Exploiting the potential of large databases of electronic health records for research using rapid algorithms and an intuitive query interface

From Clinfowiki
Jump to: navigation, search


We have had many struggles with extracting accurate, complete and timely data however we now have, "an end-to-end system, underpinned by an innovative search algorithm, allows the user to extract information in near real-time via an intuitive query interface." [1]


Change in technology allows for large databases to collect vast amounts of information from millions of patients which could potentially assist with research and development of studies or medicine. This data is found in the Clinical Practice Research Datalink (CPRD) database and is accessed by scientists, clinicians and government agencies from around the world.[1]The challenge lies in the amount of time such a large complex query could take to complete as well as the format in which it is extracted is typically in rows and columns. Though this is a start to findings there still may be days of work involved with sorting and filtering data.


One goal is to have the ability to run a query that will gather the data needed with little experience in coding. The other goal is to reduce the amount of time it takes to run the query."TrialViz is a simple online intuitive interface to the large and complex data held within CPRD."[1]


Multiple disciplines have come together to collaborate on such an extensive project i.e. data analysts, epidemiologists, software engineers and computer scientists.[1] After developing algorithms and identifying ways of extracting data, TrialViz has become the latest in software that can assist with large data extraction. Throughout the build process of this software they did have user involvement in all stages of development.


The development of TrialViz was found to have a dramatic decrease in query times, for example a query that was typically 5 hours from start to finish was no 2 seconds[1]TrialViz is comparable to similar systems however they are not capable of working with the size of database that CPRD provides.


With the development of this new software data analyzed in near real time with accuracy an minimal experience with coding. This would potentially change how potential candidates for research could be recruited.


This study demonstrates some of the challenges that we face today and a solution to that challenge. I currently face some of those very challenges and at times have needed to wait for days to have a report generated. At times this can be frustrating if facing deadlines however I've now learned that there are solutions. This article clearly demonstrates the advancements in algorithms in queries that will help in the process of data extraction.

Related articles


  1. 1.0 1.1 1.2 1.3 1.4 Tale AR, Beloff N, Al-Radwan B, Wickson J, Puri S, Williams T, Staa TV, Bleach A, 5 November 2013, Exploiting the potential of large databases of electronic health records for research using rapid search algorithms and intuitive query interface, Journal of American Medical Informatics Association 2014, 21, 292-298 http://jamia.oxfordjournals.org/content/21/2/292