Electronic Health Records GeCIP Domain


The sequence of health-related events that we all experience is increasingly captured in health records, on smart phones or on devices that we wear. In order to get the most benefit from the genetic sequence in the 100,000 Genomes Project datacentre for ourselves, our families and for others we need to bring together these two sequences – on the one hand our health as it unfolds over time and on the other hand the string of 3 billion letters that make up our genome.

The Farr Institute seeks to do just that. The major, exciting challenge demands developing new ways of doing research and building new methods and tools in six related areas:

  1. Recruitment to help patients get into the 100,000 Genomes Project in the first place.
  2. To bring in the patients perspective (eg reporting how they feel on a smartphone).
  3. To cross the bridge between the well established biological toolkits used in genomics and the newer field of examining health records.
  4. To use the ‘cradle to grave’ strengths of the NHS to find out about health over time.
  5. To offer patients the opportunity to take part in more clinical trials, with less intrusion on their time.
  6. To ensure that the NHS learns from knew knowledge in order to drive better patient outcomes from genomic medicine.

Below are the current subdomains for this domain. You can find the full details of the research proposed by this domain in the Electronic Health Records detailed research plan.

RecruitmentRichard Dobson and Jackie CassellThis proposal specifically addresses the immediate mission-critical GeL objectives for the use of data within clinical records at each GMC site for:
1. candidate recruitment and,
2. population of GeL disease models.

These objectives will be met through the provision of an information retrieval and extraction platform: 'CogStack'. Cogstack implements best-of-breed enterprise search, natural language processing, analytics and visualisation technologies that are known to be absent from the majority of GMCs’ business and clinical intelligence capabilities. We have demonstrated successes at two GMC sites (South London and The Maudsley NHS Foundation Trust (SLaM) and King’s College Hospital (KCH)) where we are already seeing a step-change to GMC capabilities to expedite patient recruitment for the 100KGP and populate phenotype data models. It is already clear that this project has triggered lasting change at this GMC NHS Trust.

We aim to build a legacy within the NHS business intelligence community through the deployment of CogStack across all GMC sites. Beyond the principal goals of the 100KGP, the implementation of the proposed toolset and the associated upskilling in data mining techniques will create a transformational capability in the fields of business intelligence, research and audit. One of the key outcomes will be a dramatically reduced cost of clinical trials in routine practice.
Patient perspective and PROMSZina Ibrahim and Mark DumanPatient-reported Outcome measures (PROM) provide vital insight into patient experience with care and are an important clinical tool whose use has evolved from research and monitoring quality of care to supporting outcome improvement (Black‘13). Not only do PROMs provide data on large numbers of patients, but are representative of typical, everyday practice, thus facilitating research on the observational effectiveness (rather than controlled efficacy) of treatments (Devlin‘10). PROMs also represent high-quality standardised phenotypes enabling the quantification of patient state, comparisons of treatments and patient response to therapy. In this regard, PROMs can be used to generate a greatly an enriched phenotype with benefit to the 100k Genomics England project.

In addition to PROM data, passive remote monitoring using the sensors on mobile phones and wearable devices can greatly enhance our understanding of the clinical phenotype, and provide powerful and potentially disruptive means to support innovation in, and democratization of, healthcare delivery. Streamed data has the potential to support early diagnosis, prognosis and provide a means for stratification and intervention delivery. The data captured offers the first real opportunity to collect objective metrics at high resolution in daily life providing a more complete picture of the patient, and a new class of phenotype target to complement genetic, molecular and neuroscience research. We will aim to build on our present success in this area with a €22M (additional €1.2m contribution from tech partners Intel and SoftwareAG) IMI2 funded project: RADAR-CNS coordinated by KCL making the proposed work outlined here highly cost-effective.

To achieve all this, we need a user-centred design to build informatics tools that provide the right functionality and information for patients as well as for healthcare professionals. Our objective is to create and demonstrate the benefits of a computing infrastructure supporting an interoperable and personalized environment enabling remote patient-led phenotype provision through PROMs and wearable-generated data directly to hospital
Ensembl for EHRNick Luscombe and Tim HubbardThere is ongoing debate regarding disease classification and developing novel phenotypic cohorts of patients that may encompass one or more disease. A “new taxonomy of disease” based on underlying biology, rather than traditional descriptives, will allow dramatically different approaches to diagnosis, treatment and prognosis for risk evaluation. Many diseases currently studied by GeL have not been formally defined in computable terms using existing clinical terminologies. Additionally, for diseases where diagnosis code do exist, the resolution provided is coarse and as a result substantial overlap and ambiguity between data elements exists Raw genomic and EHR data are not research nor clinic-ready and a substantial amount of work is required in order to transform them into a resource that can be analysed and interpreted. The EHR GeCIP provides an ideal opportunity to curate omic, biomarkers, imaging, EHR and other data, and present them as clinically meaningful disease models.

The aims of this sub-domain are:
(i) to develop novel computational software infrastructure for automatically curating genomic and EHR data;
(ii) to use these curations to help develop a "new taxonomy" of disease, based on computable disease models;
(iii) to feed back these models to GMCs as a way to continually improve patient care and data collection;
(iv) to provide programmatic access to curated disease models for researchers.

We will develop and evolve the EHR research community by fundamentally shifting the cultural landscape, providing a ‘go-to’ resource of information, tools and knowledge exchange and promoting sharing and transparency.

The subdomain is closely aligned with the principle aims of the Farr Institute and the major MRC Medical Bioinformatics Awards, and it builds on ongoing initiatives such as the UK Biobank.
Longitudinal phenotypes and data linkageSpiros Denaxas and Martin LandrayIn order to drive NHS transformation, GeL has to deliver patient benefit in terms of improved clinical outcomes. The current data models for the majority of GeL phenotypes are defined based on a a priori-selected set of data elements which are being provided by the GMC‘s through a mixture of approaches including case report forms. These data elements are of limited scope as they capture cross-sectional data that is generated during secondary care interactions within the healthcare system. There is a multitude of phenotypically diverse and longitudinal data for patients that span primary and secondary care and non-health data sources that are of high interest to GeL and are currently not systematically captured or extracted.

These rich data sources can be used to construct longitudinal disease phenotypes by establishing the transition between disease states from onset to progression capturing all episodes of care (from initial presentation in primary care to diagnosis and treatment in secondary care). High-resolution longitudinal phenotypes can be utilized for hypothesis driven and hypothesis free observational research across all clinical GECiPs.
Furthermore, the linkage with national coded primary and secondary care electronic health records will enable scalable and cost-effective long-term outcomes for clinical trials.
Similarly, the linkage with administrative and social datasets will facilitate high-impact research across clinical domains at the intersect between health and social care by enabling the creation of non-health related phenotypes.
EHR enabled trials for precision medicineJP Casas and Folkert AsselbergsDelivering precision medicine for the NHS requires a suite of new, genomically informed, randomised trials. Generating this evidence demands innovation in the efficient use of NHS data in order to optimise each stage of trial design, conduct and implementation) as well as specific challenges and opportunities afforded by the availability of genomic sequence data. This is crucial to address questions such as testing dose, timing and combinations of existing drugs, repurposing hypotheses, licensed drug adjuncts to existing treatments (which may arise from genetic discoveries) as well as new innovative treatments. It will also be crucial to streamline and better power new trials for patients with rare disease and ethnic groups under-represented in trials.

The overarching aim of this Farr GeCIP sub-domain is to intersect OMICs technologies and EHRs to enable precision medicine trials within the NHS.

The specific objectives of this sub-domain are
(i) To drive EHR phenotypic augmentation of OMICs Data for evaluating trial feasibility, point of care randomisation and follow up for trial safety and efficacy outcomes applicable to any trial design
(ii) To generate an intelligent computational platform that builds on OMICs data to optimise trial design and execution across a wide range of trial designs.
Learning Health SystemsBrendan Delaney and Tjeerd van StaaFor the potential benefits of GeL‘s sequence data to be translated into patient benefits requires a matching transformation in information systems. The Learning Health System (LHS), as defined by the US Institute of Medicine (2008, 2012) is a term that describes the formal linkage of research in routine healthcare settings and application of the knowledge created. The widespread adoption of electronic health records (EHRs) and their use in real time during the consultation enables the LHS to develop as an integrated technological system at potentially much greater scope and scale than traditional human- social systems. The LHS is a wider concept that personalized medicine alone as it has two important implications:
1. Research, ‗Big Data/‘omics/personalized medicine‘ is insufficient in isolation to create impact on patients, it requires a ‗Big Knowledge‘ strategy to accompany it.
2. To progress at scale and without duplication, creating silos and fragmentation we require a significant step up in informatics approaches to facilitate the LHS.

We propose to methods research to develop core informatics standards, models, approaches for system integration across the EHR GeCIP subdomains. We will build on and collaborate with the very best UK, European and International groups in this area.

Other Projects