The National Genomic Research Library (NGRL) now contains one of the richest genomic datasets in the world for both rare disease and cancer research, following the curation and addition of all consented genomes from the 100,000 Genomes Project. The Library now holds the data of over 110,000 clinically linked genomes, from over 97,000 participants.
This latest data update to the Genomics England Research Environment, the platform to access the NGRL, is a major milestone for the 100,000 Genomes Project as well as future research. With this update, an aggregate dataset of over 78,000 genomes – built from individual genomic variant call format (gVCF) files – were also added to the Library, allowing researchers to easily and quickly view a wealth of information on variants across all participants with those variants. This will make research like genome-wide association studies significantly easier to do within our Research Environment.
The recent update means the NGRL now represents one of the largest collections of whole genomes across several cancers, including both somatic and germline genomes, and rare diseases in the world. As these are disease-specific cohorts, the impact of this is even greater. The level of detail and granularity of the genomic data is complemented with constantly improving clinical data, fed back to the Library through NHS Digital and Public Health England.
We are delighted to have made available a very large aggregate of whole genome sequence data from our participants, and we expect it will be useful for many downstream discovery analyses. This dataset contains over 700 million germline variants on over 78,000 participants; though it does not include the tumour genomes, it does include the germline samples from cancer participants. We have been using this dataset to contribute to international efforts on discovering the genetic factors around the severity of COVID-19.
We are extremely grateful to all participants who generously donated their genomic data for research. Now we can tell each of them that they are actively helping to improve the way we understand human disease and therefore healthcare outcomes.
Dr Loukas Moutsianas
Head of Bioinformatics Research Services at Genomics England
Genomics England’s Research Environment is our platform through which approved researchers can securely access the genomic and other health data held in the NGRL. Participants must give us consent for us to include their de-identified genome and clinical records in this incredible resource for research.