Academics, clinicians, and students worldwide can join our research community, the Genomics England Clinical Interpretations Partnership (GECIP, for short).
Genotype-Phenotype Analysis of Patients with TUFT1 variants.
Project Lead
Siddharth Banka
Project Date
06/07/2021
Lay Summary
We have identified a new genetic disorder caused by changes in the TUFT1 gene. Affected patients appear to have primarily skin-related problems. We intend to explore the effect of different changes in the TUFT1 gene in a wider cohort of patients who were recruited as part of the 100,000 Genomes Project.
We have identified a new genetic disorder caused by changes in the TUFT1 gene. Affected patients appear to have primarily skin-related problems. We intend to explore the effect of different changes in the TUFT1 gene in a wider cohort of patients who were recruited as part of the 100,000 Genomes Project.
Validation of interpretable artificial intelligence algorithms for genetic diagnostics
Project Lead
Rocío Acuna-Hidalgo
Project Date
23/06/2021
Lay Summary
Over 300 million people around the world are affected by a rare genetic disease. Patients with rare genetic diseases face on average 5 years between the first onset of symptoms and finally reaching a diagnosis, often undergoing multiple visits, referrals and series of medical tests along this journey. This translates into delayed medical care, affecting quality of life for the patient and high costs for healthcare systems.
Exome and genome sequencing can help diagnose patients with rare genetic diseases by identifying the variants in their DNA that are causing their disease. Analysing a genetic test to find disease-causing variants in the patient’s DNA is a crucial step in genetic testing, known as variant interpretation. Variant interpretation is currently a manual process carried out by a specialized (human) analyst, which makes it time-consuming, costly and often delivers inaccurate or uncertain results. As a consequence, patients must often wait several months to receive their genetic test results. Furthermore, less than 50% of patients with rare genetic diseases undergoing exome or genome sequencing receive a clear diagnosis, which is crucial to provide appropriate medical care.
We have developed machine learning algorithms to analyse genetic tests to diagnose rare genetic disorders (specifically, to perform variant interpretation for exome and genome sequencing). Our goal is to automate the step of variant interpretation for genetic tests to decrease the costs, turnaround time and other limitations associated with this manual process.
We will use the Genomics England dataset to test the performance of these algorithms and ensure they can be implemented in the clinic to reliably automate the analysis of genetic tests. This allows for the analysis of genetic tests to be carried out accurately and in an automated way, lowering costs and turnaround time for tests. For patients with rare genetic diseases, this translates into earlier access to a clear molecular and clinical diagnosis, which is essential to receive appropriate medical care and to make informed medical decisions. Finally, at the population level, shortening the journey from first onset of symptoms to diagnosis leads to better outcomes for patients with rare diseases while lowering costs for health care systems, by reducing the number of patients receiving unnecessary tests and medical interventions.
Over 300 million people around the world are affected by a rare genetic disease. Patients with rare genetic diseases face on average 5 years between the first onset of symptoms and finally reaching a diagnosis, often undergoing multiple visits, referrals and series of medical tests along this journey. This translates into delayed medical care, affecting quality of life for the patient and high costs for healthcare systems.
Exome and genome sequencing can help diagnose patients with rare genetic diseases by identifying the variants in their DNA that are causing their disease. Analysing a genetic test to find disease-causing variants in the patient’s DNA is a crucial step in genetic testing, known as variant interpretation. Variant interpretation is currently a manual process carried out by a specialized (human) analyst, which makes it time-consuming, costly and often delivers inaccurate or uncertain results. As a consequence, patients must often wait several months to receive their genetic test results. Furthermore, less than 50% of patients with rare genetic diseases undergoing exome or genome sequencing receive a clear diagnosis, which is crucial to provide appropriate medical care.
We have developed machine learning algorithms to analyse genetic tests to diagnose rare genetic disorders (specifically, to perform variant interpretation for exome and genome sequencing). Our goal is to automate the step of variant interpretation for genetic tests to decrease the costs, turnaround time and other limitations associated with this manual process.
We will use the Genomics England dataset to test the performance of these algorithms and ensure they can be implemented in the clinic to reliably automate the analysis of genetic tests. This allows for the analysis of genetic tests to be carried out accurately and in an automated way, lowering costs and turnaround time for tests. For patients with rare genetic diseases, this translates into earlier access to a clear molecular and clinical diagnosis, which is essential to receive appropriate medical care and to make informed medical decisions. Finally, at the population level, shortening the journey from first onset of symptoms to diagnosis leads to better outcomes for patients with rare diseases while lowering costs for health care systems, by reducing the number of patients receiving unnecessary tests and medical interventions.
Enhanced interpretation of coding and non-coding variants at the 3’ end of genes
Project Lead
Siddharth Banka
Project Date
20/05/2021
Lay Summary
Many types of variant in a person’s genes are known to cause rare genetic diseases. In some cases, the effects of these variants are well understood. However, the effects of variants at the far end of genes is often unclear. This is because of the different way in which the far ends of the genes are read and processed in our cells. We want to look at these variants at the far end of genes to find out what makes them different. Hopefully we will be able to find new diagnoses for patients with these variants, and learn more about the biology of rare genetic diseases.
Many types of variant in a person’s genes are known to cause rare genetic diseases. In some cases, the effects of these variants are well understood. However, the effects of variants at the far end of genes is often unclear. This is because of the different way in which the far ends of the genes are read and processed in our cells. We want to look at these variants at the far end of genes to find out what makes them different. Hopefully we will be able to find new diagnoses for patients with these variants, and learn more about the biology of rare genetic diseases.
Exomiser performance and new diagnoses for the 100,000 Genomes Project RD cohort
Project Lead
Damian Smedley
Project Date
14/04/2021
Lay Summary
Exomiser is a popular, open-source software tool used world-wide (including in Genomics England's own pipeline) to ease the task of identifying the variant that is responsible for causing a rare disease, from the amongst the millions of letters of code that are sequenced. The NGRL now contains one of the largest collections of cases that have been solved as a result of whole genome sequencing, along with their associated clinical information. The value of this software in solving what were previously thought undiagnosable conditions needs to be demonstrated to clinicians with clear evidence. Hence, we will evaluate how effective Exomiser is in identifying diagnoses for those recruited to the 100,000 Genomes Project and make this information available to the wider rare disease community.
We will then apply the latest versions of Exomiser to unsolved cases in the 100,000 Genomes Project to try and find new diagnoses in genes that may not have been considered fully before as disease genes, or involving variants that may not have been fully studied such as larger structural variants or variants that affect the splicing of genes.
Exomiser is a popular, open-source software tool used world-wide (including in Genomics England's own pipeline) to ease the task of identifying the variant that is responsible for causing a rare disease, from the amongst the millions of letters of code that are sequenced. The NGRL now contains one of the largest collections of cases that have been solved as a result of whole genome sequencing, along with their associated clinical information. The value of this software in solving what were previously thought undiagnosable conditions needs to be demonstrated to clinicians with clear evidence. Hence, we will evaluate how effective Exomiser is in identifying diagnoses for those recruited to the 100,000 Genomes Project and make this information available to the wider rare disease community.
We will then apply the latest versions of Exomiser to unsolved cases in the 100,000 Genomes Project to try and find new diagnoses in genes that may not have been considered fully before as disease genes, or involving variants that may not have been fully studied such as larger structural variants or variants that affect the splicing of genes.
Prevalence of genetic forms of Dilated Cardiomyopathy (DCM)
Project Lead
Ana Barat
Project Date
03/02/2021
This research project is approved, but is not approved for
publication.
Lay Summary
Genetic DCM epidemiology data is elusive and confirm the difficulty to know the “real” prevalence of this entity in the overall scope of heart failure (HFrEF) and general population. Our aim is thus to investigate the prevalence and the phenotypic characteristics of patients with DCM of genetic aetiology in general as well as characterise the contributions of the various specific mutation types to the genetic DCM prevalence.
This research project is approved, but is not approved for
publication.
Genetic DCM epidemiology data is elusive and confirm the difficulty to know the “real” prevalence of this entity in the overall scope of heart failure (HFrEF) and general population. Our aim is thus to investigate the prevalence and the phenotypic characteristics of patients with DCM of genetic aetiology in general as well as characterise the contributions of the various specific mutation types to the genetic DCM prevalence.
Identifying additional genomic variants within the promoter region of the TXNL4A gene associated with the spliceosomal developmental disorder Burn-McKeown syndrome.
Project Lead
Jamie Ellingford
Project Date
29/01/2021
Lay Summary
Burn-McKeown syndrome (BMKS) is a rare craniofacial developmental disorder caused by biallelic variants in the pre-messenger RNA splicing factor gene TXNL4A. The majority of individuals with BMKS have a 34 base pair (bp) deletion in the promoter region of one allele of TXNL4A combined with a loss-of-function variant on the other allele. Some BMKS patients are homozygous for a slightly different 34bp deletion in the promoter region of TXNL4A. Functional evidence has demonstrated the importance of the affected regions of the promoter on TXNL4A gene expression and we have identified the critical nucleotides within the 34bp promoter deletions which are essential for TXNL4A expression. We therefore intend on probing the Genomics England data for additional variants within this region of the TXNL4A promoter to identify additional genotypes associated with BMKS.
Burn-McKeown syndrome (BMKS) is a rare craniofacial developmental disorder caused by biallelic variants in the pre-messenger RNA splicing factor gene TXNL4A. The majority of individuals with BMKS have a 34 base pair (bp) deletion in the promoter region of one allele of TXNL4A combined with a loss-of-function variant on the other allele. Some BMKS patients are homozygous for a slightly different 34bp deletion in the promoter region of TXNL4A. Functional evidence has demonstrated the importance of the affected regions of the promoter on TXNL4A gene expression and we have identified the critical nucleotides within the 34bp promoter deletions which are essential for TXNL4A expression. We therefore intend on probing the Genomics England data for additional variants within this region of the TXNL4A promoter to identify additional genotypes associated with BMKS.
Genotype-Phenotype Analysis of Patients with ING3 variants.
Project Lead
Siddharth Banka
Project Date
15/01/2021
Lay Summary
Our collaborators in Canada have identified a new genetic disorder caused by changes in the ING3 gene. Affected patients have intellectual disability and epilepsy however, as far as we know, this condition seems to be very rare. We intend to explore the effect of different changes in the ING3 gene in a wider cohort of patients studied in the 100,000 Genomes Project.
Our collaborators in Canada have identified a new genetic disorder caused by changes in the ING3 gene. Affected patients have intellectual disability and epilepsy however, as far as we know, this condition seems to be very rare. We intend to explore the effect of different changes in the ING3 gene in a wider cohort of patients studied in the 100,000 Genomes Project.
Genotype-Phenotype Analysis of Patients with HNRNPDL variants.
Project Lead
Siddharth Banka
Project Date
15/01/2021
Lay Summary
Our collaborators in the US have identified a new genetic disorder caused by changes in the HNRNPDL gene. Affected patients have problems with the nervous system however, as far as we know, this condition seems to be very rare. We intend to explore the effect of different changes in the HNRNPDL gene in a wider cohort of patients studied in the 100,000 Genomes Project.
Our collaborators in the US have identified a new genetic disorder caused by changes in the HNRNPDL gene. Affected patients have problems with the nervous system however, as far as we know, this condition seems to be very rare. We intend to explore the effect of different changes in the HNRNPDL gene in a wider cohort of patients studied in the 100,000 Genomes Project.
Genotype-Phenotype Analysis of Patients with WSB2 variants.
Project Lead
Siddharth Banka
Project Date
15/01/2021
Lay Summary
Our collaborators in the US have identified a new genetic disorder caused by changes in the WSB2 gene. Affected patients have problems with muscle function however, as far as we know, this condition seems to be very rare. We intend to explore the effect of different changes in the WSB2 gene in a wider cohort of patients studied in the 100,000 Genomes Project.
Our collaborators in the US have identified a new genetic disorder caused by changes in the WSB2 gene. Affected patients have problems with muscle function however, as far as we know, this condition seems to be very rare. We intend to explore the effect of different changes in the WSB2 gene in a wider cohort of patients studied in the 100,000 Genomes Project.
Genotype-Phenotype Analysis of Patients with C12orf66 variants.
Project Lead
Siddharth Banka
Project Date
15/01/2021
Lay Summary
We have identified a new genetic disorder caused by changes in the C12orf66 gene. Affected patients have varying combinations of developmental delay and neurological problems. We intend to explore the effect of different changes in the C12orf66 gene in a wider cohort of patients studied in the 100,000 Genomes Project.
We have identified a new genetic disorder caused by changes in the C12orf66 gene. Affected patients have varying combinations of developmental delay and neurological problems. We intend to explore the effect of different changes in the C12orf66 gene in a wider cohort of patients studied in the 100,000 Genomes Project.
Genotype-Phenotype Analysis of Patients with XPO7 variants.
Project Lead
Siddharth Banka
Project Date
15/01/2021
Lay Summary
We have identified a new genetic disorder caused by changes in the XPO7 gene. Affected patients have varying combinations of developmental delay and neurological problems. We intend to explore the effect of different changes in the XPO7 gene in a wider cohort of patients studied in the 100,000 Genomes Project.
We have identified a new genetic disorder caused by changes in the XPO7 gene. Affected patients have varying combinations of developmental delay and neurological problems. We intend to explore the effect of different changes in the XPO7 gene in a wider cohort of patients studied in the 100,000 Genomes Project.
Genotype-Phenotype Analysis of Patients with NEUROG2 variants.
Project Lead
Siddharth Banka
Project Date
11/01/2021
Lay Summary
We have identified a new genetic disorder caused by changes in the NEUROG2 gene. Affected patients have varying combinations of developmental delay and neurological problems. We intend to explore the effect of different changes in the NEUROG2 gene in a wider cohort of patients studied in the 100,000 Genomes Project.
We have identified a new genetic disorder caused by changes in the NEUROG2 gene. Affected patients have varying combinations of developmental delay and neurological problems. We intend to explore the effect of different changes in the NEUROG2 gene in a wider cohort of patients studied in the 100,000 Genomes Project.
Impact of structural variants detection on biomarker discovery and target identification efforts.
Project Lead
Pierre Farmer
Project Date
04/12/2020
This research project is approved, but is not approved for
publication.
Lay Summary
While the majority of genetic analyses are focusing on Single Nucleotide Polymorphisms (SNPs) due to technical simplicity and maturity of methods, much less attention have been given to structural variants (insertions / deletions, repeat expansions and copy number variations). Attention to structural variation is valuable since human genomes differ more between individuals as a consequence of structural variation compare to because of single-base-pair differences.
Our research will be focused on better understanding how structural variants causes diseases with three specific angles: (i) by developing computational methods to better identify and study them, (ii) by increasing our knowledge on how structural variants are related to diseases and (iii) by investigating for new ways to treat diseases.
This research project is approved, but is not approved for
publication.
While the majority of genetic analyses are focusing on Single Nucleotide Polymorphisms (SNPs) due to technical simplicity and maturity of methods, much less attention have been given to structural variants (insertions / deletions, repeat expansions and copy number variations). Attention to structural variation is valuable since human genomes differ more between individuals as a consequence of structural variation compare to because of single-base-pair differences.
Our research will be focused on better understanding how structural variants causes diseases with three specific angles: (i) by developing computational methods to better identify and study them, (ii) by increasing our knowledge on how structural variants are related to diseases and (iii) by investigating for new ways to treat diseases.
Assessing the prevalence of premature stop codons in diseases for potential indication expansion for an early phase drug development program.
Project Lead
Pierre Farmer
Project Date
25/11/2020
This research project is approved, but is not approved for
publication.
Lay Summary
In genetics, a premature termination stop codon (PTC) mutation is a point mutation on the DNA that results in the generation of a stop signal that interrupts prematurely the synthesis of a specific protein. The functional effect of these PTCs mutation depends on their location within the coding DNA.
Further, a research investigation concluded that this type of mutation is estimated to represent 11% of all catalogue human genetic lesions causative of diseases.
We will investigate more precisely the prevalence of PTC mutation within the 100k genome cohort and study more closely their relationship with diseases.
This research project is approved, but is not approved for
publication.
In genetics, a premature termination stop codon (PTC) mutation is a point mutation on the DNA that results in the generation of a stop signal that interrupts prematurely the synthesis of a specific protein. The functional effect of these PTCs mutation depends on their location within the coding DNA.
Further, a research investigation concluded that this type of mutation is estimated to represent 11% of all catalogue human genetic lesions causative of diseases.
We will investigate more precisely the prevalence of PTC mutation within the 100k genome cohort and study more closely their relationship with diseases.
Understanding the genotype-phenotypic spectrum in GSN-related amyloidosis
Project Lead
Siddharth Banka
Project Date
20/11/2020
Lay Summary
A rare condition called Finnish-type Amyloidosis is caused by changes in a gene called Gelsolin. So far the changes reported have all been of a similar type. We have found a new change in a family known to us for many years who have an unusual form of amyloidosis which is similar to Finnish-type. We would like to use the 100 000 Genomes data to look for more families where gene changes in Gelsolin could be contributing to their disease, which has, so far, remained undiagnosed.
We hope that the identification and characterisation of these gene changes in more families will help better understand the disorder and may, eventually, lead to options for treatment and follow up.
A rare condition called Finnish-type Amyloidosis is caused by changes in a gene called Gelsolin. So far the changes reported have all been of a similar type. We have found a new change in a family known to us for many years who have an unusual form of amyloidosis which is similar to Finnish-type. We would like to use the 100 000 Genomes data to look for more families where gene changes in Gelsolin could be contributing to their disease, which has, so far, remained undiagnosed.
We hope that the identification and characterisation of these gene changes in more families will help better understand the disorder and may, eventually, lead to options for treatment and follow up.
Characterizing the phenotype associated with HDAC2 variants
Project Lead
Stefan Barakat
Project Date
22/10/2020
Lay Summary
We recently identified HDAC2 in a functional screen as a possible new disease gene for a neurodevelopmental disorder, and have identified a number of individuals with variants in this gene who have neurodevelopmental symptoms. We aim to bring together all patients with disease-causing changes in HDAC2 to better describe this condition.
We recently identified HDAC2 in a functional screen as a possible new disease gene for a neurodevelopmental disorder, and have identified a number of individuals with variants in this gene who have neurodevelopmental symptoms. We aim to bring together all patients with disease-causing changes in HDAC2 to better describe this condition.
CamBridge: Linking human disease genetics to cell biology
Project Lead
Evan Reid
Project Date
19/10/2020
Lay Summary
There is a wealth of genetic information in the National Genomic Research Library which has not yet been explored. But one of the issues is that there is a communications and skills gap amongst researchers. On the one hand, geneticists are able to use the genetic data, but do not in general have the ability to carry out specific functional assays, whereas cell, developmental and structural biologists can do functional assays available, but do not have the bioinformatics skills to obtain and analyse the genetic information contained in the NGRL. There are significant positive opportunities if this can be addressed, and this project aims to do so. It will remove the barriers that prevent functional biology researchers from using the rare Mendelian disease genetic information contained in the NGRL by providing them with analysis on genes that they nominate. By doing so it will facilitate identification of new disease genes via rapid functional validation of possible variants. It will reveal important insights into the functional cell biology of the genes identified, identifying disease mechanisms and high quality validated drug targets, which could provide future therapeutic approaches.
Initially we will provide this analysis for groups based in Cambridge Institute for Medical Research (CIMR), the cell biology research institute in which we are based. Rare genetic disease research is a major thematic focus within the institute, and so this will facilitate links with a core group of researchers who are precisely interested in the functional cell biology of rare genetic disease. However, if this is successful, and if suitable funding can be found, we aim to broaden the reach of CamBridge to the broader Cambridge functional cell biology community, and then nationally.
There is a wealth of genetic information in the National Genomic Research Library which has not yet been explored. But one of the issues is that there is a communications and skills gap amongst researchers. On the one hand, geneticists are able to use the genetic data, but do not in general have the ability to carry out specific functional assays, whereas cell, developmental and structural biologists can do functional assays available, but do not have the bioinformatics skills to obtain and analyse the genetic information contained in the NGRL. There are significant positive opportunities if this can be addressed, and this project aims to do so. It will remove the barriers that prevent functional biology researchers from using the rare Mendelian disease genetic information contained in the NGRL by providing them with analysis on genes that they nominate. By doing so it will facilitate identification of new disease genes via rapid functional validation of possible variants. It will reveal important insights into the functional cell biology of the genes identified, identifying disease mechanisms and high quality validated drug targets, which could provide future therapeutic approaches.
Initially we will provide this analysis for groups based in Cambridge Institute for Medical Research (CIMR), the cell biology research institute in which we are based. Rare genetic disease research is a major thematic focus within the institute, and so this will facilitate links with a core group of researchers who are precisely interested in the functional cell biology of rare genetic disease. However, if this is successful, and if suitable funding can be found, we aim to broaden the reach of CamBridge to the broader Cambridge functional cell biology community, and then nationally.
Diagnosis of Rare Disease using Whole Genome Sequencing as part of Routine Healthcare
Project Lead
Anna Need
Project Date
15/10/2020
Lay Summary
This paper will summarize the rare disease arm of the 100,000 Genomes Project. It will report what improvements were made to the diagnostic approach over the years the project was running and what learnings we can take from the project to offer an optimal Genomic Medicine Service.
This paper will summarize the rare disease arm of the 100,000 Genomes Project. It will report what improvements were made to the diagnostic approach over the years the project was running and what learnings we can take from the project to offer an optimal Genomic Medicine Service.
Clinical delineation of TSPEAR-associated ectodermal dysplasia.
Project Lead
Siddharth Banka
Project Date
07/10/2020
Lay Summary
Changes in the gene, TSPEAR, have been reported to cause a particular disorder which affects the development of skin, hair, teeth and sweat glands. Conditions like this a called ectodermal dysplasias and there are many other causes. Only a handful of patients with variants in TSPEAR have been described so it is difficult to truly understand the full spectrum of problems that can occur with this disorder. We would like to use the 100, 000 Genomes Project data to find and describe more patients with this condition.
Changes in the gene, TSPEAR, have been reported to cause a particular disorder which affects the development of skin, hair, teeth and sweat glands. Conditions like this a called ectodermal dysplasias and there are many other causes. Only a handful of patients with variants in TSPEAR have been described so it is difficult to truly understand the full spectrum of problems that can occur with this disorder. We would like to use the 100, 000 Genomes Project data to find and describe more patients with this condition.
Genomics England consultancy project proposal for PTC Therapeutics: Identification of individuals with likely Aromatic l-aminoacid decarboxylase (AADC) deficiency based on DDC variation in the Genomics England dataset
Project Date
03/09/2020
Lay Summary
Variants in the DDC gene result in AADC deficiency. As a result of the reduced activity of the AADC enzyme, nerve cells produce less dopamine and serotonin. Changes in the levels of these neurotransmitters contribute to the developmental delay, intellectual disability, abnormal movements, and autonomic dysfunction seen in people with AADC deficiency . PTC Therapeutics are interested in identifying and characterising individuals with DDC variants implicated in Aromatic l-aminoacid decarboxylase (AADC) deficiency in the NGRL. The project will assemble a cohort of individuals on the basis of their genomic variation at the DDC gene.
Variants in the DDC gene result in AADC deficiency. As a result of the reduced activity of the AADC enzyme, nerve cells produce less dopamine and serotonin. Changes in the levels of these neurotransmitters contribute to the developmental delay, intellectual disability, abnormal movements, and autonomic dysfunction seen in people with AADC deficiency . PTC Therapeutics are interested in identifying and characterising individuals with DDC variants implicated in Aromatic l-aminoacid decarboxylase (AADC) deficiency in the NGRL. The project will assemble a cohort of individuals on the basis of their genomic variation at the DDC gene.
Enhanced interpretation research plan
Full details of the research proposed by this domain