Select from the domains and explore our active research projects.
Project Title
Project Lead
Project Date
Research projects
A comprehensive catalog of somatic mutations and structural variations in lung cancer and correlation with main clinical characteristics
Project Lead
Shicheng Guo
Project Date
23/02/2023
Lay Summary
Lung cancer is the leading cause of cancer-related death worldwide, and a better understanding of the genetic changes that drive its progression is crucial for improving patient outcomes. The aim of this research plan is to study the frequency and association of lung cancer mutations and structural variations with clinical information, metastasis, drug response, and survival in the lung cancer patient cohort from the 100,000 Genomes Project. Whole Genome Sequences would provide more insights on non-coding regions of DNA, as compared with other sources of genomic data from cancer patients (e.g. TCGA and TARGET). Non-coding regions of DNA, which make up the majority of the human genome, play important roles in gene regulation, chromatin structure, and other biological processes. Mutations or structural variations in non-coding regions can affect gene expression and contribute to the development and progression of cancer. Therefore, studying non-coding regions of DNA can provide valuable insights into the molecular mechanisms of cancer. In this study, Dr. Shicheng Guo, Dr. Christopher Moy, Dr. Joel Greshock, Dr. Ramzi Temmani and Dr. Tommaso Mansi will provide a comprehensive understanding of the genetic changes that drive lung cancer progression and contribute to its resistance to current therapies. The identification of biomarkers of drug response and survival will support clinical trial development and facilitate the development of personalized medicine approaches and improve patient outcomes and benefit lung cancer patient. Overall, the research plan has the potential to improve the diagnosis, treatment, and outcomes for patients with lung cancer. By contributing to a better understanding of the molecular basis of the disease, your research can help advance precision medicine and improve the quality of care for patients.
Lung cancer is the leading cause of cancer-related death worldwide, and a better understanding of the genetic changes that drive its progression is crucial for improving patient outcomes. The aim of this research plan is to study the frequency and association of lung cancer mutations and structural variations with clinical information, metastasis, drug response, and survival in the lung cancer patient cohort from the 100,000 Genomes Project. Whole Genome Sequences would provide more insights on non-coding regions of DNA, as compared with other sources of genomic data from cancer patients (e.g. TCGA and TARGET). Non-coding regions of DNA, which make up the majority of the human genome, play important roles in gene regulation, chromatin structure, and other biological processes. Mutations or structural variations in non-coding regions can affect gene expression and contribute to the development and progression of cancer. Therefore, studying non-coding regions of DNA can provide valuable insights into the molecular mechanisms of cancer. In this study, Dr. Shicheng Guo, Dr. Christopher Moy, Dr. Joel Greshock, Dr. Ramzi Temmani and Dr. Tommaso Mansi will provide a comprehensive understanding of the genetic changes that drive lung cancer progression and contribute to its resistance to current therapies. The identification of biomarkers of drug response and survival will support clinical trial development and facilitate the development of personalized medicine approaches and improve patient outcomes and benefit lung cancer patient. Overall, the research plan has the potential to improve the diagnosis, treatment, and outcomes for patients with lung cancer. By contributing to a better understanding of the molecular basis of the disease, your research can help advance precision medicine and improve the quality of care for patients.
Analysing the differences between inferred ancestry groups within breast cancer
Project Lead
Claude Chelala
Project Date
22/02/2023
Lay Summary
We investigate ancestry-associated differences in breast cancer with the aim to identify actionable windows of opportunity to improve patient safety and outcome. To this end, we will use the Genomics England breast cancer cohort to conduct preliminary analyses and compare the results of this national cohort with our Barts breast cancer cohort (collected from patients within North East London (NEL)) and with independent cohorts.
Breast cancer patients from NEL present with disease earlier and with more aggressive cancer than nationwide, with 1-year survival rates for cancer being in the lowest 10% in the country and particularly poor for breast cancer.
We investigate ancestry-associated differences in breast cancer with the aim to identify actionable windows of opportunity to improve patient safety and outcome. To this end, we will use the Genomics England breast cancer cohort to conduct preliminary analyses and compare the results of this national cohort with our Barts breast cancer cohort (collected from patients within North East London (NEL)) and with independent cohorts.
Breast cancer patients from NEL present with disease earlier and with more aggressive cancer than nationwide, with 1-year survival rates for cancer being in the lowest 10% in the country and particularly poor for breast cancer.
Validation and Expansion of Synthetic Lethal Cancer Targets Using the Genomics England Dataset
Project Lead
Shicheng Guo
Project Date
22/02/2023
Lay Summary
Cancer is a leading cause of death worldwide, and the development of effective treatments is of utmost importance. Synthetic lethal-based drug development is a strategy for developing cancer treatments that focuses on targeting specific genetic alterations in cancer cells that are not present in normal cells. This approach takes advantage of the unique vulnerabilities of cancer cells, which often have specific genetic changes that make them more dependent on certain cellular processes. By targeting these processes, synthetic lethal-based drugs can selectively kill cancer cells while leaving normal cells unharmed. However, the most current synthetic lethal targets are not fully validated yet, so we aim to do exactly this, validating these targets in the population cohort of cancer participants from the 100,000 Genomes Project. In this project, team of Dr. Shicheng Guo, Dr. Christopher Moy, Dr. Tim Schultz, Dr. Kirti Snigdha, Dr. Barbara Weir, Dr. Yu Sun, Dr. Ramzi Temmani, Dr. Tommaso Mansi and Dr. Joel Greshock will be able to improve our understanding of synthetic lethal targets and their potential for cancer drug development. By validating the findings in this large human population, we will be able to better evaluate the clinical relevance of synthetic lethal targets and expand our knowledge of their potential as cancer treatments . This will inform our drug discovery programmes in this field, to ensure we pursue R&D to develop medicines around synthetic lethal targets that are confirmed to be relevant for cancer patients in the UK.
Cancer is a leading cause of death worldwide, and the development of effective treatments is of utmost importance. Synthetic lethal-based drug development is a strategy for developing cancer treatments that focuses on targeting specific genetic alterations in cancer cells that are not present in normal cells. This approach takes advantage of the unique vulnerabilities of cancer cells, which often have specific genetic changes that make them more dependent on certain cellular processes. By targeting these processes, synthetic lethal-based drugs can selectively kill cancer cells while leaving normal cells unharmed. However, the most current synthetic lethal targets are not fully validated yet, so we aim to do exactly this, validating these targets in the population cohort of cancer participants from the 100,000 Genomes Project. In this project, team of Dr. Shicheng Guo, Dr. Christopher Moy, Dr. Tim Schultz, Dr. Kirti Snigdha, Dr. Barbara Weir, Dr. Yu Sun, Dr. Ramzi Temmani, Dr. Tommaso Mansi and Dr. Joel Greshock will be able to improve our understanding of synthetic lethal targets and their potential for cancer drug development. By validating the findings in this large human population, we will be able to better evaluate the clinical relevance of synthetic lethal targets and expand our knowledge of their potential as cancer treatments . This will inform our drug discovery programmes in this field, to ensure we pursue R&D to develop medicines around synthetic lethal targets that are confirmed to be relevant for cancer patients in the UK.
De novo and mendelian inherited causal genetic variations identification for bladder related phenotypes
Project Lead
Sharad Agarwal
Project Date
22/02/2023
Lay Summary
We aim to explore the genomic sequences of cancer participants from the 100,000 Genomes Project to identify genetic variations that appear to be associated with human phenotypic changes and differential drug response. In this project, our team of Dr. Sharad Agarwal, Dr. Shicheng Guo, Dr. Ramzi Temmani, Felipe Golib will apply deep learning and statistical genetics approaches to identify bladder related cancer (muscle-invasive bladder cancer & Metastatic Bladder Cancer) associated genomic variation and to understand their role in disease progression as well as drug discovery and development. We will also analyse the cancer cohort of the 100,000 Genomes Project to validate the findings from our previous GWAS study that we conducted within UKBioBank . This study can be used to develop targeted and personalized treatments, leading to improved outcomes for patients. The validation of these findings through the participants of the cancer cohort of the 100,000 Genomes Project will further solidify the impact of this research on healthcare.
We aim to explore the genomic sequences of cancer participants from the 100,000 Genomes Project to identify genetic variations that appear to be associated with human phenotypic changes and differential drug response. In this project, our team of Dr. Sharad Agarwal, Dr. Shicheng Guo, Dr. Ramzi Temmani, Felipe Golib will apply deep learning and statistical genetics approaches to identify bladder related cancer (muscle-invasive bladder cancer & Metastatic Bladder Cancer) associated genomic variation and to understand their role in disease progression as well as drug discovery and development. We will also analyse the cancer cohort of the 100,000 Genomes Project to validate the findings from our previous GWAS study that we conducted within UKBioBank . This study can be used to develop targeted and personalized treatments, leading to improved outcomes for patients. The validation of these findings through the participants of the cancer cohort of the 100,000 Genomes Project will further solidify the impact of this research on healthcare.
Predictive modeling of Brain metastases in patients diagnosed with Non-small Cell Lung Cancer using clinical-genomic real-world data
Project Lead
Alvaro Ulloa
Project Date
22/02/2023
Lay Summary
Lung cancer is the leading cause of cancer-related deaths in the US. Non-small Cell Lung Cancer (NSCLC) represents approximately 84% of all lung cancers. Patients diagnosed with NSCLC and Brain metastasis (BM) have poor prognoses, with median survival ranging between 4 and 6 months. Both early diagnosis and intervention may improve patient outcomes. Amivantamab + Lazertinib is a combination therapy for the treatment of certain types of non-small cell lung cancer (NSCLC). Amivantamab is an anti-PD-L1 antibody, a type of immunotherapy, while Lazertinib is a tyrosine kinase inhibitor that targets specific genetic mutations found in NSCLC. The combination of these two drugs works to block the growth and spread of cancer cells. Janssen is currently assessing the effectiveness of Amivantimab + Lazertinib on advanced non-small cell lung cancer (NSCLC) patients who haven't received any previous treatment and whose tumors harbor select EGFR mutations (exon 19 deletions or L858R) in the MARIPOSA trial whose main goal is to see how long patients can go without their cancer getting worse, which is known as progression-free survival (PFS). Given that the intervention is found to be effective, we hypothesize that prophylactic treatment with Amivantimab + Lazertinib targeting patients with a high risk of developing brain metastasis within 2 years may further improve patient outcomes. This will help oncologists make better decisions about treating these patients in the future. To support this trial, we have developed a model using clinical and genomic information from Concert AI’s Genome 360 database. In this study, Dr. Alvaro Ulloa, Dr. Amanda Zheutlin, Dr. Yi Zhang, Dr. Kaitlin Hood, Dr. Breno Neri, and Dr. Shicheng Guo plan to analyse the genomic and clinical data from the (Non-small cell lung) cancer participants in the 100,000 Genomes Project to validate the performance estimates for brain metastases in patients diagnosed with Non-small Cell Lung Cancer using clinical-genomic real-world data. This study will be important to support our clinical trial to prevent brain metastases from happening in these high-risk patients.
Lung cancer is the leading cause of cancer-related deaths in the US. Non-small Cell Lung Cancer (NSCLC) represents approximately 84% of all lung cancers. Patients diagnosed with NSCLC and Brain metastasis (BM) have poor prognoses, with median survival ranging between 4 and 6 months. Both early diagnosis and intervention may improve patient outcomes. Amivantamab + Lazertinib is a combination therapy for the treatment of certain types of non-small cell lung cancer (NSCLC). Amivantamab is an anti-PD-L1 antibody, a type of immunotherapy, while Lazertinib is a tyrosine kinase inhibitor that targets specific genetic mutations found in NSCLC. The combination of these two drugs works to block the growth and spread of cancer cells. Janssen is currently assessing the effectiveness of Amivantimab + Lazertinib on advanced non-small cell lung cancer (NSCLC) patients who haven't received any previous treatment and whose tumors harbor select EGFR mutations (exon 19 deletions or L858R) in the MARIPOSA trial whose main goal is to see how long patients can go without their cancer getting worse, which is known as progression-free survival (PFS). Given that the intervention is found to be effective, we hypothesize that prophylactic treatment with Amivantimab + Lazertinib targeting patients with a high risk of developing brain metastasis within 2 years may further improve patient outcomes. This will help oncologists make better decisions about treating these patients in the future. To support this trial, we have developed a model using clinical and genomic information from Concert AI’s Genome 360 database. In this study, Dr. Alvaro Ulloa, Dr. Amanda Zheutlin, Dr. Yi Zhang, Dr. Kaitlin Hood, Dr. Breno Neri, and Dr. Shicheng Guo plan to analyse the genomic and clinical data from the (Non-small cell lung) cancer participants in the 100,000 Genomes Project to validate the performance estimates for brain metastases in patients diagnosed with Non-small Cell Lung Cancer using clinical-genomic real-world data. This study will be important to support our clinical trial to prevent brain metastases from happening in these high-risk patients.
Characterisation of the contribution of genetics in influencing cancer outcomes and treatment response
Project Lead
Athena Matakidou
Project Date
21/02/2023
Lay Summary
Cancer is a disease that shows great variability in how patients respond to therapies and their overall prognosis (survival). This variability is not limited to different cancer types but is also seen within patients with the same diagnosis, with few clinically relevant biomarkers currently available to guide treatment choices. This project aims to explore how genetic variation, both inherited (germline) or acquired (cancer/somatic) impacts response to commonly used anti-cancer therapies and patients outcomes, by analysing the clinical and genetic data of cancer participants in the 100,000 Genomes Programme , found in the National Genomic Research Library (NGRL). The results may help identify novel biomarkers of treatment response as well provide insights into genes and pathways for the development of novel anti-cancer therapies.
Cancer is a disease that shows great variability in how patients respond to therapies and their overall prognosis (survival). This variability is not limited to different cancer types but is also seen within patients with the same diagnosis, with few clinically relevant biomarkers currently available to guide treatment choices. This project aims to explore how genetic variation, both inherited (germline) or acquired (cancer/somatic) impacts response to commonly used anti-cancer therapies and patients outcomes, by analysing the clinical and genetic data of cancer participants in the 100,000 Genomes Programme , found in the National Genomic Research Library (NGRL). The results may help identify novel biomarkers of treatment response as well provide insights into genes and pathways for the development of novel anti-cancer therapies.
Assessing the public health impact of endogenous viral elements in the human population
Project Lead
Gkikas Magiorkinis
Project Date
19/02/2023
Lay Summary
Deep analysis of whole genome sequencing (WGS) data has become a major asset in characterizing mobile genetic elements in humans. Yet, one large order of these elements – the endogenous viral elements (EVEs)- are integrations of partial or full-length viral genomic material into the host genome. These insertions are vertically inherited and can confer an advantageous or deleterious phenotype in humans. Thus, it is of critical importance to identify those viral genetic imprints and investigate their biological functions. In this project, we aim to deeply analyze a massive volume of human genome data in order to identify and characterize novel viral integrations. Through sophisticated bioinformatic algorithms, the project will catalogue novel polymorphic EVEs integrations, estimate their frequency in the human population and assess their public health impact.
Deep analysis of whole genome sequencing (WGS) data has become a major asset in characterizing mobile genetic elements in humans. Yet, one large order of these elements – the endogenous viral elements (EVEs)- are integrations of partial or full-length viral genomic material into the host genome. These insertions are vertically inherited and can confer an advantageous or deleterious phenotype in humans. Thus, it is of critical importance to identify those viral genetic imprints and investigate their biological functions. In this project, we aim to deeply analyze a massive volume of human genome data in order to identify and characterize novel viral integrations. Through sophisticated bioinformatic algorithms, the project will catalogue novel polymorphic EVEs integrations, estimate their frequency in the human population and assess their public health impact.
Exploring the role of rare non-coding variation in breast cancer risk
Project Lead
Douglas Easton
Project Date
16/02/2023
Lay Summary
The risk of cancer, including breast cancer, can be influenced by inherited genetic changes. The human genome is made up of genes, which code for proteins, and other “non-coding” DNA that does not code for proteins. The non-coding DNA can still serve important functional roles, such as regulating nearby genes. While the role of alterations in genes (such as BRCA1) in determining the risk of cancer, has been well studied, the role of the non-coding genome, particularly rare genetic changes, is much less understood. The role of this project is to investigate the role of rare alterations in non-coding DNA in determining breast cancer risk.
The risk of cancer, including breast cancer, can be influenced by inherited genetic changes. The human genome is made up of genes, which code for proteins, and other “non-coding” DNA that does not code for proteins. The non-coding DNA can still serve important functional roles, such as regulating nearby genes. While the role of alterations in genes (such as BRCA1) in determining the risk of cancer, has been well studied, the role of the non-coding genome, particularly rare genetic changes, is much less understood. The role of this project is to investigate the role of rare alterations in non-coding DNA in determining breast cancer risk.
Characterisation of the pan-cancer epigenomic mutational landscape
Project Lead
Simon Furney
Project Date
10/02/2023
Lay Summary
Within cells, DNA is organised and packaged by protein complexes known as chromatin. Certain families of genes are responsible for controlling this packaging of DNA in chromatin. This allows genes to be expressed and translated into proteins, or repressed so that the genes are not expressed at the wrong time or in the wrong cell type. However, many of these genes that control these processes are mutated in cancer leading to abnormal chromatin (“epigenomic”) states. This in turn leads to abnormal expression or repression of other genes which contribute to cancer development. In this project, we aim to investigate the prevalence and consequence of mutations in the genes which control chromatin states. In order to do this, we will look at germline and tumour mutations in these genes in cancer patients’ genomes to try to understand which mutations are important and may lead to tumour development.
Within cells, DNA is organised and packaged by protein complexes known as chromatin. Certain families of genes are responsible for controlling this packaging of DNA in chromatin. This allows genes to be expressed and translated into proteins, or repressed so that the genes are not expressed at the wrong time or in the wrong cell type. However, many of these genes that control these processes are mutated in cancer leading to abnormal chromatin (“epigenomic”) states. This in turn leads to abnormal expression or repression of other genes which contribute to cancer development. In this project, we aim to investigate the prevalence and consequence of mutations in the genes which control chromatin states. In order to do this, we will look at germline and tumour mutations in these genes in cancer patients’ genomes to try to understand which mutations are important and may lead to tumour development.
microRNA-associated variation in rare epilepsies
Project Lead
Ifeolutembi Fashina
Project Date
08/02/2023
Lay Summary
This project aims to identify classes of gene regulators that cause epilepsy. This study is part of an effort to understand how inherited changes in non-protein coding features influence epilepsy. We need to use whole genome data from consortiums like Genomics England, to test these gene regulators in different epilepsy subtypes. Successful interrogation could point molecular researchers towards druggable pathways.
This project aims to identify classes of gene regulators that cause epilepsy. This study is part of an effort to understand how inherited changes in non-protein coding features influence epilepsy. We need to use whole genome data from consortiums like Genomics England, to test these gene regulators in different epilepsy subtypes. Successful interrogation could point molecular researchers towards druggable pathways.
Analysis of MCPH1 and CSMD1 in the 100,000 genomes cohort
Project Lead
Sandra Bell
Project Date
07/02/2023
Lay Summary
Genetic changes in the developmental genes MCPH1 and CSMD1 have been identified in a range of cancer types particularly breast cancer. We wish to perform a comprehensive review of all MCPH1 and CSMD1 germline and somatic variants in the 100,000 genomes cohort. We aim to determine the extent to which the MCPH1 and CSMD1 genes contribute to both neurological disorders and cancer development.
Genetic changes in the developmental genes MCPH1 and CSMD1 have been identified in a range of cancer types particularly breast cancer. We wish to perform a comprehensive review of all MCPH1 and CSMD1 germline and somatic variants in the 100,000 genomes cohort. We aim to determine the extent to which the MCPH1 and CSMD1 genes contribute to both neurological disorders and cancer development.
Large-scale functional characterisation of variants observed in patients
Project Lead
Matthew Hurles
Project Date
06/02/2023
Lay Summary
We are using an experimental technique, Saturation Genome Editing, to generate scores for assessing the severity for all possible genetic variants within genes. The maps we compile can be used by clinicians in the interpretation of variants of uncertain significance, facilitating diagnoses for those with genetic disease. The availability of variant data from affected populations is critical to compile accurate variant effect maps for the genes of interest.
We plan to use the anonymised variant information from patients recruited as part of the Rare Disease Programme in GEL to establish the specificity and sensitivity of the variant maps we generate. We will identify and export de-identified DNA sequence variants present in specific genes with disease associations, along with their inheritance status. These variants will be tested in our functional assay alongside hundreds of other SNVs, and will then serve as positive control variants in the analysis of our experimental data, and thus help establish the efficacy of our maps to disambiguate previously uninterpretable variants. We anticipate performing this data extraction for approximately 10 genes per month, for a total of around 300 genes. This work will be carried out by the Hurles Group (https://www.sanger.ac.uk/group...).
We are using an experimental technique, Saturation Genome Editing, to generate scores for assessing the severity for all possible genetic variants within genes. The maps we compile can be used by clinicians in the interpretation of variants of uncertain significance, facilitating diagnoses for those with genetic disease. The availability of variant data from affected populations is critical to compile accurate variant effect maps for the genes of interest.
We plan to use the anonymised variant information from patients recruited as part of the Rare Disease Programme in GEL to establish the specificity and sensitivity of the variant maps we generate. We will identify and export de-identified DNA sequence variants present in specific genes with disease associations, along with their inheritance status. These variants will be tested in our functional assay alongside hundreds of other SNVs, and will then serve as positive control variants in the analysis of our experimental data, and thus help establish the efficacy of our maps to disambiguate previously uninterpretable variants. We anticipate performing this data extraction for approximately 10 genes per month, for a total of around 300 genes. This work will be carried out by the Hurles Group (https://www.sanger.ac.uk/group...).
Comparative medical genetics to facilitate the interpretation of rare variation
Project Lead
Bushra Haque
Project Date
02/02/2023
Lay Summary
Reading entire DNA sequences of individuals provides a comprehensive view of the unique genetic variation that contributes to a range of diseases, from rare genetic disorders to common cancers. Genetic variants create changes of the genetic code that often negatively impact proteins that DNA code for and cause harmful effects. However, distinguishing rare variants that cause disease from the majority that have no consequences remains a challenge. This study aims to develop and test a new approach for determining if rare variants are associated with human disease. An area that has not been investigated is the overlap between germline DNA variants that cause genetic disorders in humans and variants in other contexts. Germline variants refer to changes that are incorporated into the DNA of every cell in the body that are often inherited from the egg and sperm cells during conception. Examples of variants in other contexts include somatic variants, that occur in specific cell types and cause cancer, or non-human animal variants.
This study aims to explores this further by extracting variants from databases, such as cancer mutation databases and the Online Mendelian Inheritance in Animals (OMIA). Several hundreds of these disease-causing variants were extracted from these databases and will be queried in human germline databases including ClinVar and the GEL database. Since these variants are the same, there is a clear functional consequence of these variants in other contexts that can be used to interpret their impact in human genomes that cause germline disease. This data will help to improve the way we interpret novel rare variants to be disease-causing in humans and improve clinical diagnosis to allow better treatment and health outcomes.
Reading entire DNA sequences of individuals provides a comprehensive view of the unique genetic variation that contributes to a range of diseases, from rare genetic disorders to common cancers. Genetic variants create changes of the genetic code that often negatively impact proteins that DNA code for and cause harmful effects. However, distinguishing rare variants that cause disease from the majority that have no consequences remains a challenge. This study aims to develop and test a new approach for determining if rare variants are associated with human disease. An area that has not been investigated is the overlap between germline DNA variants that cause genetic disorders in humans and variants in other contexts. Germline variants refer to changes that are incorporated into the DNA of every cell in the body that are often inherited from the egg and sperm cells during conception. Examples of variants in other contexts include somatic variants, that occur in specific cell types and cause cancer, or non-human animal variants.
This study aims to explores this further by extracting variants from databases, such as cancer mutation databases and the Online Mendelian Inheritance in Animals (OMIA). Several hundreds of these disease-causing variants were extracted from these databases and will be queried in human germline databases including ClinVar and the GEL database. Since these variants are the same, there is a clear functional consequence of these variants in other contexts that can be used to interpret their impact in human genomes that cause germline disease. This data will help to improve the way we interpret novel rare variants to be disease-causing in humans and improve clinical diagnosis to allow better treatment and health outcomes.
Genetic determinants for healthcare costs - genCOST consortium
Project Lead
Andrea Ganna
Project Date
31/01/2023
Lay Summary
Healthcare costs continue to rise worldwide, and in 2018, global healthcare spending reached $8.3 trillion, or 10% of the global gross domestic product. Accurate measurement of healthcare costs associated with different risk factors and health outcomes is important to prioritize public health promotion and prevention programs. Moreover, healthcare utilization, and associated healthcare costs, can be used to compare the impact of risk factors on individual health burden in a disease-agnostic manner. Thus, analysis of healthcare costs is of significant interest from an epidemiological, public health, and policy perspective.
In this project we aim to study the impact of genetic factors on healthcare cost. There are three major motivations. First, healthcare costs can provide an objective measure of morbidity. Thus, genetic associations with healthcare costs can help identify biological pathways that are implicated in overall health maintenance. Second, implementation of genetic-based screening tools in a clinical setting requires cost-effectiveness evaluations. Estimating the relationship between genetic risk factors and healthcare costs can help estimating the cost-effectiveness of novel genetic-based interventions. Third, genetic associations with healthcare costs can be used to inform the causal relationship between modifiable risk factors and healthcare costs using statistical genetic approach.
Healthcare costs continue to rise worldwide, and in 2018, global healthcare spending reached $8.3 trillion, or 10% of the global gross domestic product. Accurate measurement of healthcare costs associated with different risk factors and health outcomes is important to prioritize public health promotion and prevention programs. Moreover, healthcare utilization, and associated healthcare costs, can be used to compare the impact of risk factors on individual health burden in a disease-agnostic manner. Thus, analysis of healthcare costs is of significant interest from an epidemiological, public health, and policy perspective.
In this project we aim to study the impact of genetic factors on healthcare cost. There are three major motivations. First, healthcare costs can provide an objective measure of morbidity. Thus, genetic associations with healthcare costs can help identify biological pathways that are implicated in overall health maintenance. Second, implementation of genetic-based screening tools in a clinical setting requires cost-effectiveness evaluations. Estimating the relationship between genetic risk factors and healthcare costs can help estimating the cost-effectiveness of novel genetic-based interventions. Third, genetic associations with healthcare costs can be used to inform the causal relationship between modifiable risk factors and healthcare costs using statistical genetic approach.
G4 quadruplex regions of the first Intron
Project Lead
Stephen Henderson
Project Date
30/01/2023
Lay Summary
We have identified important regulatory regions of the genome called G4 quadruplexes. We wish to search through multiple cancers to see if these regions which we know to be biologically important may also be selectively disrupted by mutation in different types of cancer.
We have identified important regulatory regions of the genome called G4 quadruplexes. We wish to search through multiple cancers to see if these regions which we know to be biologically important may also be selectively disrupted by mutation in different types of cancer.
External validation of a clinical score estimating the pre-test probability of obtaining a molecular diagnosis using massive parallel sequencing data in adults patients with kidney diseases of unknown origin.
Project Lead
Albertien van Eerde
Project Date
20/01/2023
Lay Summary
We developed a clinical score estimating the pre-test probability of obtaining a molecular diagnosis in adults patients with kidney diseases without a genetic diagnosis. It was internally validated using a cohort of 497 patients from Sorbonne University hospitals (Paris, France), with a good discriminative performance (AUROC 0.73) and calibration.
We developed a clinical score estimating the pre-test probability of obtaining a molecular diagnosis in adults patients with kidney diseases without a genetic diagnosis. It was internally validated using a cohort of 497 patients from Sorbonne University hospitals (Paris, France), with a good discriminative performance (AUROC 0.73) and calibration.
Quality Control of somatic calls from whole-genome sequencing
Project Lead
Giulio Caravagna
Project Date
11/01/2023
Lay Summary
The prelude to interpret cancer genomes is to generate good-quality sequencing data. With the advent of high-resolution whole-genome sequencing (WGS) data we are seeing the flourishing of many distinct bioinformatics algorithms to call somatic mutations (and more complex alterations) from WGS data, but Quality Control (QC) algorithms for such calls are still missing. These algorithms are fundamental to flag good-quality from bad-quality data, eventually refining data-quality until desired criteria are met. In this project we will use the wealth of pan-cancer data at Genomics England to to develop and test new QC algorithms for WGS data, allowing researcher to get the best out of their sequencing datasets.
The prelude to interpret cancer genomes is to generate good-quality sequencing data. With the advent of high-resolution whole-genome sequencing (WGS) data we are seeing the flourishing of many distinct bioinformatics algorithms to call somatic mutations (and more complex alterations) from WGS data, but Quality Control (QC) algorithms for such calls are still missing. These algorithms are fundamental to flag good-quality from bad-quality data, eventually refining data-quality until desired criteria are met. In this project we will use the wealth of pan-cancer data at Genomics England to to develop and test new QC algorithms for WGS data, allowing researcher to get the best out of their sequencing datasets.
Genotyping MUCs VNTR linked to Ovarian cancer/fertility using Whole Genome Sequencing
Project Lead
Steven Conlan
Project Date
10/01/2023
Lay Summary
Mucin (MUC) proteins are large proteins that contain repeating subunits called variable number tandem repeat regions (VNTR) that are decorated with sugar molecules (glycosylated). They are found throughout the body, including in the uterus (womb) and ovaries. We are proposing to investigate the relationship between the VNTR length of two proteins MUC1 and MUC16 in relation to fertility and cancer making use of 100k genomes project.
The fertility part of the project would investigate MUC1 VNTR length linked to fertility status (number of pregnancies). Despite thorough investigation many cases of infertility remain unexplained. Although normal embryos are transferred in most in-vitro fertilisation (IVF) successful pregnancy only takes place in about one in five attempts. Failure of implantation of the embryo is probably the reason for the lack of success. The level of MUC1 increases after ovulation and persists during implantation. High genetic variation in the number of VNTRs is a characteristic of MUC 1, and we will test whether this variation could be linked to pregnancy outcome.
In the ovarian cancer part of the project we would look at VNTR length in MUC16 as it is the ovarian cancer biomarker known as CA125. CA125 is not always a reliable diagnostic marker for ovarian cancer (although it is the best marker we have, and is excellent in monitoring ovarian cancer treatment response). We think that false positive CA125 tests could be due to VNTR length (genetic variation), and knowing the relationship between MUC16 VNTR length and MUC16 (CA125) levels could give a better risk score for ovarian cancer diagnosis/detection.
Mucin (MUC) proteins are large proteins that contain repeating subunits called variable number tandem repeat regions (VNTR) that are decorated with sugar molecules (glycosylated). They are found throughout the body, including in the uterus (womb) and ovaries. We are proposing to investigate the relationship between the VNTR length of two proteins MUC1 and MUC16 in relation to fertility and cancer making use of 100k genomes project.
The fertility part of the project would investigate MUC1 VNTR length linked to fertility status (number of pregnancies). Despite thorough investigation many cases of infertility remain unexplained. Although normal embryos are transferred in most in-vitro fertilisation (IVF) successful pregnancy only takes place in about one in five attempts. Failure of implantation of the embryo is probably the reason for the lack of success. The level of MUC1 increases after ovulation and persists during implantation. High genetic variation in the number of VNTRs is a characteristic of MUC 1, and we will test whether this variation could be linked to pregnancy outcome.
In the ovarian cancer part of the project we would look at VNTR length in MUC16 as it is the ovarian cancer biomarker known as CA125. CA125 is not always a reliable diagnostic marker for ovarian cancer (although it is the best marker we have, and is excellent in monitoring ovarian cancer treatment response). We think that false positive CA125 tests could be due to VNTR length (genetic variation), and knowing the relationship between MUC16 VNTR length and MUC16 (CA125) levels could give a better risk score for ovarian cancer diagnosis/detection.
Additional Findings Evaluation: health economic analysis
Project Lead
Lyn Chitty
Project Date
09/01/2023
Lay Summary
100,000 Genomes Project participants were asked if they wanted additional health information to be looked for in their genome sequence, known as 'additional findings'. For those who said yes, gene alterations or ‘spelling mistakes’ were looked for in a specific list of genes that could increase the risk of developing certain health conditions. These alterations are rare - it is expected that about 1 in 100 participants will have one of these findings - but for this small of people number steps can be taken to reduce the likelihood of the health condition developing, or the condition can be treated or monitored.
The process of looking for 'additional findings' and the steps or treatments which may follow a new finding will come with extra costs to the NHS and the patient. It may also bring benefits from taking early action to treat, or reduce the risk of, a health condition. It is import to assess both the potential costs and benefits to help decide whether or not we should look for 'additional findings' in NHS patients, including whether it will be a good use of tax payer money. This project will use standard health economics methods to carry out this assessment.
100,000 Genomes Project participants were asked if they wanted additional health information to be looked for in their genome sequence, known as 'additional findings'. For those who said yes, gene alterations or ‘spelling mistakes’ were looked for in a specific list of genes that could increase the risk of developing certain health conditions. These alterations are rare - it is expected that about 1 in 100 participants will have one of these findings - but for this small of people number steps can be taken to reduce the likelihood of the health condition developing, or the condition can be treated or monitored.
The process of looking for 'additional findings' and the steps or treatments which may follow a new finding will come with extra costs to the NHS and the patient. It may also bring benefits from taking early action to treat, or reduce the risk of, a health condition. It is import to assess both the potential costs and benefits to help decide whether or not we should look for 'additional findings' in NHS patients, including whether it will be a good use of tax payer money. This project will use standard health economics methods to carry out this assessment.
Additional Findings Evaluation: health economic analysis
Project Lead
Lyn Chitty
Project Date
09/01/2023
Lay Summary
100,000 Genomes Project participants were asked if they wanted additional health information to be looked for in their genome sequence, known as 'additional findings'. For those who said yes, gene alterations or ‘spelling mistakes’ were looked for in a specific list of genes that could increase the risk of developing certain health conditions. These alterations are rare - it is expected that about 1 in 100 participants will have one of these findings - but for this small of people number steps can be taken to reduce the likelihood of the health condition developing, or the condition can be treated or monitored.
The process of looking for 'additional findings' and the steps or treatments which may follow a new finding will come with extra costs to the NHS and the patient. It may also bring benefits from taking early action to treat, or reduce the risk of, a health condition. It is import to assess both the potential costs and benefits to help decide whether or not we should look for 'additional findings' in NHS patients, including whether it will be a good use of tax payer money. This project will use standard health economics methods to carry out this assessment.
100,000 Genomes Project participants were asked if they wanted additional health information to be looked for in their genome sequence, known as 'additional findings'. For those who said yes, gene alterations or ‘spelling mistakes’ were looked for in a specific list of genes that could increase the risk of developing certain health conditions. These alterations are rare - it is expected that about 1 in 100 participants will have one of these findings - but for this small of people number steps can be taken to reduce the likelihood of the health condition developing, or the condition can be treated or monitored.
The process of looking for 'additional findings' and the steps or treatments which may follow a new finding will come with extra costs to the NHS and the patient. It may also bring benefits from taking early action to treat, or reduce the risk of, a health condition. It is import to assess both the potential costs and benefits to help decide whether or not we should look for 'additional findings' in NHS patients, including whether it will be a good use of tax payer money. This project will use standard health economics methods to carry out this assessment.