Finding genetic causes for rare conditions using the ‘Rareservoir’
By Florence Cornish on
This blog features research from our research community, completed using data donated by participants to the National Genomic Research Library.
A recent study found 3 new genetic causes for rare conditions using a database called the Rareservoir. The Rareservoir is part of a newly designed process for handling data, which lets us identify rare gene variants more efficiently than before.
Background
At present, there are 10,000 catalogued rare conditions. Collectively, they affect 1 in 20 people around the globe. Despite this, less than half of these catalogued rare conditions actually have a resolved genetic cause.
The human genome refers to all of the DNA present in an individual person. By studying the genomes of large diverse groups of people with rare conditions, we may gain valuable insights into what might be causing particular conditions. While this could help increase rates of diagnosis, there are numerous challenges associated with it.
For example, the human genome sequence is made up of 3 billion letters. This means that full genome data from large groups must be stored in a particular way. This storage has a high cost, and processing data can be expensive.
In addition, recent data frameworks are often designed to detect both rare and common gene variants in each person’s genome. This makes it difficult to study rare conditions alone.
Finally, to accommodate for large sets of data, we require multiple storage systems and different types of software. This creates added complications when attempting to use data in research.
What did this study aim to do?
The overall aim of this study was to improve our understanding of the genetic causes of rare conditions. Researchers Greene et al. address the challenges of combining data from large-scale genome sequencing with data from the clinic, using a creation called the Rareservoir.
The Rareservoir is a database that contains information from over 77,000 participants of the 100,000 Genomes Project. It is part of a powerful new framework, designed by researchers to combat the difficulties of handling large amounts of genomic data.
This work could bring us one step closer to standard genome sequencing in the clinic, and ultimately a better understanding of rare condition causes.
What did the study find?
While the Rareservoir contains data from thousands of participants, it only stores the rare genetic variants from each person’s genome. This means that the amount of data stored is greatly reduced, and can all exist in one central system. Using this method makes data easier to manage, coming in especially useful for larger studies.
Using the Rareservoir, researchers investigated 269 rare conditions across participants in the database. They identified 260 genetic associations that link to particular conditions, 19 of which were previously unknown.
Out of the 19, 3 genetic associations were selected for extra investigation. Here’s what the researchers found.
1. A gene for primary lymphoedema
The first genetic association selected relates to primary lymphoedema. Primary lymphoedema is a group of genetic conditions that cause chronic swelling throughout tissues in the body.
Using the Rareservoir, researchers identified an association between a gene called ERG and primary lymphoedema.
A non-faulty ERG gene plays a vital role in blood vessel formation; however, not much is known about ERG and the lymphatic system.
The lymphatic system is a network of vessels that maintains the amount of fluid in the body. After identifying the rare ERG variant, researchers confirmed that ERG is also important for maintaining the structure of lymphatic vessels.
When the ERG gene doesn’t function, lymphatic vessels cannot form properly; fluid cannot be sufficiently drained and chronic swelling occurs.
2. A gene for Loeys-Dietz syndrome
The researchers identified an association between Loeys-Dietz syndrome and a gene called PMEPA1. Loeys-Dietz syndrome is a condition that affects connective tissue, such as bones and tissue of the organs.
Researchers found that when faulty PMEPA1 genes produce a shorter-than-normal protein, this can result in Loeys-Dietz syndrome. All patients with this faulty version exhibited skeletal features, such as scoliosis, abnormally long fingers and toes, and deformities in the chest wall. Some patients also displayed thoracic aortic aneurysm disease (swelling in the aorta, the main blood vessel leaving the heart).
Researchers found that PMEPA1 plays a vital role in a pathway that is known to be associated with several aortic diseases, including Loeys-Dietz.
3. A gene for congenital hearing impairment
Finally, researchers found that non-functioning variants in a gene called GPR156 are associated with congenital hearing impairment, meaning hearing impairment from birth.
GPR156 has recently been identified as an important regulator of structures in the ear called stereocilia. Stereocilia are small extensions on hair cells of the ear, essential for hearing and balance.
Findings suggest that when people inherit 2 faulty copies of GPR156, stereocilia do not form properly, leading to impaired hearing ability.
What does the Rareservoir mean for patients?
While talk of data and data processing may seem far afield to patients in the clinic, the Rareservoir offers great promise for the diagnostic odyssey.
The more efficiently we can look through genomic data, the more understanding we can gain about genetic causes of conditions. This allows clinicians to provide more specific, effective treatment and management options to patients with rare conditions.
All 3 of the genes identified here were not previously known to be linked with disease. By using data from the Rareservoir, researchers found new gene-disease associations, helping to reduce the gap in our knowledge.
It is important to remember, all data within the Rareservoir was volunteered by participants of the 100,000 Genomes Project. The findings here show how important the contributions of participants are, and how much they can help to achieve.
Next steps
Overall, this study presents a new, efficient way to identify unknown genetic causes behind rare conditions. Findings show that genome sequencing, if paired with a powerful enough framework, could provide answers to those currently without a genetic diagnosis.
Out of the 19 new gene-disease associations found in the study, 3 were confirmed. The next step is to examine the remaining 16 with further experiments, before they can be considered as causes.
In addition, other regions of the genome could also be explored using approaches similar to this one. The Rareservoir database, together with efficient data processing, offers a promising future for rare condition diagnosis.
Have thoughts on this research? Share them in a comment below.
Stay tuned for more updates on the latest from our research community. In the meantime, browse the latest news at Genomics England, or check out participant stories from 100,000 Genomes Project participants.
Endnote
This research was produced by the Genomics England Clinical Interpretations Partnership, or GECIP for short. These partnerships are formed by academics, clinicians and students, who join together to form research communities. Click here to find out more about our GECIPs.