Skip to main content

Genomics England supports the development of new AI tool aiding long-read sequencing

Genomics England researchers have played a key role in the development of a new artificial intelligence tool, known as SAVANA, that uses long-read sequencing data to find specific changes within a person’s DNA linked to cancer. The tool can quickly and accurately analyse genetic information from patient samples to help diagnose cancer and inform treatment approaches. 

Long-read sequencing technologies analyse long, continuous stretches of DNA. These methods have the potential to improve researchers’ abilities to detect complex genetic changes in cancer genomes. However, the complex structure of cancer genomes means that standard analysis tools can lead to false-positive results and unreliable interpretations of the data. These misleading results can compromise understanding of how tumours evolve, respond to treatment, and limit abilities to diagnose and treat cancer. 

The research, led by EMBL’s European Bioinformatics Institute and published in Nature Methods, was developed and tested by researchers at the Institute alongside Genomics England, in collaboration with clinical partners at University College London, the Royal National Orthopaedic Hospital, Instituto de Medicina Molecular João Lobo Antunes, and Boston Children's Hospital.  

The research team compared SAVANA’s results from long-read data with sequencing of the same samples analysed through a whole-genome sequencing data analysis pipeline used to deliver clinical reports. The findings were highly consistent across technologies, demonstrating that SAVANA performs on par with current clinical standards while revealing additional cancer-relevant alterations. 

Genomics England explored SAVANA's use as part of its work looking at the clinical potential of long-read sequencing technology to support earlier, faster diagnosis of cancer, as well as providing data for researchers to check the accuracy of the new tool. 

“Using SAVANA can help clinicians receive accurate and reliable genomic data, enabling them to confidently integrate advanced genomic sequencing methods such as long-read sequencing into routine patient care.” 

Greg Elgar

Director of Sequencing R&D at Genomics England

 “Because other analysis tools are not developed to account for the particularities of cancer genomics data, they often pick up false positives that could lead to incorrect clinical and biological interpretations. 

“SAVANA changes this. By training the algorithm directly on long-read sequencing data from cancer samples, we created a new method that can tell the difference between true cancer-related genomic alterations and sequencing artefacts.” 

Isidro Cortes-Ciriano

Group Leader at EMBL-EBI

SAVANA is also being deployed as part of nationwide initiatives, such as the UK Stratified Medicine Paediatrics project funded by Cancer Research UK and Children with Cancer UK. This project is focused on developing more successful and easier treatments for childhood cancers using advanced sequencing technologies to better understand tumour biology and monitor disease recurrence. 

Media contact

[email protected]

Follow us