Sequencing a genome

Your genome is your unique sequence of DNA. It is over 3 billion letters long. It is found in almost every cell in your body.

Collecting DNA

People take part in the 100,000 Genomes Project at NHS Genomic Medicine Centres. They donate a sample of DNA. This usually comes from a small blood sample of about 5 ml (a tablespoon). Cancer patients also donate a small piece of their tumour.

DNA is taken from the samples at NHS hospital laboratories. The DNA sample is then stored at the national biorepository.

Sequencing

An illumina HiSeqX sequenching machine

An Illumina HiSeqX sequencing machine.

There are different methods and machines that can sequence genomes.

In the 100,000 Genomes Project, DNA is sequenced by our partners at Illumina. One human genome can be sequenced in about a day, though the analysis takes much longer.

DNA sequencing machines cannot sequence the whole genome in one go. Instead, they sequence the DNA in short pieces, around 150 letters long. Each of these short sequences is called a ‘read’.

Mapping

The reads from the sequencing machine are matched to a ‘reference genome sequence’. This is done by ‘mapping’ software on high performance computers. The software finds where each read belongs on the genome.

The reference sequence is used by scientists world-wide. It is a representative example of a human genome sequence. It is made up of DNA sequences from 13 anonymous donors, so is not any single person. The reference sequence was the result of the original human genome project, which finished in 2001.

The position of most of our genes is known, and is shown on the reference sequence. The next step is to identify the differences between your genome and the reference.

Analysis

Every person has millions of differences to the reference sequence. The differences are called ‘variants’. These might be a single letter. Or a string of letters may be in a different place or missing. Most of the differences are completely harmless – they are the reason we are different from each other.

Some differences could be causing a disease. Scientists use a range of software to filter millions of differences down to just a few that could be harmful.

Any change that is likely to be the cause of someone’s symptoms or disease is given back to the NHS. They then confirm the result in their laboratories. The findings and any implications are then discussed with the patient. Find out more about results in the 100,000 Genomes Project.

If it is not clear that a change is causing disease, it is sent to researchers for further analysis.

Bioinformatics

Bioinformatics is the science of collecting and analysing complex biological data, such as genomic data.

Bioinformaticians are scientists who specialise in analysing genomic or other biological data. They develop methods and software tools to understand and interpret genomic data. They may have studied biology, engineering, computing or maths, and have training in bioinformatics.

Info graphic showing how you sequence a human genome

Scientist loading a DNA sample onto a sequencing machine

Scientist loading a DNA sample onto a sequencing machine. Cambridge, UK

Share thisShare on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInEmail this to someone