Skip to main content

Diverse Data

The initiative aims to reduce health inequalities and improve patient outcomes within genomic medicine.

Diverse data hero 2

The need for diverse data

To date, studies of human genetics have largely focused on populations from European ancestries, which has contributed to many populations being underrepresented in genomics research.

Europeans represent 78%

of people in genome-wide association studies (GWAS)

Only 10% of GWAS are Asian

2% are African, 1% are Hispanic, and all other ethnicities represent <1%

Polygenic risk scores are 4.5x

more accurate for people of European ancestry than of African ancestry

How diverse data might be missed

Art by Stef Posavec

Data structures and systems can result in situations where data doesn’t fit neatly, is deleted, or relegated to ‘other’. An acute example of this in genomics is how to handle data from people of mixed ancestries.

The cube represents the data environment or system, and the circles represents information that fits within the data structures. The circles bouncing outside depicts information that doesn’t fit the data structures and environments neatly.

Explore Genomics England