Skip to main content

Data Engineer

London, England

Company Description

Genomics England successfully led the world-leading 100,000 Genomes Project, which compared and analysed individuals’ genetic codes to help diagnose, treat and prevent illness.

We're now accelerating our impact, working with the NHS to further develop and embed genomic healthcare and research in Britain. Our next chapter involves working with patients, doctors, scientists, government and industry to improve genomic testing, and help researchers access the health data and technology they need to make new medical discoveries and create more effective, targeted medicines for everybody.

Job Description

We are looking for experienced data engineers to join our growing data family at Genomics England. You will be part of multi-discipline squads/teams delivering a step change in the way that healthcare is delivered, ensuring that patients receive improved diagnosis and care. We use the best in science and technology to deliver genetic insights for personalised medicine.

You will get to work with truly large scale clinical and genomic data. We currently hold over 40 petabytes of structured and unstructured data. Our organisation is dynamic, agile and full of talented people working on the cutting edge of science. We look forward to having you join our organisation.

Key Responsibilities:

  • Automating and optimising big data pipelines on structured, semi-structured and unstructured data
  • Continuous integration and deployment of end-to-end data delivery
  • Acquisition, ingestion, transformation, curation and productisation of data assets
  • Thought leadership and innovation for data delivery
  • Support the business to migrate to scalable architecture on cloud (AWS)

Key Skills:

  • Experience with tooling for data manipulation in programming languages, low-code ETL tooling and/or data visualisation software
  • Strong programming skills against cloud-based pipelines
  • Experience in co-design of curated data products from raw data assets working with business users to meet business needs
  • Experience of data modelling and developing/reverse-engineering data assets and the necessary components for data model conformance
  • Experience developing, optimising and automating data extract, transform and load routines to create a coherent high-quality comprehensive curated data
  • Experience with developing scheduled data flows using an integration/workflow engine, including message management and troubleshooting, log integration and complex data lineage
  • Strong AWS experience preferred
  • Healthcare experience helpful

Example tooling:

  • Cloud: AWS, Azure
  • ETL: AWS Glue, Trifacta, KNIME
  • Metadata & master data management: White Rabbit
  • Data models: XML, JSON, HL7 FHIR, OMOP
  • Databases: AWS S3 & Athena, AWS DynamoDB, AWS RDS, AWS Aurora (Postgres)
  • Continuous deployment: Jenkins, AWS Lambda, Docker, Kubernetes
  • Programming languages: Python, R, SQL
  • Visualisation software: Tableau
  • Machine learning: Regression, decision trees, SVM, Bayes, NLP
  • Practices: DMBOK2, Continuous Integration/Continuous Deployment (CI/CD)


Ideally, Master’s degree or equivalent experience working in data management, biostatistics, clinical informatics or data analysis

Additional Information

Originally conceived as a project, Genomics England has transformed to meet the long-term opportunities created by our scientific breakthroughs in understanding the Human Genome. Being part of this journey is a reward in itself, however we're pleased to offer our colleagues a great benefits package including:

  • competitive salary
  • 30 days holiday
  • generous pension scheme
  • individual learning budgets for every colleague
  • a raft of other benefits

Talk to our Talent Team and find out how a career with Genomics England will benefit you.


As part of our recruitment process, all successful candidates are subject to a Standard Disclosure and Barring Service (DBS) check. We therefore require applicants to disclose any previous offences at point of application, as some unspent convictions may mean we are unable to proceed with your application due to the nature of our work in healthcare.

Genomics England operate a blended working model, as we know our people appreciate the flexibility. We expect most people to come into the office 2 times each month as a minimum. However, this will vary according to role and will be agreed with your team leader. For some people this is 1 day a quarter, for others it is several days a week. There is no expectation that staff will return to the office full time unless they want to. The exception would be some of our roles that would require you to be on site full time e.g., lab teams, reception team.

Our teams and squads have, and will continue to, reflect on what works best for them to work together successfully and have the freedom to design working patterns to suit, beyond the minimum. Our office locations at the moment are Cambridge and Farringdon (London) and in Autumn 2022 we are relocating our London office location to Canary Wharf. We will also be expanding our regional offices.

Looking ahead to our move to Canary Wharf, we will be designing our new space with blended working in mind, and with the flexibility to adapt to changing work patterns. During the pandemic we will be following government advice on working from home guidance.


Videos to watch

Apply for this role

Personal Details


Screening Questions

Do you hold the current right to work in the UK?
Will you now or in the future need a visa or work permit in order to work in the UK?

Diversity and Inclusion is important to Genomics England. We want an organisation where you can bring your whole self and feel welcome and included in our mission and workplace. We use the answers to these questions completely anonymously to track and report on our progress in attracting and hiring a diverse workforce. Your answers to these questions are not visible to recruiters or hiring managers and are not used in the selection process.

Explore Genomics England