Abstract

In response to the escalating SARS-CoV-2 pandemic, in March 2020 the COVID-19 Genomics UK (COG-UK) consortium was established to enable national-scale genomic surveillance in the UK. By the end of 2020, 49% of all SARS-CoV-2 genome sequences globally had been generated as part of the COG-UK programme, and to date, this system has generated >3 million SARS-CoV-2 genomes. Rapidly and reliably analysing this unprecedented number of genomes was an enormous challenge. To fulfil this need and to inform public health decision-making, we developed a centralized pipeline that performs quality control, alignment, and variant calling and provides the global phylogenetic context of sequences. We present this pipeline and describe how we tailored it as the pandemic progressed to scale with the increasing amounts of data and to provide the most relevant analyses on a daily basis.

Rights

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Cite as

Colquhoun, R., O’Toole, Á., Hill, V., Yu, X., Poplawski, R., Whalley, T., Groves, N., Ellaby, N., Loman, N., Connor, T., Rambaut, A., McCrone, J. & Nicholls, S. 2024, 'A phylogenetics and variant calling pipeline to support SARS-CoV-2 genomic epidemiology in the UK', Virus Evolution, 10(1), article no: veae083. https://doi.org/10.1093/ve/veae083

Downloadable citations

Download HTML citationHTML Download BIB citationBIB Download RIS citationRIS
Last updated: 15 October 2025
Was this page helpful?