Addressing pandemic-wide systematic errors in the SARS-CoV-2 phylogeny

Hunt, Martin and Hinrichs, Angie S. and Anderson, Daniel and Karim, Lily and Dearlove, Bethany L. and Knaggs, Jeff and Constantinides, Bede and Fowler, Philip W. and Rodger, Gillian and Street, Teresa and Lumley, Sheila and Webster, Hermione and Sanderson, Theo and Ruis, Christopher and Kotzen, Benjamin and de Maio, Nicola and Amenga-Etego, Lucas N. and Amuzu, Dominic S. Y. and Avaro, Martin and Awandare, Gordon A. and Ayivor-Djanie, Reuben and Barkham, Timothy and Bashton, Matthew and Batty, Elizabeth M. and Bediako, Yaw and De Belder, Denise and Benedetti, Estefania and Bergthaler, Andreas and Boers, Stefan A. and Campos, Josefina and Carr, Rosina Afua Ampomah and Chen, Yuan Yi Constance and Cuba, Facundo and Dattero, Maria Elena and Dejnirattisai, Wanwisa and Dilthey, Alexander and Duedu, Kwabena and Endler, Lukas and Engelmann, Ilka and Francisco, Ngiambudulu M. and Fuchs, Jonas and Gnimpieba, Etienne Z. and Groc, Soraya and Gyamfi, Jones and Heemskerk, Dennis and Houwaart, Torsten and Hsiao, Nei-yuan and Huska, Matthew and Hölzer, Martin and Iranzadeh, Arash and Jarva, Hanna and Jeewandara, Chandima and Jolly, Bani and Joseph, Rageema and Kant, Ravi and Ki, Karrie Ko Kwan and Kurkela, Satu and Lappalainen, Maija and Lataretu, Marie and Lemieux, Jacob and Liu, Chang and Malavige, Gathsaurie Neelika and Mashe, Tapfumanei and Mongkolsapaya, Juthathip and Montes, Brigitte and Mora, Jose Arturo Molina and Morang’a, Collins M. and Mvula, Bernard and Nagarajan, Niranjan and Nelson, Andrew and Ngoi, Joyce M. and da Paixão, Joana Paula and Panning, Marcus and Poklepovich, Tomas and Quashie, Peter K. and Ranasinghe, Diyanath and Russo, Mara and San, James Emmanuel and Sanderson, Nicholas D. and Scaria, Vinod and Screaton, Gavin and Sessions, October Michael and Sironen, Tarja and Sisay, Abay and Smith, Darren and Smura, Teemu and Supasa, Piyada and Suphavilai, Chayaporn and Swann, Jeremy and Tegally, Houriiyah and Tegomoh, Bryan and Vapalahti, Olli and Walker, Andreas and Wilkinson, Robert J. and Williamson, Carolyn and Zair, Xavier and Biere, Barbara and Dürrwald, Ralf and Mache, Christin and Oh, Djin-Ye and Schulze, Jessica and Wedde, Marianne and Wolff, Thorsten and Fuchs, Stephan and Semmler, Torsten and Paraskevopoulou, Sofia and Kerber, Romy and Kröger, Stefan and Haas, Walter and Bode, Konrad and Corman, Victor and Erren, Michael and Finzer, Patrick and Grosser, Roger and Haffner, Manuel and Hermann, Beate and Kiel, Christina and Krumbholz, Andi and Lorentz, Thomas and Meinck, Kristian and Nitsche, Andreas and Petzold, Markus and Schwanz, Thomas and Szabados, Florian and Tewald, Friedemann and Tiemann, Carsten and de Oliveira, Tulio and Peto, Timothy EA and Crook, Derrick and Corbett-Detig, Russell and Iqbal, Zamin (2026) Addressing pandemic-wide systematic errors in the SARS-CoV-2 phylogeny. Nature Methods. ISSN 1548-7091

[thumbnail of s41592-025-02947-1.pdf]
Preview
Text
s41592-025-02947-1.pdf - Published Version
Available under License Creative Commons Attribution.

Download (2MB)

Abstract

The majority of SARS-CoV-2 genomes obtained during the pandemic were derived by amplifying overlapping windows of the genome (‘tiled amplicons’), reconstructing their sequences and fitting them together. This leads to systematic errors in genomes unless the software is both aware of the amplicon scheme and of the error modes of amplicon sequencing. Additionally, over time, amplicon schemes need to be updated as new mutations in the virus interfere with the primer binding sites at the end of amplicons. Thus, waves of variants swept the world during the pandemic and were followed by waves of systematic errors in the genomes, which had significant impacts on the inferred phylogenetic tree.

Here we reconstruct the genomes from all public data as of June 2024 using an assembly tool called Viridian ( https://github.com/iqbal-lab-org/viridian ), developed to rigorously process amplicon sequence data. With these high-quality consensus sequences we provide a global phylogenetic tree of 4,471,579 samples, viewable at https://viridian.taxonium.org . We provide simulation and empirical validation of the methodology, and quantify the improvement in the phylogeny.

Item Type: Article
Identification Number: 10.1038/s41592-025-02947-1
Dates:
Date
Event
27 October 2025
Accepted
9 February 2026
Published Online
Subjects: CAH03 - biological and sport sciences > CAH03-01 - biosciences > CAH03-01-02 - biology (non-specific)
Divisions: Life and Health Sciences > Life and Sports Sciences
Depositing User: Gemma Tonks
Date Deposited: 11 Feb 2026 12:40
Last Modified: 11 Feb 2026 12:40
URI: https://www.open-access.bcu.ac.uk/id/eprint/16859

Actions (login required)

View Item View Item

Research

In this section...