SARS-CoV-2 evolution threatens vaccine- and natural infection–derived immunity and the efficacy of therapeutic antibodies. To improve public health preparedness, we sought to predict which existing amino acid mutations in SARS-CoV-2 might contribute to future variants of concern. We tested the predictive value of features comprising epidemiology, evolution, immunology, and neural network–based protein sequence modeling and identified primary biological drivers of SARS-CoV-2 intrapandemic evolution. We found evidence that ACE2-mediated transmissibility and resistance to population-level host immunity has waxed and waned as a primary driver of SARS-CoV-2 evolution over time. We retroactively identified with high accuracy (area under the receiver operator characteristic curve = 0.92 to 0.97) mutations that will spread, at up to 4 months in advance, across different phases of the pandemic. The behavior of the model was consistent with a plausible causal structure where epidemiological covariates combine the effects of diverse and shifting drivers of viral fitness. We applied our model to forecast mutations that will spread in the future and characterize how these mutations affect the binding of therapeutic antibodies. These findings demonstrate that it is possible to forecast the driver mutations that could appear in emerging SARS-CoV-2 variants of concern. We validated this result against Omicron, showing elevated predictive scores for its component mutations before emergence and rapid score increase across daily forecasts during emergence. This modeling approach may be applied to any rapidly evolving pathogens with sufficiently dense genomic surveillance data, such as influenza, and unknown future pandemic viruses.


This work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided that the original work is properly cited. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. This license does not apply to figures/photos/artwork or other content included in the article that is credited to a third party; obtain authorization from the rights holder before using this material.

Cite as

Maher, M., Bartha, I., Weaver, S., Di Iulio, J., Ferri, E., Soriaga, L., Lempp, F., Hie, B., Bryson, B., Berger, B., Robertson, D., Snell, G., Corti, D., Virgin, H., Kosakovsky Pond, S. & Telenti, A. 2022, 'Predicting the mutational drivers of future SARS-CoV-2 variants of concern', Science Translational Medicine, 14(633). https://doi.org/10.1126/scitranslmed.abk3445

Downloadable citations

Download HTML citationHTML Download BIB citationBIB Download RIS citationRIS
Last updated: 08 November 2022
Was this page helpful?