- Published
- 05 July 2022
- Journal article
gofasta: Command-line utilities for genomic epidemiology research
- Authors
- Source
- Bioinformatics
Abstract
Summary:
gofasta comprises a set of command-line utilities for handling alignments of short assembled genomes in a genomic epidemiology context. It was developed for processing large numbers of closely related SARS-CoV-2 viral genomes, and should be useful with other densely sampled pathogen genomic datasets. It provides functions to convert sam-format pairwise alignments between assembled genomes to fasta format; to annotate mutations in multiple sequence alignments, and to extract sets of sequences by genetic distance measures for use in outbreak investigations.
Availability and Implementation:
gofasta is an open-source project distributed under the MIT license. Binaries are available at https://github.com/virus-evolution/gofasta, from Bioconda, and through the Go programming language’s package management system. Source code and further documentation, including walkthroughs for common use cases, are available on the GitHub repository.
Rights
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Cite as
Jackson, B. 2022, 'gofasta: Command-line utilities for genomic epidemiology research', Bioinformatics, article no: btac424. https://doi.org/10.1093/bioinformatics/btac424