Phylogenetic placement of exact amplicon sequences improves associations with clinical information

Stefan Janssen, Daniel McDonald, Antonio Gonzalez, Jose A. Navas-Molina, Lingjing Jiang, Zhenjiang Zech Xu, Kevin Winker, Deborah M. Kado, Eric Orwoll, Mark Manary, Siavash Mirarab, Rob Knight

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Recent algorithmic advances in amplicon-based microbiome studies enable the inference of exact amplicon sequence fragments. These new methods enable the investigation of sub-operational taxonomic units (sOTU) by removing erroneous sequences. However, short (e.g., 150-nucleotide [nt]) DNA sequence fragments do not contain sufficient phylogenetic signal to reproduce a reasonable tree, introducing a barrier in the utilization of critical phylogenetically aware metrics such as Faith's PD or UniFrac. Although fragment insertion methods do exist, those methods have not been tested for sOTUs from high-throughput amplicon studies in insertions against a broad reference phylogeny. We benchmarked the SATé-enabled phylogenetic placement (SEPP) technique explicitly against 16S V4 sequence fragments and showed that it outperforms the conceptually problematic but often-used practice of reconstructing de novo phylogenies. In addition, we provide a BSD-licensed QIIME2 plugin (https://github.com/biocore/q2-fragment-insertion) for SEPP and integration into the microbial study management platform QIITA. IMPORTANCE The move from OTU-based to sOTU-based analysis, while providing additional resolution, also introduces computational challenges. We demonstrate that one popular method of dealing with sOTUs (building a de novo tree from the short sequences) can provide incorrect results in human gut metagenomic studies and show that phylogenetic placement of the new sequences with SEPP resolves this problem while also yielding other benefits over existing methods.

Original languageEnglish (US)
Article numbere00021-18
JournalmSystems
Volume3
Issue number3
DOIs
StatePublished - May 1 2018

Fingerprint

Exact Sequence
Phylogenetics
Placement
Fragment
phylogenetics
DNA sequences
phylogeny
Insertion
Nucleotides
Phylogeny
Throughput
Metagenomics
Unit
Microbiota
methodology
Plug-in
DNA Sequence
High Throughput
Resolve
method

Keywords

  • Amplicon sequencing
  • Microbial community analysis
  • Phylogenetic placement
  • SEPP

ASJC Scopus subject areas

  • Microbiology
  • Physiology
  • Biochemistry
  • Ecology, Evolution, Behavior and Systematics
  • Modeling and Simulation
  • Molecular Biology
  • Genetics
  • Computer Science Applications

Cite this

Janssen, S., McDonald, D., Gonzalez, A., Navas-Molina, J. A., Jiang, L., Xu, Z. Z., ... Knight, R. (2018). Phylogenetic placement of exact amplicon sequences improves associations with clinical information. mSystems, 3(3), [e00021-18]. https://doi.org/10.1128/mSystems.00021-18

Phylogenetic placement of exact amplicon sequences improves associations with clinical information. / Janssen, Stefan; McDonald, Daniel; Gonzalez, Antonio; Navas-Molina, Jose A.; Jiang, Lingjing; Xu, Zhenjiang Zech; Winker, Kevin; Kado, Deborah M.; Orwoll, Eric; Manary, Mark; Mirarab, Siavash; Knight, Rob.

In: mSystems, Vol. 3, No. 3, e00021-18, 01.05.2018.

Research output: Contribution to journalArticle

Janssen, S, McDonald, D, Gonzalez, A, Navas-Molina, JA, Jiang, L, Xu, ZZ, Winker, K, Kado, DM, Orwoll, E, Manary, M, Mirarab, S & Knight, R 2018, 'Phylogenetic placement of exact amplicon sequences improves associations with clinical information', mSystems, vol. 3, no. 3, e00021-18. https://doi.org/10.1128/mSystems.00021-18
Janssen S, McDonald D, Gonzalez A, Navas-Molina JA, Jiang L, Xu ZZ et al. Phylogenetic placement of exact amplicon sequences improves associations with clinical information. mSystems. 2018 May 1;3(3). e00021-18. https://doi.org/10.1128/mSystems.00021-18
Janssen, Stefan ; McDonald, Daniel ; Gonzalez, Antonio ; Navas-Molina, Jose A. ; Jiang, Lingjing ; Xu, Zhenjiang Zech ; Winker, Kevin ; Kado, Deborah M. ; Orwoll, Eric ; Manary, Mark ; Mirarab, Siavash ; Knight, Rob. / Phylogenetic placement of exact amplicon sequences improves associations with clinical information. In: mSystems. 2018 ; Vol. 3, No. 3.
@article{f3d0ec44a6ce4fddbce822354dfbc48a,
title = "Phylogenetic placement of exact amplicon sequences improves associations with clinical information",
abstract = "Recent algorithmic advances in amplicon-based microbiome studies enable the inference of exact amplicon sequence fragments. These new methods enable the investigation of sub-operational taxonomic units (sOTU) by removing erroneous sequences. However, short (e.g., 150-nucleotide [nt]) DNA sequence fragments do not contain sufficient phylogenetic signal to reproduce a reasonable tree, introducing a barrier in the utilization of critical phylogenetically aware metrics such as Faith's PD or UniFrac. Although fragment insertion methods do exist, those methods have not been tested for sOTUs from high-throughput amplicon studies in insertions against a broad reference phylogeny. We benchmarked the SAT{\'e}-enabled phylogenetic placement (SEPP) technique explicitly against 16S V4 sequence fragments and showed that it outperforms the conceptually problematic but often-used practice of reconstructing de novo phylogenies. In addition, we provide a BSD-licensed QIIME2 plugin (https://github.com/biocore/q2-fragment-insertion) for SEPP and integration into the microbial study management platform QIITA. IMPORTANCE The move from OTU-based to sOTU-based analysis, while providing additional resolution, also introduces computational challenges. We demonstrate that one popular method of dealing with sOTUs (building a de novo tree from the short sequences) can provide incorrect results in human gut metagenomic studies and show that phylogenetic placement of the new sequences with SEPP resolves this problem while also yielding other benefits over existing methods.",
keywords = "Amplicon sequencing, Microbial community analysis, Phylogenetic placement, SEPP",
author = "Stefan Janssen and Daniel McDonald and Antonio Gonzalez and Navas-Molina, {Jose A.} and Lingjing Jiang and Xu, {Zhenjiang Zech} and Kevin Winker and Kado, {Deborah M.} and Eric Orwoll and Mark Manary and Siavash Mirarab and Rob Knight",
year = "2018",
month = "5",
day = "1",
doi = "10.1128/mSystems.00021-18",
language = "English (US)",
volume = "3",
journal = "mSystems",
issn = "2379-5077",
publisher = "American Society for Microbiology",
number = "3",

}

TY - JOUR

T1 - Phylogenetic placement of exact amplicon sequences improves associations with clinical information

AU - Janssen, Stefan

AU - McDonald, Daniel

AU - Gonzalez, Antonio

AU - Navas-Molina, Jose A.

AU - Jiang, Lingjing

AU - Xu, Zhenjiang Zech

AU - Winker, Kevin

AU - Kado, Deborah M.

AU - Orwoll, Eric

AU - Manary, Mark

AU - Mirarab, Siavash

AU - Knight, Rob

PY - 2018/5/1

Y1 - 2018/5/1

N2 - Recent algorithmic advances in amplicon-based microbiome studies enable the inference of exact amplicon sequence fragments. These new methods enable the investigation of sub-operational taxonomic units (sOTU) by removing erroneous sequences. However, short (e.g., 150-nucleotide [nt]) DNA sequence fragments do not contain sufficient phylogenetic signal to reproduce a reasonable tree, introducing a barrier in the utilization of critical phylogenetically aware metrics such as Faith's PD or UniFrac. Although fragment insertion methods do exist, those methods have not been tested for sOTUs from high-throughput amplicon studies in insertions against a broad reference phylogeny. We benchmarked the SATé-enabled phylogenetic placement (SEPP) technique explicitly against 16S V4 sequence fragments and showed that it outperforms the conceptually problematic but often-used practice of reconstructing de novo phylogenies. In addition, we provide a BSD-licensed QIIME2 plugin (https://github.com/biocore/q2-fragment-insertion) for SEPP and integration into the microbial study management platform QIITA. IMPORTANCE The move from OTU-based to sOTU-based analysis, while providing additional resolution, also introduces computational challenges. We demonstrate that one popular method of dealing with sOTUs (building a de novo tree from the short sequences) can provide incorrect results in human gut metagenomic studies and show that phylogenetic placement of the new sequences with SEPP resolves this problem while also yielding other benefits over existing methods.

AB - Recent algorithmic advances in amplicon-based microbiome studies enable the inference of exact amplicon sequence fragments. These new methods enable the investigation of sub-operational taxonomic units (sOTU) by removing erroneous sequences. However, short (e.g., 150-nucleotide [nt]) DNA sequence fragments do not contain sufficient phylogenetic signal to reproduce a reasonable tree, introducing a barrier in the utilization of critical phylogenetically aware metrics such as Faith's PD or UniFrac. Although fragment insertion methods do exist, those methods have not been tested for sOTUs from high-throughput amplicon studies in insertions against a broad reference phylogeny. We benchmarked the SATé-enabled phylogenetic placement (SEPP) technique explicitly against 16S V4 sequence fragments and showed that it outperforms the conceptually problematic but often-used practice of reconstructing de novo phylogenies. In addition, we provide a BSD-licensed QIIME2 plugin (https://github.com/biocore/q2-fragment-insertion) for SEPP and integration into the microbial study management platform QIITA. IMPORTANCE The move from OTU-based to sOTU-based analysis, while providing additional resolution, also introduces computational challenges. We demonstrate that one popular method of dealing with sOTUs (building a de novo tree from the short sequences) can provide incorrect results in human gut metagenomic studies and show that phylogenetic placement of the new sequences with SEPP resolves this problem while also yielding other benefits over existing methods.

KW - Amplicon sequencing

KW - Microbial community analysis

KW - Phylogenetic placement

KW - SEPP

UR - http://www.scopus.com/inward/record.url?scp=85046907278&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046907278&partnerID=8YFLogxK

U2 - 10.1128/mSystems.00021-18

DO - 10.1128/mSystems.00021-18

M3 - Article

AN - SCOPUS:85046907278

VL - 3

JO - mSystems

JF - mSystems

SN - 2379-5077

IS - 3

M1 - e00021-18

ER -