Flexible expressed region analysis for RNA-seq with derfinder

Leonardo Collado-Torres, Abhinav Nellore, Alyssa C. Frazee, Christopher Wilks, Michael I. Love, Ben Langmead, Rafael A. Irizarry, Jeffrey T. Leek, Andrew E. Jaffe

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Differential expression analysis of RNA sequencing (RNA-seq) data typically relies on reconstructing transcripts or counting reads that overlap known gene structures. We previously introduced an intermediate statistical approach called differentially expressed region (DER) finder that seeks to identify contiguous regions of the genome showing differential expression signal at single base resolution without relying on existing annotation or potentially inaccurate transcript assembly. We present the derfinder software that improves our annotation-agnostic approach to RNA-seq analysis by: (i) implementing a computationally efficient bump-hunting approach to identify DERs that permits genome-scale analyses in a large number of samples, (ii) introducing a flexible statistical modeling framework, including multi-group and time-course analyses and (iii) introducing a new set of data visualizations for expressed region analysis. We apply this approach to public RNA-seq data from the Genotype-Tissue Expression (GTEx) project and BrainSpan project to show that derfinder permits the analysis of hundreds of samples at base resolution in R, identifies expression outside of known gene boundaries and can be used to visualize expressed regions at base-resolution. In simulations, our base resolution approaches enable discovery in the presence of incomplete annotation and is nearly as powerful as feature-level methods when the annotation is complete. derfinder analysis using expressed region-level and single base-level approaches provides a compromise between full transcript reconstruction and feature-level analysis. The package is available from Bioconductor at www.bioconductor.org/packages/derfinder.

Original languageEnglish (US)
Pages (from-to)e9
JournalNucleic Acids Research
Volume45
Issue number2
DOIs
StatePublished - Jan 1 2017
Externally publishedYes

Fingerprint

RNA Sequence Analysis
Genome
Genes
Software
Genotype

ASJC Scopus subject areas

  • Genetics

Cite this

Collado-Torres, L., Nellore, A., Frazee, A. C., Wilks, C., Love, M. I., Langmead, B., ... Jaffe, A. E. (2017). Flexible expressed region analysis for RNA-seq with derfinder. Nucleic Acids Research, 45(2), e9. https://doi.org/10.1093/nar/gkw852

Flexible expressed region analysis for RNA-seq with derfinder. / Collado-Torres, Leonardo; Nellore, Abhinav; Frazee, Alyssa C.; Wilks, Christopher; Love, Michael I.; Langmead, Ben; Irizarry, Rafael A.; Leek, Jeffrey T.; Jaffe, Andrew E.

In: Nucleic Acids Research, Vol. 45, No. 2, 01.01.2017, p. e9.

Research output: Contribution to journalArticle

Collado-Torres, L, Nellore, A, Frazee, AC, Wilks, C, Love, MI, Langmead, B, Irizarry, RA, Leek, JT & Jaffe, AE 2017, 'Flexible expressed region analysis for RNA-seq with derfinder', Nucleic Acids Research, vol. 45, no. 2, pp. e9. https://doi.org/10.1093/nar/gkw852
Collado-Torres L, Nellore A, Frazee AC, Wilks C, Love MI, Langmead B et al. Flexible expressed region analysis for RNA-seq with derfinder. Nucleic Acids Research. 2017 Jan 1;45(2):e9. https://doi.org/10.1093/nar/gkw852
Collado-Torres, Leonardo ; Nellore, Abhinav ; Frazee, Alyssa C. ; Wilks, Christopher ; Love, Michael I. ; Langmead, Ben ; Irizarry, Rafael A. ; Leek, Jeffrey T. ; Jaffe, Andrew E. / Flexible expressed region analysis for RNA-seq with derfinder. In: Nucleic Acids Research. 2017 ; Vol. 45, No. 2. pp. e9.
@article{7e8f153816ac426185945e85cc572edf,
title = "Flexible expressed region analysis for RNA-seq with derfinder",
abstract = "Differential expression analysis of RNA sequencing (RNA-seq) data typically relies on reconstructing transcripts or counting reads that overlap known gene structures. We previously introduced an intermediate statistical approach called differentially expressed region (DER) finder that seeks to identify contiguous regions of the genome showing differential expression signal at single base resolution without relying on existing annotation or potentially inaccurate transcript assembly. We present the derfinder software that improves our annotation-agnostic approach to RNA-seq analysis by: (i) implementing a computationally efficient bump-hunting approach to identify DERs that permits genome-scale analyses in a large number of samples, (ii) introducing a flexible statistical modeling framework, including multi-group and time-course analyses and (iii) introducing a new set of data visualizations for expressed region analysis. We apply this approach to public RNA-seq data from the Genotype-Tissue Expression (GTEx) project and BrainSpan project to show that derfinder permits the analysis of hundreds of samples at base resolution in R, identifies expression outside of known gene boundaries and can be used to visualize expressed regions at base-resolution. In simulations, our base resolution approaches enable discovery in the presence of incomplete annotation and is nearly as powerful as feature-level methods when the annotation is complete. derfinder analysis using expressed region-level and single base-level approaches provides a compromise between full transcript reconstruction and feature-level analysis. The package is available from Bioconductor at www.bioconductor.org/packages/derfinder.",
author = "Leonardo Collado-Torres and Abhinav Nellore and Frazee, {Alyssa C.} and Christopher Wilks and Love, {Michael I.} and Ben Langmead and Irizarry, {Rafael A.} and Leek, {Jeffrey T.} and Jaffe, {Andrew E.}",
year = "2017",
month = "1",
day = "1",
doi = "10.1093/nar/gkw852",
language = "English (US)",
volume = "45",
pages = "e9",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "2",

}

TY - JOUR

T1 - Flexible expressed region analysis for RNA-seq with derfinder

AU - Collado-Torres, Leonardo

AU - Nellore, Abhinav

AU - Frazee, Alyssa C.

AU - Wilks, Christopher

AU - Love, Michael I.

AU - Langmead, Ben

AU - Irizarry, Rafael A.

AU - Leek, Jeffrey T.

AU - Jaffe, Andrew E.

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Differential expression analysis of RNA sequencing (RNA-seq) data typically relies on reconstructing transcripts or counting reads that overlap known gene structures. We previously introduced an intermediate statistical approach called differentially expressed region (DER) finder that seeks to identify contiguous regions of the genome showing differential expression signal at single base resolution without relying on existing annotation or potentially inaccurate transcript assembly. We present the derfinder software that improves our annotation-agnostic approach to RNA-seq analysis by: (i) implementing a computationally efficient bump-hunting approach to identify DERs that permits genome-scale analyses in a large number of samples, (ii) introducing a flexible statistical modeling framework, including multi-group and time-course analyses and (iii) introducing a new set of data visualizations for expressed region analysis. We apply this approach to public RNA-seq data from the Genotype-Tissue Expression (GTEx) project and BrainSpan project to show that derfinder permits the analysis of hundreds of samples at base resolution in R, identifies expression outside of known gene boundaries and can be used to visualize expressed regions at base-resolution. In simulations, our base resolution approaches enable discovery in the presence of incomplete annotation and is nearly as powerful as feature-level methods when the annotation is complete. derfinder analysis using expressed region-level and single base-level approaches provides a compromise between full transcript reconstruction and feature-level analysis. The package is available from Bioconductor at www.bioconductor.org/packages/derfinder.

AB - Differential expression analysis of RNA sequencing (RNA-seq) data typically relies on reconstructing transcripts or counting reads that overlap known gene structures. We previously introduced an intermediate statistical approach called differentially expressed region (DER) finder that seeks to identify contiguous regions of the genome showing differential expression signal at single base resolution without relying on existing annotation or potentially inaccurate transcript assembly. We present the derfinder software that improves our annotation-agnostic approach to RNA-seq analysis by: (i) implementing a computationally efficient bump-hunting approach to identify DERs that permits genome-scale analyses in a large number of samples, (ii) introducing a flexible statistical modeling framework, including multi-group and time-course analyses and (iii) introducing a new set of data visualizations for expressed region analysis. We apply this approach to public RNA-seq data from the Genotype-Tissue Expression (GTEx) project and BrainSpan project to show that derfinder permits the analysis of hundreds of samples at base resolution in R, identifies expression outside of known gene boundaries and can be used to visualize expressed regions at base-resolution. In simulations, our base resolution approaches enable discovery in the presence of incomplete annotation and is nearly as powerful as feature-level methods when the annotation is complete. derfinder analysis using expressed region-level and single base-level approaches provides a compromise between full transcript reconstruction and feature-level analysis. The package is available from Bioconductor at www.bioconductor.org/packages/derfinder.

UR - http://www.scopus.com/inward/record.url?scp=85014052295&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85014052295&partnerID=8YFLogxK

U2 - 10.1093/nar/gkw852

DO - 10.1093/nar/gkw852

M3 - Article

VL - 45

SP - e9

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 2

ER -