recount workflow: Accessing over 70,000 human RNA-seq samples with Bioconductor

Leonardo Collado-Torres, Abhinav Nellore, Andrew E. Jaffe

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

The recount2 resource is composed of over 70,000 uniformly processed human RNA-seq samples spanning TCGA and SRA, including GTEx. The processed data can be accessed via the recount2 website and the recount Bioconductor package. This workflow explains in detail how to use the recount package and how to integrate it with other Bioconductor packages for several analyses that can be carried out with the recount2 resource. In particular, we describe how the coverage count matrices were computed in recount2 as well as different ways of obtaining public metadata, which can facilitate downstream analyses. Step-by-step directions show how to do a gene-level differential expression analysis, visualize base-level genome coverage data, and perform an analyses at multiple feature levels. This workflow thus provides further information to understand the data in recount2 and a compendium of R code to use the data.

Original languageEnglish (US)
Article number1558
JournalF1000Research
Volume6
DOIs
StatePublished - 2017

Fingerprint

Workflow
Genes
RNA
Metadata
Websites
Genome
Direction compound

Keywords

  • Bioconductor
  • Bioinformatics
  • Differential expression
  • Genomics
  • GTEx
  • Human
  • RNA-seq
  • SRA
  • TCGA
  • Visualization

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Pharmacology, Toxicology and Pharmaceutics(all)

Cite this

recount workflow : Accessing over 70,000 human RNA-seq samples with Bioconductor. / Collado-Torres, Leonardo; Nellore, Abhinav; Jaffe, Andrew E.

In: F1000Research, Vol. 6, 1558, 2017.

Research output: Contribution to journalArticle

@article{4e6a084fbfcc4377ae4db19f7b60bee0,
title = "recount workflow: Accessing over 70,000 human RNA-seq samples with Bioconductor",
abstract = "The recount2 resource is composed of over 70,000 uniformly processed human RNA-seq samples spanning TCGA and SRA, including GTEx. The processed data can be accessed via the recount2 website and the recount Bioconductor package. This workflow explains in detail how to use the recount package and how to integrate it with other Bioconductor packages for several analyses that can be carried out with the recount2 resource. In particular, we describe how the coverage count matrices were computed in recount2 as well as different ways of obtaining public metadata, which can facilitate downstream analyses. Step-by-step directions show how to do a gene-level differential expression analysis, visualize base-level genome coverage data, and perform an analyses at multiple feature levels. This workflow thus provides further information to understand the data in recount2 and a compendium of R code to use the data.",
keywords = "Bioconductor, Bioinformatics, Differential expression, Genomics, GTEx, Human, RNA-seq, SRA, TCGA, Visualization",
author = "Leonardo Collado-Torres and Abhinav Nellore and Jaffe, {Andrew E.}",
year = "2017",
doi = "10.12688/f1000research.12223.1",
language = "English (US)",
volume = "6",
journal = "F1000Research",
issn = "2046-1402",
publisher = "F1000 Research Ltd.",

}

TY - JOUR

T1 - recount workflow

T2 - Accessing over 70,000 human RNA-seq samples with Bioconductor

AU - Collado-Torres, Leonardo

AU - Nellore, Abhinav

AU - Jaffe, Andrew E.

PY - 2017

Y1 - 2017

N2 - The recount2 resource is composed of over 70,000 uniformly processed human RNA-seq samples spanning TCGA and SRA, including GTEx. The processed data can be accessed via the recount2 website and the recount Bioconductor package. This workflow explains in detail how to use the recount package and how to integrate it with other Bioconductor packages for several analyses that can be carried out with the recount2 resource. In particular, we describe how the coverage count matrices were computed in recount2 as well as different ways of obtaining public metadata, which can facilitate downstream analyses. Step-by-step directions show how to do a gene-level differential expression analysis, visualize base-level genome coverage data, and perform an analyses at multiple feature levels. This workflow thus provides further information to understand the data in recount2 and a compendium of R code to use the data.

AB - The recount2 resource is composed of over 70,000 uniformly processed human RNA-seq samples spanning TCGA and SRA, including GTEx. The processed data can be accessed via the recount2 website and the recount Bioconductor package. This workflow explains in detail how to use the recount package and how to integrate it with other Bioconductor packages for several analyses that can be carried out with the recount2 resource. In particular, we describe how the coverage count matrices were computed in recount2 as well as different ways of obtaining public metadata, which can facilitate downstream analyses. Step-by-step directions show how to do a gene-level differential expression analysis, visualize base-level genome coverage data, and perform an analyses at multiple feature levels. This workflow thus provides further information to understand the data in recount2 and a compendium of R code to use the data.

KW - Bioconductor

KW - Bioinformatics

KW - Differential expression

KW - Genomics

KW - GTEx

KW - Human

KW - RNA-seq

KW - SRA

KW - TCGA

KW - Visualization

UR - http://www.scopus.com/inward/record.url?scp=85030687991&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85030687991&partnerID=8YFLogxK

U2 - 10.12688/f1000research.12223.1

DO - 10.12688/f1000research.12223.1

M3 - Article

C2 - 29043067

AN - SCOPUS:85030687991

VL - 6

JO - F1000Research

JF - F1000Research

SN - 2046-1402

M1 - 1558

ER -