Reproducible biomedical benchmarking in the cloud: Lessons from crowd-sourced data challenges

Kyle Ellrott, Alex Buchanan, Allison Creason, Michael Mason, Thomas Schaffter, Bruce Hoff, James Eddy, John M. Chilton, Thomas Yu, Joshua M. Stuart, Julio Saez-Rodriguez, Gustavo Stolovitzky, Paul C. Boutros, Justin Guinney

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Challenges are achieving broad acceptance for addressing many biomedical questions and enabling tool assessment. But ensuring that the methods evaluated are reproducible and reusable is complicated by the diversity of software architectures, input and output file formats, and computing environments. To mitigate these problems, some challenges have leveraged new virtualization and compute methods, requiring participants to submit cloud-ready software packages. We review recent data challenges with innovative approaches to model reproducibility and data sharing, and outline key lessons for improving quantitative biomedical data analysis through crowd-sourced benchmarking challenges.

Original languageEnglish (US)
Article number195
JournalGenome biology
Volume20
Issue number1
DOIs
StatePublished - Sep 10 2019

Fingerprint

Benchmarking
benchmarking
Software
software
Information Dissemination
reproducibility
data analysis
methodology
method

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Genetics
  • Cell Biology

Cite this

Ellrott, K., Buchanan, A., Creason, A., Mason, M., Schaffter, T., Hoff, B., ... Guinney, J. (2019). Reproducible biomedical benchmarking in the cloud: Lessons from crowd-sourced data challenges. Genome biology, 20(1), [195]. https://doi.org/10.1186/s13059-019-1794-0

Reproducible biomedical benchmarking in the cloud : Lessons from crowd-sourced data challenges. / Ellrott, Kyle; Buchanan, Alex; Creason, Allison; Mason, Michael; Schaffter, Thomas; Hoff, Bruce; Eddy, James; Chilton, John M.; Yu, Thomas; Stuart, Joshua M.; Saez-Rodriguez, Julio; Stolovitzky, Gustavo; Boutros, Paul C.; Guinney, Justin.

In: Genome biology, Vol. 20, No. 1, 195, 10.09.2019.

Research output: Contribution to journalArticle

Ellrott, K, Buchanan, A, Creason, A, Mason, M, Schaffter, T, Hoff, B, Eddy, J, Chilton, JM, Yu, T, Stuart, JM, Saez-Rodriguez, J, Stolovitzky, G, Boutros, PC & Guinney, J 2019, 'Reproducible biomedical benchmarking in the cloud: Lessons from crowd-sourced data challenges', Genome biology, vol. 20, no. 1, 195. https://doi.org/10.1186/s13059-019-1794-0
Ellrott, Kyle ; Buchanan, Alex ; Creason, Allison ; Mason, Michael ; Schaffter, Thomas ; Hoff, Bruce ; Eddy, James ; Chilton, John M. ; Yu, Thomas ; Stuart, Joshua M. ; Saez-Rodriguez, Julio ; Stolovitzky, Gustavo ; Boutros, Paul C. ; Guinney, Justin. / Reproducible biomedical benchmarking in the cloud : Lessons from crowd-sourced data challenges. In: Genome biology. 2019 ; Vol. 20, No. 1.
@article{90e345326493416ca7ad5c439fbb93b1,
title = "Reproducible biomedical benchmarking in the cloud: Lessons from crowd-sourced data challenges",
abstract = "Challenges are achieving broad acceptance for addressing many biomedical questions and enabling tool assessment. But ensuring that the methods evaluated are reproducible and reusable is complicated by the diversity of software architectures, input and output file formats, and computing environments. To mitigate these problems, some challenges have leveraged new virtualization and compute methods, requiring participants to submit cloud-ready software packages. We review recent data challenges with innovative approaches to model reproducibility and data sharing, and outline key lessons for improving quantitative biomedical data analysis through crowd-sourced benchmarking challenges.",
author = "Kyle Ellrott and Alex Buchanan and Allison Creason and Michael Mason and Thomas Schaffter and Bruce Hoff and James Eddy and Chilton, {John M.} and Thomas Yu and Stuart, {Joshua M.} and Julio Saez-Rodriguez and Gustavo Stolovitzky and Boutros, {Paul C.} and Justin Guinney",
year = "2019",
month = "9",
day = "10",
doi = "10.1186/s13059-019-1794-0",
language = "English (US)",
volume = "20",
journal = "Genome Biology",
issn = "1474-7596",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Reproducible biomedical benchmarking in the cloud

T2 - Lessons from crowd-sourced data challenges

AU - Ellrott, Kyle

AU - Buchanan, Alex

AU - Creason, Allison

AU - Mason, Michael

AU - Schaffter, Thomas

AU - Hoff, Bruce

AU - Eddy, James

AU - Chilton, John M.

AU - Yu, Thomas

AU - Stuart, Joshua M.

AU - Saez-Rodriguez, Julio

AU - Stolovitzky, Gustavo

AU - Boutros, Paul C.

AU - Guinney, Justin

PY - 2019/9/10

Y1 - 2019/9/10

N2 - Challenges are achieving broad acceptance for addressing many biomedical questions and enabling tool assessment. But ensuring that the methods evaluated are reproducible and reusable is complicated by the diversity of software architectures, input and output file formats, and computing environments. To mitigate these problems, some challenges have leveraged new virtualization and compute methods, requiring participants to submit cloud-ready software packages. We review recent data challenges with innovative approaches to model reproducibility and data sharing, and outline key lessons for improving quantitative biomedical data analysis through crowd-sourced benchmarking challenges.

AB - Challenges are achieving broad acceptance for addressing many biomedical questions and enabling tool assessment. But ensuring that the methods evaluated are reproducible and reusable is complicated by the diversity of software architectures, input and output file formats, and computing environments. To mitigate these problems, some challenges have leveraged new virtualization and compute methods, requiring participants to submit cloud-ready software packages. We review recent data challenges with innovative approaches to model reproducibility and data sharing, and outline key lessons for improving quantitative biomedical data analysis through crowd-sourced benchmarking challenges.

UR - http://www.scopus.com/inward/record.url?scp=85072025943&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85072025943&partnerID=8YFLogxK

U2 - 10.1186/s13059-019-1794-0

DO - 10.1186/s13059-019-1794-0

M3 - Article

C2 - 31506093

AN - SCOPUS:85072025943

VL - 20

JO - Genome Biology

JF - Genome Biology

SN - 1474-7596

IS - 1

M1 - 195

ER -