Practical Computational Reproducibility in the Life Sciences

Björn Grüning, John Chilton, Johannes Köster, Ryan Dale, Nicola Soranzo, Marius van den Beek, Jeremy Goecks, Rolf Backofen, Anton Nekrutenko, James Taylor

Research output: Contribution to journalComment/debate

5 Citations (Scopus)

Abstract

Many areas of research suffer from poor reproducibility, particularly in computationally intensive domains where results rely on a series of complex methodological decisions that are not well captured by traditional publication approaches. Various guidelines have emerged for achieving reproducibility, but implementation of these practices remains difficult due to the challenge of assembling software tools plus associated libraries, connecting tools together into pipelines, and specifying parameters. Here, we discuss a suite of cutting-edge technologies that make computational reproducibility not just possible, but practical in both time and effort. This suite combines three well-tested components—a system for building highly portable packages of bioinformatics software, containerization and virtualization technologies for isolating reusable execution environments for these packages, and workflow systems that automatically orchestrate the composition of these packages for entire pipelines—to achieve an unprecedented level of computational reproducibility. We also provide a practical implementation and five recommendations to help set a typical researcher on the path to performing data analyses reproducibly. Many areas of research suffer from poor reproducibility, particularly in computationally intensive domains where results rely on a series of complex methodological decisions that are not well captured by traditional publication approaches. Various guidelines have emerged for achieving reproducibility, but implementation of these practices remains difficult due to the challenge of assembling software tools plus associated libraries, connecting tools together into pipelines, and specifying parameters. Here, we discuss a suite of cutting-edge technologies that make computational reproducibility not just possible, but practical in both time and effort. This suite combines three well-tested components—a system for building highly portable packages of bioinformatics software, containerization and virtualization technologies for isolating reusable execution environments for these packages, and workflow systems that automatically orchestrate the composition of these packages for entire pipelines—to achieve an unprecedented level of computational reproducibility. We also provide a practical implementation and five recommendations to help set a typical researcher on the path to performing data analyses reproducibly.

Original languageEnglish (US)
Pages (from-to)631-635
Number of pages5
JournalCell Systems
Volume6
Issue number6
DOIs
StatePublished - Jun 27 2018

Fingerprint

Biological Science Disciplines
Software
Technology
Workflow
Computational Biology
Libraries
Publications
Research Personnel
Guidelines
Research

ASJC Scopus subject areas

  • Pathology and Forensic Medicine
  • Histology
  • Cell Biology

Cite this

Grüning, B., Chilton, J., Köster, J., Dale, R., Soranzo, N., van den Beek, M., ... Taylor, J. (2018). Practical Computational Reproducibility in the Life Sciences. Cell Systems, 6(6), 631-635. https://doi.org/10.1016/j.cels.2018.03.014

Practical Computational Reproducibility in the Life Sciences. / Grüning, Björn; Chilton, John; Köster, Johannes; Dale, Ryan; Soranzo, Nicola; van den Beek, Marius; Goecks, Jeremy; Backofen, Rolf; Nekrutenko, Anton; Taylor, James.

In: Cell Systems, Vol. 6, No. 6, 27.06.2018, p. 631-635.

Research output: Contribution to journalComment/debate

Grüning, B, Chilton, J, Köster, J, Dale, R, Soranzo, N, van den Beek, M, Goecks, J, Backofen, R, Nekrutenko, A & Taylor, J 2018, 'Practical Computational Reproducibility in the Life Sciences', Cell Systems, vol. 6, no. 6, pp. 631-635. https://doi.org/10.1016/j.cels.2018.03.014
Grüning B, Chilton J, Köster J, Dale R, Soranzo N, van den Beek M et al. Practical Computational Reproducibility in the Life Sciences. Cell Systems. 2018 Jun 27;6(6):631-635. https://doi.org/10.1016/j.cels.2018.03.014
Grüning, Björn ; Chilton, John ; Köster, Johannes ; Dale, Ryan ; Soranzo, Nicola ; van den Beek, Marius ; Goecks, Jeremy ; Backofen, Rolf ; Nekrutenko, Anton ; Taylor, James. / Practical Computational Reproducibility in the Life Sciences. In: Cell Systems. 2018 ; Vol. 6, No. 6. pp. 631-635.
@article{3fa23a6fd61c412f897a0997818276fa,
title = "Practical Computational Reproducibility in the Life Sciences",
abstract = "Many areas of research suffer from poor reproducibility, particularly in computationally intensive domains where results rely on a series of complex methodological decisions that are not well captured by traditional publication approaches. Various guidelines have emerged for achieving reproducibility, but implementation of these practices remains difficult due to the challenge of assembling software tools plus associated libraries, connecting tools together into pipelines, and specifying parameters. Here, we discuss a suite of cutting-edge technologies that make computational reproducibility not just possible, but practical in both time and effort. This suite combines three well-tested components—a system for building highly portable packages of bioinformatics software, containerization and virtualization technologies for isolating reusable execution environments for these packages, and workflow systems that automatically orchestrate the composition of these packages for entire pipelines—to achieve an unprecedented level of computational reproducibility. We also provide a practical implementation and five recommendations to help set a typical researcher on the path to performing data analyses reproducibly. Many areas of research suffer from poor reproducibility, particularly in computationally intensive domains where results rely on a series of complex methodological decisions that are not well captured by traditional publication approaches. Various guidelines have emerged for achieving reproducibility, but implementation of these practices remains difficult due to the challenge of assembling software tools plus associated libraries, connecting tools together into pipelines, and specifying parameters. Here, we discuss a suite of cutting-edge technologies that make computational reproducibility not just possible, but practical in both time and effort. This suite combines three well-tested components—a system for building highly portable packages of bioinformatics software, containerization and virtualization technologies for isolating reusable execution environments for these packages, and workflow systems that automatically orchestrate the composition of these packages for entire pipelines—to achieve an unprecedented level of computational reproducibility. We also provide a practical implementation and five recommendations to help set a typical researcher on the path to performing data analyses reproducibly.",
author = "Bj{\"o}rn Gr{\"u}ning and John Chilton and Johannes K{\"o}ster and Ryan Dale and Nicola Soranzo and {van den Beek}, Marius and Jeremy Goecks and Rolf Backofen and Anton Nekrutenko and James Taylor",
year = "2018",
month = "6",
day = "27",
doi = "10.1016/j.cels.2018.03.014",
language = "English (US)",
volume = "6",
pages = "631--635",
journal = "Cell Systems",
issn = "2405-4712",
publisher = "Cell Press",
number = "6",

}

TY - JOUR

T1 - Practical Computational Reproducibility in the Life Sciences

AU - Grüning, Björn

AU - Chilton, John

AU - Köster, Johannes

AU - Dale, Ryan

AU - Soranzo, Nicola

AU - van den Beek, Marius

AU - Goecks, Jeremy

AU - Backofen, Rolf

AU - Nekrutenko, Anton

AU - Taylor, James

PY - 2018/6/27

Y1 - 2018/6/27

N2 - Many areas of research suffer from poor reproducibility, particularly in computationally intensive domains where results rely on a series of complex methodological decisions that are not well captured by traditional publication approaches. Various guidelines have emerged for achieving reproducibility, but implementation of these practices remains difficult due to the challenge of assembling software tools plus associated libraries, connecting tools together into pipelines, and specifying parameters. Here, we discuss a suite of cutting-edge technologies that make computational reproducibility not just possible, but practical in both time and effort. This suite combines three well-tested components—a system for building highly portable packages of bioinformatics software, containerization and virtualization technologies for isolating reusable execution environments for these packages, and workflow systems that automatically orchestrate the composition of these packages for entire pipelines—to achieve an unprecedented level of computational reproducibility. We also provide a practical implementation and five recommendations to help set a typical researcher on the path to performing data analyses reproducibly. Many areas of research suffer from poor reproducibility, particularly in computationally intensive domains where results rely on a series of complex methodological decisions that are not well captured by traditional publication approaches. Various guidelines have emerged for achieving reproducibility, but implementation of these practices remains difficult due to the challenge of assembling software tools plus associated libraries, connecting tools together into pipelines, and specifying parameters. Here, we discuss a suite of cutting-edge technologies that make computational reproducibility not just possible, but practical in both time and effort. This suite combines three well-tested components—a system for building highly portable packages of bioinformatics software, containerization and virtualization technologies for isolating reusable execution environments for these packages, and workflow systems that automatically orchestrate the composition of these packages for entire pipelines—to achieve an unprecedented level of computational reproducibility. We also provide a practical implementation and five recommendations to help set a typical researcher on the path to performing data analyses reproducibly.

AB - Many areas of research suffer from poor reproducibility, particularly in computationally intensive domains where results rely on a series of complex methodological decisions that are not well captured by traditional publication approaches. Various guidelines have emerged for achieving reproducibility, but implementation of these practices remains difficult due to the challenge of assembling software tools plus associated libraries, connecting tools together into pipelines, and specifying parameters. Here, we discuss a suite of cutting-edge technologies that make computational reproducibility not just possible, but practical in both time and effort. This suite combines three well-tested components—a system for building highly portable packages of bioinformatics software, containerization and virtualization technologies for isolating reusable execution environments for these packages, and workflow systems that automatically orchestrate the composition of these packages for entire pipelines—to achieve an unprecedented level of computational reproducibility. We also provide a practical implementation and five recommendations to help set a typical researcher on the path to performing data analyses reproducibly. Many areas of research suffer from poor reproducibility, particularly in computationally intensive domains where results rely on a series of complex methodological decisions that are not well captured by traditional publication approaches. Various guidelines have emerged for achieving reproducibility, but implementation of these practices remains difficult due to the challenge of assembling software tools plus associated libraries, connecting tools together into pipelines, and specifying parameters. Here, we discuss a suite of cutting-edge technologies that make computational reproducibility not just possible, but practical in both time and effort. This suite combines three well-tested components—a system for building highly portable packages of bioinformatics software, containerization and virtualization technologies for isolating reusable execution environments for these packages, and workflow systems that automatically orchestrate the composition of these packages for entire pipelines—to achieve an unprecedented level of computational reproducibility. We also provide a practical implementation and five recommendations to help set a typical researcher on the path to performing data analyses reproducibly.

UR - http://www.scopus.com/inward/record.url?scp=85048421153&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048421153&partnerID=8YFLogxK

U2 - 10.1016/j.cels.2018.03.014

DO - 10.1016/j.cels.2018.03.014

M3 - Comment/debate

C2 - 29953862

AN - SCOPUS:85048421153

VL - 6

SP - 631

EP - 635

JO - Cell Systems

JF - Cell Systems

SN - 2405-4712

IS - 6

ER -