Biocompute objects - A step towards evaluation and validation of biomedical scientific computations

Vahan Simonyan, Jeremy Goecks, Raja Mazumder

Research output: Contribution to journalReview article

6 Citations (Scopus)

Abstract

The unpredictability of actual physical, chemical, and biological experiments due to the multitude of environmental and procedural factors is well documented. What is systematically overlooked, however, is that computational biology algorithms are also affected by multiplicity of parameters and have no lesser volatility. The complexities of computation protocols and interpretation of outcomes is only a part of the challenge: There are also virtually no standardized and industry-accepted metadata schemas for reporting the computational objects that record the parameters used for computations together with the results of computations. Thus, it is often impossible to reproduce the results of a previously performed computation due to missing information on parameters, versions, arguments, conditions, and procedures of application launch. In this article we describe the concept of biocompute objects developed specifically to satisfy regulatory research needs for evaluation, validation, and verification of bioinformatics pipelines. We envision generalized versions of biocompute objects called biocompute templates that support a single class of analyses but can be adapted to meet unique needs. To make these templates widely usable, we outline a simple but powerful cross-platform implementation. We also discuss the reasoning and potential usability for such concept within the larger scientific community through the creation of a biocompute object database initially consisting of records relevant to the U.S. Food and Drug Administration. A biocompute object database record will be similar to a GenBank record in form; the difference being that instead of describing a sequence, the biocompute record will include information related to parameters, dependencies, usage, and other information related to specific computational instance. This mechanism will extend similar efforts and also serve as a collaborative ground to ensure interoperability between different platforms, industries, scientists, regulators, and other stakeholders interested in biocomputing.

Original languageEnglish (US)
Pages (from-to)136-146
Number of pages11
JournalPDA Journal of Pharmaceutical Science and Technology
Volume71
Issue number2
DOIs
StatePublished - Mar 1 2017
Externally publishedYes

Fingerprint

Computational Biology
Industry
Databases
Volatilization
Nucleic Acid Databases
United States Food and Drug Administration
Metadata

Keywords

  • Biocompute object
  • Computation reproducibility
  • FDA
  • NGS standardization
  • Regulatory research

ASJC Scopus subject areas

  • Pharmaceutical Science

Cite this

Biocompute objects - A step towards evaluation and validation of biomedical scientific computations. / Simonyan, Vahan; Goecks, Jeremy; Mazumder, Raja.

In: PDA Journal of Pharmaceutical Science and Technology, Vol. 71, No. 2, 01.03.2017, p. 136-146.

Research output: Contribution to journalReview article

@article{16c7b0cf8c3c47b3b6fa91fb15e73184,
title = "Biocompute objects - A step towards evaluation and validation of biomedical scientific computations",
abstract = "The unpredictability of actual physical, chemical, and biological experiments due to the multitude of environmental and procedural factors is well documented. What is systematically overlooked, however, is that computational biology algorithms are also affected by multiplicity of parameters and have no lesser volatility. The complexities of computation protocols and interpretation of outcomes is only a part of the challenge: There are also virtually no standardized and industry-accepted metadata schemas for reporting the computational objects that record the parameters used for computations together with the results of computations. Thus, it is often impossible to reproduce the results of a previously performed computation due to missing information on parameters, versions, arguments, conditions, and procedures of application launch. In this article we describe the concept of biocompute objects developed specifically to satisfy regulatory research needs for evaluation, validation, and verification of bioinformatics pipelines. We envision generalized versions of biocompute objects called biocompute templates that support a single class of analyses but can be adapted to meet unique needs. To make these templates widely usable, we outline a simple but powerful cross-platform implementation. We also discuss the reasoning and potential usability for such concept within the larger scientific community through the creation of a biocompute object database initially consisting of records relevant to the U.S. Food and Drug Administration. A biocompute object database record will be similar to a GenBank record in form; the difference being that instead of describing a sequence, the biocompute record will include information related to parameters, dependencies, usage, and other information related to specific computational instance. This mechanism will extend similar efforts and also serve as a collaborative ground to ensure interoperability between different platforms, industries, scientists, regulators, and other stakeholders interested in biocomputing.",
keywords = "Biocompute object, Computation reproducibility, FDA, NGS standardization, Regulatory research",
author = "Vahan Simonyan and Jeremy Goecks and Raja Mazumder",
year = "2017",
month = "3",
day = "1",
doi = "10.5731/pdajpst.2016.006734",
language = "English (US)",
volume = "71",
pages = "136--146",
journal = "PDA Journal of Pharmaceutical Science and Technology",
issn = "1079-7440",
publisher = "Parenteral Drug Association Inc.",
number = "2",

}

TY - JOUR

T1 - Biocompute objects - A step towards evaluation and validation of biomedical scientific computations

AU - Simonyan, Vahan

AU - Goecks, Jeremy

AU - Mazumder, Raja

PY - 2017/3/1

Y1 - 2017/3/1

N2 - The unpredictability of actual physical, chemical, and biological experiments due to the multitude of environmental and procedural factors is well documented. What is systematically overlooked, however, is that computational biology algorithms are also affected by multiplicity of parameters and have no lesser volatility. The complexities of computation protocols and interpretation of outcomes is only a part of the challenge: There are also virtually no standardized and industry-accepted metadata schemas for reporting the computational objects that record the parameters used for computations together with the results of computations. Thus, it is often impossible to reproduce the results of a previously performed computation due to missing information on parameters, versions, arguments, conditions, and procedures of application launch. In this article we describe the concept of biocompute objects developed specifically to satisfy regulatory research needs for evaluation, validation, and verification of bioinformatics pipelines. We envision generalized versions of biocompute objects called biocompute templates that support a single class of analyses but can be adapted to meet unique needs. To make these templates widely usable, we outline a simple but powerful cross-platform implementation. We also discuss the reasoning and potential usability for such concept within the larger scientific community through the creation of a biocompute object database initially consisting of records relevant to the U.S. Food and Drug Administration. A biocompute object database record will be similar to a GenBank record in form; the difference being that instead of describing a sequence, the biocompute record will include information related to parameters, dependencies, usage, and other information related to specific computational instance. This mechanism will extend similar efforts and also serve as a collaborative ground to ensure interoperability between different platforms, industries, scientists, regulators, and other stakeholders interested in biocomputing.

AB - The unpredictability of actual physical, chemical, and biological experiments due to the multitude of environmental and procedural factors is well documented. What is systematically overlooked, however, is that computational biology algorithms are also affected by multiplicity of parameters and have no lesser volatility. The complexities of computation protocols and interpretation of outcomes is only a part of the challenge: There are also virtually no standardized and industry-accepted metadata schemas for reporting the computational objects that record the parameters used for computations together with the results of computations. Thus, it is often impossible to reproduce the results of a previously performed computation due to missing information on parameters, versions, arguments, conditions, and procedures of application launch. In this article we describe the concept of biocompute objects developed specifically to satisfy regulatory research needs for evaluation, validation, and verification of bioinformatics pipelines. We envision generalized versions of biocompute objects called biocompute templates that support a single class of analyses but can be adapted to meet unique needs. To make these templates widely usable, we outline a simple but powerful cross-platform implementation. We also discuss the reasoning and potential usability for such concept within the larger scientific community through the creation of a biocompute object database initially consisting of records relevant to the U.S. Food and Drug Administration. A biocompute object database record will be similar to a GenBank record in form; the difference being that instead of describing a sequence, the biocompute record will include information related to parameters, dependencies, usage, and other information related to specific computational instance. This mechanism will extend similar efforts and also serve as a collaborative ground to ensure interoperability between different platforms, industries, scientists, regulators, and other stakeholders interested in biocomputing.

KW - Biocompute object

KW - Computation reproducibility

KW - FDA

KW - NGS standardization

KW - Regulatory research

UR - http://www.scopus.com/inward/record.url?scp=85019622328&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85019622328&partnerID=8YFLogxK

U2 - 10.5731/pdajpst.2016.006734

DO - 10.5731/pdajpst.2016.006734

M3 - Review article

C2 - 27974626

AN - SCOPUS:85019622328

VL - 71

SP - 136

EP - 146

JO - PDA Journal of Pharmaceutical Science and Technology

JF - PDA Journal of Pharmaceutical Science and Technology

SN - 1079-7440

IS - 2

ER -