TY - JOUR
T1 - The influence of disease categories on gene candidate predictions from model organism phenotypes
AU - Oellrich, Anika
AU - Koehler, Sebastian
AU - Washington, Nicole
AU - Mungall, Chris
AU - Lewis, Suzanna
AU - Haendel, Melissa
AU - Robinson, Peter N.
AU - Smedley, Damian
AU - Sanger Mouse Genetic Project, Mouse Genetic Project
N1 - Funding Information:
Publication in this supplement was support by National Institutes of Health (NIH) grant [1 U54 HG006370-01]. This article has been published as part of the Journal of Biomedical Semantics, Volume 5 Supplement 1, 2013: Proceedings of the Bio-Ontologies Special Interest Group Meeting 2013. The full contents of the supplement are available online at http://www.jbiomedsem.com/supplements/5/S1. This article has been published as part of Journal of Biomedical Semantics Volume 5 Supplement 1, 2014: Proceedings of the Bio-Ontologies Special Interest Group 2013. The full contents of the supplement are available online at http:// www.jbiomedsem.com/supplements/5/S1.
Funding Information:
This work was supported by core infrastructure funding from the Wellcome Trust and National Institutes of Health (NIH) grant [1 U54 HG006370-01].
Publisher Copyright:
© 2014 Oellrich et al; licensee BioMed Central Ltd.
PY - 2014
Y1 - 2014
N2 - Background: The molecular etiology is still to be identified for about half of the currently described Mendelian diseases in humans, thereby hindering efforts to find treatments or preventive measures. Advances, such as new sequencing technologies, have led to increasing amounts of data becoming available with which to address the problem of identifying disease genes. Therefore, automated methods are needed that reliably predict disease gene candidates based on available data. We have recently developed Exomiser as a tool for identifying causative variants from exome analysis results by filtering and prioritising using a number of criteria including the phenotype similarity between the disease and mouse mutants involving the gene candidates. Initial investigations revealed a variation in performance for different medical categories of disease, due in part to a varying contribution of the phenotype scoring component. Results: In this study, we further analyse the performance of our cross-species phenotype matching algorithm, and examine in more detail the reasons why disease gene filtering based on phenotype data works better for certain disease categories than others. We found that in addition to misleading phenotype alignments between species, some disease categories are still more amenable to automated predictions than others, and that this often ties in with community perceptions on how well the organism works as model. Conclusions: In conclusion, our automated disease gene candidate predictions are highly dependent on the organism used for the predictions and the disease category being studied. Future work on computational disease gene prediction using phenotype data would benefit from methods that take into account the disease category and the source of model organism data.
AB - Background: The molecular etiology is still to be identified for about half of the currently described Mendelian diseases in humans, thereby hindering efforts to find treatments or preventive measures. Advances, such as new sequencing technologies, have led to increasing amounts of data becoming available with which to address the problem of identifying disease genes. Therefore, automated methods are needed that reliably predict disease gene candidates based on available data. We have recently developed Exomiser as a tool for identifying causative variants from exome analysis results by filtering and prioritising using a number of criteria including the phenotype similarity between the disease and mouse mutants involving the gene candidates. Initial investigations revealed a variation in performance for different medical categories of disease, due in part to a varying contribution of the phenotype scoring component. Results: In this study, we further analyse the performance of our cross-species phenotype matching algorithm, and examine in more detail the reasons why disease gene filtering based on phenotype data works better for certain disease categories than others. We found that in addition to misleading phenotype alignments between species, some disease categories are still more amenable to automated predictions than others, and that this often ties in with community perceptions on how well the organism works as model. Conclusions: In conclusion, our automated disease gene candidate predictions are highly dependent on the organism used for the predictions and the disease category being studied. Future work on computational disease gene prediction using phenotype data would benefit from methods that take into account the disease category and the source of model organism data.
UR - http://www.scopus.com/inward/record.url?scp=84938329175&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84938329175&partnerID=8YFLogxK
U2 - 10.1186/2041-1480-5-S1-S4
DO - 10.1186/2041-1480-5-S1-S4
M3 - Article
AN - SCOPUS:84938329175
SN - 2041-1480
VL - 5
JO - Journal of Biomedical Semantics
JF - Journal of Biomedical Semantics
M1 - S4
ER -