TY - GEN

T1 - Parallel discovery of direct causal relations and Markov boundaries with applications to gene networks

AU - Nikolova, Olga

AU - Aluru, Srinivas

N1 - Copyright:
Copyright 2011 Elsevier B.V., All rights reserved.

PY - 2011

Y1 - 2011

N2 - Bayesian networks enable formal probabilistic reasoning on a set of interacting variables of a domain, and have been shown to have broad applicability. More specifically, in bioinformatics Bayesian networks are used to model gene interactions. Learning the structure of a Bayesian network is an NP-hard problem making it necessary to employ heuristics for solving large-scale problems. In this paper, we present parallel algorithms for two problems that arise in relation with network structure learning and analysis: (i) the discovery of all direct causal relations for each variable, i.e., the set of parents and children of each node in the corresponding Bayesian network, and (ii) the computation of Markov boundary of each variable, defined as the minimal set of variables that shield the target variable from all other variables in the domain. Our parallel algorithms are based on state-of-the art constraint-based heuristic optimization methods. They are shown to be work-optimal and communication efficient, and exhibit nearly perfect scaling.

AB - Bayesian networks enable formal probabilistic reasoning on a set of interacting variables of a domain, and have been shown to have broad applicability. More specifically, in bioinformatics Bayesian networks are used to model gene interactions. Learning the structure of a Bayesian network is an NP-hard problem making it necessary to employ heuristics for solving large-scale problems. In this paper, we present parallel algorithms for two problems that arise in relation with network structure learning and analysis: (i) the discovery of all direct causal relations for each variable, i.e., the set of parents and children of each node in the corresponding Bayesian network, and (ii) the computation of Markov boundary of each variable, defined as the minimal set of variables that shield the target variable from all other variables in the domain. Our parallel algorithms are based on state-of-the art constraint-based heuristic optimization methods. They are shown to be work-optimal and communication efficient, and exhibit nearly perfect scaling.

KW - Bayesian networks

KW - Causal relations

KW - Constraint-based learning

KW - Markov boundaries

UR - http://www.scopus.com/inward/record.url?scp=80155187592&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80155187592&partnerID=8YFLogxK

U2 - 10.1109/ICPP.2011.49

DO - 10.1109/ICPP.2011.49

M3 - Conference contribution

AN - SCOPUS:80155187592

SN - 9780769545103

T3 - Proceedings of the International Conference on Parallel Processing

SP - 512

EP - 521

BT - Proceedings - 2011 International Conference on Parallel Processing, ICPP 2011

T2 - 40th International Conference on Parallel Processing, ICPP 2011

Y2 - 13 September 2011 through 16 September 2011

ER -