TY - GEN
T1 - Parallel discovery of direct causal relations and Markov boundaries with applications to gene networks
AU - Nikolova, Olga
AU - Aluru, Srinivas
N1 - Copyright:
Copyright 2011 Elsevier B.V., All rights reserved.
PY - 2011
Y1 - 2011
N2 - Bayesian networks enable formal probabilistic reasoning on a set of interacting variables of a domain, and have been shown to have broad applicability. More specifically, in bioinformatics Bayesian networks are used to model gene interactions. Learning the structure of a Bayesian network is an NP-hard problem making it necessary to employ heuristics for solving large-scale problems. In this paper, we present parallel algorithms for two problems that arise in relation with network structure learning and analysis: (i) the discovery of all direct causal relations for each variable, i.e., the set of parents and children of each node in the corresponding Bayesian network, and (ii) the computation of Markov boundary of each variable, defined as the minimal set of variables that shield the target variable from all other variables in the domain. Our parallel algorithms are based on state-of-the art constraint-based heuristic optimization methods. They are shown to be work-optimal and communication efficient, and exhibit nearly perfect scaling.
AB - Bayesian networks enable formal probabilistic reasoning on a set of interacting variables of a domain, and have been shown to have broad applicability. More specifically, in bioinformatics Bayesian networks are used to model gene interactions. Learning the structure of a Bayesian network is an NP-hard problem making it necessary to employ heuristics for solving large-scale problems. In this paper, we present parallel algorithms for two problems that arise in relation with network structure learning and analysis: (i) the discovery of all direct causal relations for each variable, i.e., the set of parents and children of each node in the corresponding Bayesian network, and (ii) the computation of Markov boundary of each variable, defined as the minimal set of variables that shield the target variable from all other variables in the domain. Our parallel algorithms are based on state-of-the art constraint-based heuristic optimization methods. They are shown to be work-optimal and communication efficient, and exhibit nearly perfect scaling.
KW - Bayesian networks
KW - Causal relations
KW - Constraint-based learning
KW - Markov boundaries
UR - http://www.scopus.com/inward/record.url?scp=80155187592&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80155187592&partnerID=8YFLogxK
U2 - 10.1109/ICPP.2011.49
DO - 10.1109/ICPP.2011.49
M3 - Conference contribution
AN - SCOPUS:80155187592
SN - 9780769545103
T3 - Proceedings of the International Conference on Parallel Processing
SP - 512
EP - 521
BT - Proceedings - 2011 International Conference on Parallel Processing, ICPP 2011
T2 - 40th International Conference on Parallel Processing, ICPP 2011
Y2 - 13 September 2011 through 16 September 2011
ER -