Given n random variables and a set of m observations of each of the n variables, the Bayesian network inference problem is to infer a directed acyclic graph (DAG) on the n variables such that the implied joint probability distribution best explains the set of observations. Bayesian networks are widely used in many fields ranging from data mining to computational biology. Exact inference of Bayesian networks takes O(n2 · 2n) time plus the cost of O(n · 2n) evaluations of an application-specific scoring function. In this paper, we present a parallel algorithm for exact Bayesian inference that is work-optimal and communication-efficient. We demonstrate the applicability of our method by an implementation on the IBM Blue Gene/L, with experimental results that exhibit near perfect scaling.