# Score-based Structure Learning BNs

Posted on Fri 24 June 2016 in score_based

With a bit of delay, I am now working on a basic PR for score-based structure estimation for Bayesian Networks. It comes with two ingredients:

• The StructureScore-Class and its subclasses BayesianScore and BICScore. They are initialized with a data set and provide a score-method to compute how well a given BayesianModel can be fitted to the data, according to different criteria. Since those scores are decomposable for BNs, a local_score-method is also exposed for node-by-node computation.
• The StructureSearch-Class and its subclasses ExhaustiveSearch and HCSearch. They are initialized with a StructureScore-instance and optimize that score over all BayesianModels. The latter subclass has a number of optional search enhancements.

So far BayesianScore supports BDeu and K2 priors, I’ll think for a good interface to specify other prior weights. With K2 priors the score is given by the following form:

$$score^{K2}_D(m) = \log(P(m)) + \sum_{X\in nodes(m)} local\_score^{K2}_D(X, parents_m(X))$$

where $$P(m)$$ is an optional structure prior that is quite negligible in practice. $$local\_score^{K2}$$ is computed for each node as follows:

$$local\_score^{K2}_D(X, P_X) = \sum_{j=1}^{q(P_X)} (\log(\frac{(r-1)!}{(N_j+r-1)!}) + \sum_{k=1}^r \log(N_{jk}!))$$

Where $$r$$ is the cardinality of the variable $$X$$, $$q(P_X)$$ is the product of the cardinalities of the parents of $$X$$ (= the possible states of $$P_X$$) and $$N_{jk}$$ is the number of times that variable $$X$$ is in state $$k$$ while parents are in state $$j$$ in the data sample. Finally, $$N_j:=\sum_{k=1}^r N_{jk}$$.