HillClimbEstimator done

Posted on Sun 14 August 2016 in score_based

pgmpy now has a basic hill climb BN structure estimator.

Usage:

import pandas as pd
import numpy as np
from pgmpy.estimators import HillClimbSearch, BicScore

# create data sample with 9 random variables:
data = pd.DataFrame(np.random.randint(0, 5, size=(5000, 9)), columns=list('ABCDEFGHI'))
# add 10th dependent variable
data['J'] = data['A'] * data['B']

est = HillClimbSearch(data, scoring_method=BicScore(data))
best_model = est.estimate()

print(sorted(best_model.nodes()))
print(sorted(best_model.edges()))

Output:

['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']
[('A', 'J'), ('B', 'J')]