当前位置: X-MOL 学术Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hierarchical Independence Thresholding for learning Bayesian network classifiers
Knowledge-Based Systems ( IF 8.8 ) Pub Date : 2020-11-25 , DOI: 10.1016/j.knosys.2020.106627
Yang Liu , Limin Wang , Musa Mammadov , Shenglei Chen , Gaojie Wang , Sikai Qi , Minghui Sun

Bayesian networks are powerful tools for knowledge representation and inference under conditions of uncertainty. However, learning an optimal Bayesian network classifier (BNC) is an NP-hard problem since its topology complexity increases exponentially with the number of attributes. Researchers proposed to apply information-theoretic criteria to measure conditional dependence, and independence assumptions are introduced implicitly or explicitly to simplify the network topology of BNC. In this paper, we clarify the mapping relationship between conditional mutual information and local topology, and then illustrate that informational independence does not correspond to probabilistic independence, the criterion of probabilistic independence does not necessarily hold for the independence topology. A novel framework of semi-naive Bayesian operation, called Hierarchical Independence Thresholding (HIT), is presented to efficiently identify informational conditional independence and probabilistic conditional independence by applying an adaptive thresholding method, redundant edges will be filtered out and the learned topology will fit the data better. Extensive experimental evaluation on 58 publicly available datasets reveals that when HIT is applied to BNCs (such as tree augmented Naive Bayes or k-dependence Bayesian classifier), the final BNCs achieve competitive classification performance compared to state-of-the-art learners such as Random Forest and Logistic regression.



中文翻译:

用于学习贝叶斯网络分类器的分层独立阈值

贝叶斯网络是不确定条件下知识表示和推理的强大工具。但是,学习最佳贝叶斯网络分类器(BNC)是一个NP难题,因为其拓扑结构复杂度随属性数量呈指数增长。研究人员提议应用信息理论标准来度量条件依赖性,并且隐式或显式引入独立性假设以简化BNC的网络拓扑。在本文中,我们阐明了条件互信息与局部拓扑之间的映射关系,然后说明信息独立性并不对应于概率独立性,概率独立性的标准不一定适用于独立性拓扑。半朴素贝叶斯运算的新颖框架,提出了一种称为层次独立阈值(HIT)的方法,以通过应用自适应阈值方法有效地识别信息条件独立性和概率条件独立性,冗余边缘将被滤除,并且学习的拓扑将更好地适合数据。对58个可公开获得的数据集进行的广泛实验评估表明,将HIT应用于BNC(例如树木增强的朴素贝叶斯或k-依赖贝叶斯分类器),与最先进的学习器(例如随机森林和Logistic回归)相比,最终的BNC具有竞争性的分类性能。

更新日期:2020-12-01
down
wechat
bug