Journal of Parallel and Distributed Computing ( IF 3.4 ) Pub Date : 2021-01-08 , DOI: 10.1016/j.jpdc.2020.12.012 Yulai Zhang , Jiachen Wang , Gang Cen , Kueiming Lo
Inferring the causal direction between two variables from their observation data is one of the most fundamental and challenging topics in data science. A causal direction inference algorithm maps the observation data into a binary value which represents either causes or causes . The nature of these algorithms makes the results unstable with the change of data points. Therefore the accuracy of the causal direction inference can be improved significantly by using parallel ensemble frameworks. In this paper, new causal direction inference algorithms based on several ways of parallel ensemble are proposed. Theoretical analyses on accuracy rates are given. Experiments are done on both of the artificial data sets and the real world data sets. The accuracy performances of the methods and their computational efficiencies in parallel computing environment are demonstrated.
中文翻译:
因果方向推断的并行集成方法
根据观测数据推断两个变量之间的因果关系是数据科学中最基本,最具挑战性的主题之一。因果方向推断算法将观测数据映射到一个二进制值,该二进制值表示 原因 要么 原因 。这些算法的性质使结果随着数据点的变化而变得不稳定。因此,通过使用并行集成框架,可以显着提高因果方向推断的准确性。本文提出了几种基于并行集成的因果方向推断算法。给出了准确率的理论分析。在人工数据集和真实数据集上都进行了实验。证明了该方法在并行计算环境中的精度性能及其计算效率。