An Instance Space Analysis of Regression Problems,ACM Transactions on Knowledge Discovery from Data

当前位置： X-MOL 学术 › ACM Trans. Knowl. Discov. Data › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An Instance Space Analysis of Regression Problems
ACM Transactions on Knowledge Discovery from Data ( IF 3.6 ) Pub Date : 2021-03-27 , DOI: 10.1145/3436893
Mario Andrés Muñoz ₁ , Tao Yan ₁ , Matheus R. Leal ₂ , Kate Smith-Miles ₁ , Ana Carolina Lorena ₃ , Gisele L. Pappa ₂ , Rômulo Madureira Rodrigues ₄

Affiliation

The quest for greater insights into algorithm strengths and weaknesses, as revealed when studying algorithm performance on large collections of test problems, is supported by interactive visual analytics tools. A recent advance is Instance Space Analysis, which presents a visualization of the space occupied by the test datasets, and the performance of algorithms across the instance space. The strengths and weaknesses of algorithms can be visually assessed, and the adequacy of the test datasets can be scrutinized through visual analytics. This article presents the first Instance Space Analysis of regression problems in Machine Learning, considering the performance of 14 popular algorithms on 4,855 test datasets from a variety of sources. The two-dimensional instance space is defined by measurable characteristics of regression problems, selected from over 26 candidate features. It enables the similarities and differences between test instances to be visualized, along with the predictive performance of regression algorithms across the entire instance space. The purpose of creating this framework for visual analysis of an instance space is twofold: one may assess the capability and suitability of various regression techniques; meanwhile the bias, diversity, and level of difficulty of the regression problems popularly used by the community can be visually revealed. This article shows the applicability of the created regression instance space to provide insights into the strengths and weaknesses of regression algorithms, and the opportunities to diversify the benchmark test instances to support greater insights.

中文翻译：

回归问题的实例空间分析

正如在研究大量测试问题的算法性能时所揭示的那样，对算法优缺点的深入了解得到了交互式可视化分析工具的支持。最近的一项进展是实例空间分析，它展示了测试数据集占用的空间的可视化，以及跨实例空间的算法性能。可以直观地评估算法的优缺点，并通过可视化分析检查测试数据集的充分性。本文介绍了机器学习中回归问题的第一个实例空间分析，考虑了 14 种流行算法在来自各种来源的 4,855 个测试数据集上的性能。二维实例空间由回归问题的可测量特征定义，从超过 26 个候选特征中选择。它可以可视化测试实例之间的异同，以及回归算法在整个实例空间中的预测性能。为实例空间的可视化分析创建此框架的目的有两个：可以评估各种回归技术的能力和适用性；同时可以直观地揭示社区普遍使用的回归问题的偏差、多样性和难度级别。本文展示了创建的回归实例空间的适用性，以提供对回归算法优缺点的洞察，以及使基准测试实例多样化以支持更深入洞察的机会。它可以可视化测试实例之间的异同，以及回归算法在整个实例空间中的预测性能。为实例空间的可视化分析创建此框架的目的有两个：可以评估各种回归技术的能力和适用性；同时可以直观地揭示社区普遍使用的回归问题的偏差、多样性和难度级别。本文展示了创建的回归实例空间的适用性，以提供对回归算法优缺点的洞察，以及使基准测试实例多样化以支持更深入洞察的机会。它可以可视化测试实例之间的异同，以及回归算法在整个实例空间中的预测性能。为实例空间的可视化分析创建此框架的目的有两个：可以评估各种回归技术的能力和适用性；同时可以直观地揭示社区普遍使用的回归问题的偏差、多样性和难度级别。本文展示了创建的回归实例空间的适用性，以提供对回归算法优缺点的洞察，以及使基准测试实例多样化以支持更深入洞察的机会。以及回归算法在整个实例空间中的预测性能。为实例空间的可视化分析创建此框架的目的有两个：可以评估各种回归技术的能力和适用性；同时可以直观地揭示社区普遍使用的回归问题的偏差、多样性和难度级别。本文展示了创建的回归实例空间的适用性，以提供对回归算法优缺点的洞察，以及使基准测试实例多样化以支持更深入洞察的机会。以及回归算法在整个实例空间中的预测性能。为实例空间的可视化分析创建此框架的目的有两个：可以评估各种回归技术的能力和适用性；同时可以直观地揭示社区普遍使用的回归问题的偏差、多样性和难度级别。本文展示了创建的回归实例空间的适用性，以提供对回归算法优缺点的洞察，以及使基准测试实例多样化以支持更深入洞察的机会。同时可以直观地揭示社区普遍使用的回归问题的偏差、多样性和难度级别。本文展示了创建的回归实例空间的适用性，以提供对回归算法优缺点的洞察，以及使基准测试实例多样化以支持更深入洞察的机会。同时可以直观地揭示社区普遍使用的回归问题的偏差、多样性和难度级别。本文展示了创建的回归实例空间的适用性，以提供对回归算法优缺点的洞察，以及使基准测试实例多样化以支持更深入洞察的机会。

更新日期：2021-03-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>