Predicting pulsar stars using a random tree boosting voting classifier (RTB-VC),Astronomy and Computing

当前位置： X-MOL 学术 › Astron. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Predicting pulsar stars using a random tree boosting voting classifier (RTB-VC)
Astronomy and Computing ( IF 1.9 ) Pub Date : 2020-08-12 , DOI: 10.1016/j.ascom.2020.100404
F. Rustam , A. Mehmood , S. Ullah , M. Ahmad , D. Muhammad Khan , G.S. Choi , B.-W. On

The recent exponential growth in the data volume and number of identified pulsar stars is due to pulsar candidate search experiments and surveys. In this study, we investigated the existing methods and techniques used for pulsar prediction, such as applying filters based on pulsar observations, which can adversely affect the success of accurate pulsar prediction. Some of the existing methods are not capable of dealing with large volumes of data and others fail to accurately select the best candidates from pulsar observations. Thus, we developed a new approach based on the traditional supervised machine learning algorithm, which yields faster and more accurate results. In this study, we present our hybrid machine learning classifier called the random trees boosting voting classifier (RTB-VC) for predicting pulsar stars. RTB-VC combines tree-based classifiers and it employs the High Time Resolution Universe 2 (HTRU2) data set comprising a set of eight features related to pulsars and non-pulsars. The HTRU2 data set is imbalanced and we solve this problem by using the synthetic minority oversampling technique to generate artificial data and obtain a balanced data set. A feature set is used to separate pulsar and non-pulsar candidates because the different distributions of variables in the data set are helpful for training models. In the proposed approach, the prediction stage of RTB-VC is based on a combination of soft voting, hard voting, and weighted voting to obtain highly accurate and relevant criteria for finally predicting pulsars or non-pulsars. The ensemble-based structure of RTB-VC yields accurate results based on pulsar observations with a high $F_{1}$ score for pulsars (98.3%). We evaluated the learning algorithm in terms of its accuracy, precision, recall, and $F_{1}$ score.

中文翻译：

使用随机树增强投票分类器（RTB-VC）预测脉冲星

最近的数据量和已识别脉冲星的数量呈指数增长是由于脉冲星候选搜索实验和调查所致。在这项研究中，我们调查了用于脉冲星预测的现有方法和技术，例如应用基于脉冲星观测值的滤波器，这可能会对精确脉冲星预测的成功产生不利影响。一些现有方法无法处理大量数据，而另一些则无法从脉冲星观测中准确选择最佳候选者。因此，我们基于传统的监督式机器学习算法开发了一种新方法，该方法可产生更快，更准确的结果。在这项研究中，我们提出了混合机器学习分类器，称为随机树增强投票分类器（RTB-VC），用于预测脉冲星。RTB-VC结合了基于树的分类器，并使用了高时间分辨率宇宙2（HTRU2）数据集，该数据集包含与脉冲星和非脉冲星有关的八个特征。HTRU2数据集是不平衡的，我们通过使用合成少数过采样技术生成人工数据并获得平衡的数据集来解决此问题。特征集用于分离脉冲星和非脉冲星候选，因为数据集中变量的不同分布有助于训练模型。在所提出的方法中，RTB-VC的预测阶段基于软投票，硬投票和加权投票的组合，以获得高度准确和相关的标准，以最终预测脉冲星或非脉冲星。基于整体的RTB-VC结构可基于脉冲星观测获得的准确结果 $F_{1个}$ 脉冲星得分（98.3％）。我们根据学习算法的准确性，准确性，召回率和 $F_{1个}$ 得分了。

更新日期：2020-08-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11