当前位置: X-MOL 学术Res. Astron. Astrophys. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The feasibility and flexibility of selecting quasars by variability using ensemble machine learning algorithms
Research in Astronomy and Astrophysics ( IF 1.8 ) Pub Date : 2021-05-20 , DOI: 10.1088/1674-4527/21/4/99
Da-Ming Yang , Zhang-Liang Xie , Jun-Xian Wang

In this work, we train three decision-tree based ensemble machine learning algorithms (Random Forest Classifier, Adaptive Boosting and Gradient Boosting Decision Tree respectively) to study quasar selection in the variable source catalog in SDSS Stripe 82. We build training and test samples (both containing 1:1 of quasars and stars) using the spectroscopic confirmed sources in SDSS DR14 (including 8330 quasars and 3966 stars). We find that when trained with variation parameters alone, all three models can select quasars with similarly and remarkably high precision and completeness (∼ 98.5% and 97.5%), even better than trained with SDSS colors alone (∼ 97.2% and 96.5%), consistent with previous studies. By applying the trained models on the variable sources without spectroscopic identifications, we estimate the spectroscopically confirmed quasar sample in Stripe 82 variable source catalog is ∼ 93% complete (95% for mi < 19.0). Using the Random Forest Classifier we derive the relative importance of the observational features utilized for classifications. We further show that even using one- or two-year time domain observations, variability-based quasar selection could still be highly efficient.



中文翻译:

使用集成机器学习算法通过可变性选择类星体的可行性和灵活性

在这项工作中,我们训练了三种基于决策树的集成机器学习算法(分别是随机森林分类器、自适应提升和梯度提升决策树)来研究 SDSS Stripe 82 中可变源目录中的类星体选择。我们构建了训练和测试样本(均包含 1:1 的类星体和恒星)使用 SDSS DR14 中的光谱确认来源(包括 8330 个类星体和 3966 个恒星)。我们发现,当单独使用变化参数进行训练时,所有三个模型都可以选择具有相似且非常高的精度和完整性(~ 98.5% 和 97.5%)的类星体,甚至比单独使用 SDSS 颜色训练(~ 97.2% 和 96.5%)还要好,与以往的研究一致。通过在没有光谱识别的情况下将训练好的模型应用于可变源,m i < 19.0)。使用随机森林分类器,我们推导出用于分类的观察特征的相对重要性。我们进一步表明,即使使用一年或两年的时域观测,基于变异性的类星体选择仍然是高效的。

更新日期:2021-05-20
down
wechat
bug