当前位置: X-MOL 学术Mon. Not. R. Astron. Soc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The probabilistic random forest applied to the selection of quasar candidates in the QUBRICS survey
Monthly Notices of the Royal Astronomical Society ( IF 4.8 ) Pub Date : 2021-06-29 , DOI: 10.1093/mnras/stab1867
Francesco Guarneri 1, 2 , Giorgio Calderone 2 , Stefano Cristiani 2, 3, 4 , Fabio Fontanot 2 , Konstantina Boutsia 5 , Guido Cupani 2 , Andrea Grazian 6 , Valentina D’Odorico 2, 3, 7
Affiliation  

The number of known, bright (i < 18), high-redshift (z > 2.5) QSOs in the Southern hemisphere is considerably lower than the corresponding number in the Northern hemisphere due to the lack of multiwavelength surveys at δ < 0. Recent works, such as the QUBRICS survey, successfully identified new, high-redshift QSOs in the South by means of a machine-learning approach applied on a large photometric data-set. Building on the success of QUBRICS, we present a new QSO selection method based on the Probabilistic Random Forest (PRF), an improvement of the classic Random Forest algorithm. The PRF takes into account measurement errors, treating input data as probability distribution functions: this allows us to obtain better accuracy and a robust predictive model. We applied the PRF to the same photometric data-set used in QUBRICS, based on the SkyMapper DR1, Gaia DR2, 2MASS, WISE, and GALEX databases. The resulting candidate list includes 626 sources with i < 18. We estimate for our proposed algorithm a completeness of ∼84 per cent and a purity of ${\sim}78{{\ \rm per\ cent}}$ on the test data-sets. Preliminary spectroscopic campaigns allowed us to observe 41 candidates, of which 29 turned out to be z > 2.5 QSOs. The performances of the PRF, currently comparable to those of the CCA, are expected to improve as the number of high-z QSOs available for the training sample grows: results are however already promising, despite this being one of the first applications of this method to an astrophysical context.

中文翻译:

概率随机森林应用于 QUBRICS 调查中类星体候选者的选择

由于缺乏在 δ < 18 处的多波长调查,南半球已知的明亮 (i < 18)、高红移 (z > 2.5) QSO 的数量远低于北半球的相应数量。0. 最近的工作,例如 QUBRICS 调查,通过应用于大型光度数据集的机器学习方法,成功识别了南方新的高红移 QSO。在 QUBRICS 成功的基础上,我们提出了一种基于概率随机森林 (PRF) 的新 QSO 选择方法,这是对经典随机森林算法的改进。PRF 考虑了测量误差,将输入数据视为概率分布函数:这使我们能够获得更好的准确性和稳健的预测模型。我们将 PRF 应用于 QUBRICS 中使用的相同光度数据集,基于 SkyMapper DR1、Gaia DR2、2MASS、WISE 和 GALEX 数据库。得到的候选列表包括 626 个源,其中 i < 18. 我们估计我们提出的算法在测试数据集上的完整性约为 84%,纯度为 ${\sim}78{{\ \rm per\ cent}}$。初步的光谱活动使我们能够观察到 41 个候选者,其中 29 个是 z > 2.5 QSO。PRF 的性能目前可与 CCA 相媲美,预计随着可用于训练样本的高 z QSO 数量的增加而提高:尽管这是该方法的首批应用之一,但结果已经很有希望到天体物理学的背景。我们估计我们提出的算法在测试数据集上的完整性约为 84%,纯度为 ${\sim}78{{\ \rm per\ cent}}$。初步的光谱活动使我们能够观察到 41 个候选者,其中 29 个是 z > 2.5 QSO。PRF 的性能目前可与 CCA 相媲美,预计随着可用于训练样本的高 z QSO 数量的增加而提高:尽管这是该方法的首批应用之一,但结果已经很有希望到天体物理学的背景。我们估计我们提出的算法在测试数据集上的完整性约为 84%,纯度为 ${\sim}78{{\ \rm per\ cent}}$。初步的光谱活动使我们能够观察到 41 个候选者,其中 29 个是 z > 2.5 QSO。PRF 的性能目前可与 CCA 相媲美,预计随着可用于训练样本的高 z QSO 数量的增加而提高:尽管这是该方法的首批应用之一,但结果已经很有希望到天体物理学的背景。
更新日期:2021-06-29
down
wechat
bug