当前位置: X-MOL 学术Astrophys. J. Suppl. Ser. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
How to Train Your Flare Prediction Model: Revisiting Robust Sampling of Rare Events
The Astrophysical Journal Supplement Series ( IF 8.6 ) Pub Date : 2021-05-17 , DOI: 10.3847/1538-4365/abec88
Azim Ahmadzadeh 1 , Berkay Aydin 1 , Manolis K. Georgoulis 2 , Dustin J. Kempton 1 , Sushant S. Mahajan 3 , Rafal A. Angryk 1
Affiliation  

We present a case study of solar flare forecasting by means of metadata feature time series, by treating it as a prominent class-imbalance and temporally coherent problem. Taking full advantage of pre-flare time series in solar active regions is made possible via the Space Weather Analytics for Solar Flares (SWAN-SF) benchmark data set, a partitioned collection of multivariate time series of active region properties comprising 4075 regions and spanning over 9 yr of the Solar Dynamics Observatory period of operations. We showcase the general concept of temporal coherence triggered by the demand of continuity in time series forecasting and show that lack of proper understanding of this effect may spuriously enhance models’ performance. We further address another well-known challenge in rare-event prediction, namely, the class-imbalance issue. The SWAN-SF is an appropriate data set for this, with a 60:1 imbalance ratio for GOES M- and X-class flares and an 800:1 imbalance ratio for X-class flares against flare-quiet instances. We revisit the main remedies for these challenges and present several experiments to illustrate the exact impact that each of these remedies may have on performance. Moreover, we acknowledge that some basic data manipulation tasks such as data normalization and cross validation may also impact the performance; we discuss these problems as well. In this framework we also review the primary advantages and disadvantages of using true skill statistic and Heidke skill score, two widely used performance verification metrics for the flare-forecasting task. In conclusion, we show and advocate for the benefits of time series versus point-in-time forecasting, provided that the above challenges are measurably and quantitatively addressed.



中文翻译:

如何训练你的耀斑预测模型:重新审视罕见事件的稳健采样

我们通过元数据特征时间序列介绍了太阳耀斑预测的案例研究,将其视为突出的类别不平衡和时间连贯的问题。通过太阳耀斑空间天气分析 (SWAN-SF) 基准数据集,可以充分利用太阳活动区的耀斑前时间序列,这是一个由 4075 个区域组成的活动区属性的多元时间序列的分区集合,跨越太阳动力学天文台运行期 9 年。我们展示了由时间序列预测中的连续性需求引发的时间一致性的一般概念,并表明对这种影响缺乏正确理解可能会虚假地提高模型的性能。我们进一步解决了罕见事件预测中的另一个众所周知的挑战,即类别不平衡问题。SWAN-SF 是一个合适的数据集,GOES M 级和 X 级耀斑的不平衡比为 60:1,X 级耀斑与耀斑安静实例的不平衡比为 800:1。我们重新审视了这些挑战的主要补救措施,并提出了几个实验来说明这些补救措施中的每一个可能对性能产生的确切影响。此外,我们承认一些基本的数据操作任务,如数据规范化和交叉验证,也可能会影响性能;我们也讨论这些问题。在这个框架中,我们还回顾了使用真实技能统计数据和 Heidke 技能分数的主要优点和缺点,这两个广泛用于耀斑预测任务的性能验证指标。总之,我们展示并提倡时间序列与时间点预测的优势,

更新日期:2021-05-17
down
wechat
bug