Proxy-based Prediction of Solar Extreme Ultraviolet Emission Using Deep Learning

, , , and

Published 2021 April 6 © 2021. The American Astronomical Society. All rights reserved.
, , Citation Anthony Pineci et al 2021 ApJL 910 L25 DOI 10.3847/2041-8213/abee89

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

2041-8205/910/2/L25

Abstract

High-energy radiation from the Sun governs the behavior of Earth's upper atmosphere and such radiation from any planet-hosting star can drive the long-term evolution of a planetary atmosphere. However, much of this radiation is unobservable because of absorption by Earth's atmosphere and the interstellar medium. This motivates the identification of a proxy that can be readily observed from the ground. Here, we evaluate absorption in the near-infrared 1083 nm triplet line of neutral orthohelium as a proxy for extreme ultraviolet (EUV) emission in the 30.4 nm line of He ii and 17.1 nm line of Fe ix from the Sun. We apply deep learning to model the nonlinear relationships, training and validating the model on historical, contemporaneous images of the solar disk acquired in the triplet He i line by the ground-based SOLIS observatory and in the EUV by the NASA Solar Dynamics Observatory. The model is a fully convolutional neural network that incorporates spatial information and accounts for the projection of the spherical Sun to 2d images. Using normalized target values, results indicate a median pixelwise relative error of 20% and a mean disk-integrated flux error of 7% on a held-out test set. Qualitatively, the model learns the complex spatial correlations between He i absorption and EUV emission has a predictive ability superior to that of a pixel-by-pixel model; it can also distinguish active regions from high-absorption filaments that do not result in EUV emission.

Export citation and abstract BibTeX RIS

1. Introduction

The Sun emits a mere ∼10−6 of its energy in the extreme ultraviolet (EUV, λ = 10–120 nm) but this radiation heats Earth's upper atmosphere, causing it to expand, and is a critical input for predictions of the lifetime of low-Earth-orbit satellites (Vourlidas & Bruinsma 2018). Models predict that over timescales of 0.1–1 Gyr, EUV radiation from host stars can drive significant escape of the atmospheres of planets on close-in orbits (Owen 2019), and the evolution in EUV emission as stars spin down and become less magnetically active is an area of active research in stellar and planetary astronomy (e.g., Linsky et al. 2014). Problematically, the EUV is only observable from space and is heavily absorbed by the interstellar medium; only the Sun and some of the nearest normal stars have been detected. Any proxy that can be monitored from the ground could greatly improve our understanding of EUV radiation from other stars.

Helium, an abundant element in all stars, has a neutral "ortho" state with a triplet of absorption lines at 1083 nm (near-infrared) that is readily observed from the ground. In cool stars, the metastable orthohelium state is primarily populated by recombination of singly ionized He, which under quiescent (nonflaring) conditions is a product of photoionization by EUV (λ < 50.4 nm) photons (Oklopčić & Hirata 2018). Neutral orthohelium is depleted both by ionizing UV photons with λ < 259 nm and de-excitation to the singlet state by electron collisions. Thus, there is a causal connection between EUV emission, which emanates from the transition region and corona (Golding et al. 2017), and the strength of He i 1083 nm absorption, which arises in the upper chromosphere. Three-dimensional radiation-magnetohydrodynamic simulations by Leenaarts et al. (2016) suggest that the He-ionizing radiation field has a diffuse, highly scattered component from the corona and a localized component from the transition region. The latter gives rise to spatial covariance between He i absorption and EUV emission; both are elevated in active plage regions and covary with time between solar minima and maxima (Floyd et al. 2005). Long-term monitoring has established that the He i 1083 nm line strength is an accurate proxy for EUV emission (Harvey & Livingston 1994; Deland & Cebula 2008). Since the advent of large-format near-infrared imaging arrays, the disk-resolved He i line has been routinely monitored on the Sun (Penn 2014). The disk-integrated line has also been surveyed among mainly Sun-like stars, including planet hosts (Andretta et al. 2017).

Due to the complexity of the solar atmosphere, particularly in magnetically active regions in which much of the EUV emission occurs, quantitative prediction of the EUV–He i relation is at the frontier of model physics and computational resources (Leenaarts et al. 2016; Rempel 2017). Given sufficient quantity and quality of data, empirical deep learning (DL) approaches can succeed where theoretical approaches cannot. Here, we apply DL to map ground-based images of the 1083 nm He i line strength to contemporaneous solar images of EUV emission obtained with the Solar Dynamics Observatory (SDO) space telescope on its geosynchronous orbit (Pesnell et al. 2012). The high-cadence, high spatial resolution, multiwavelength aspects of SDO lends itself to DL (Galvez et al. 2019), and DL has been used to translate images of solar Ca ii emission into magnetic field maps (magnetograms; Shin et al. 2020), and magnetograms into solar UV/EUV (Park et al. 2019). DL has also been used on solar EUV images to predict coronal holes (Illarionov et al. 2020) and solar wind intensity (Upendran et al. 2020).

In this approach, a DL model is trained to accurately map infrared input (He i 1083 nm) images to EUV output (He ii 30.4 nm) images. The model's (hyper)parameters are tuned while evaluating the performance on a second, independent validation set of infrared-EUV image pairs. Finally, the ability of the model to predict EUV emission is then evaluated on a third, independent test or holdout data set. Our goals were to (i) identify the DL model that most accurately predict the disk-integrated EUV emission and (ii) better understand the spatially local relationship between He i line strength and EUV emission, and any connection to specific kinds of structures in the stellar atmosphere. A better understanding of this relationship could eventually lead to proxy-based estimates for replacing missing solar EUV data, and to estimates of EUV emission from distant planet host stars where direct measurements are currently not possible.

2. Data Sets and Methods

We used full-disk solar images (i.e., line-scanning-based maps) of the He i 1083 nm line strength (i.e., equivalent width) obtained by the Vector SpectroMagnetograph (VSM) at the Synoptic Optical Long-term Investigations of the Sun (SOLIS) observatory on Kitt Peak (Henney et al. 2009; see Figure 1). We paired these with full-disk images in the EUV obtained by the Atmospheric Imaging Assembly (AIA; Lemen et al. 2012) instrument on SDO. We primarily analyzed AIA images of emission in the He ii line at 30.4 nm, the dominant line in the EUV spectrum of the quiet Sun (Del Zanna et al. 2015). Helium is very abundant (8.7% by atom) and uniformly distributed in the Sun, and correlations between lines of He i and He ii are not due to any variation in abundance.

Figure 1.

Figure 1. (a) Deep neural networks are trained to map He i absorption in the near-infrared at 1083 nm (top left) to the EUV emission of He ii at 30.4 nm (top right). Predictions on a test set (bottom right) correctly differentiate structures on the solar disk, and can be used to estimate total EUV emission with small and uniform error (bottom left). Units are standard deviation from the mean of the logarithmically scaled values. (b) Close-up of a different region, showing that while He i absorption and EUV emission are correlated, an exception is filaments, which are not associated with local sources of EUV emission.

Standard image High-resolution image

SOLIS operated from 2005 to 2015, whereas SDO has been operational since 2010. A subset of 1321 image pairs was selected from data obtained during the overlap interval between 2010 May and 2015 July. VSM images were obtained on a daily basis, weather permitting, while SDO images are obtained with a 12 s cadence, and are available for download at a 2 minute cadence; therefore, each He i image was matched with the closest AIA image in time. The offset in time between the He i and EUV images in a pair is always less than 30 minutes and typically less than 1 minute.

AIA images are acquired at a plate scale of 0farcs6, but these were spatially binned to a scale of 2farcs4 per pixel, then cropped to 864 × 864 pixels to remove space outside of the limb. VSM 2048 × 2048 images with an original plate scale of 1'' were resized using linear interpolation to match the size and plate scale of the cropped EUV images. One hundred one image pairs have missing or corrupted data due to instrument errors. Of these pairs, 68 have faulty AIA images, while the remainder have problematic SOLIS images. AIA images are automatically metered according to the total solar intensity, and 28 images were obtained during solar flares and thus have short integration times and low signal to noise in regions of the solar disk outside of the flares. These images were excluded to reduce systematic effects on the training. AIA are normalized according to their exposure time. The final processed data set consists of 1221 matched pairs of He i absorption and EUV emission. All image preprocessing is done using astropy (Astropy Collaboration et al. 2013, 2018), including using FITS headers for extraction of image exposure times, removing uncorrupted images, and matching the Sun scale of input and output images.

AIA images are also affected by long-term degradation in detector sensitivity. We corrected AIA pixel values using normalized values of sensitivity (telescope effective area) of the telescopes obtained from analysis of sounding-rocket calibrations (Boerner et al. 2012), and linearly interpolating between calibration points in time.

EUV emission values have very large variance, so logarithmic values are used instead for DL training. AIA data have been calibrated using electron hits on portions of the sensor (Boerner et al. 2012), a standardization which can result in negative emission values, so a constant term was added to the entire data set before the log-transform so that ∼1 in 1 million values are clipped.

The performance of three neural network models was compared. The primary model was a Fully Convolutional Network (FCN; Long et al. 2015) based on a VGG16 architecture (Simonyan & Zisserman 2014), with 512 × 512 pixel single-channel input and output, and "skip" connections (which bypass successive layers) with 8× upsampling using transposed convolutions. We also used variations on a pixelwise predictor that contained no information from neighboring pixels (essentially an FCN with 1 × 1 kernels), and a U-Net architecture (Ronneberger et al. 2015) that is capable of capturing features on the scale of the solar disk.

We account for some physical effects by including two additional input channels (features). The first is the Euclidean distance from the center of the disk to the limb. This addresses the geometric projection of the spherical Sun onto a two-dimensional image and increasing distortion toward the limb, as well as enhanced absorption of EUV photons along the line of sight through the solar atmosphere. The second additional channel is the solar latitude, which accounts for variation in the intensity and geometry of the solar magnetic field, the distribution and behavior of the active regions (plage and sunspots), and thus the spatial patterning of both He i absorption and EUV emission. North–south symmetry is assumed. Any effects of the ±7° annual variation in projection due to the solar obliquity were ignored.

The data was split in order to minimize correlations between train and test images. The 1211 image pairs were grouped by month, then months were randomly assigned to training, validation, and test sets with proportions of 60%, 20%, and 20%, respectively.

Models were implemented in pytorch (Paszke et al. 2019) and trained using the Adam optimization algorithm (Kingma & Ba 2015). The following hyperparameters were optimized for the FCN model by performing a grid search with sherpa (Hertel et al. 2020): the learning rate, mini-batch size, dropout regularization, and the choice of training objective function—either mean absolute error (MAE) or mean squared error (MSE). The best model (lowest validation set MSE) minimized the MAE objective using a learning rate of 0.0001, mini-batch size of 2, dropout rate of 0.1 in all layers, and a weight decay coefficient of 1 × 10−7. The learning rate was decreased by a factor of 0.1 when no improvement in validation score occurred for 5 epochs, and training was stopped when no improvement occurred over 15 epochs (iterations through the training data set).

3. Results

The performance of the different models is compared in Table 1 in terms of the MAE, root mean squared error (RMSE), and median absolute relative error computed over all predicted pixels in the test set.

Table 1. Performance on Test Data

ModelMAERMSE% Median/Mean Relative Error
 PixelwisePixelwisePixelwiseDisk-integrated
Pixelwise2.2 × 102 4.4 × 102 28.413.2
Pixelwise + Limb2.1 × 102 4.2 × 102 27.112.8
Pixelwise + Limb + Lat2.0 × 102 7.8 × 102 27.011.6
U-Net2.0 × 102 4.7 × 102 22.814.7
U-Net + Limb2.0 × 102 4.7 × 102 22.715.2
U-Net + Limb + Lat2.0 × 102 4.2 × 102 25.415.6
FCN1.7 × 102 3.4 × 102 22.17.3
FCN + Limb1.5 × 102 3.3 × 102 19.87.2
FCN + Limb + Lat1.5 × 102 3.3 × 102 19.97.0

Note. MAE and RMSE are given in terms of the units of the original data: counts s−1 pixel−1.

Download table as:  ASCIITypeset image

The convolutional neural networks that include information from neighboring pixels (U-Net and FCN) perform better than the models that make predictions for each pixel independently. This shows that spatial features in the He i image contain important information, e.g., for discriminating between active regions and filaments. However, the fully convolutional network architecture with limited connectivity outperforms the U-Net architecture that is optimized to detect features on the scale of the entire disk. This outcome can be explained by the fact that the spatial features (e.g., active regions and filaments) are restricted to small portions of the solar disk, so the U-Net's ability to capture large-scale features leads to overfitting rather than providing additional useful information. We expect this effect to diminish with a larger training data set.

Inclusion of limb distance and latitude as input features improves performance in the Pixelwise and FCN models in terms of the training objective (not shown), and this translates to an improvement in the pixelwise relative error (Figure 2). The same improvement is not seen with the U-Net model, which can be explained by the difference in spatial information available to the different architectures: the Pixelwise and FCN models are restricted to using spatially local information, while the U-Net model can use whole-disk information to make predictions. Thus, the addition of spatial features is less useful to the U-net model, and can even hurt performance by contributing to overfitting.

Figure 2.

Figure 2. Median of the signed relative percent error in pixel values using different neural network architectures. Columns contain results from Pixelwise, U-Net, and FCN models, while rows contain results with the addition of physics-informed features. These images show how the error is correlated with latitude in models that do not include latitude as an input channel.

Standard image High-resolution image

A prediction for disk-integrated flux in the 30.4 nm He ii line was obtained by summing the pixel values within the disk. The best-performing FCN model also produces the most accurate prediction of the disk-integrated flux in terms of mean absolute relative error over the test images (Figure 3).

Figure 3.

Figure 3. Plot of predicted vs. observed disk-integrated flux for Pixelwise and FCN models on the test set. Perfect predictions would fall on the red dotted line, and the two models achieve a mean relative error of 13.2% and 7.0%, respectively. Training on log-scaled values results in a systematic underestimate of the flux; a linear regression model fit to the FCN predictions is shown in brown with slope 0.92.

Standard image High-resolution image

Qualitatively, the results indicate that the model learned some aspects of the spatial correlations between He i absorption and EUV emission. For example, an obvious exception to the trend for areas of high He i absorption to correspond with EUV emission are filaments (a.k.a. prominences), thin, arcuate structures of magnetically confined plasma suspended above the chromosphere (Parenti 2014; Kuckein et al. 2016). In these filaments, high He i absorption corresponds to low EUV emission unlike other regions of the Sun. The model is able to detect this behavior and correctly inverts the trend for absorption filaments (Figure 1).

4. Summary and Discussion

Using contemporaneous historical data from ground- and space-based solar telescopes for training data, we have demonstrated that a deep convolutional neural network can be used to predict the emission in a prominent EUV line (accessible only from space) from the absorption in an infrared line (accessible on the ground) with high accuracy. We find that model performance can be improved through a physics-informed architecture design that (i) uses limited spatial information to discriminate between different physical phenomena including filaments and plage regions, and (ii) accounts for the projection of the Sun's surface onto a 2D image.

The SDO AIA images the Sun at nine other wavelengths between 9.4 and 450 nm, each of which probes different temperatures and regions of the solar atmosphere (Lemen et al. 2012). We used the FCN model to predict emission in another prominent EUV line, that of Fe ix at 17.1 nm. Figure 4(a) shows a representative prediction and its errors on an out-of-sample image. The resulting performance of this model is larger than for the corresponding 30.4 nm model (26.1% versus 19.9% pixelwise median relative error on the test set). This is expected because 17.1 nm emission is probing a region that is higher in the solar atmosphere, its contribution to the formation of triplet He i at any given point will be more spatially dispersed, and thus finer scales in patterns of emission in the 17.1 nm line will not be captured. Although predictions capture the overall distribution of emission (bottom right panel of Figure 4(a)), filamentous structures that reflect the magnetic field topology in that region of the atmosphere are not reproduced.

Figure 4.

Figure 4. (a) Contemporaneous solar images in He i (top left) and Fe ix 17.1 nm (top right). Predictions of 17.1 nm emission (bottom right) have higher error (bottom left) than predictions of emission at 30.4 nm. (b) Example of a SOLIS image of He i 1083 nm absorption (top left) and the predicted AIA image of He ii 30.4 nm (top right) using the best-performing FCN model. The same architecture is then used to predict the He i image (bottom left) from the observed EUV image (bottom right).

Standard image High-resolution image

We also trained an FCN model in reverse, swapping the model inputs and outputs, to test whether aspects of physical causality are evident. Although it is for practicality that we predict EUV emission at 30.4 nm (the less accessible observation) based on He i absorption at 1083 nm (the more accessible observation), it is EUV photoionization of He and subsequent recombination that produces triplet He i. Figure 4(b) compares one representative prediction to the observation for each directions. The pixelwise median relative error of 19.9% for He i → EUV sense (top) increases to 24.1% for EUV → He i (bottom). One marked difference is the inability of the EUV → He i to predict the existence of filaments. Since the He i absorption is not locally driven it cannot be predicted by the local distribution of EUV emission.

Our results suggest that at least partial reconstruction of EUV emission could be achieved using He I observations, either to reconstruct missing data at past epochs, or to estimate EUV emission at future time when EUV observations from space are interrupted or become unavailable. The Extreme ultraviolet Imaging Telescope on the Solar and Heliospheric Observatory was obtaining full-disk images in four EUV lines, including that of He ii at 30.4 nm, for 15 yr, albeit at much lower cadence and spatial resolution than the AIA (Delaboudinière et al. 1995), and represents another potential opportunity to test the performance of DL models, particularly in the 5 yr overlap interval with SOLIS. Future work could also test whether the inclusion of other imaging data from ground-based observatories, i.e., Hα, Ca ii HK, or magnetograms (like that produced by SOLIS-VSM; Gosain et al. 2013) can improve predictions. Application to other stars is inhibited by the absence of resolve-disk information, but potentially such DL models could be used as computationally efficient means to convert multiple disk-integrated observables, including triplet He i absorption at 1083 nm, into total EUV emission using descriptions of the distribution of activity with a few tunable parameters.

Tom Schad offered valuable comments on a draft of this manuscript. A.P. was supported in part by Samuel P. and Frances Krown through the Caltech Summer Undergraduate Research Fellowship program. This material is based upon work supported by the National Science Foundation under grant No. 2008344. Advanced computing resources from the University of Hawai'i Information Technology Services Cyberinfrastructure are gratefully acknowledged. E.G. acknowledges support as a long-term visitor in the Center for Space and Habitability at the University of Bern.

Please wait… references are loading.
10.3847/2041-8213/abee89