Full length article
A generic, cluster-centred lossless compression framework for joint auroral data

https://doi.org/10.1016/j.jvcir.2021.103185Get rights and content

Abstract

Studying the well-known phenomenon “aurora” plays a pivotal role in investigating the solar–terrestrial coupling mechanism. A special auroral spectrograph in Antarctic Zhongshan Station constitutes a auroral observation joint system with satellite-borne sensors of the Defense Meteorological Satellite Program. Multipoint observation by this system provides more essential information for relevant studies than single observation by each instrument, but also results in a multifold increased volume of data that are difficult to be either stored or transmitted. To address this difficulty, we develop a clustering-based, generic lossless data compression framework that combines the usage of various ultimate compressors with a hierarchical clustering algorithm to exert the strength of all the compressors in data reduction. This framework achieves an always-best compression performance for different-sized datasets with a reasonable time consumption, which promises the design of pipelines using it for real-time data transmission.

Introduction

Aurora is representative of energetic particle precipitation into the upper atmosphere of Earth’s polar regions. It is the visible manifestation of spontaneous emissions concerning some excited atmospheric compositions by collision with the precipitating electrons/protons originated from solar wind or magnetosphere [1], [2]. Relevant studies are crucial for uncovering the secrets of multiscale coupling of the Sun–Earth system.

Auroral observing can be conducted by various devices independently, e.g., optical imagers measuring electromagnetic energy from a distance and particle detectors capable of in-situ energy measurements, some of which are operated on the ground, and some are satellite-borne that pass over fixed locations at different times, even in a day. It is also feasible to carry out the multi-point observation, i.e., exploit different sensors to measure the auroral characteristics of interest in a simultaneous way, in which case these devices constitute a joint observing system. That collects information on a larger scale than the observations by individual devices, facilitating a comprehensive understanding of relevant physical and chemical processes.

Up to now, several cooperative observing networks including United States Antarctic Program [3], Scientific Commission on Antarctic Research (SCAR) programs [4] and Chinese Meridian Project [5] have been established for auroral observations. In a different way, virtual observation platforms can be built for economic considerations under the premise that all involved instruments concurrently observe the same event or phenomenon. An example of joint auroral observing system consists of a customized high-resolution auroral spectrograph (ASG) in Antarctic Zhongshan Station (ZHS) [6] and Special Sensor Ultraviolet Spectrographic Imager (SSUSI) [7] and Special Sensor J5 (SSJ5) [8] positioned on a Defense Meteorological Satellite Program (DMSP) satellite. In Fig. 1, one DMSP satellite is passing the magnetic meridian over ZHS, and the intersection of blue and green curves manifesting the trajectory of satellite and the effective field of view of ASG, respectively.

There is no doubt that a combined use of jointly observed data from different sources brings benefits to auroral research in simulating the auroral particle transport in the atmosphere [9], [10], [11], reversely tracing the origins of aurora [12], [13], [14], and etc. However, it associates with a great difficulty in the preservation and manipulation of numerous files that have various formats. Lossless compression is widely used for reversible data reduction, Besides, the problem caused by great variety of data (which refers to the diversity of data types and data sources) can be overcome by several means.

In this regard, this paper presents a lossless compression framework for the mixed data concerning aurora observations by using multiple remarkably-performing compressors with the help of hierarchical clustering. That is, raw data are allocated into several clusters regardless of their types, then losslessly compressed cluster by cluster. The data of each cluster is processed by the same compressor, albeit more than one compressor is adopted. This framework has the generality of application to any dataset, and its compression performance is supposed to be always better than all used compressors. For validation, we combine the natural data collected by satellite-borne detectors SSUSI and SSJ5 with auroral spectral data (ASD) simultaneously obtained from ZHS, concentrating on the two-hour and two-day periods for experiments. Results demonstrate that our proposed framework improves the encoding efficiency based on other compressors and effectively utilizes the similarity between data, though slightly compromising the computation efficiency due to the usage of hierarchical clustering.

More details about this implementation are described in the rest of paper. Section 2 introduces some background knowledge. Section 3 provides the procedures about how to combine hierarchical clustering and lossless compression for efficiently storing a great variety of data. Section 4 discusses the detailed experimental design and analysis. At last, the summary and prospect is given.

Section snippets

Lossless compression for auroral data of interest

In this paper, the auroral data of interest are individually collected from the ASG, SSUSI and SSJ5. ASG has run for years in ZHS and produced massive ultraspectral data, ASD. These data are essential for an in-depth analysis of aurorae, and analogous to other spectral data, they have at least two dimensions: one is spatial, the other is spectral. Likewise, SSUSI generates spectral data during its remote sensing observation of atmospheric far-ultraviolet emissions. SSJ5 measures energy of its

Generic lossless compression by hierarchical clustering

All of the above-mentioned compressors are sensible to data format. Using a certain compressor for joint auroral data is generally followed by an in-and-out compression performance, so it is difficult to effectively compress the mixed data jointly obtained by ASG, SSUSI and SSJ5. Though either converting all involved formats to a uniform format [38] or using a file archiver that supports generic data compression is helpful, we reduce this difficulty in a different way. That is, a generic,

The choice of similarity metric

An ideal similarity metric should have a high value for similar data but a low value for distinct data. Using this idea, the similarity measures given in Table 1 are successively applied to a special dataset for determining which measure is the most suitable. This dataset is constructed on the principle that relevant files are distinguishable enough for ease of telling whether two files are similar enough, so it concerns two types of data, viz. ASD and plain text that are apparently different

Conclusions

Aurora facilitates complement our knowledge about the Earth as well as the Sun. Up to now, various ground-based and in-situ devices have been employed alone for its observation, somewhat short of monitoring relevant physical properties from a comprehensive perspective. Though the relevant data can be used in combination with each other, this brings a great difficulty in the long-term data preservation and fast network data transmission that have a rather considerable amount.

To resolve this

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work was funded by the National Natural Science Foundation of China (No. 61775175, 61601355, 41874195, 41831072), National Key Research and Development Program of China (No. 2018YFC 1407303), Space Science Pilot Project of the Chinese Academy of Sciences (No. XDA15350202), International Cooperation Advance Research on Key Scientific Issues of the International Meridian Project, China (No. A131901W14).

ASD used in this paper are acquired by ASG located at ZHS, aka the National Observation

References (61)

  • SummerhayesC.P.

    International collaboration in antarctica: The international polar years, the international geophysical year, and the scientific committee on antarctic research

    Polar Rec.

    (2008)
  • ChiW.

    Development of the chinese meridian project

    Chin. J. Space Sci.

    (2010)
  • PaxtonL.J. et al.

    SSUSI: Horizon-to-horizon and limb-viewing spectrographic imager for remote sensing of environmental parameters

  • Kadinsky-CadeK. et al.

    First results from the SSJ5 precipitating particle sensor on DMSP F16: Simultaneous observation of kev and MeV particles during the 2003 halloween storms

  • SolomonS.C. et al.

    The auroral 6300 Å emission - observations and modeling

    J. Geophys. Res. Space Phys.

    (1988)
  • SolomonS.C.

    Global modeling of thermospheric airglow in the far-ultraviolet: Global airglow model

    J. Geophys. Res. Space Phys.

    (2017)
  • GrubbsG. et al.

    A comparative study of spectral auroral intensity predictions from multiple electron transport models

    J. Geophys. Res. Space Phys.

    (2018)
  • GrubbsG. et al.

    Predicting electron population characteristics in 2-d using multi-spectral ground-based imaging

    Geophys. Res. Lett.

    (2018)
  • AryalS. et al.

    Derivation of the energy and flux morphology in an aurora observed at midlatitude using multispectral imaging

    J. Geophys. Res. Space Phys.

    (2018)
  • KongW. et al.

    A comparative study of estimating auroral electron energy from ground-based hyperspectral imagery and DMSP-SSJ5 particle data

    Remote Sens.

    (2020)
  • KnightH. et al.

    Evidence for significantly greater N2 lyman-birge-hopfield emission efficiencies in proton versus electron aurora based on analysis of coincident DMSP SSUSI and SSJ/5 data

    J. Geophys. Res. Space Phys.

    (2008)
  • MottaG. et al.

    Hyperspectral Data Compression

    (2006)
  • HuangB.

    Satellite Data Compression

    (2011)
  • MielikainenJ. et al.

    Clustered DPCM for the lossless compression of hyperspectral images

    IEEE Trans. Geosci. Remote Sens.

    (2003)
  • MielikainenJ. et al.

    Lossless compression of hyperspectral images using clustered linear prediction with adaptive prediction length

    IEEE Geosci. Remote Sens. Lett.

    (2012)
  • WuJ. et al.

    Lossless compression of hyperspectral imagery via clustered differential pulse code modulation with removal of local spectral outliers

    IEEE Signal Process. Lett.

    (2015)
  • WuX. et al.

    Context-based, adaptive, lossless image coding

    IEEE Trans. Commun.

    (1997)
  • WuX. et al.

    Context-based lossless interband compression-extending CALIC

    IEEE Trans. Image Process.

    (2000)
  • MagliE. et al.

    Optimized onboard lossless and near-lossless compression of hyperspectral data using CALIC

    IEEE Geosci. Remote Sens. Lett.

    (2004)
  • MielikainenJ.

    Lossless compression of hyperspectral images using lookup tables

    IEEE Signal Process. Lett.

    (2006)
  • Cited by (0)

    This paper has been recommended for acceptance by Zicheng Liu.

    View full text