Applied Soft Computing ( IF 5.472 ) Pub Date : 2020-10-16 , DOI: 10.1016/j.asoc.2020.106807 Di Wang; Ping Wang; Cong Wang; Shuo Zhuang; Junzhi Shi
Distribution regression refers to the regression case whose input objects are probability measures. A lot of machine learning applications can fit into this framework, such as multi-instance learning and learning from noisy data. This paper proposes an interval prediction algorithm for distribution regression. The algorithm is based on conformal prediction which aims to build reliable prediction systems. To the best of our knowledge, this is the first work to extend conformal prediction to distribution regression problems. Our approach first embeds the input distributions to a reproducing kernel Hilbert space by kernel mean embedding, and then learns a conformal regressor from the embeddings to the outputs. In order to make the whole process faster, we also employ random fourier features to approximate the kernel. The algorithm was tested on synthetic data sets and applied to statistical postprocessing of ensemble forecasts for temperature and precipitation, which is the first attempt of applying conformal prediction to this application area. The experimental results demonstrate the empirical validity and the effectiveness of our approach when compared with the other widely used algorithms for postprocessing.