Abstract
Cloud segmentation plays a crucial role in image analysis for climate modeling. Manually labeling the training data for cloud segmentation is time-consuming and error-prone. We explore to train segmentation networks with synthetic data due to the natural acquisition of pixel-level labels. Nevertheless, the domain gap between synthetic and real images significantly degrades the performance of the trained model. We propose a color space adaptation method to bridge the gap, by training a color-sensitive generator and discriminator to adapt synthetic data to real images in color space. Instead of transforming images by general convolutional kernels, we adopt a set of closed-form operations to make color-space adjustments while preserving the labels. We also construct a synthetic-to-real cirrus cloud dataset SynCloud and demonstrate the adaptation efficacy on the semantic segmentation task of cirrus clouds. With our adapted synthetic data for training the semantic segmentation, we achieve an improvement of \(6.59\%\) when applied to real images, superior to alternative methods.
Similar content being viewed by others
Notes
We will release the cirrus clouds dataset, including all the volume data, rendering settings and rendering results.
SWD owns similar properties to the Wasserstein distance but simpler to compute. It is widely used in various applications, including generative modeling and general supervised/unsupervised learning, to measure the quality of generative images [62].
References
Yuan, F., Lee, Y.H., Meng, Y.S.: Comparison of cloud models for propagation studies in ka-band satellite applications. In: Proceedings of IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting (AP-S/URSI), pp. 383–384 (2014)
Christodoulou, C., Michaelides, S., Pattichis, C.: Multifeature texture analysis for the classification of clouds in satellite imagery. IEEE Trans. Geosci. Remote Sens. 41(11), 2662–2668 (2003)
Yuan, F., Lee, Y.H., Meng, Y.S.: Comparison of radio-sounding profiles for cloud attenuation analysis in the tropical region. In: Proceedings of IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting (AP-S/URSI), pp. 259–260 (2014)
Mahrooghy, M., Younan, N.H., Anantharaj, V.G., Aanstoos, J., Yarahmadian, S.: On the use of a cluster ensemble cloud classification technique in satellite precipitation estimation. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 5(5), 1356–1363 (2012)
Dev, S., Lee, Y.H., Winkler, S.: Color-based segmentation of sky/cloud images from ground-based cameras. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 10(1), 231–242 (2017)
Dianne, G., Wiliem, A., Lovell, B.C.: Deep-learning from mistakes: automating cloud class refinement for sky image segmentation. In: Proceedings of Digital Image Computing: Techniques and Applications (DICTA), pp. 1–8 (2019)
Dev, S., Manandhar, S., Lee, Y.H., Winkler, S.: Multi-label cloud segmentation using a deep network. In: Proceedings of IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting (AP-S/URSI) (2019)
Dev, S., Lee, Y.H., Winkler, S.: Multi-level semantic labeling of sky/cloud images. In: Proceedings of IEEE International Conference on Image Processing (ICIP), pp. 636–640 (2015)
Sun, B., Saenko, K.: From virtual to reality: fast adaptation of virtual object detectors to real domains. In: Proceedings of British Machine Vision Conference (BMVC) (2014)
Massa, F., Russell, B.C., Aubry, M.: Deep exemplar 2d–3d detection by adapting from real to rendered views. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6024–6033 (2016)
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtualworlds as proxy for multi-object tracking analysis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4340–4349 (2016)
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: Proceedings of IEEE/RJS International Conference on Intelligent Robots and Systems (IROS), pp. 23–30 (2017)
Liebelt, J., Schmid, C.: Multi-view object class detection with a 3d geometric model. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1688–1695 (2010)
Stark, M., Goesele, M., Schiele, B.: Back to the future: learning shape models from 3d cad data. In: Proceedings of British Machine Vision Conference (BMVC), pp. 1–11 (2010)
Su, H., Qi, C.R., Li, Y., Guibas, L.J.: Render for CNN: viewpoint estimation in images using CNNS trained with rendered 3D model views. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 2686–2694 (2015)
Grabner, A., Roth, P.M., Lepetit, V.: 3D pose estimation and 3D model retrieval for objects in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3022–3031 (2018)
Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 102–118 (2016)
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The synthia dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3234–3243 (2016)
Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D.: Unsupervised pixel-level domain adaptation with generative adversarial networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 95–104 (2017)
Hoffman, J., Tzeng, E., Park, T., Zhu, J.Y., Isola, P., Saenko, K., Efros, A., Darrell, T.: Cycada: cycle-consistent adversarial domain adaptation. In: Proceedings of International Conference on Machine Learning (ICML), pp. 1989–1998 (2018)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Proceedings of Neural Information Processing Systems (NeurlPS), pp. 2672–2680 (2014)
Ledig, C., Theis, L., Huszar, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., Shi, W.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 105–114 (2017)
Snderby, C.K., Caballero, J., Theis, L., Shi, W., Husz’r, F.: Amortised map inference for image super-resolution. In: Proceedings of International Conference on Learning Representations (ICLR) (2017)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of International Conference on Learning Representations (ICLR) (2016)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5967–5976 (2017)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 2242–2251 (2017)
Taigman, Y., Polyak, A., Wolf, L.: Unsupervised cross-domain image generation. In: Proceedings of International Conference on Learning Representations (ICLR) (2017)
Liu, M.Y., Tuzel, O.: Coupled generative adversarial networks. In: Proceedings of Neural Information Processing Systems (NeurlPS), pp. 469–477 (2016)
Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2337–2346 (2019)
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis. In: Proceedings of International Conference on Machine Learning (ICML), vol. 48, pp. 1060–1069 (2016)
Zhang, H., Xu, T., Li, H.: Stackgan: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 5908–5916 (2017)
Hong, S., Yang, D., Choi, J., Lee, H.: Inferring semantic layout for hierarchical text-to-image synthesis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7986–7994 (2018)
Long, M., Cao, Y., Wang, J., Jordan, M.: Learning transferable features with deep adaptation networks. In: Proceedings of International Conference on Machine Learning (ICML), pp. 97–105 (2015)
Sixt, L., Wild, B., Landgraf, T.: Rendergan: generating realistic labeled data. In: Proceedings of International Conference on Learning Representations (ICLR) (2017)
Wang, B., Yu, Y., Xu, Y.Q.: Example-based image color and tone style enhancement. ACM Trans. Graph. (TOG) 30(4), 64 (2011)
Kuanar, S., Rao, K.R., Mahapatra, D., Bilas, M.: Night time haze and glow removal using deep dilated convolutional network. arXiv preprint arXiv:1902.00855 (2019)
Kuanar, S., Conly, C., Rao, K.R.: Deep learning based HEVC in-loop filtering for decoder quality enhancement. In: Proceedings of Picture Coding Symposium (PCS), pp. 164–168 (2018)
Kuanar, S., Athitsos, V., Mahapatra, D., Rao, K., Akhtar, Z., Dasgupta, D.: Low dose abdominal ct image reconstruction: an unsupervised learning based approach. In: Proceedings of IEEE International Conference on Image Processing (ICIP), pp. 1351–1355 (2019)
Reinhard, E., Adhikhmin, M., Gooch, B., Shirley, P.: Color transfer between images. IEEE Comput. Graph. Appl. 21(5), 34–41 (2001)
Huang, H., Zang, Y., Li, C.F.: Example-based painting guided by color features. Vis. Comput. 26(6), 933–942 (2010)
Huang, H., Xiao, X.: Example-based contrast enhancement by gradient mapping. Vis. Comput. 26(6), 731–738 (2010)
Yan, Z., Zhang, H., Wang, B., Paris, S., Yu, Y.: Automatic photo adjustment using deep neural networks. ACM Trans. Graph. (TOG) 35(2), 11 (2016)
Gharbi, M., Chen, J., Barron, J.T., Hasinoff, S.W., Durand, F.: Deep bilateral learning for real-time image enhancement. ACM Trans. Graph. (TOG) 36(4), 118 (2017)
Chen, Y.S., Wang, Y.C., Kao, M.H., Chuang, Y.Y.: Deep photo enhancer: Unpaired learning for image enhancement from photographs with gans. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6306–6314 (2018)
Limmer, M., Lensch, H.P.A.: Infrared colorization using deep convolutional neural networks. In: Proceedings of International Conference on Machine Learning and Applications (ICMLA), pp. 61–68 (2016)
Park, J., Lee, J.Y., Yoo, D., Kweon, I.S.: Distort-and-recover: color enhancement using deep reinforcement learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5928–5936 (2018)
Hu, Y., He, H., Xu, C., Wang, B., Lin, S.: Exposure: a white-box photo post-processing framework. ACM Trans. Graph. (TOG) 37(2), 26 (2018)
Bianco, S., Cusano, C., Piccoli, F., Schettini, R.: Learning parametric functions for color image enhancement. In: Proceedings of International Workshop on Computational Color Imaging (CCIW), vol. 11418, pp. 209–220 (2019)
Chai, Y., Giryes, R., Wolf, L.: Supervised and unsupervised learning of parameterized color enhancement. In: Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 992–1000 (2020)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 40(4), 834–848 (2018)
Bi, L., Feng, D.D., Kim, J.: Dual-path adversarial learning for fully convolutional network (FCN)-based medical image segmentation. Vis. Comput. 34(6), 1043–1052 (2018)
Bi, L., Kim, J., Kumar, A., Fulham, M., Feng, D.: Stacked fully convolutional networks with multi-channel learning: application to medical image segmentation. Vis. Comput. 33(6), 1061–1071 (2017)
Wang, J., Zheng, C., Chen, W., Wu, X.: Learning aggregated features and optimizing model for semantic labeling. Vis. Comput. 33(12), 1587–1600 (2017)
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask r-CNN. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 42(2), 386–397 (2020)
Jakob, W.: Mitsuba renderer (2010). http://www.mitsuba-renderer.org
Bychkovsky, V., Paris, S., Chan, E., Durand, F.: Learning photographic global tonal adjustment with a database of input / output image pairs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 97–104 (2011)
Chollet, F., et al.: Keras. https://keras.io (2015)
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D.G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., Zheng, X.: Tensorflow: a system for large-scale machine learning. In: Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 265–283 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034 (2015)
Kingma, D.P., Ba, J.L.: Adam: A method for stochastic optimization. In: Proceedings of International Conference on Learning Representations (ICLR) (2015)
Kolouri, S., Nadjahi, K., Simsekli, U., Badeau, R., Rohde, G.: Generalized sliced Wasserstein distances. In: Proceedings of Neural Information Processing Systems (NeurlPS), pp. 261–272 (2019)
Bonneel, N., Rabin, J., Peyr, G., Pfister, H.: Sliced and radon Wasserstein barycenters of measures. J. Math. Imaging Vis. 51(1), 22–45 (2015)
Dobashi, Y., Shinzo, Y., Yamamoto, T.: Modeling of clouds from a single photograph. Comput. Graph. Forum (CGF) 29(7), 2083–2090 (2010)
Liu, J., Sun, J., Shum, H.Y.: Paint selection. p. 69 (2009)
Chen, Q., Li, D., Tang, C.K.: KNN matting. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 35(9), 2175–2188 (2013)
Funding
This study was funded by National Natural Science Foundation of China (Grant Nos. 61772024, 61732016).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 A: Color adjustment operations
Brightness The brightness adjustment operation is defined as
where x is the input image, and \(\alpha _b\) is a scalar parameter that controls the extent of the adjustment. We clip \(\alpha _b\) into the range \([-1,1]\).
Saturation The saturation adjustment operation is defined as
where x is the input image and \(\alpha _s\) is a scalar parameter that controls the extent of the adjustment. We clip \(\alpha _s\) into the range \([-1,1]\).
\(\textsf {L}(x)\) is the per-pixel average of the three channels \(\frac{1}{2}\cdot [\textsf {rgb\_max}(x) + \textsf {rgb\_min}(x)]\), and \(\textsf {s}(x,\alpha _s)\) is defined as
\(\textsf {S}(x)\) is defined as a per-pixel ratio
where \(\textsf {delta}(x) = \textsf {rgb\_max}(x) - \textsf {rgb\_min}(x)\).
Contrast The contrast adjustment operation is defined as
where x is the input image, \(\bar{x}\) is the average of all pixel values of x, and \(\alpha _c\) is a scalar parameter that controls the extent of the adjustment. We clip \(\alpha _c\) into the range \([-1,1]\) (Fig. 12).
1.2 B: Handcrafted approach for image augmentation
We execute an adaptation approach using handcrafted features in the experiment. First, we transfer the images to HSV space to extract features of saturation and brightness. Then, we fit a Gaussian distribution model to the feature points of the real images. Next, for each synthetic image, we shift its features toward a target point sampled from the Gaussian distribution model. Finally, we reconstruct augmented images from the shifted features. Compared with the handcrafted feature, our generator learns more powerful features in higher dimensions and leverages that to decide the best way to shift each synthetic image (Fig. 13).
1.3 C: Details of style transfer and image post-processing
All the training images are zero-centered and rescaled to \([-1, 1]\). We set the batch size to 8. We adopt the Adam optimizer with \(lr=2\mathrm {e}{-4}\) and \(\beta _1=0.5\) and train both the discriminator and the generator for 100 epochs. We show more color adaptation results on the Pexels dataset in Fig. 14. We also apply the model trained on the Pexels dataset to images in the MIT-FiveK dataset to show the ability of cross-dataset generalization (Fig. 15). While similar effects can be produced by [47], our method does not require reinforcement learning.
Rights and permissions
About this article
Cite this article
Lyu, Q., Chen, M. & Chen, X. Learning color space adaptation from synthetic to real images of cirrus clouds. Vis Comput 37, 2341–2353 (2021). https://doi.org/10.1007/s00371-020-01990-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-020-01990-7