Siamese Dense Network for Reflection Removal with Flash and No-Flash Image Pairs

Chang, Yakun; Jung, Cheolkon; Sun, Jun; Wang, Fengqiao

doi:10.1007/s11263-019-01276-z

Siamese Dense Network for Reflection Removal with Flash and No-Flash Image Pairs

Published: 09 January 2020

Volume 128, pages 1673–1698, (2020)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Yakun Chang¹,
Cheolkon Jung ORCID: orcid.org/0000-0003-0299-7206¹,
Jun Sun¹ &
…
Fengqiao Wang¹

1486 Accesses
18 Citations
Explore all metrics

Abstract

This work addresses the reflection removal with flash and no-flash image pairs to separate reflection from transmission. When objects are covered by glass, the no-flash image usually contains reflection, and thus flash is used to enhance transmission details. However, the flash image suffers from the specular highlight on the glass surface caused by flash. In this paper, we propose a siamese dense network (SDN) for reflection removal with flash and no-flash image pairs. SDN extracts shareable and complementary features via concatenated siamese dense blocks. We utilize an image fusion block for the SDN to fuse the intermediate output of two branches. Since severe information loss occurs in the specular highlight, we detect the specular highlight in the flash image based on gradient of the maximum chromaticity. Through observations, flash causes various artifacts such as tone distortion and inhomogeneous brightness. Thus, with synthetic datasets we collect 758 pairs of real flash and no-flash image pairs (including their ground truth) by different cameras to gain generalization. Various experiments show that the proposed method successfully removes reflections using flash and no-flash image pairs and outperforms state-of-the-art ones in terms of visual quality and quantitative measurements. Besides, we apply the SDN to color/depth image pairs and achieve both color reflection removal and depth filling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 14

Fig. 16

Near Real-Time Correction of Specular Reflections in Flash Images Using No-Flash Image Prior

Single Image Reflection Removal Using DeepLabv3+

Separation of Diffuse and Specular Reflection Components from Real-World Color Images Captured Under Flash Imaging Conditions

References

Agrawal, A., Raskar, R., Nayar, S. K., & Li, Y. (2005). Removing photography artifacts using gradient projection and flash-exposure sampling. ACM Transactions on Graphics (TOG), 24(3), 828–835.
Article Google Scholar
Aksoy, Y., Kim, C., Kellnhofer, P., Paris, S., Elgharib, M., Pollefeys, M., & Matusik, W. (2018). A dataset of flash and ambient illumination pairs from the crowd. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 634–649).
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., & Shah, R. (1994). Signature verification using a siamese time delay neural network. In Advances in neural information processing systems (pp. 737–744).
Camplani, M., & Salgado, L. (2012). Efficient spatio-temporal hole filling strategy for kinect depth maps. In Proceedings of SPIE 8290, three-dimensional image processing (3DIP) and applications II (Vol. 8290, p. 82900E). International Society for Optics and Photonics.
Chang, Y., & Jung, C. (2019). Single image reflection removal using convolutional neural networks. IEEE Transactions on Image Processing, 28(4), 1954–1966.
Article MathSciNet Google Scholar
Chang, Y., Jung, C., Ke, P., Song, H., & Hwang, J. (2018). Automatic contrast-limited adaptive histogram equalization with dual gamma correction. IEEE Access, 6, 11782–11792.
Article Google Scholar
Chopra, S., Hadsell, R., & Lecun, Y. (2005). Learning a similarity metric discriminatively, with application to face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Diamant, Y., & Schechner, Y.Y. (2008). Overcoming visual reverberations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8). IEEE.
Eisemann, E., & Durand, F. (2004). Flash photography enhancement via intrinsic relighting. In ACM Transactions on Graphics (TOG) (Vol. 23, pp. 673–678). ACM.
Fan, Q., Yang, J., Hua, G., Chen, B., & Wipf, D. (2017). A generic deep architecture for single image reflection removal and image smoothing. In Proceedings of the IEEE Conference on Computer Vision (ICCV) (pp. 3258–3267). IEEE.
Farid, H., & Adelson, E.H. (1999). Separating reflections and lighting using independent components analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Vol. 1, pp. 262–267). IEEE.
Guo, X., Cao, X., & Ma, Y. (2014). Robust separation of reflection from multiple images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2187–2194).
Han, B. J., & Sim, J. Y. (2017). Reflection removal using low-rank matrix completion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Han, B. J., & Sim, J. Y. (2018). Glass reflection removal using co-saliency-based image alignment and low-rank matrix completion in gradient domain. IEEE Transactions on Image Processing, 27(10), 4873–4888.
Article MathSciNet Google Scholar
Hang, Z., & Dana, K. (2018). Multi-style generative network for real-time transfer (pp. 349–365).
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).
He, S., & Lau, R. W. (2014). Saliency detection with flash and no-flash image pairs. In Proceedings of the European Conference on Computer Vision (pp. 110–124). Springer.
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 2261–2269).
Kim, H., Jin, H., Hadap, S., & Kweon, I. (2013). Specular reflection separation using dark channel prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1460–1467).
Kong, N., Tai, Y. W., & Shin, S. Y. (2012). A physically-based approach to reflection separation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 9–16). IEEE.
Levin, A., Lischinski, D., & Weiss, Y. (2004). Colorization using optimization. ACM Transactions on Graphics, 23(3), 689–694.
Article Google Scholar
Levin, A., & Weiss, Y. (2007). User assisted separation of reflections from a single image using a sparsity prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(9), 1647–1654.
Article Google Scholar
Li, Y., & Brown, M.S. (2013). Exploiting reflection change for automatic reflection removal. In Proceedings of the IEEE Conference on Computer Vision (pp. 2432–2439).
Li, Y., & Brown, M. S. (2014). Single image layer separation using relative smoothness. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2752–2759).
Li, Y., Tan, R. T., Guo, X., Lu, J., & Brown, M. S. (2016). Rain streak removal using layer priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2736–2744).
Lu, C., Drew, M. S., & Finlayson, G. D. (2006). Shadow removal via flash/noflash illumination. In Proceedings of the IEEE Workshop on Multimedia Signal Processing (pp. 198–201). IEEE.
Matsui, S., Okabe, T., Shimano, M., & Sato, Y. (2011). Image enhancement of low-light scenes with near-infrared flash images. Information and Media Technologies, 6(1), 202–210.
Google Scholar
Mertens, T., Kautz, J., & Van Reeth, F. (2009). Exposure fusion: A simple and practical alternative to high dynamic range photography. Computer Graphics Forum, 28(1), 161–171.
Nayar, S. K., Fang, X. S., & Boult, T. (1997). Separation of reflection components using color and polarization. International Journal of Computer Vision, 21(3), 163–186.
Article Google Scholar
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., & Efros, A. A. (2016). Context encoders: Feature learning by inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2536–2544).
Petschnigg, G., Szeliski, R., Agrawala, M., Cohen, M., Hoppe, H., & Toyama, K. (2004). Digital photography with flash and no-flash image pairs. In ACM Transactions on Graphics (TOG) (Vol. 23, pp. 664–672). ACM.
Punnappurath, A., & Brown, M. S. (2019). Reflection removal using a dual-pixel sensor. In The IEEE conference on computer vision and pattern recognition (CVPR).
Schechner, Y. Y., Kiryati, N., & Basri, R. (2000). Separation of transparent layers using focus. International Journal of Computer Vision, 39(1), 25–39.
Article Google Scholar
Schechner, Y. Y., Shamir, J., & Kiryati, N. (2000). Polarization and statistical analysis of scenes containing a semireflector. JOSA A, 17(2), 276–284.
Article Google Scholar
Seo, H. J., & Milanfar, P. (2012). Robust flash denoising/deblurring by iterative guided filtering. EURASIP Journal on Advances in Signal Processing, 2012(1), 3.
Article Google Scholar
Shen, J., & Cheung, S. C. S. (2013). Layer depth denoising and completion for structured-light rgb-d cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1187–1194).
Shih, Y., Krishnan, D., Durand, F., & Freeman, W. T. (2015). Reflection removal using ghosting cues. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3193–3201).
Shirai, K., Okamoto, M., & Ikehara, M. (2011). Noiseless no-flash photo creation by color transform of flash image. In Proceedings of the IEEE Conference on Image Processing (ICIP) (pp. 3437–3440). IEEE.
Silberman, N., Hoiem, D., Kohil, P., & Fergus, R. (2012). Indoor segmentation and support inference from rgbd images. In Proceedings of the European Conference on Computer Vision. Springer.
Simon, C., & Park, I. K. (2015). Reflection removal for in-vehicle black box videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4231–4239).
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Song, S., Lichtenberg, S. P., & Xiao, J. (2015). Sun rgb-d: A rgb-d scene understanding benchmark suite. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 567–576).
Sun, J., Chang, Y., Jung, C., & Feng, J. (2019). Multi-modal reflection removal using convolutional neural networks. IEEE Signal Processing Letters, 26(7), 1011–1015.
Article Google Scholar
Sun, J., Kang, S. B., Xu, Z. B., Tang, X., & Shum, H. Y. (2007). Flash cut: Foreground extraction with flash and no-flash image pairs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8). IEEE.
Sun, J., Li, Y., Kang, S. B., & Shum, H. Y. (2006). Flash matting. ACM Transactions on Graphics (TOG), 25(3), 772–778.
Article Google Scholar
Szeliski, R., Avidan, S., & Anandan, P. (2000). Layer extraction from multiple images containing reflections and transparency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Vol. 1, pp. 246–253). IEEE.
Tan, T., Nishino, K., & Ikeuchi, K. (2003). Illumination chromaticity estimation using inverse-intensity chromaticity space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Wan, R., Shi, B., Duan, L. Y., Tan, A. H., & Kot, A. C. (2018). Crrn: Multi-scale guided concurrent reflection removal network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4777–4785).
Wei, K., Yang, J., Fu, Y., Wipf, D., & Huang, H. (2019). Single image reflection removal exploiting misaligned training data and network enhancements. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 8178–8187).
Yang, J., Gong, D., Liu, L., & Shi, Q. (2018). Seeing deeply and bidirectionally: A deep learning approach for single image reflection removal. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 654–669).
Yang, J., Li, H., Dai, Y., & Tan, R. T. (2016). Robust optical flow estimation of double-layer images under transparency or reflection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1410–1419).
Yang, Y., Ma, W., Zheng, Y., Cai, J. F., & Xu, W. (2019). Fast single image reflection suppression via convex optimization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 8141–8149).
Yi, S., Wang, X., & Tang, X. (2014). Deep learning face representation from predicting 10,000 classes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Yu, L., Xun, C., Cheng, J., & Hu, P. (2017). A medical image fusion method based on convolutional neural networks. In Proceedings of the International Conference on Information Fusion.
Yu, L., Xun, C., Hu, P., & Wang, Z. (2017). Multi-focus image fusion with a deep convolutional neural network. Information Fusion, 36, 191–207.
Article Google Scholar
Zagoruyko, S., & Komodakis, N. (2015). Learning to compare image patches via convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4353–4361).
Zhang, L., Zhang, L., Mou, X., & Zhang, D. (2011). Fsim: A feature similarity index for image quality assessment. IEEE Transactions on Image Processing, 20(8), 2378–2386.
Article MathSciNet Google Scholar
Zhang, X., Ng, R., & Chen, Q. (2018). Single image reflection separation with perceptual losses. arXiv preprint arXiv:1806.05376

Download references

Author information

Authors and Affiliations

School of Electronic Engineering, Xidian University, Xian, 710071, Shaanxi, China
Yakun Chang, Cheolkon Jung, Jun Sun & Fengqiao Wang

Authors

Yakun Chang
View author publications
You can also search for this author in PubMed Google Scholar
Cheolkon Jung
View author publications
You can also search for this author in PubMed Google Scholar
Jun Sun
View author publications
You can also search for this author in PubMed Google Scholar
Fengqiao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheolkon Jung.

Additional information

Communicated by Stephen Lin.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the National Natural Science Foundation of China (No. 61872280) and the International S&T Cooperation Program of China (No. 2014DFG12780).

Appendices

1.1 Appendix A

We present a tripartite database including real and synthetic images in Sect. III. B. For the synthetic images generated from the data in Aksoy et al. (2018), we obtain natural glass images by simply invoking functions (2) (3) and (4) because both flash/no-flash images are available. However, there are no flash images in SUN RGB-D. Obviously, simply brightening the global images cannot simulate the flash condition because objects are ununiformly brightened in flash images. These characteristics are from a major cause of the distance between camera and objects. That is, objects near to the camera are exposed to strong illumination, and vice versa. RGB-D data contains depth information, which helps us hierarchically brighten the image. We first fill holes of depth images, then normalize pixels to 0 $\sim $ 1 by the minimum and maximum values. Finally, we hierarchically enhance the color images according to the following function:

$$\begin{aligned} Y_F (p) = Y (p)^{0.5 + D(p)/1.3} \end{aligned}$$

(18)

where Y(p) is the pixel value located at p in Y channel of YCbCr color space. Within the enhanced Y channel, we generate the image captured with flash. Figure 18 shows two examples of $T_F$. As shown in them, wall is not visibly brightened because it is relatively far from the camera, while the chairs, bed and ground are enhanced a lot because they are close to the camera. Hence, the data synthesis simulates the flash situation to the maximum extent.

1.2 Appendix B

We provide more results on real images in Figs. 19 and 20. For brevity, the ground truth of intermediate results, i.e., $T_A$ and $T_F$, are omitted in this figure. It can be observed that the proposed method achieves good performance in various scenarios.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, Y., Jung, C., Sun, J. et al. Siamese Dense Network for Reflection Removal with Flash and No-Flash Image Pairs. Int J Comput Vis 128, 1673–1698 (2020). https://doi.org/10.1007/s11263-019-01276-z

Download citation

Received: 03 June 2019
Accepted: 06 December 2019
Published: 09 January 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s11263-019-01276-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Siamese Dense Network for Reflection Removal with Flash and No-Flash Image Pairs

Abstract

Access this article

Similar content being viewed by others

Near Real-Time Correction of Specular Reflections in Flash Images Using No-Flash Image Prior

Single Image Reflection Removal Using DeepLabv3+

Separation of Diffuse and Specular Reflection Components from Real-World Color Images Captured Under Flash Imaging Conditions

References