Abstract
Gesture recognition using cameras capable of capturing detailed images for gesture recognition is not feasible in many places due to concerns regarding privacy and information leakage. To address this problem, we have proposed a method of capturing shadow pictures using single-pixel-imaging to realize privacy-conscious gesture recognition. As an implementation method of single-pixel-imaging in public spaces, we have studied using a high-frame-rate LED display as a light source. By using a high-frame-rate LED display, random patterns can be latent while the observer perceives an apparent image. However, the image reconstructed by single-pixel-imaging using a high-frame-rate LED display is influenced by the apparent image, making gesture recognition difficult. In this study, we show that the influence of the apparent image can be removed by restoring the restored image using deep learning with a convolutional network called U-Net, and high classification accuracy with a small number of illuminations by using LeNet to classify restored images.
Similar content being viewed by others
Data availability
The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Mitra, S., Acharya, T.: Gesture recognition: a survey. IEEE Trans Syst Man Cybern Part C 37(3), 311–324 (2007)
Yasui YM, Alvissalim MS, Takahashi M, Tomiyama Y, Suyama S, Ishikawa M. Floating display screen formed by AIRR (Aerial imaging by retro-reflection) for interaction in 3D space. In: 2014 International Conference on 3D Imaging (IC3D) (IEEE, 2014), pp. 1–5.
Rossol, N., Cheng, I., Basu, A.: A Multisensor technique for gesture recognition through intelligent skeletal pose analysis. IEEE Trans Hum Mach Syst 46, 350–359 (2016)
Nishihori, M., Izumi, T., Nagano, Y., Sato, M., Tsukada, T., Kropp, A.E., Wakabayashi, T.: Development and clinical evaluation of a contactless operating interface for three-dimensional image-guided navigation for endovascular neurosurgery. Int J Comput Assist Radiol Surg 16, 663–671 (2021)
Dai J, Wu J, Saghafi B, Konrad J, Ishwar P. Towards privacy-preserving activity recognition using extremely low temporal and spatial resolution cameras. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (IEEE, 2015), pp. 68–76.
Wu Z, Wang Z, Wang Z, Jin H. Towards privacy-preserving visual recognition via adversarial training: A pilot study. In: Proceedings of the European Conference on Computer Vision (ECCV) (Springer, 2018), pp. 606–624.
Mukojima, N., Yasugi, M., Mizutani, Y., Yasui, T., Yamamoto, H.: Deep-learning-assisted single-pixel imaging for gesture recognition in consideration of privacy. IEICE Trans Electron E105-C. 2, 79–85 (2022)
Gibson, G.M., Johnson, S.D., Padgett, M.J.: Single-pixel imaging 12 years on: a review. Opt Express 28, 28190–28208 (2020)
Onose, S., Takahashi, M., Mizutani, Y., Yasui, T., Yamamoto, H.: Single pixel imaging with a high-frame-rate LED digital signage. Proc Int Display Worksh 23, 1495–1498 (2016)
Mukojima, N., Talatsuka, H., Yasugi, M., Suyama, S., Yamamoto, H.: Reconstruction of gesture images by using banner as illumination of single-pixel imaging. Proc. IDW 29, 1039–1042 (2022)
Takahashi M, Yamamoto H. Encryption by spatiotemporal scrambling on a high-frame-rate display. In: The 63rd JSAP Spring Meeting, 21a-S224–5. 2016. [in Japanese].
Mukojima N, Yasugi M, Suyama S, Yamamoto H. The possibility of using banner images as the mask pattern of single-pixel imaging. In: 2022 Information Photonics (IP) (OSJ, 2022) IPp-09.
Takatsuka H, Yasugi M, Suyama S, Yamamoto H. Reconstruction performance of U-Net in single-pixel-imaging with random-dot-embedded apparent images. In: The 12th laser display and lighting conference 2023, p. LDC7–05. 2023.
Shibuya, K., Minamikawa, T., Mizutani, Y., Yamamoto, H., Minoshima, K., Yasui, T., Iwata, T.: Scan-less hyperspectral dual-comb single-pixel-imaging in both amplitude and phase. Opt Express 25, 21947–21957 (2017)
Takatsuka H, Yasugi M, Mukojima N, Suyama S, Yamamoto H. Elimination of apparent image on single-pixel-imaging by use of high-frame-rate display with latent random dot patterns. In: Proc. IDW 29, 1035–1038. 2022.
Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. 2015. arXiv:1505.04597.
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc IEEE 86(11), 2278–2324 (1998)
Funding
A part of this work was supported by JSPS KAKENHI (20H05702).
Author information
Authors and Affiliations
Contributions
HT contributed for this paper as first author. He conducted the experiments, analyzed the data, and wrote the original draft. MY and SS and HY designed the experiments and edited the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflicts of interest associated with this manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Takatsuka, H., Yasugi, M., Suyama, S. et al. Gesture recognition using deep-learning in single-pixel-imaging with high-frame-rate display with latent random dot patterns. Opt Rev 31, 116–125 (2024). https://doi.org/10.1007/s10043-023-00848-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10043-023-00848-2