Abstract
This paper proposes a first approach based on wavelet analysis inside image processing for object detection with a repetitive pattern and binary classification in the image plane, in particular for navigation in simulated environments. To date, it has become common to use algorithms based on convolutional neural networks (CNNs) to process images obtained from the on-board camera of unmanned aerial vehicles (UAVs) in the spatial domain, being useful in detection and classification tasks. CNN architecture can receive images without pre-processing, as input in the training stage. This advantage allows us to extract the characteristic features of the image. Nevertheless, in this work, we argue that characteristics at different frequencies, low and high, also affect the performance of CNN during training. Thus, we propose a CNN architecture complemented by the 2D discrete wavelet transform, which is a feature extraction method. Wavelet analysis allows us to use time-frequency information through a multiresolution analysis. Therefore, we have investigated the combination of multiresolution analysis, via wavelets, in conjunction with CNN architectures, to use the information in the wavelet domain as input to the training stage. The information improves the learning capacity, eliminates the overfitting, and achieves a better efficiency in the detection of a target. Also, our learning model was evaluated in the aerial navigation of a drone.
Similar content being viewed by others
REFERENCES
Rojas-Pérez, L.O., Autonomous navigation system for micro aerial vehicles, Dissertation, Puebla: Instituto Tecnológico Superior de Atlixco, 2018.
Martínez-Carranza, J., Valentín, L., Márquez-Aquino, F., González-Islas, J.C., and Loewen, N., Obstacle detection during autonomous flight of drones using monocular SLAM, Res. Comput. Sci., 2016, vol. 114, pp. 111–124.
Vassilieva, N.S., Content-based image retrieval methods, Program. Comput. Soft, 2009, vol. 35, no. 3, pp. 158–180. https://doi.org/10.1134/S0361768809030049
Zhdanov, A.D., Zhdanov, D.D., Bogdanov, N.N., et al., Discomfort of visual perception in virtual and mixed reality systems, Program. Comput. Soft, 2019, vol. 45, pp. 147–155. https://doi.org/10.1134/S036176881904011X
Sanzharov, V.V. and Frolov, V.A., Level of detail for precomputed procedural textures, Program. Comput. Soft, 2019, vol. 45, pp. 187–195. https://doi.org/10.1134/S0361768819040078
Vargas-Olmos, C., Procesamiento de imágenes con métodos de ondeleta, Dissertation, San Luis Potosí: Facultad de Ciencias, UASLP, 2010.
Cárdenas-Amaya, J.D., Extracción y análisis de características con la Transformada Wavelet para el reconocimiento de imágenes, Dissertation, San Luis Potosí: Instituto de Investigación en Comunicación Óptica, UASLP, 2018.
Bazi, Y. and Melgani, F., Convolutional SVM networks for object detection in UAV imagery, IEEE Trans. Geosci. Remote Sens., 2018, vol. 56, no. 6, pp. 3107–3118.
Pi, Y., Nath, N.D., and Behzadan, A.H., Convolutional neural networks for object detection in aerial imagery for disaster response and recovery, Adv. Eng. Inf., 2020, vol. 43, p. 101009.
Dionisio-Ortega, S., Rojas-Perez, L.O., Martinez-Carranza, J., and Cruz-Vega, I., A deep learning approach towards autonomous flight in forest environments, Proc. Int. Conf. on Electronics, Communications and Computers (CONIELECOMP), Cholula, 2018. https://ieeexplore.ieee.org/document/8327189.
Krizhevsky, A., Sutskever, I., and Hinton, G.E., ImageNet classification with deep convolutional neural networks, Adv. Neural Inf. Process. Syst., 2012, vol. 25, no. 2, pp. 1097–1105.
Redmon, J., Divvala, S., Girshich, R., and Farhadi, A., You only look once: unified, real-time object detection, Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016, pp. 779–788.
Jung, S., Hwang, S., Shin, H., and Shim, D.H., Perception, guidance, and navigation for indoor autonomous drone racing using deep learning, IEEE Rob. Automati. Lett., 2018, vol. 3, issue 3, pp. 2539–2544. https://ieeexplore.ieee.org/abstract/document/8299437.
Williams, T. and Li, R., An ensemble of convolutional neural, J. Software Eng. Appl., 2018, vol. 11, no. 2, pp. 69–88.
Williams, T. and Li, R., Wavelet pooling for convolutional neural networks, Proc. Int. Conf. on Learning Representations, Vancouver, 2018. https://openreview.net/forum?id=rkhlb8lCZ.
Piao, J., Chen, Y., and Shin, H., A new deep learning based multi-spectral image fusion method, J. Entropy, 2019, vol. 21, issue 6. https://doi.org/10.3390/e21060570
De Silva, D.N., Fernando, S., Piyatilake, I.T.S., and Karunnarathne, A.V.S., Wavelet based edge feature enhancement for convolutional neural networks, Proc. 11th Int. Conf. on Machine Vision (ICMV 2018), Munich, 2018. https://doi.org/10.1117/12.2522849
Fujieda, S., Takayama, K., and Hachisuka, T., Wavelet convolutional neural networks for texture classification, 2017. arXiv:1707.07394.
Burrus, C.S., Gonipath, R.A., and Guo, H., Introduction to Wavelets and Wavelet Transforms: a Primer, Englewood Cliffs: Prentice Hall, 1998.
Mallat, S., A Wavelet Tour of Signal Processing: the Sparse Way, 3rd ed., Acad. Press, 2009.
Strang, G. and Nguyen, T., Wavelets and Filter Banks, 2nd ed., Wellesley College, 1996.
Vargas-Olmos, C., Procesamiento de señales y solución de problemas con la transformada wavelet, Ph.D. Dissertation, San Luis Potosí: Instituto de Investigación en Comunicación Óptica, UASLP, 2017.
Walker, J.S., A Primer on Wavelets and Their Scientific Applications, 2nd ed., Broken Sound Parkway NW: Chapman & Hall/CRC, 2008.
Goodfellow, I., Bengio, Y., and Courville, A., Deep Learning, Cambridge: MIT Press, 2016.
Stepanyan, I.V., Methodology and tools for designing binary neural networks. Program. Comput. Soft, 2020, vol. 46, pp. 49–56. https://doi.org/10.1134/S0361768820010065
Karpov, Y.L., Karpov, L.E., and Smetanin, Y.G., Adaptation of general concepts of software testing to neural networks, Program. Comput. Soft, 2018, vol. 44, pp. 324–334. https://doi.org/10.1134/S0361768818050031
Torres, J., Convolutional Neural Networks for Beginners, 2018. https://towardsdatascience.com/convolutional-neural-networksfor-beginners-practical-guide-with-python-and-kerasdc688ea90dca. Accessed Apr. 27, 2019.
Simonyan, K. and Zisserman, A., Very deep convolutional networks for large-scale image recognition, 2014. arXiv preprint arXiv:1409.1556.
Yayık, A., Yakup, K., and Altan, G., Deep Learning with ConvNET Predicts Imagery Tasks Through EEG, 2019. arXiv preprint arXiv:1907.05674.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C., SSD: single shot multibox detector, in Proc. European Conf. on Computer Vision, Cham: Springer, 2016, pp. 21–37.
Kaiming, H., Gkioxari, G., Dollár, P., and Girshick, R., Mask RCNN, Proc. IEEE Int. Conf. on Computer Vision, Venice, 2017, pp. 2961–2969.
Shaoqing, R., He, K., Girshick, R., and Sun, J., Faster R-CNN: towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Mach. Intellig., 2017, vol. 39, issue 6, pp. 1137–1149.
Chollet, F., Deep Learning with Python, Shelter Island, NY: Manning Publ. Co., 2018.
Koenig, N. and Howard, A., Design and use paradigms for Gazebo, an open-source multi-robot simulator, Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), Sendai, 2004. https://ieeexplore.ieee.org/document/1389727.
Rodriguez-Martín, E., Sistema de posicionamiento para un drone, Dissertation, España: Universidad de La Laguna, 2015.
Joseph, L., Robot Operating System for Absolute Beginners: Robotics Programming Made Easy, Aluva: Apress, 2018.
ACKNOWLEDGMENTS
J. M. Fortuna-Cervantes is a doctoral fellow of CONACYT (México) in the program of “Ciencias Aplicadas” at IICO-UASLP. To INAOE for giving the facilities to carry out the research internship where part of this work was done.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Fortuna-Cervantes, J.M., Ramírez-Torres, M.T., Martínez-Carranza, J. et al. Object Detection in Aerial Navigation using Wavelet Transform and Convolutional Neural Networks: A First Approach. Program Comput Soft 46, 536–547 (2020). https://doi.org/10.1134/S0361768820080113
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0361768820080113