An end-to-end annotation-free machine vision system for detection of products on the rack

Santra, Bikash; Shaw, Avishek Kumar; Mukherjee, Dipti Prasad

doi:10.1007/s00138-021-01186-6

An end-to-end annotation-free machine vision system for detection of products on the rack

Original Paper
Published: 10 March 2021

Volume 32, article number 56, (2021)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Bikash Santra ORCID: orcid.org/0000-0002-6833-140X¹,
Avishek Kumar Shaw² &
Dipti Prasad Mukherjee¹

414 Accesses
5 Citations
Explore all metrics

Abstract

Given single instance (or template image) per product, our objective is to detect merchandise displayed in the images of racks available in a supermarket. Our end-to-end solution consists of three consecutive modules: exemplar-driven region proposal, classification followed by non-maximal suppression of the region proposals. The two-stage exemplar-driven region proposal works with the example or template of the product. The first stage estimates the scale between the template images of products and the rack image. The second stage generates proposals of potential regions using the estimated scale. Subsequently, the potential regions are classified using convolutional neural network. The generation and classification of region proposal do not need annotation of rack image in which products are recognized. In the end, the products are identified removing ambiguous overlapped region proposals using greedy non-maximal suppression. Extensive experiments are performed on one in-house dataset and three publicly available datasets: Grocery Products, WebMarket and GroZi-120. The proposed solution outperforms the competing approaches improving up to around \(4\%\) detection accuracy. Moreover, in the repeatability test, our solution is found to be better compared to state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 3

Product identification in retail stores by combining faster r-cnn and recurrent neural network

Article 06 June 2023

Rajib Ghosh

Naïve Approach for Bounding Box Annotation and Object Detection Towards Smart Retail Systems

The Combination of Background Subtraction and Convolutional Neural Network for Product Recognition

Notes

https://github.com/keras-team/keras accessed as on 04/2020.
https://github.com/mdbloice/Augmentor accessed on 04/2020.
https://github.com/aleju/imgaug accessed as on 04/2020.

References

Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 248–255. IEEE (2009)
Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96, 226–231 (1996)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Franco, A., Maltoni, D., Papi, S.: Grocery product detection and recognition. Expert Syst. Appl. 81, 163–176 (2017)
Article Google Scholar
Freund, Y., Schapire, R., Abe, N.: A short introduction to boosting. J. Jpn. Soc. Artif. Intell. 14(771–780), 1612 (1999)
Google Scholar
George, M., Floerkemeier, C.: Recognizing products: a per-exemplar multi-label image classification approach. In: European Conference on Computer Vision, pp. 440–455. Springer (2014)
George, M., Mircic, D., Soros, G., Floerkemeier, C., Mattern, F.: Fine-grained product class recognition for assisted shopping. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 154–162 (2015)
Ghassabeh, Y.A., Moghaddam, H.A.: Adaptive linear discriminant analysis for online feature extraction. Mach. Vis. Appl. 24(4), 777–794 (2013)
Article Google Scholar
Girshick, R.: Fast r-cnn. arXiv preprint arXiv:1504.08083 (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587 (2014)
Goldberg, D.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, New York (1989)
MATH Google Scholar
Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey Vision Conference, vol. 15, pp. 10–5244. Manchester, UK (1988)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988. IEEE (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hu, Y., Lu, M., Lu, X.: Driving behaviour recognition from still images by using multi-stream fusion cnn. Mach. Vis. Appl. 30(5), 851–865 (2019)
Article Google Scholar
Kejriwal, N., Garg, S., Kumar, S.: Product counting using images with application to robot-based retail stock assessment. In: 2015 IEEE International Conference on Technologies for Practical Robot Applications (TePRA), pp. 1–6. IEEE (2015)
Kim, J., Liu, C., Sha, F., Grauman, K.: Deformable spatial pyramid matching for fast dense correspondences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2307–2314 (2013)
Leutenegger, S., Chli, M., Siegwart, R.Y.: Brisk: inary robust invariant scalable keypoints. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2548–2555. IEEE (2011)
Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 2(2), 164–168 (1944)
Article MathSciNet Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Marder, M., Harary, S., Ribak, A., Tzur, Y., Alpert, S., Tzadok, A.: Using image analytics to monitor retail store shelves. IBM J. Res. Dev. 59(2/3), 3–1 (2015)
Article Google Scholar
Marquardt, D.W.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11(2), 431–441 (1963)
Article MathSciNet Google Scholar
Merler, M., Galleguillos, C., Belongie, S.: Recognizing groceries in situ using in vitro training data. In: IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR’07, pp. 1–8. IEEE (2007)
Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. Int. J. Comput. Vis. 60(1), 63–86 (2004)
Article Google Scholar
Mukherjee, D., Wu, Q.J., Wang, G.: A comparative experimental study of image feature detectors and descriptors. Mach. Vis. Appl. 26(4), 443–466 (2015)
Article Google Scholar
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in pytorch. In: NIPS-W (2017)
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: Pytorch: An imperative style, high-performance deep learning library. In: H. Wallach, H. Larochelle, A. Beygelzimer, F. d’ Alché-Buc, E. Fox, R. Garnett (eds.) Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Ray, A., Kumar, N., Shaw, A., Mukherjee, D.P.: U-pc: unsupervised planogram compliance. In: European Conference on Computer Vision, pp. 598–613. Springer (2018)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Robbins, H., Monro, S.: A stochastic approximation method. The annals of mathematical statistics, pp. 400–407 (1951)
Santra, B., Mukherjee, D.P.: A comprehensive survey on computer vision based approaches for automatic identification of products in retail store. Image Vis. Comput. 86, 45–63 (2019)
Article Google Scholar
Santra, B., Shaw, A.K., Mukherjee, D.P.: Graph-based non-maximal suppression for detecting products on the rack. Pattern Recogn. Lett. 140, 73–80 (2020). https://doi.org/10.1016/j.patrec.2020.09.023
Article Google Scholar
Shen, J., Hao, X., Liang, Z., Liu, Y., Wang, W., Shao, L.: Real-time superpixel segmentation by dbscan clustering algorithm. IEEE Trans. Image Process. 25(12), 5933–5942 (2016)
Article MathSciNet Google Scholar
Winlock, T., Christiansen, E., Belongie, S.: Toward real-time grocery detection for the visually impaired. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 49–56. IEEE (2010)
Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1577–1584. IEEE (2011)
Yu, J., Yow, K.C., Jeon, M.: Joint representation learning of appearance and motion for abnormal event detection. Mach. Vis. Appl. 29(7), 1157–1170 (2018)
Article Google Scholar
Zhang, Y., Wang, L., Hartley, R., Li, H.: Where’s the weet-bix? In: Asian Conference on Computer Vision, pp. 800–810. Springer (2007)

Download references

Acknowledgements

We would like to thank TCS Limited for partially supporting this work. We would also like to thank NVIDIA Corporation for donating one Titan XP GPU used for this research.

Author information

Authors and Affiliations

Electronics and Communication Sciences Unit Indian Statistical Institute, Kolkata, India
Bikash Santra & Dipti Prasad Mukherjee
Tata Consultancy Services Limited, Kolkata, India
Avishek Kumar Shaw

Authors

Bikash Santra
View author publications
You can also search for this author in PubMed Google Scholar
Avishek Kumar Shaw
View author publications
You can also search for this author in PubMed Google Scholar
Dipti Prasad Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bikash Santra.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Santra, B., Shaw, A.K. & Mukherjee, D.P. An end-to-end annotation-free machine vision system for detection of products on the rack. Machine Vision and Applications 32, 56 (2021). https://doi.org/10.1007/s00138-021-01186-6

Download citation

Received: 01 May 2020
Revised: 15 November 2020
Accepted: 12 February 2021
Published: 10 March 2021
DOI: https://doi.org/10.1007/s00138-021-01186-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

An end-to-end annotation-free machine vision system for detection of products on the rack

Abstract

Access this article

Similar content being viewed by others

Product identification in retail stores by combining faster r-cnn and recurrent neural network

Naïve Approach for Bounding Box Annotation and Object Detection Towards Smart Retail Systems

The Combination of Background Subtraction and Convolutional Neural Network for Product Recognition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An end-to-end annotation-free machine vision system for detection of products on the rack

Abstract

Access this article

Similar content being viewed by others

Product identification in retail stores by combining faster r-cnn and recurrent neural network

Naïve Approach for Bounding Box Annotation and Object Detection Towards Smart Retail Systems

The Combination of Background Subtraction and Convolutional Neural Network for Product Recognition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation