Image Classification Approach Using Machine Learning and an Industrial Hadoop Based Data Pipeline,Big Data Research

当前位置： X-MOL 学术 › Big Data Res. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Image Classification Approach Using Machine Learning and an Industrial Hadoop Based Data Pipeline
Big Data Research ( IF 3.5 ) Pub Date : 2021-01-12 , DOI: 10.1016/j.bdr.2021.100184
Rim Koulali , Hajar Zaidani , Maryeme Zaim

In smart cities, citizens contribute to improving the overall quality of life through infrastructure deficiency signaling. Multimedia content (images, videos) uploaded using smartphones allow city authorities to take appropriate incident responses. This paper proposes a benchmark of machine learning (ML) algorithms for image classification, evaluated on a small dataset of images captured by citizens that cover problems related to water and electricity distribution. The final goal is to label each image into its corresponding class to take the appropriate decisions to tackle the reported problem. A number of classical supervised ML algorithms along with deep learning methods are trained and compared. The experimental results demonstrate that transfer learning with data augmentation and fine-tuning using VGG16 network achieves high classification precision and a desirable time performance. We also deployed our models through a Hadoop based data pipeline which led to a significant enhancement in the precision and the image classification time.

中文翻译：

使用机器学习和基于工业Hadoop的数据管道的图像分类方法

在智慧城市中，公民通过基础设施不足的信号为改善整体生活质量做出贡献。使用智能手机上传的多媒体内容（图像，视频）允许城市当局采取适当的事件响应。本文提出了一种用于图像分类的机器学习（ML）算法的基准，该基准是在市民捕获的覆盖水和电分配问题的小型图像数据集上进行评估的。最终目标是将每个图像标记到其相应的类别中，以做出适当的决定来解决所报告的问题。训练和比较了许多经典的监督ML算法以及深度学习方法。实验结果表明，通过使用VGG16网络进行数据增强和微调的转移学习可以实现较高的分类精度和理想的时间性能。我们还通过基于Hadoop的数据管道部署了模型，从而大大提高了精度和图像分类时间。

更新日期：2021-01-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文