Unsupervised Machine Learning Via Transfer Learning and k -Means Clustering to Classify Materials Image Data,Integrating Materials and Manufacturing Innovation

当前位置： X-MOL 学术 › Integr. Mater. Manuf. Innov. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Unsupervised Machine Learning Via Transfer Learning and k -Means Clustering to Classify Materials Image Data
Integrating Materials and Manufacturing Innovation ( IF 2.4 ) Pub Date : 2021-04-08 , DOI: 10.1007/s40192-021-00205-8
Ryan Cohn , Elizabeth Holm

Unsupervised machine learning offers significant opportunities for extracting knowledge from unlabeled datasets and for achieving maximum machine learning performance. This paper demonstrates how to construct, use, and evaluate a high-performance unsupervised machine learning system for classifying images in a popular microstructural dataset. The Northeastern University Steel Surface Defects Database includes micrographs of six different defects observed on hot-rolled steel in a format that is convenient for training and evaluating models for image classification. We use the VGG16 convolutional neural network pre-trained on the ImageNet dataset of natural images to extract feature representations for each micrograph. After applying principal component analysis to extract signal from the feature descriptors, we use k-means clustering to classify the images without needing labeled training data. The approach achieves 99.4% ± 0.16% accuracy, and the resulting model can be used to classify new images without retraining. This approach demonstrates an improvement in both performance and utility compared to a previous study. A sensitivity analysis is conducted to better understand the influence of each step on the classification performance. The results provide insight toward applying unsupervised machine learning techniques to problems of interest in materials science.

中文翻译：

通过转移学习和k均值聚类对材料图像数据进行分类的无监督机器学习

无监督的机器学习为从未标记的数据集中提取知识并实现最大的机器学习性能提供了重要的机会。本文演示了如何构建，使用和评估高性能的无监督机器学习系统，以对流行的微结构数据集中的图像进行分类。东北大学钢表面缺陷数据库包括在热轧钢上观察到的六个不同缺陷的显微照片，其格式便于训练和评估图像分类模型。我们使用在自然图像的ImageNet数据集上经过预训练的VGG16卷积神经网络来提取每个显微照片的特征表示。应用主成分分析从特征描述符中提取信号后，我们使用k-表示聚类以对图像进行分类，而无需标记的训练数据。该方法可达到99.4％±0.16％的精度，并且所得模型可用于对新图像进行分类而无需重新训练。与以前的研究相比，这种方法证明了在性能和实用性上的改进。进行敏感性分析以更好地了解每个步骤对分类性能的影响。结果为将无监督机器学习技术应用于材料科学感兴趣的问题提供了见识。

更新日期：2021-04-09

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11