Abstract
Many studies have been conducted to improve wafer bin map (WBM) defect classification performance because accurate WBM classification can provide information about abnormal processes causing a decrease in yield. However, in the actual manufacturing field, the manual labeling performed by engineers leads to a high level of uncertainty. Label uncertainty has been a major cause of the reduction in WBM classification system performance. In this paper, we propose a class label reconstruction method for subdividing a defect class with various patterns into several groups, creating a new class for defect samples that cannot be categorized into known classes and detecting unknown defects. The proposed method performs discriminative feature learning of the Siamese network and repeated cross-learning of the class label reconstruction based on Gaussian means clustering in a learned feature space. We verified the proposed method using a real-world WBM dataset. In a situation where there the class labels of the training dataset were corrupted, the proposed method could increase the classification accuracy of the test dataset by enabling the corrupted sample to find its original class label. As a result, the accuracy of the proposed method was up to 7.8% higher than that of the convolutional neural network (CNN). Furthermore, through the proposed class label reconstruction, we found a new mixed-type defect class that had not been found until now, and we detected new types of unknown defects that were not used for learning with an average accuracy of over 73%.
Similar content being viewed by others
References
Adly, F., Yoo, P. D., Muhaidat, S., & Al-Hammadi, Y. (2014). Machine-learning-based identification of defect patterns in semiconductor wafer maps: An overview and proposal. In 2014 IEEE international parallel & distributed processing symposium workshops (pp. 420–429). IEEE. https://doi.org/10.1109/IPDPSW.2014.54.
Altman, N. S. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46(3), 175–185. https://doi.org/10.2307/2685209.
Anderson, T. W., & Darling, D. A. (1952). Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. The Annals of Mathematical Statistics, 23(2), 193–212. https://doi.org/10.1214/aoms/1177729437.
Chang, C.-W., Chao, T.-M., Horng, J.-T., Lu, C.-F., & Yeh, R.-H. (2012). Development pattern recognition model for the classification of circuit probe wafer maps on semiconductors. IEEE Transactions on Components, Packaging and Manufacturing Technology, 2(12), 2089–2097. https://doi.org/10.1109/TCPMT.2012.2215327.
Chopra, S., Hadsell, R., & LeCun, Y. (2005). Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR) (vol. 1, pp. 539–546). IEEE. https://doi.org/10.1109/CVPR.2005.202.
Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29(2–3), 103–130.
Ferain, I., Colinge, C. A., & Colinge, J.-P. (2011). Multigate transistors as the future of classical metal–oxide–semiconductor field-effect transistors. Nature, 479(7373), 310–316. https://doi.org/10.1038/nature10676.
Guan, D., Yuan, W., Lee, Y.-K., & Lee, S. (2011). Identifying mislabeled training data with the aid of unlabeled data. Applied Intelligence, 35(3), 345–358. https://doi.org/10.1007/s10489-010-0225-4.
Hamerly, G., & Elkan, C. (2004). Learning the k in k-means. In Advances in neural information processing systems (vol. 17, pp. 1–8). https://papers.nips.cc/paper/2526-learning-the-k-in-k-means
Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844. https://doi.org/10.1109/34.709601.
Kim, J., Lee, Y., & Kim, H. (2018). Detection and clustering of mixed-type defect patterns in wafer bin maps. IISE Transactions, 50(2), 99–111. https://doi.org/10.1080/24725854.2017.1386337.
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. The third International Conference on Learning Representations, ICLR 2015, 1–15. https://arxiv.org/abs/1412.6980
Köhler, J. M., Autenrieth, M., & Beluch, W. H. (2019). Uncertainty based detection and relabeling of noisy image labels. In 2019 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR) (pp. 33–37). https://arxiv.org/abs/1906.11876
Kyeong, K., & Kim, H. (2018). Classification of mixed-type defect patterns in wafer bin maps using convolutional neural networks. IEEE Transactions on Semiconductor Manufacturing, 31(3), 395–402. https://doi.org/10.1109/TSM.2018.2841416.
Li, Y., Yang, J., Song, Y., Cao, L., Luo, J., & Li, L.-J. (2017). Learning from noisy labels with distillation. In 2017 IEEE international conference on computer vision (ICCV) (Vol. 2017-October, pp. 1928–1936). IEEE. https://doi.org/10.1109/ICCV.2017.211.
Liu, C.-W., & Chien, C.-F. (2013). An intelligent system for wafer bin map defect diagnosis: An empirical study for semiconductor manufacturing. Engineering Applications of Artificial Intelligence, 26(5–6), 1479–1486. https://doi.org/10.1016/j.engappai.2012.11.009.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth berkeley symposium on mathematical statistics and probability (vol. 1, pp. 281–297).
Nair, V., & Hinton, G. E. (2010). Rectified linear units improve Restricted Boltzmann machines. In ICML 2010 - Proceedings, 27th international conference on machine learning (pp. 807–814). Retrieved from https://www.scopus.com/inward/record.uri?eid=2-s2.0-77956509090&partnerID=40&md5=70e2e88c9faa609cc4bd7221fc47e5ca.
Nakazawa, T., & Kulkarni, D. V. (2018). Wafer map defect pattern classification and image retrieval using convolutional neural network. IEEE Transactions on Semiconductor Manufacturing, 31(2), 309–314. https://doi.org/10.1109/TSM.2018.2795466.
Nettleton, D. F., Orriols-Puig, A., & Fornells, A. (2010). A study of the effect of different types of noise on the precision of supervised learning techniques. Artificial Intelligence Review, 33(4), 275–306. https://doi.org/10.1007/s10462-010-9156-z.
Patrini, G., Rozza, A., Menon, A. K., Nock, R., & Qu, L. (2017). Making deep neural networks robust to label noise: A loss correction approach. In 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2233–2241). IEEE. https://doi.org/10.1109/CVPR.2017.240.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Quinlan, J. (1999). Simplifying decision trees. International Journal of Human-Computer Studies, 51(2), 497–510. https://doi.org/10.1006/ijhc.1987.0321.
Stephens, M. A. (1974). EDF statistics for goodness of fit and some comparisons. Journal of the American Statistical Association, 69(347), 730–737. https://doi.org/10.1080/01621459.1974.10480196.
Vahdat, A. (2017). Toward robustness against label noise in training deep discriminative neural networks. In Advances in neural information processing systems (pp. 5597–5606). https://arxiv.org/abs/1706.00038.
van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579–2605.
Veit, A., Alldrin, N., Chechik, G., Krasin, I., Gupta, A., & Belongie, S. (2017). Learning from noisy large-scale datasets with minimal supervision. In 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 6575–6583). IEEE. https://doi.org/10.1109/CVPR.2017.696.
Wang, C.-H., Kuo, W., & Bensmail, H. (2006). Detection and classification of defect patterns on semiconductor wafers. IIE Transactions, 38(12), 1059–1068. https://doi.org/10.1080/07408170600733236.
Wang, Y., Liu, W., Ma, X., Bailey, J., Zha, H., Song, L., & Xia, S.-T. (2018). Iterative learning with open-set noisy labels. In 2018 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 8688–8696). IEEE. https://doi.org/10.1109/CVPR.2018.00906.
Wu, M.-J., Jang, J.-S. R., & Chen, J.-L. (2015). Wafer map failure pattern recognition and similarity ranking for large-scale data sets. IEEE Transactions on Semiconductor Manufacturing, 28(1), 1–12. https://doi.org/10.1109/TSM.2014.2364237.
Acknowledgements
This work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korean Government (MSIT) (Grant No. NRF-2019R1A2B5B01070358).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Park, S., Jang, J. & Kim, C.O. Discriminative feature learning and cluster-based defect label reconstruction for reducing uncertainty in wafer bin map labels. J Intell Manuf 32, 251–263 (2021). https://doi.org/10.1007/s10845-020-01571-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10845-020-01571-4