Analysis of phonemes and tones confusion rules obtained by ASR

Published: 01 January 2020

Volume 27, pages 3471–3481, (2021)
Cite this article

Wireless Networks Aims and scope Submit manuscript

Gulnur Arkin¹,
Askar Hamdulla¹ &
Mijit Ablimit¹

253 Accesses
1 Citation
Explore all metrics

Abstract

This paper is based on the exploration of the effective method of erroneous phoneme pronunciation of Chinese mandarin learners whose mother tongue is Uyghur and the solution of major problems of language education, concerning the learner’s pronunciation, it uses a different method, namely data-driven approach, and the automatic speech recognition is also used to recognize phonemes of the pronunciation of Chinese mandarin learners. The phoneme sequence is identified and then the standard pronunciation phonemes corresponding to the recognized phonemes are used as the target phonemes to obtain the mapping relation of each target phoneme and recognition phoneme, thus the possible phoneme error categories and possible erroneous rules in pronunciation can be obtained, which may give some help to the learners to learn the Chinese auxiliary language system and the corresponding pronunciation evaluation model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Similar content being viewed by others

Analysis of Phonemes and Tones Confusion Rules Obtained by ASR

Chapter © 2019

Grading the Severity of Mispronunciations in CAPT Based on Statistical Analysis and Computational Speech Perception

Article 12 September 2014

Confusion analysis in phoneme based speech recognition in Hindi

Article 01 February 2020

References

Ito, A., Lim, Y.-L., Suzuki, M., & MaKino, S. (2007). Pronunciation error detection for computer-assisted language learning system based on error rule clustering using a decision tree. Acoustical Science and Technology, 28(2), 131–133.
Article Google Scholar
Stanley, T., & Hacioglu, K. (2012). Improving L1-specific phonological error diagnosis in computer assisted pronunciation training. In INTERSPEECH 2012 (pp. 827–830). ISCA.
Wang, Y. B., & Lee, L. S. (2012). Improved approaches of modeling and detecting error patterns with empirical analysis for computer-aided pronunciation training. In ICASSP 2012 (pp. 5049–5052). IEEE.
Jiang, D., Wang, W., Shi, L., & Song, H. (2018). A compressive sensing-based approach to end-to-end network traffic reconstruction. IEEE Transactions on Network Science and Engineering, 5(3), 1–12.
Google Scholar
Jiang, M., Jiang, L., Jiang, D., et al. (2017). Dynamic measurement errors prediction for sensors based on firefly algorithm optimize support vector machine. Sustainable Cities and Society, 2017(35), 250–256.
Article Google Scholar
Wang, F., Jiang, D., & Qi, S. (2019). An adaptive routing algorithm for integrated information networks. China Communications, 7(1), 196–207.
Google Scholar
Jiang, D., Zhang, P., & Lv, Z. (2016). Energy-efficient multi-constraint routing algorithm with load balancing for smart city applications. IEEE Internet of Things Journal, 3(6), 1437–1447.
Article Google Scholar
Jiang, D., Li, W., & Lv, H. (2017). An energy-efficient cooperative multicast routing in multi-hop wireless networks for smart medical applications. Neurocomputing, 220, 160–169.
Article Google Scholar
Witt, S. M. (1999). Use of speech recognition in computer-assisted language learning. Cambridge: Cambridge University.
Google Scholar
Ye, H., & Young, S. J. (2005). Improving the speech recognition performance of beginners in spoken conversational interaction for language learning. In INTERSPEECH 2005 (pp. 289–292). ISCA.
Jiang, D., Huo, L., & Song, H. (2018). Rethinking behaviors and activities of base stations in mobile cellular networks based on big data analysis. IEEE Transactions on Network Science and Engineering, 1(1), 1–12.
MathSciNet Google Scholar
Qian, X. J., Meng, H., & Soong, F. K. (2011). On mispronunciation lexicon generation using joint sequence multigrams in computer-aided pronunciation training (CAPT). In INTERSPEECH 2011 (pp. 865–868). ISCA.
Tsubota, Y., Kawahara, T., & Dantsuji, M. (2002) Recognition and verification of English by Japanese students for computer-assisted language learning system. In Proceedings of ICSLP (pp. 1205–1208).
Oh, Y. R., Yoon, J. S., & Kim, H. K. (2007). Acoustic model adaptation based on pronunciation variability analysis for non-native speech recognition. Speech Communication, 49, 59–70.
Article Google Scholar
Peter Ladefoged, F., & Keith Johnson, S. (2015). A course in phonetics (7th ed.). Haidian: Peking University Press.
Google Scholar
Thurgood, G., & La Polla, R. J. (2003). The Sino-Tibetan languages. London: Routledge.
Google Scholar
Jiang, D., Huo, L., & Li, Y. (2018). Fine-granularity inference and estimations to network traffic for SDN. PLoS ONE, 13(5), 1–23.
Google Scholar
Huo, L., Jiang, D., & Lv, Z. (2018). Soft frequency reuse-based optimization algorithm for energy efficiency of multi-cell networks. Computers & Electrical Engineering, 66(2), 316–331.
Article Google Scholar
Xiangru, Z., & Zhining, Z. (1985). Uyghur language. China: National press.
Google Scholar
Shifeng, F. (2009). Experimental phonology exploration. China: Peking University Press.
Google Scholar
Lo, W. K., Zhang, S., & Meng, H. M. (2010). Automatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training system. In INTERSPEECH 2010 (pp. 765–768).
Zhu, J., Song, Y., Jiang, D., et al. (2018). A new deep-Q-learning-based transmission scheduling mechanism for the cognitive Internet of things. IEEE Internet of Things Journal, 5(4), 2375–2385.
Article Google Scholar
Troung, K. F. (2004). Automatic pronunciation error detection in Dutch as a second language: An acoustic-phonetic approach. Utrecht: Utrecht University.
Google Scholar
Jiang, D., Wang, Y., Lv, Z., et al. (2019). Big data analysis-based network behavior insight of cellular networks for industry 4.0 applications. IEEE Transactions on Industrial Informatics. https://doi.org/10.1109/tii.2019.2930226.
Article Google Scholar
Jiang, D., Huo, L., Lv, Z., et al. (2018). A joint multi-criteria utility-based network selection approach for vehicle-to-infrastructure networking. IEEE Transactions on Intelligent Transportation Systems, 19(10), 3305–3319.
Article Google Scholar
Huo, L., & Jiang, D. (2019). Stackelberg game-based energy-efficient resource allocation for 5G cellular networks. Telecommunication System, 23(4), 1–11.
Google Scholar
Sun, M., Jiang, D., Song, H., et al. (2017). Statistical resolution limit analysis of two closely-spaced signal sources using Rao test. IEEE Access, 2017(5), 22013–22022.
Article Google Scholar
Dong, B., & Zhao, Q. W. (2006). Automatic scoring of flat tongue and raised tongue in computer-assisted mandarin learning. In ISCSLP 2006 (pp. 2–7). IEEE.
Chen, L., Jiang, D., Song, H., et al. (2018). A lightweight end-side user experience data collection system for quality evaluation of multimedia communications. IEEE Access, 6(2018), 15408–15419.
Article Google Scholar
Sun, M., Jiang, D., Song, H., et al. (2017). Statistical resolution limit analysis of two closely-spaced signal sources using Rao test. IEEE Access, 5, 22013–22022.
Article Google Scholar
Wang, S. J., & Li, H. Y. (2011). Research on the evaluation of spoken language scale intelligence for second language learning. Chinese Journal of information science, 25(6), 142–148.
MathSciNet Google Scholar
Gass, S., & Selinker, L. (1992). Language transfer in language learning (pp. 22–113). Amsterdam: John Benjamins Publishing Company.
Book Google Scholar
Wang, L., Feng, X., & Meng, H. M. (2008). Automatic generation and pruning of phonetic mispronunciations to support computer-aided pronunciation training. In INTERSPEECH 2008 (pp. 22– 26).
Wang, L., Feng, X., & Meng, H. M. (2008). Mispronunciation detection based on cross-language phonological comparisons. In ICALIP 2008 (pp. 307–311).
Arkin, G., & Hamdulla, A. (2018). Tone investigation of non-native Chinese speakers based on acoustic features. Technical Acoustics, 37(6), 572–578.
Google Scholar
Arkin, G., & Hamdulla, A. (2018). Tone analysis of non-native Chinese speakers based on rules and statistics. Journal of Applied Acoustics, 37(3), 366–372.
Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (NSFC; Grants 61662078, and 61633013), National Key Research and Development Plan of China (2017YFC0820602).

Author information

Authors and Affiliations

Institute of Information Science and Engineering, Xinjiang University, Ürümqi, 830046, China
Gulnur Arkin, Askar Hamdulla & Mijit Ablimit

Authors

Gulnur Arkin
View author publications
You can also search for this author in PubMed Google Scholar
Askar Hamdulla
View author publications
You can also search for this author in PubMed Google Scholar
Mijit Ablimit
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Askar Hamdulla.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arkin, G., Hamdulla, A. & Ablimit, M. Analysis of phonemes and tones confusion rules obtained by ASR. Wireless Netw 27, 3471–3481 (2021). https://doi.org/10.1007/s11276-019-02220-2

Download citation

Published: 01 January 2020
Issue Date: July 2021
DOI: https://doi.org/10.1007/s11276-019-02220-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions