research-article

A GDPR-compliant Ecosystem for Speech Recognition with Transfer, Federated, and Evolutionary Learning

Authors:
Di Jiang

AI Group, WeBank Co., Ltd., Shenzhen, China

AI Group, WeBank Co., Ltd., Shenzhen, China

0000-0003-2309-1809
View Profile

,
Conghui Tan

AI Group, WeBank Co., Ltd., Shenzhen, China

AI Group, WeBank Co., Ltd., Shenzhen, China

0000-0003-3993-4751
View Profile

,
Jinhua Peng

AI Group, WeBank Co., Ltd., Shenzhen, China

AI Group, WeBank Co., Ltd., Shenzhen, China
View Profile

,
Chaotao Chen

AI Group, WeBank Co., Ltd., Shenzhen, China

AI Group, WeBank Co., Ltd., Shenzhen, China
View Profile

,
Xueyang Wu

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Kowloon, Hong Kong

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
View Profile

,
Weiwei Zhao

AI Group, WeBank Co., Ltd., Shenzhen, China

AI Group, WeBank Co., Ltd., Shenzhen, China
View Profile

,
Yuanfeng Song

AI Group, WeBank Co., Ltd., Shenzhen, China

AI Group, WeBank Co., Ltd., Shenzhen, China
View Profile

,
Yongxin Tong

BDBC, SKLSDE Lab and IRI, Beihang University, Beijing, China

BDBC, SKLSDE Lab and IRI, Beihang University, Beijing, China

0000-0002-5598-0312
View Profile

,
Chang Liu

AI Group, WeBank Co., Ltd., Shenzhen, China

AI Group, WeBank Co., Ltd., Shenzhen, China
View Profile

,
Qian Xu

AI Group, WeBank Co., Ltd., Shenzhen, China

AI Group, WeBank Co., Ltd., Shenzhen, China
View Profile

,
Qiang Yang

AI Group, WeBank Co., Ltd., China and Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong

AI Group, WeBank Co., Ltd., China and Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong
View Profile

,
Li Deng

Citadel LLC, Chicago, IL, USA

Citadel LLC, Chicago, IL, USA
View Profile

ACM Transactions on Intelligent Systems and Technology Volume 12 Issue 3Article No.: 30pp 1–19https://doi.org/10.1145/3447687

Published:05 May 2021Publication History

ACM Transactions on Intelligent Systems and Technology

Abstract

Automatic Speech Recognition (ASR) is playing a vital role in a wide range of real-world applications. However, Commercial ASR solutions are typically “one-size-fits-all” products and clients are inevitably faced with the risk of severe performance degradation in field test. Meanwhile, with new data regulations such as the European Union’s General Data Protection Regulation (GDPR) coming into force, ASR vendors, which traditionally utilize the speech training data in a centralized approach, are becoming increasingly helpless to solve this problem, since accessing clients’ speech data is prohibited. Here, we show that by seamlessly integrating three machine learning paradigms (i.e., Transfer learning, Federated learning, and Evolutionary learning (TFE)), we can successfully build a win-win ecosystem for ASR clients and vendors and solve all the aforementioned problems plaguing them. Through large-scale quantitative experiments, we show that with TFE, the clients can enjoy far better ASR solutions than the “one-size-fits-all” counterpart, and the vendors can exploit the abundance of clients’ data to effectively refine their own ASR products.

References

Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. ACM, 308–318. Google ScholarDigital Library
Victor Abrash, Horacio Franco, Ananth Sankar, and Michael Cohen. 1995. Connectionist speaker normalization and adaptation. In Proceedings of the European Conference on Speech Communication and Technology (Eurospeech’95). Citeseer.Google Scholar
Harith Al-Sahaf, Ausama Al-Sahaf, Bing Xue, Mark Johnston, and Mengjie Zhang. 2017. Automatically evolving rotation-invariant texture image descriptors by genetic programming. IEEE Trans. Evolution. Comput. 21, 1 (2017), 83–101. Google ScholarDigital Library
Wissam A. Albukhanajer, Johann A. Briffa, and Yaochu Jin. 2014. Evolutionary multiobjective image feature extraction in the presence of noise. IEEE Trans. Cybernet. 45, 9 (2014), 1757–1768.Google ScholarCross Ref
Johes Bater, Xi He, William Ehrich, Ashwin Machanavajjhala, and Jennie Rogers. 2018. Shrinkwrap: Differentially-private query processing in private data federations. Retrieved from https://arXiv:1810.01816.Google ScholarDigital Library
Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin. 2003. A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb.2003), 1137–1155. Google ScholarDigital Library
Peva Blanchard, Rachid Guerraoui, Julien Stainer, et al. 2017. Machine learning with adversaries: Byzantine tolerant gradient descent. In Advances in Neural Information Processing Systems. MIT Press, 119–129. Google ScholarDigital Library
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3 (2003), 993–1022. Google ScholarDigital Library
Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2016. Practical secure aggregation for federated learning on user-held data. Retrieved from https://arXiv:1611.04482.Google Scholar
Theodora S. Brisimi, Ruidi Chen, Theofanie Mela, Alex Olshevsky, Ioannis Ch Paschalidis, and Wei Shi. 2018. Federated learning of predictive models from federated Electronic Health Records. Int. J. Med. Info. 112 (2018), 59–67.Google ScholarCross Ref
Armand R. Burks and William F. Punch. 2018. Genetic programming for tuberculosis screening from raw X-ray images. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’18). 1214–1221. Google ScholarDigital Library
Boyuan Chen, Harvey Wu, Warren Mo, Ishanu Chattopadhyay, and Hod Lipson. 2018. Autostacker: A compositional evolutionary learning system. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’18). 402–409. Google ScholarDigital Library
Kuan-Yu Chen, Hsuan-Sheng Chiu, and Berlin Chen. 2010. Latent topic modeling of word vicinity information for speech recognition. In Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP’10). IEEE, 5394–5397.Google ScholarCross Ref
Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. 2017. Targeted backdoor attacks on deep learning systems using data poisoning. Retrieved from https://arXiv:1712.05526.Google Scholar
Yiqiang Chen, Xin Qin, Jindong Wang, Chaohui Yu, and Wen Gao. 2020. Fedhealth: A federated transfer learning framework for wearable healthcare. IEEE Intell. Syst. 35, 4 (2020), 83–93.Google ScholarCross Ref
Kewei Cheng, Tao Fan, Yilun Jin, Yang Liu, Tianjian Chen, and Qiang Yang. 2019. SecureBoost: A lossless federated learning framework. Retrieved from http://arxiv.org/abs/1901.08755.Google Scholar
Alexandra Chronopoulou, Christos Baziotis, and Alexandros Potamianos. 2019. An embarrassingly simple approach for transfer learning from pretrained language models. Retrieved from https://arXiv:1902.10547.Google Scholar
George E Dahl, Dong Yu, Li Deng, and Alex Acero. 2011. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio, Speech, Lang. Process. 20, 1 (2011), 30–42. Google ScholarDigital Library
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https://arXiv:1810.04805.Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https://arXiv:1810.04805.Google Scholar
Cynthia Dwork. 2008. Differential privacy: A survey of results. In Proceedings of the Theory and Applications of Models of Computation 5th International Conference (TAMC’08). 1–19. Google ScholarDigital Library
Cynthia Dwork, Aaron Roth, et al. 2014. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 3–4 (2014), 211–407. Google ScholarDigital Library
Roberto Gemello, Franco Mana, Stefano Scanzio, Pietro Laface, and Renato De Mori. 2007. Linear hidden transformations for adaptation of hybrid ANN/HMM models. Speech Commun. 49, 10 (2007), 827–835. Google ScholarDigital Library
Robin C. Geyer, Tassilo Klein, and Moin Nabi. 2017. Differentially private federated learning: A client level perspective. Retrieved from https://arXiv:1712.07557.Google Scholar
Shweta Ghai and Rohit Sinha. 2016. Adaptive feature truncation to address acoustic mismatch in automatic recognition of children’s speech. APSIPA Trans. Signal Info. Process. 5 (2016).Google Scholar
Alex Graves, Santiago Fernández, Faustino Gomez, and Jürgen Schmidhuber. 2006. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd International Conference on Machine Learning. ACM, 369–376. Google ScholarDigital Library
Xiawei Guo, Quanming Yao, WeiWei Tu, Yuqiang Chen, Wenyuan Dai, and Qiang Yang. 2018. Privacy-preserving Transfer Learning for Knowledge Sharing. Retrieved from https://arXiv:1811.09491.Google Scholar
Jihun Hamm, Yingjun Cao, and Mikhail Belkin. 2016. Learning privately from multiparty data. In Proceedings of the International Conference on Machine Learning. 555–563. Google ScholarDigital Library
Andrew Hard, Kanishka Rao, Rajiv Mathews, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. 2018. Federated learning for mobile keyboard prediction. Retrieved from https://arXiv:1811.03604.Google Scholar
Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Richard Nock, Giorgio Patrini, Guillaume Smith, and Brian Thorne. 2017. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. Retrieved from https://arXiv:1711.10677.Google Scholar
John H. Holland. 1992. Adaptation in Natural and Artificial Systems. MIT Press, Cambridge, MA. Google ScholarDigital Library
Yan Huang, Dong Yu, Chaojun Liu, and Yifan Gong. 2014. Multi-accent deep neural network acoustic model with accent-specific top layer using the KLD-regularized model adaptation. In Proceedings of the 15th Annual Conference of the International Speech Communication Association.Google Scholar
Josiah Jacobsen-Grocott, Yi Mei, Gang Chen, and Mengjie Zhang. 2017. Evolving heuristics for dynamic vehicle routing with time windows using genetic programming. In Proceedings of the IEEE Congress on Evolutionary Computation, (CEC’17). 1948–1955.Google ScholarCross Ref
Yanfei Kang, Rob Hyndman, and Smith-Miles Kate. 2017. Visualising forecasting algorithm performance using time series instance spaces. Int. J. Forecast. 33, 2 (2017), 345–358.Google ScholarCross Ref
Dietrich Klakow and Jochen Peters. 2002. Testing the correlation of word error rate and perplexity. Speech Commun. 38, 1–2 (2002), 19–28. Google ScholarDigital Library
Roland Kuhn and Renato De Mori. 1990. A cache-based natural language model for speech recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12, 6 (1990), 570–583. Google ScholarDigital Library
Raymond Lau, Ronald Rosenfeld, and Salim Roukos. 1993. Trigger-based language models: A maximum entropy approach. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2. IEEE, 45–48. Google ScholarDigital Library
Joel Lehman, Jay Chen, Jeff Clune, and Kenneth O. Stanley. 2018. ES is more than just a traditional finite-difference approximator. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’18). 450–457. Google ScholarDigital Library
Bo Li and Khe Chai Sim. 2010. Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems. In Proceedings of the 11th Annual Conference of the International Speech Communication Association.Google Scholar
Ke Li, Hainan Xu, Yiming Wang, Daniel Povey, and Sanjeev Khudanpur. 2018. Recurrent neural network language model adaptation for conversational speech recognition. In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH’18).1–5.Google ScholarCross Ref
Xiao Li and Jeff Bilmes. 2006. Regularized adaptation of discriminative classifiers. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’06), Vol. 1. IEEE, I–I.Google Scholar
Yuyu Liang, Mengjie Zhang, and Will N. Browne. 2015. A supervised figure-ground segmentation method using genetic programming. In Proceedings of the European Conference on the Applications of Evolutionary Computation. 491–503.Google Scholar
Yang Liu, Tianjian Chen, and Qiang Yang. 2018. Secure federated transfer learning. Retrieved from http://arxiv.org/abs/1812.03337.Google Scholar
Yuxin Liu, Yi Mei, Mengjie Zhang, and Zili Zhang. 2017. Automated heuristic design using genetic programming hyper-heuristic for uncertain capacitated arc routing problem. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’17). 290–297. Google ScholarDigital Library
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Agüera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS’17). 1273–1282.Google Scholar
H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, et al. 2016. Communication-efficient learning of deep networks from decentralized data. Retrieved from https://arXiv:1602.05629.Google Scholar
Tomáš Mikolov, Martin Karafiát, Lukáš Burget, Jan Černockỳ, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Proceedings of the 11th Annual Conference of the International Speech Communication Association.Google ScholarCross Ref
Tomáš Mikolov, Stefan Kombrink, Lukáš Burget, Jan Černockỳ, and Sanjeev Khudanpur. 2011. Extensions of recurrent neural network language model. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’11). IEEE, 5528–5531.Google ScholarCross Ref
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. MIT Press, 3111–3119. Google ScholarDigital Library
David J. Montana and Lawrence Davis. 1989. Training feedforward neural networks using genetic algorithms. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’89). 762–767. Google ScholarDigital Library
Frederic Morin and Yoshua Bengio. 2005. Hierarchical probabilistic neural network language model. In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS’05), Vol. 5. Citeseer, 246–252.Google Scholar
Joao Neto, Luís Almeida, Mike Hochberg, Ciro Martins, Luis Nunes, Steve Renals, and Tony Robinson. 1995. Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system. In Proceedings of the European Conference on Speech Communication and Technology (Eurospeech’95). 2171–2174.Google Scholar
Su Nguyen, Yi Mei, and Mengjie Zhang. 2017. Genetic programming for production scheduling: A survey with a unified framework. Complex Intell. Syst. 3, 1 (2017), 41–66.Google ScholarCross Ref
Su Nguyen, Mengjie Zhang, Mark Johnston, and Kay Chen Tan. 2014. Automatic design of scheduling policies for dynamic multi-objective job shop scheduling via cooperative coevolution genetic programming. IEEE Trans. Evolution. Comput. 18, 2 (2014), 193–208. Google ScholarDigital Library
Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (2010), 1345–1359. DOI:https://doi.org/10.1109/TKDE.2009.191 Google ScholarDigital Library
Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (2010), 1345–1359. Google ScholarDigital Library
Nicolas Papernot, Martín Abadi, Ulfar Erlingsson, Ian Goodfellow, and Kunal Talwar. 2016. Semi-supervised knowledge transfer for deep learning from private training data. Retrieved from https://arXiv:1610.05755.Google Scholar
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 1532–1543.Google ScholarCross Ref
Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, et al. 2011. The Kaldi speech recognition toolkit. In Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society.Google Scholar
Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V. Le. 2018. Regularized evolution for image classifier architecture search. Retrieved from https://arXiv:1802.01548.Google Scholar
Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V. Le, and Alexey Kurakin. 2017. Large-scale evolution of image classifiers. In Proceedings of the International Conference on Machine Learning (ICML’17). 2902–2911. Google ScholarDigital Library
Ronald L. Rivest, Len Adleman, Michael L. Dertouzos, et al. 1978. On data banks and privacy homomorphisms. Found. Secure Comput. 4, 11 (1978), 169–180.Google Scholar
Natasha Singh-Miller and Michael Collins. 2007. Trigger-based language modeling using a loss-sensitive perceptron algorithm. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’07), Vol. 4. IEEE, IV–25.Google ScholarCross Ref
Ankur Sinha, Pekka Malo, and Timo Kuosmanen. 2015. A multiobjective exploratory procedure for regression model selection. J. Comput. Graphic. Stat. 24, 1 (2015), 154–182.Google ScholarCross Ref
Shuang Song, Kamalika Chaudhuri, and Anand D. Sarwate. 2013. Stochastic gradient descent with differentially private updates. In Proceedings of the IEEE Global Conference on Signal and Information Processing. IEEE, 245–248.Google Scholar
Andreas Stolcke. 2002. SRILM-an extensible language modeling toolkit. In Proceedings of the 7th International Conference on Spoken Language Processing.Google Scholar
Andreas Stolcke and Jasha Droppo. 2017. Comparing human and machine errors in conversational speech transcription. In Proceedings of the Interspeech Conference. 137–141. https://academic.microsoft.com/paper/2963980299Google ScholarCross Ref
Baochen Sun and Kate Saenko. 2016. Deep coral: Correlation alignment for deep domain adaptation. In Proceedings of the European Conference on Computer Vision. Springer, 443–450.Google ScholarCross Ref
Yanan Sun, Gary G. Yen, and Zhang Yi. 2019. Evolving unsupervised deep neural networks for learning meaningful representations. IEEE Trans. Evolution. Comput. 23, 1 (2019), 89–103.Google ScholarCross Ref
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems. MIT Press, 3104–3112. Google ScholarDigital Library
Jan Trmal, Jan Zelinka, and Luděk Müller. 2010. Adaptation of a feedforward artificial neural network using a linear transform. In Proceedings of the International Conference on Text, Speech and Dialogue. Springer, 423–430. Google ScholarDigital Library
Paul Voigt and Axel Von dem Bussche. 2017. The EU general data protection regulation (GDPR). A Practical Guide, 1st ed. Springer International Publishing, Cham. Google ScholarDigital Library
Jindong Wang, Yiqiang Chen, Wenjie Feng, Han Yu, Meiyu Huang, and Qiang Yang. 2020. Transfer learning with dynamic distribution adaptation. ACM Trans. Intell. Syst. Technol. 11, 1 (2020), 1–25. Google ScholarDigital Library
Yang Wang, Quanquan Gu, and Donald Brown. 2018. Differentially private hypothesis transfer learning. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 811–826.Google Scholar
Hainan Xu, Ke Li, Yiming Wang, Jian Wang, Shiyin Kang, Xie Chen, Daniel Povey, and Sanjeev Khudanpur. 2018. Neural network language modeling with letter-based features and importance sampling. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’18). IEEE, 6109–6113.Google ScholarCross Ref
Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. 10, 2 (2019), 12. Google ScholarDigital Library
Andrew Chi-Chih Yao. 1982. Protocols for secure computations. In Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS’82), Vol. 82. 160–164. Google ScholarDigital Library
Jiangyan Yi, Hao Ni, Zhengqi Wen, Bin Liu, and Jianhua Tao. 2016. CTC regularized model adaptation for improving LSTM RNN based multi-accent Mandarin speech recognition. In Proceedings of the 10th International Symposium on Chinese Spoken Language Processing (ISCSLP’16). IEEE, 1–5.Google ScholarCross Ref
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? In Advances in Neural Information Processing Systems. MIT Press, 3320–3328. Google ScholarDigital Library
Daniel Yska, Yi Mei, and Mengjie Zhang. 2018. Genetic programming hyper-heuristic with cooperative coevolution for dynamic flexible job shop scheduling. In Proceedings of the European Conference of Genetic Programming (EuroGP’18). 306–321.Google ScholarCross Ref
Dong Yu and Li Deng. 2016. Automatic Speech Recognition.Springer.Google Scholar
Dong Yu, Kaisheng Yao, Hang Su, Gang Li, and Frank Seide. 2013. KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’13). IEEE, 7893–7897.Google ScholarCross Ref
Chao Zhang, Zichao Yang, Xiaodong He, and Li Deng. 2019. Multimodal intelligence: Representation learning, information fusion, and applications. Retrieved from https://arXiv:1911.03977.Google Scholar
Hangyu Zhu and Yaochu Jin. 2019. Multi-objective evolutionary federated learning. IEEE Trans. Neural Netw. Learn. Syst. 31, 4 (2019), 1310–1322.Google ScholarCross Ref
Yuze Zou, Shaohan Feng, Dusit Niyato, Yutao Jiao, Shimin Gong, and Wenqing Cheng. 2019. Mobile device training strategies in federated learning: An evolutionary game approach. In Proceedings of the International Conference on Internet of Things (iThings’19) and IEEE Green Computing and Communications (GreenCom’19) and IEEE Cyber, Physical and Social Computing (CPSCom’19) and IEEE Smart Data (SmartData’19). IEEE, 874–879.Google Scholar

Index Terms

A GDPR-compliant Ecosystem for Speech Recognition with Transfer, Federated, and Evolutionary Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Speech recognition

Recommendations

Data Augmentation Techniques for Transfer Learning-Based Continuous Dysarthric Speech Recognition
Abstract
Data augmentation is an essential component in building a dysarthric speech recognition system, as speech data collection from dysarthric speakers with varying degree of disorder is difficult. Dysarthric speech recognition systems are mostly built ...
Read More
Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and ...
Read More
A Platform for Deploying the TFE Ecosystem of Automatic Speech Recognition
MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Since data regulations such as the European Union's General Data Protection Regulation (GDPR) have taken effect, the traditional two-step Automatic Speech Recognition (ASR) optimization strategy (i.e., training a one-size-fits-all model with vendor's ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Intelligent Systems and Technology Volume 12, Issue 3
June 2021
218 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/3460499
Editor:
Yu Zheng
JD Digits, China
Issue’s Table of Contents
Copyright © 2021 Association for Computing Machinery.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 May 2021
- Accepted: 1 January 2021
- Revised: 1 November 2020
- Received: 1 March 2020
Published in tist Volume 12, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Speech recognition
federated learning
transfer learning
evolutionary learning
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 407
  Total Downloads
- Downloads (Last 12 months)70
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A GDPR-compliant Ecosystem for Speech Recognition with Transfer, Federated, and Evolutionary Learning

ACM Transactions on Intelligent Systems and Technology

Abstract

References

Cited By

Index Terms

Recommendations

Data Augmentation Techniques for Transfer Learning-Based Continuous Dysarthric Speech Recognition

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

A Platform for Deploying the TFE Ecosystem of Automatic Speech Recognition