Skip to main content
Log in

Air traffic control communication (ATCC) speech corpora and their use for ASR and TTS development

  • Project Notes
  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

The paper introduces the motivation for creating dedicated speech corpora of air traffic control communication, describes in detail the process of preparation of corpora for both automatic speech recognition and text-to-speech synthesis, presents an illustrative example of speech recognition system developed using the automatic speech recognition corpora and finally describes the technical aspects of the data and the distribution channel.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Notes

  1. Which, in our case, does not in fact mean “high quality”—see the following sections.

  2. https://www.clarin.eu/.

  3. Although the ATC communication can be conducted in native language in regional air traffic, the use of English is naturally indispensable in international ATC.

  4. https://catalog.ldc.upenn.edu/LDC94S14A.

  5. http://catalog.elra.info/en-us/repository/browse/ELRA-S0293/.

  6. See for example http://aviationknowledge.wikidot.com/aviation:nato-phonetic-alphabet.

  7. Note that the assumption that we know the origin of the data is perfectly reasonable—in our “artificial pseudopilot” scenario, we also only expect the controller’s speech.

  8. http://en.wikipedia.org/wiki/Arpabet.

  9. http://itblp.zcu.cz/.

References

  • Barras, C., Geoffrois, E., Wu, Z., & Liberman, M. (2001). Transcriber: Development and use of a tool for assisting speech corpora production. Speech Communication—Special Issue on Speech Annotation and Corpus Tools, 33(1–2), 5–22.

    Google Scholar 

  • Delpech, E., Laignelet, M., Pimm, C., Raynal, C., Trzos, M., Arnold, A., & Pronto, D. (2018). A real-life, french-accented corpus of air traffic control communications. In Proceedings of the eleventh international conference on language resources and evaluation, LREC 2018.

  • Hermansky, H. (1990). Perceptual linear predictive (PLP) analysis of speech. The Journal of the Acoustical Society of America, 87(4), 1738–1752.

    Article  Google Scholar 

  • Hofbauer, K., Petrik, S., & Hering, H. (2008). The ATCOSIM corpus of non-prompted clean air traffic control speech. In Proceedings of the international conference on language resources and evaluation, LREC, 2008 (pp. 2147–2152).

  • Jelinek, F., Bahl, L., & Mercer, R. (1975). Design of a linguistic statistical decoder for the recognition of continuous speech. IEEE Transactions on Information Theory, 21(3), 250–256.

    Article  Google Scholar 

  • Jůzová, M., & Tihelka, D. (2014). Minimum text corpus selection for limited domain speech synthesis. In P. Sojka, A. Horák, I. Kopeček, & K. Pala (Eds.), Text, speech and dialogue, volume 8655 of Lecture Notes in Computer Science (pp. 398–407). Berlin: Springer.

    Google Scholar 

  • Legát, M., Matoušek, J., & Tihelka, D. (2011). On the detection of pitch marks using a robust multi-phase algorithm. Speech Communication, 53, 552–566.

    Article  Google Scholar 

  • Matoušek, J., & Romportl, J. (2008). Automatic pitch-synchronous phonetic segmentation. In Interspeech 2008, proceedings of 9th annual conference of International Speech Communication Association (pp. 1626–1629).

  • Matoušek, J., Tihelka, D., & Psutka, J. (2003). Automatic segmentation for Czech concatenative speech synthesis using statistical approach with boundary-specific correction. In Eurospeech 2003—interspeech, proceedings of the 8th european conference on speech communication and technology (pp. 301–304).

  • Matoušek, J., Tihelka, D., & Romportl, J. (2008). Building of a speech corpus optimised for unit selection TTS synthesis. In LREC 2008, proceedings of 6th international conference on language resources and evaluation (pp. 1296–1299). ELRA.

  • Pavlinović, M., Boras, D., & Francetić, I. (2013). First steps in designing air traffic control communication language technology system—Compiling spoken corpus of radiotelephony communication. International Journal of Computers and Communications, 7(3), 73–80.

    Google Scholar 

  • Pellegrini, T., Farinas, J., Delpech, E., & Lancelot, F. (2018). The airbus air traffic control speech recognition 2018 challenge: Towards ATC automatic transcription and call sign detection. arXiv:1810.12614.

  • Prcín, M., Müller, L., & Šmídl, L. (2002). Statistical based speech/non-speech detector with heuristic feature set. In 6th World multi-conference on systemics, cybernetics and informatics (SCI 2002)/8th international conference on information systems analysis and synthesis (ISAS 2002), Orlando, FL (pp. 264–269).

  • Šmídl, L., & Švec, J. (2014). Semantic entity detection in the spoken air traffic control data. In A. Ronzhin, R. Potapova, & V. Delic (Eds.), SPECOM 2014, volume 8773 of Lecture Notes in Computer Science (pp. 394–401). Berlin: Springer.

    Google Scholar 

  • Tihelka, D., Hanzlíček, Z., Jůzová, M., Vít, J., Matoušek, J., & Grůber, M. (2018). Current state of text-to-speech system ARTIC: A decade of research on the field of speech technologies. In P. Sojka, A. Horák, I. Kopeček, & K. Pala (Eds.), TSD 2018, volume 11107 of Lecture Notes in Computer Science (pp. 369–378). Berlin: Springer.

    Google Scholar 

  • Valenta, T., & Šmídl, L. (2015). WebTransc—A WWW interface for speech corpora production and processing. In A. Ronzhin, R. Potapova, & V. Fakotakis (Eds.), SPECOM 2015, volume 9319 of Lecture Notes in Computer Science (pp. 487–494). Berlin: Springer.

    Google Scholar 

  • Witten, I. H., & Bell, T. (1991). The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression. IEEE Transactions on Information Theory, 37(4), 1085–1094.

    Article  Google Scholar 

Download references

Funding

Funding was provided by Grantová Agentura České Republiky (Grant No. GBP103/12/G084).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pavel Ircing.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Šmídl, L., Švec, J., Tihelka, D. et al. Air traffic control communication (ATCC) speech corpora and their use for ASR and TTS development. Lang Resources & Evaluation 53, 449–464 (2019). https://doi.org/10.1007/s10579-019-09449-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-019-09449-5

Keywords

Navigation