Skip to main content
Log in

The South African directory enquiries (SADE) name corpus

  • Original Paper
  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

We present the design and development of a South African directory enquiries corpus. It contains audio and orthographic transcriptions of a wide range of South African names produced by first-language speakers of four languages, namely Afrikaans, English, isiZulu and Sesotho. Useful as a resource to understand the effect of name language and speaker language on pronunciation, this is the first corpus to also aim to identify the “intended language”: an implicit assumption with regard to word origin made by the speaker of the name. We describe the design, collection, annotation, and verification of the corpus. This includes an analysis of the algorithms used to tag the corpus with meta information that may be beneficial to pronunciation modelling tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://cloud.google.com/translate/docs/reference/rest.

  2. Google Translate API was developed by Google as a proprietary application based on statistical machine translation.

  3. ISLRN 510-842-952-534-8, available from http://hdl.handle.net/20.500.12185/378 under a Creative Commons Attribution License (3.0 Unported).

References

  • Adda-Decker, M., & Lamel, L. (2006). Multilingual dictionaries. In T. Schultz & K. Kirchoff (Eds.), Multilingual speech processing (pp. 123–166). Berlington, MA: Academic Press. chap 5.

    Chapter  Google Scholar 

  • Amdal, I., & Fosler-Lussier, E. (2003). Pronunciation variation modeling in automatic speech recognition. Telektronikk, 99, 70–82.

    Google Scholar 

  • Barnard, E., Davel, M., & van Heerden, C. (2009). ASR corpus design for resource-scarce languages. In Proceedings of the 10th annual conference of the international speech communication association (INTERSPEECH), Brighton, UK (pp. 2847–2850).

  • Barnard, E., Davel, M. H., van Heerden, C. J., De Wet, F., & Badenhorst, J. (2014). The NCHLT speech corpus of the South African languages. In Proceedings of the of the 4th workshop on spoken language technologies for under-resourced languages (SLTU), St. Peterburg, Russia (pp. 194–200).

  • Barnard, E., Davel, M, H., & van Huyssteen, G. B. (2010). Speech technology for information access: a South African case study. In Proceedings of the AAAI spring symposium on artificial intelligence for development (AAI-D) (pp. 8–13).

  • Bechet, F., De Mori, R., & Subsol, G. (2001) Very large vocabulary proper name recognition for directory assistance. In IEEE workshop on automatic speech recognition and understanding (ASRU) (pp. 222–225).

  • Bechet, F., De Mori, R., & Subsol, G. (2002). Dynamic generation of proper name pronunciations for directory assistance. In IEEE international conference on acoustics, speech, and signal processing (ICASSP) (Vol. 1, pp. I–745–I–748).

  • Bisani, M., & Ney, H. (2008). Joint-sequence models for grapheme-to-phoneme conversion. Speech Communication, 50(5), 434–451. https://doi.org/10.1016/j.specom.2008.01.002.

    Article  Google Scholar 

  • Church, K. W. (1985). Stress assignment in letter-to-sound rules for speech synthesis. The Journal of the Acoustical Society of America, 78(S1), S7–S7.

    Article  Google Scholar 

  • Córdoba, R., San-Segundo, R., Montero, J. M., Colás, J., Ferreiros, J., Macías-Guarasa, J., & Pardo, J. M. (2001). An interactive directory assistance service for Spanish with large-vocabulary recognition. In Proceedings of the 2nd annual conference of the international speech communication association (INTERSPEECH), Scandinavia (pp. 1279–1282).

  • Davel, M. H., Basson, W. D., van Heerden, C. J., & Barnard, E. (2013). NCHLT dictionaries: Project report. Technical report. Multilingual Speech Technologies, North-West University.

  • Davel, M. H., & Martirosian, O. (2009). Pronunciation dictionary development in resource-scarce environments. In Proceedings of the 10th annual conference of the international speech communication association (INTERSPEECH) (pp. 2851–2854).

  • Davel, M. H., van Heerden, C. J., & Barnard, E. (2012). Validating smartphone-collected speech corpora (accepted for publication). In Proceedings of the spoken language technologies for under-resourced languages (SLTU).

  • Giwa, O., & Davel, M. H. (2014). Language identification of individual words with Joint Sequence Models. In Proceedings of the 15th annual conference of the international speech communication association (Interspeech).

  • Giwa, O., & Davel, M. H. (2015) Text-based language identification of multilingual names. In Proceedings of the pattern recognition association of South Africa and robotics and mechatronics international conference (PRASA-RobMech) (pp. 166–171).

  • Giwa, O., Davel, M. H., & Barnard, E. (2011). A Southern African corpus for multilingual name pronunciation. In Proceedings of the 22nd annual symposium of the pattern recognition association of South Africa (PRASA) (pp. 49–53).

  • Gustafson, J. (2009). ONOMASTICA—Creating a multi-lingual dictionary of European names. Lund Working Papers in Linguistics, 43, 66–69.

    Google Scholar 

  • Kamm, C. A., Shamieh, C., & Singhal, S. (1995). Speech recognition issues for directory assistance applications. Speech Communication, 17(3), 303–311.

    Article  Google Scholar 

  • Kgampe, M., & Davel, M. H. (2010). Consistency of cross-lingual pronunciation of South African personal names. In 21st annual symposium of the pattern recognition association of South Africa (PRASA 2010) (pp. 123–127).

  • Kgampe, M., Davel, M. H. (2011). The predictability of name pronunciation errors in four South African languages. In Proceedings of the 22nd annual symposium of the pattern recognition association of South Africa (PRASA), Emerald Casino and Resort, Vanderbijlpark, South Africa (pp. 85–90).

  • Llitjós, A.F., & Black, A.W. (2001) Knowledge of language origin improves pronunciation accuracy of proper names. In 7th European conference on speech communication and technology (EUROSPEECH) (pp. 1919–1922).

  • Llitjós, A. F., Black, A. W., Lenzo, K., & Rosenfeld, R. (2001) Improving pronunciation accuracy of proper names with language origin classes. In Proceedings of the 7th ESSLLI student session.

  • Loots, L., & Niesler, T. (2011). Automatic conversion between pronunciations of different English accents. Speech Communication, 53, 75–84. https://doi.org/10.1016/j.specom.2010.07.006.

    Article  Google Scholar 

  • Maison, B., Chen, S. F., & Cohen, P. S. (2003). Pronunciation modeling for names of foreign origin. In Proceedings of the IEEE workshop on automatic speech recognition and understanding (ASRU), IEEE (pp. 429–434). https://doi.org/10.1109/ASRU.2003.1318479.

  • Modipa, T., de Wet, F., Davel, M. H. (2009) ASR performance analysis of an experimental call routing system. In Proceedings of the 20th annual symposium of the pattern recognition association of South Africa (PRASA) (pp. 127–130).

  • Modipa, T. I., Davel, M. H., & de Wet, F. (2013). Pronunciation modelling of foreign words for Sepedi ASR. In Proceedings of the annual symposium of the pattern recognition association of South Africa (PRASA), Johannesburg, South Africa (pp. 64–69).

  • Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., & Schwarz, P., et al. (2011) The Kaldi speech recognition toolkit. In Proceedings of the IEEE 2011 workshop on automatic speech recognition and understanding (ASRU), Big Island, Hawaii, EPFL-CONF-192584.

  • Réveil, B., Martens, J. P., & D’Hoore, B. (2009) How speaker tongue and name source language affect the automatic recognition of spoken names. In 10th annual conference of the international speech communication association (INTERSPEECH) (pp. 2971–2974).

  • Réveil, B., Martens, J. P., & van den Heuvel, H. (2010) Improving proper name recognition by adding automatically learned pronunciation variants to the lexicon. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner & D. Tapias (Eds.), Proceedings of the 7th conference on international language resources and evaluation (LREC) (pp. 2149–2154).

  • Réveil, B., Martens, J. P., & van den Heuvel, H. (2012). Improving proper name recognition by means of automatically learned pronunciation variants. Speech Communication, 54(3), 321–340.

    Article  Google Scholar 

  • Schramm, H., Rueber, B., & Kellner, A. (2000). Strategies for name recognition in automatic directory assistance systems. Speech Communication, 31(4), 329–338. https://doi.org/10.1016/S0167-6393(99)00066-7.

    Article  Google Scholar 

  • Spiegel, M. F. (2003). Proper name pronunciations for speech technology applications. International Journal of Speech Technology, 6(4), 419–427.

    Article  Google Scholar 

  • Strik, H., & Cucchiarini, C. (1999). Modeling pronunciation variation for ASR: A survey of the literature. Speech Communication, 29(2–4), 225–246.

    Article  Google Scholar 

  • Thirion, J. W., Davel, M. H., & Barnard, E. (2012) Multilingual pronunciations of proper names in a Southern African corpus. In Proceedings of the 23rd annual symposium of the pattern recognition association of South Africa (PRASA), Pretoria, South Africa (pp. 102–108).

  • Trancoso, I., & Viana, M. C. (1995). Issues in the pronunciation of proper names: The experience of the Onomastica project. In Workshop on integration of language and speech (pp. 1–16).

  • van den Heuvel, H., Martens, J. P., D’hanens, K., & Konings, N. (2008) The autonomata spoken names corpus. In Proceedings of the 6th conference on international language resources and evaluation (LREC) (pp. 140–143).

  • van den Heuvel, H., Réveil, B., & Martens, J. P. (2009). Pronunciation-based ASR for names. In Proceedings of the 10th annual conference of the international speech communication association (INTERSPEECH) (pp. 2959–2962).

  • van Heerden, C., Davel, M. H., & Barnard, E. (2014). Performance analysis of a multilingual directory enquiries application. In Proceedings of the annual symposium of the pattern recognition association of South Africa (PRASA).

  • van Heerden, C., Kleynhans, N., & Davel, M. (2016). Improving the Lwazi ASR baseline. In Proceedings of the INTERSPEECH (pp. 3534–3538).

  • Yang, Q., Martens, J.P., Konings, N., & van den Heuvel, H. (2006) Development of a phoneme-to-phoneme (p2p) converter to improve the grapheme-to-phoneme (g2p) conversion of names. In Proceedings of the 5th international conference on language resources and evaluation (LREC) (pp. 287–292).

  • Yu, D., Ju, Y. C., Wang, Y. Y., Zweig, G., & Acero, A. (2007) Automated directory assistance system—From theory to practice. In Proceedings of the 8th annual conference of the international speech communication association (INTERSPEECH) (pp. 2709–2712).

  • Zulu, P. N., Botha, G., & Barnard, E. (2008). Orthographic measures of language distances between the official South African languages. Literator: Journal of Literary Criticism, Comparative Linguistics and Literary Studies, 29(1), 185–204.

    Article  Google Scholar 

Download references

Acknowledgements

This work is based on research supported by the Department of Arts and Culture (DAC) of the government of South Africa, through their Human Language Technologies (HLT) unit, and the National Research Foundation (NRF). Any opinion, finding and conclusion or recommendation expressed in this material is that of the authors and the NRF does not accept any liability in this regard. The support by both institutions is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marelie H. Davel.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendices

1.1 Appendix 1: The SADE phone set

Table 13 provides a description of the SADE phoneme set as used to annotate the SADE corpus. For each phoneme, the corresponding IPA and X-SAMPA symbols are also provided.

Table 13 SADE phone set

1.2 Appendix 2: Consent form

The following consent form was used during data collection:

figure e

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Thirion, J.W.F., van Heerden, C., Giwa, O. et al. The South African directory enquiries (SADE) name corpus. Lang Resources & Evaluation 54, 155–184 (2020). https://doi.org/10.1007/s10579-019-09448-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-019-09448-6

Keywords

Navigation