Learning opacity in Stratal Maximum Entropy Grammar*

Aleksei Nazarov; Joe Pater

doi:10.1017/S095267571700015X

Learning opacity in Stratal Maximum Entropy Grammar*

Published online by Cambridge University Press: 14 August 2017

Aleksei Nazarov and

Joe Pater

Show author details

Aleksei Nazarov*: Affiliation:
Harvard University
Joe Pater*: Affiliation:
University of Massachusetts Amherst
*: E-mail: anazarov@fas.harvard.edu, pater@linguist.umass.edu.
E-mail: anazarov@fas.harvard.edu, pater@linguist.umass.edu.

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Opaque phonological patterns are sometimes claimed to be difficult to learn; specific hypotheses have been advanced about the relative difficulty of particular kinds of opaque processes (Kiparsky 1971, 1973), and the kind of data that is helpful in learning an opaque pattern (Kiparsky 2000). In this paper, we present a computationally implemented learning theory for one grammatical theory of opacity, a Maximum Entropy version of Stratal OT (Bermúdez-Otero 1999, Kiparsky 2000), and test it on simplified versions of opaque French tense–lax vowel alternations and the opaque interaction of diphthong raising and flapping in Canadian English. We find that the difficulty of opacity can be influenced by evidence for stratal affiliation: the Canadian English case is easier if the learner encounters application of raising outside the flapping context, or non-application of raising between words (e.g. life with [ʌɪ]; lie for with [aɪ]).

Type: Articles
Information: Phonology , Volume 34 , Issue 2 , August 2017 , pp. 299 - 324

DOI: https://doi.org/10.1017/S095267571700015X [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

We would like to thank Ricardo Bermúdez-Otero, Paul Boersma, Jeroen Breteler, Ivy Hauser, Jeff Heinz, Coral Hughto, Gaja Jarosz, Marc van Oostendorp, Olivier Rizzolo, Klaas Seinhorst and Robert Staubs, as well as audiences at the 21st Manchester Phonology Meeting, the University of Massachusetts Amherst and the University of Amsterdam for their insightful feedback on this paper and for stimulating discussion. We also thank the editors of this volume and two anonymous reviewers for their very helpful and useful comments. We are grateful to the National Science Foundation for supporting this work through grants BCS-0813829 and BCS-1424077. All errors are ours.

References

REFERENCES

Bermúdez-Otero, Ricardo (1999). Constraint interaction in language change: quantity in English and Germanic. PhD dissertation, University of Manchester & University of Santiago de Compostela.Google Scholar

Bermúdez-Otero, Ricardo (2003). The acquisition of phonological opacity. In Spenader et al. (2003). 25–36.Google Scholar

Boersma, Paul (1998). Functional phonology: formalizing the interactions between articulatory and perceptual drives. PhD dissertation, University of Amsterdam.Google Scholar

Boersma, Paul & van Leussen, Jan-Willem (to appear). Efficient evaluation and learning in multilevel parallel constraint grammars. LI 48.Google Scholar

Byrd, Richard H., Lu, Peihuang, Nocedal, Jorge & Zhu, Ciyou (1995). A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing 16. 1190–1208.Google Scholar

Coetzee, Andries W. & Pater, Joe (2011). The place of variation in phonological theory. In Goldsmith et al. (2011). 401–434.Google Scholar

De Jong, Kenneth J. (2011). Flapping in American English. In van Oostendorp, Marc, Ewen, Colin J., Hume, Elizabeth & Rice, Keren (eds.) The Blackwell companion to phonology. Malden, Mass.: Wiley-Blackwell. 2711–2729.Google Scholar

Dinnsen, Daniel A. & Farris-Trimble, Ashley W. (2008). An opacity-tolerant conspiracy in phonological acquisition. Indiana University Working Papers in Linguistics 6. 99–118.Google Scholar

Eisenstat, Sarah (2009). Learning underlying forms with MaxEnt. MA thesis, Brown University.Google Scholar

Eychenne, Lucien (2014). Schwa and the loi de position in Southern French. Journal of French Language Studies 24. 223–253.Google Scholar

Goldsmith, John A., Riggle, Jason & Yu, Alan C. L. (eds.) (2011). The handbook of phonological theory. 2nd edn. Malden, Mass.: Wiley-Blackwell.Google Scholar

Goldwater, Sharon & Johnson, Mark (2003). Learning OT constraint rankings using a Maximum Entropy model. In Spenader et al. (2003). 111–120.Google Scholar

Hayes, Bruce & Wilson, Colin (2008). A maximum entropy model of phonotactics and phonotactic learning. LI 39. 379–440.Google Scholar

Hoerl, Arthur E. & Kennard, Robert W. (1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12. 55–67.CrossRef Google Scholar

Idsardi, William J. (2000). Clarifying opacity. The Linguistic Review 17. 337–350.Google Scholar

Jarosz, Gaja (2006). Rich lexicons and restrictive grammars: maximum likelihood learning in Optimality Theory. PhD dissertation, Johns Hopkins University.Google Scholar

Jarosz, Gaja (2014). Serial markedness reduction. In John Kingston, Claire Moore-Cantwell, Joe Pater & Robert Staubs (eds.) Proceedings of the 2013 Meeting on Phonology. Available (May 2017) at http://journals.linguisticsociety.org/proceedings/index.php/amphonology/article/view/40.Google Scholar

Jarosz, Gaja (2015). Expectation driven learning of phonology. Ms, University of Massachusetts Amherst.Google Scholar

Jarosz, Gaja (2016). Learning opaque and transparent interactions in Harmonic Serialism. In Gunnar Ólafur Hansson, Ashley Farris-Trimble, Kevin McMullin & Douglas Pulleyblank (eds.) Proceedings of the 2015 Annual Meeting on Phonology. Available (May 2017) at http://journals.linguisticsociety.org/proceedings/index.php/amphonology/article/view/3671.Google Scholar

Johnson, Mark (2013). A gentle introduction to maximum entropy, log-linear, exponential, logistic, harmonic, Boltzmann, Markov Random Fields, Conditional Random Fields, etc., models. Slides of paper presented to the Macquarie University Machine Learning Reading Group. Available (May 2017) at http://web.science.mq.edu.au/~mjohnson/papers/Johnson12IntroMaxEnt.pdf.Google Scholar

Johnson, Mark, Pater, Joe, Staubs, Robert & Dupoux, Emmanuel (2015). Sign constraints on feature weights improve a joint model of word segmentation and phonology. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics. 303–313.Google Scholar

Joos, Martin (1942). A phonological dilemma in Canadian English. Lg 18. 141–144.Google Scholar

Kaye, Jonathan (1990). What ever happened to Dialect B? In Mascaró, Joan & Nespor, Marina (eds.) Grammar in progress: GLOW essays for Henk van Riemsdijk. Dordrecht: Foris. 259–263.Google Scholar

Kim, Yun Jung (2012). Do learners prefer transparent rule ordering? An artificial language learning study. CLS 48:1. 375–386.Google Scholar

Kiparsky, Paul (1971). Historical linguistics. In Dingwall, William Orr (ed.) A survey of linguistic science. College Park: University of Maryland Linguistics Program. 576–642.Google Scholar

Kiparsky, Paul (1973). Abstractness, opacity, and global rules. In Fujimura, Osamu (ed.) Three dimensions in linguistic theory. Tokyo: TEC. 57–86.Google Scholar

Kiparsky, Paul (2000). Opacity and cyclicity. The Linguistic Review 17. 351–365.Google Scholar

Kullback, S. & Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics 22. 79–86.Google Scholar

McCarthy, John J. (2007). Hidden generalizations: phonological opacity in Optimality Theory. Sheffield & Bristol, Conn.: Equinox.Google Scholar

McCarthy, John J. & Pater, Joe (eds.) (2016). Harmonic Grammar and Harmonic Serialism. London: Equinox.Google Scholar

Moreux, Bernard (1985). La ‘Loi de Position’ en français du Midi. I: Synchronie (Béarn). Cahiers de Grammaire 9. 45–138.Google Scholar

Moreux, Bernard (2006). Les voyelles moyennes en français du Midi: une tentative de synthèse en 1985. Cahiers de Grammaire 30. 307–317.Google Scholar

Odden, David (2011). Rules v. constraints. In Goldsmith et al. (2011). 1–39.Google Scholar

Pater, Joe (2014). Canadian raising with language-specific weighted constraints. Lg 90. 230–240.Google Scholar

Pater, Joe (2016). Universal Grammar with weighted constraints. In McCarthy & Pater (2016). 1–46.Google Scholar

Pater, Joe, Staubs, Robert, Jesney, Karen & Smith, Brian (2012). Learning probabilities over underlying representations. In Proceedings of the 12th Meeting of the Special Interest Group on Computational Morphology and Phonology. Montreal: Association for Computational Linguistics. 62–71. Available (May 2017) at www.aclweb.org/anthology/W12-2308.Google Scholar

R Core Team (2013). R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. http://www.R-project.org/.Google Scholar

Rizzolo, Olivier (2002). Du leurre phonétique des voyelles moyennes en français et du divorce entre licenciement et licenciement pour gouverner. PhD dissertation, University of Nice-Sophia Antipolis.Google Scholar

Selkirk, Elisabeth (1978). The French foot: on the status of ‘mute’ e. Studies in French Linguistics 1:2. 141–150.Google Scholar

Smolensky, Paul (1986). Information processing in dynamical systems: foundations of Harmony Theory. In Rumelhart, D. E., McClelland, J. L. & the PDP Research Group (eds.) Parallel Distributed Processing: explorations in the micro-structure of cognition. Vol. 1: Foundations. Cambridge, Mass.: MIT Press. 194–281.Google Scholar

Smolensky, Paul & Legendre, Géraldine (eds.) (2006). The harmonic mind: from neural computation to optimality-theoretic grammar. 2 vols. Cambridge, Mass.: MIT Press.Google Scholar

Spenader, Jennifer, Eriksson, Anders & Dahl, Östen (eds.) (2003). Variation within Optimality Theory: Proceedings of the Stockholm Workshop on ‘Variation within Optimality Theory’. Stockholm: Department of Linguistics, Stockholm University.Google Scholar

Staubs, Robert (2014a). Computational modeling of learning biases in stress typology. PhD dissertation, University of Massachusetts Amherst.Google Scholar

Staubs, Robert (2014b). Stratal MaxEnt Solver. Software package. Available (July 2017) at http://www.linguist.robertstaubs.org/HGR/serialMaxEnt.zip.Google Scholar

Staubs, Robert & Pater, Joe (2016). Learning serial constraint-based grammars. In McCarthy & Pater (2016). 369–388.Google Scholar

Tesar, Bruce & Smolensky, Paul (2000). Learnability in Optimality Theory. Cambridge, Mass.: MIT Press.Google Scholar

Article contents

Learning opacity in Stratal Maximum Entropy Grammar*

Abstract

Access options

Footnotes

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests