Communicative Efficiency in the Lexicon

Author: Peter Graff, Year:2012

In this dissertation, I argue that a variety of probabilistic patterns in natural language phonology derive from communicative efficiency. I present evidence from phonetically transcribed dictionaries of 60 languages from 25 major language families showing that both probability distributions over phonological structures licensed by the categorical grammar, and the global organization of the phonological lexicon as a whole facilitate the efficient communication of intended messages from speaker to listener.

Specifically, I show that the occurrence probabilities of different grammatical structures render natural language phonology an efficient code for communication given the effort involved in producing different categories and the specific kinds of noise introduced by the human language channel. I also present evidence that co-occurrence restrictions on consonants sharing place features serve a communicative purpose in that they facilitate the identification of words with respect to each other. Furthermore, I show that the organization of the phonological lexicon as a whole is subject to communicative efficiency. Concretely, I show that words in human language preferentially rely on highly perceptible contrasts for distinctness, beyond what is expected from the probabilistic patterning of the individual sounds that distinguish them. This shows that redundancy in the phonological code is not randomly distributed, but exists to supplement imperceptible distinctions between larger units as needed.

I argue that cross-linguistic biases in the distributions of individual sounds arise from humans using their language in ways that accommodate anticipated mistransmission (Jurafsky et al. 2001, van Son and Pols 2003, Aylett and Turk 2004) thus presenting a serious challenge to theories relegating the emergence of communicative efficiency in phonology to properties of the human language channel only (Ohala 1981, Blevins 2004, 2006). Furthermore, I present preliminary computational and experimental evidence that the optimization of the lexicon as a whole could have arisen from the aggregate effects of speakers’ biases to use globally distinct word forms over the course of a language’s history (cf Martin, 2007).