Computational Structure of Human Language

E. S. Ristad, 1990

for $19.95 x

The central thesis of this report is that human language is NP-complete.  That is, the process of comprehending and producing utterances is bounded above by the class NP, and below by NP-hardness.  This constructive complexity thesis has two empirical consequences.  The first is to predict that a language model outside NP is unnaturally powerful.  The second is to predict that a language model less complex than NP-hard is descriptively inadequate.

To prove the lower bound, I show that the following three subproblems of language comprehension are all NP-hard: decide whether a given sound is a possible sound of a given language; disambiguate a sequence of words; and compute the antecedents of pronouns.  The complexity proofs are based directly on the empirical facts of the language user"s knowledge, under an appropriate idealization.  Therefore, they are invariant across linguistic theories.  No knowledge of linguistic theory is needed to understand the proofs, only knowledge of English.

To illustrate the usefulness of the upper bound, I show that two widely-accepted analyses of the language user"s knowledge (of syntactic ellipsis and phonological dependencies) lead to complexity outside NP (PSPACE-hard and Undecidable, respectively).  Next, guided by the complexity proofs, I construct alternate linguistic analyses that are strictly superior on descriptive grounds, as well as being less complex computationally (in NP).

The report also presents a new framework for linguistic theorizing, that resolves important puzzles in generative linguistics, and guides the mathematical investigation of human language.

Artificial Intelligence Laboratory, MIT

Table of Contents

1          Foundation of the Investigation                                                              1

            1.1       The conceptual framework                                                                    2

            1.2       The constructive complexity thesis                                                         8

            1.3       Direct complexity analysis                                                                     9

            1.4       Summary of the report                                                              10

2          Structure of Phonological Knowledge                                                    13

            2.1       Segmental phonology                                                                            15

                        2.1.1    Complexity of segmental recognition and generation       18

                        2.1.2    Restricting the segmental model                                     20

                        2.1.3    The SPE evaluation metric                                                        30

                        2.1.4    Modeling phonological dependencies                                        34

            2.2       Autosegmental phonology                                                                     35

                        2.2.1    Complexity of autosegmental recognition                                   35

                        2.2.2    Suprasegmental dependencies                                       38

3          Syntactic Agreement and Lexical Ambiguity                                                      42

            3.1       Morpho-syntactic dependencies                                                            44

            3.2       Complexity of linguistic transforms                                                         51

            3.3       Complexity of agreement interactions                                        57

            3.4       Conclusions                                                                                          62

                        3.4.1    Locality in linguistic theory                                                         62

                        3.4.2    The search for uniform mechanisms                                           65

4          Complexity of Anaphora                                                                                   68

            4.1       Preliminaries                                                                                         69

            4.2       Two models of the anaphora problem                                       71

                        4.2.1    The agreement model                                                                71

                        4.2.2    The referential dependence model                                             76

                        4.2.3    From satisfiability to referential dependence                               84

            4.3       Evidence for an NP upper bound                                                          86

                        4.3.1    Copy theory of ellipsis                                                  89

                        4.3.2    Complexity of anaphora in the copy theory                                92

                        4.3.3    Ellipsis reconsidered                                                                 101

            4.4       Analysis of linguistic theories                                                                 106

5          References                                                                                                       110

A         Philosophical Issues                                                                                          118

            A.1      Implications of a complexity classification                                              118

            A.2      Unbounded idealizations                                                                       123

                        A.2.1   Unbounded agreement features                                     125

                        A.2.2   Unbounded phonological features                                              127

                        A.2.3   Limiting unbounded idealizations                                                129

B          Structure of Elliptical Dependencies                                                                   130

            B.1       Previous work reconsidered                                                                  132

                        B.1.1    Covariance rediced to prediction of subjects                             133

                        B.1.2    The problem of nonsubject covariance                          134

                        B.1.3    Covariance reduced to bound anaphora                                    137

            B.2       Invisible obviation                                                                                 140

            B.3       Necessary structure of an explicit theory                                                145

            B.4       The proposal system of representation                                       146

                        B.4.1    Segregate phrase and thematic structure                                    146

                        B.4.2    Two relations of referential dependency                         148

                        B.4.3    Ellipsis as a shared thematic-function                                         150

            B.5       The space of elliptical structures                                                            155

                        B.5.1    Possible domains of ellipsis                                                       155

                        B.5.2    Invisible crossover                                                                    157

                        B.5.3    Invisible obviation                                                                     158

                        B.5.4    Recursive ellipsis                                                                       160

            B.6       Conclusions                                                                                          161