Noam Chomsky’s distinction between competence and performance has been controversial in linguistics and psycholinguistics for 50 years. The proponents of generative grammar presuppose it and rely on it, and have tried explaining the distinction many times, often unsuccessfully. I recently came across a neat way to encapsulate it that comes not from a linguist but from a mathematical meteorologist.
Psycholinguists (concerned with how language is really handled in human minds) and sociolinguists (interested in how language relates to social context) were horrified at Chomsky’s own exposition. The celebrated Page 3 of Aspects of the Theory of Syntax (1965) announces that the actual subject matter of linguistics is the intuitions about sentence structure of an imaginary “ideal speaker-listener in a completely homogeneous speech-community, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of this language in actual performance.”
Older linguists were shocked. Linguistic science is concerned with depicting the unobservable grammatical intuitions of a fictive ideal person living in a monodialectal dreamworld where memory is unlimited and no one ever makes mistakes? Linguists whose training had involved the sternly empirical behavioral science of the first half of the 20th century saw all this as horrifyingly regressive, harking back to the bad old days when psychologists worked from intuitive hunches about mental life. Indeed, the very idea that linguistics was part of psychology appalled some, who had spent the previous decades trying to separate the rigorous description of grammatical systems from the experimental study of the behavioral and cognitive quirks of the imperfect organisms who use them.
And Chomsky was quite serious about shifting his discipline in a psychological direction: He insisted that intuitions were not just the evidence for theoretical linguistics (which was bad enough) but the subject matter. He even linked the subject back to rationalist philosophers of 17th-century France who spoke of “l’expression naturelle de nos pensées” rather than the structure of utterances, and “l’esprit de l’écrivain” rather than the content of texts. Quel horreur! Linguistics was being driven back to the time of Descartes.
Yet the competence/performance distinction is perfectly sensible; as a grammarian I couldn’t do without it for a minute. And I think it can be clarified nicely without any reference to intuitions, or to the “esprit” of an imaginary ideal being.
A recent Economist article tells me that Edward Lorenz, the pioneer of applied nonlinear dynamics, once distinguished weather from climate not in terms of averaging out daily weather over some longer period, but in a conceptually simpler way: “Climate is what you expect; weather is what you get.”
Exactly what we need for the purpose at hand: Competence is what you expect; performance is what you get.
Suppose you want to state the facts about how many occurrences of the are found at the beginning of a simple definite noun phrase (NP) in English. The right answer is 1; the value is grammatical but *the the value is not. Yet counting occurrences of the as determiner in the standard corpus of 1987–1989 Wall Street Journal articles that computational linguists use for testing, we get a different figure: not 1, but roughly 1.0001.
The first file of the corpus contains the previously most potent clot-dissolving agent and the the Levitts; the second file contains the loose ears of their opponents and the the first two months of 1987; the third has the utility and the 30-day grace period and the the Soviet and Brazilian situations; and so it goes on. There are 231 occurrences of the the in total, for nearly 2.27 million NPs beginning with the.
If we hug the data too closely we get the frankly silly result that when constructing an NP with the definite article as its determiner you should aim to use an average of about 1.0001 occurrences of the.
The sensible view, of course, is to say that 1 is what we expect (because there really is a grammatical rule limiting NPs to a single determiner), but 1.0001 is what we get on average (because journalists occasionally double up a the by mistake).
That’s one way of explaining why Chomsky said that we cannot possibly base grammatical description on performance. A grammar is not supposed to predict the vagaries of what we will get. It is supposed to tell us what to expect. The expectation of one definite article per NP is the only one that we should take as our guide when trying to compose an utterance. In real life occasional unintentional repetitions occur (and I suppose very occasional errors of omitting the as well), but only because of sporadic and unintended mistakes. Not everything you come across is something the rules of grammar should take account of.