Code-breaking

Code-breaking

We think of the empirical study of grammar as an exercise in code-breaking. A metaphor, no doubt, but a most fruitful one, and some of the consequences of adopting it are detailed below. It structures and dominates the way we think about the enterprise.  English is to be viewed as a gigantic code,  a system for processing messages into sentences, a vast battery of alternative encoding routines or programs, all taking messages as their original inputs and English sentences as their outputs.  The connection between ‘complete’ messages and ‘complete’ sentences is that one message is encoded into one sentence by one complete machine run through some encoding program of the language.

The enterprise is nothing less than to crack the code for English: to delineate its encoding programs, and thereby show how messages are articulated in the sentences which encode them. 

Technical Vocabulary

Code breakers have to be utterly scrupulous in keeping  clear which terms are syntactic, belonging to the realm of words and sentences, and which semantic, belonging to the realm of ideas and messages.  To conflate the two is tantamount to airbrushing the science of grammar out of the picture, studying as it does the relationship between them. Accordingly we adopt throughout distinct vocabularies for the two realms.  At first, this means some hard work:  there will be many unfamiliar technical terms (because new coinages) to learn.  The payoff comes later, when we are able to give clear and precise grammatical accounts of tense, aspect, modality. 

Invariance

The code-breaking metaphor imposes an austere methodolgy, for it is a fundamental assumption of the code-breaking approach that behind every word or other syntactic device (like phase-modification, or formal tense) is a unique and invariant informational factor (its meaning, if you like) which triggers selection of that word or device in the encoding process.  There is no other half-sensible way in which a code could work. 

Of course, there is the phenomenon of lexical ambiguity, where one string of letters has two or more dictionary entries.  We are not about to deny that the string ‘f-r-o-g’ can betoken a depression in the top of a housebrick as well as an animal.  But that phenomenon raises no genuine difficulty for our methodological principle, only the need for a terminology to delineate it.  We respond by discerning two different words, each spelt ‘f’-r-o-g’.  Which is, in effect, what dictionaries do in giving separate entries for frog1, frog2, and so on.

But beyond such trivial cases, the methodological principle rules: same word, same idea.

For definiteness, an example. It is is a matter of observation that the sentence

That dart might have landed in baby’s eye

can encode two quite different messages.  (Lucky it didn’t? Or perhaps it did?).  The Philosophers are prone to sheet the ambiguity home to the modal, discerning different senses of ‘might’.  For us, that would be not an explanatory theory, but a sheer admission of defeat.  An acceptance of brute diversity is always the last resort for a code-breaker.  Or again,

Her Majesty will walk home

can equally encode a message concerning a single future event (Her Majesty will walk home tonight), and a message concerning Her Majesty’s present habits (Nowadays, thanks to the Osborne cuts, Her Majesty will invariably walk home).  How is this datum to be explained?   Not, according to us, by discerning varying uses of ‘will’.

What is instead happening in both these cases is that quite different encoding programs, designed for quite different purposes, nonetheless end up outputting the same sentence.

That possibility should come as no surprise.  There is nothing in the idea of a code to suggest that different messages can never end up with the same signal. Nor is it necessarily a defect in a code that it generate ambiguous signals in this way.  Practical utility, for instance, does not dictate univocity of signal.  As a means to communicating messages from mind to mind, a code succeeds to the extent that the right messages are conveyed, by whatever overall process, upon individual occasions of broadcast. It is not necessary that the messages be recoverable from the signals alone.  The Native English Speaker is clever, and a naturally evolving code will exploit that fact to the hilt.

It is a corollary to this last thought that there be the phenomenon of …

Unsignalled Information

Unsignalled information is the obverse of ambiguity. Messages regularly comprise both SIGNALLED and UNSIGNALLED information. Signalled information is the sum of those informational factors which end up encoded in the words which make up the signal. Unsignalled information is the sum of informational factors ingredient in the message,  but nowhere encoded in the output signal. Unsignalled information has to be read in by the hearer in her attempt to grasp the message intended to be conveyed by the speaker.

More on unsignalled information, illustrated through examples, is to be found here

Grammatical Structure

Sentences have a simple enough structure: that of a string of sausages. There is a first word, and a second word, and so on. Messages are organised on quite different lines. They have grammatical structure.  We can offer a precise account: the structure of a message is that imposed by the encoding programme which generates its corresponding sentence.

The overall encoding process has two distinct and successive phases: an inquisition phase, followed by an execution phase.  In the inquisition phase, the message is submitted to a programmed interrogation.   Information is extracted from the message in a tree of questions, with answers to earlier questions helping to determine subsequent lines of enquiry.  The result of the inquisition, when the last jot of information has been wrung from the message, is a structural arrangement of informational choices, the structure naturally reflecting the course of the inquisition itself.  The execution phase sees this structured arrangement somehow transformed into a sentence, an item with utterly different structural properties.

The grammatical structure of a message is precisely the structural arrangement of informational choices.   The codebreaker’s task is to plot the course of the inquisition.  His object is to discover the routine whereby information conditioning the eventual string of words was systematically elicited from the originating message.  He wants to be able to see the message articulated in the sentence.

Terminology: when a message m is encoded as a sentence S, we call m an INTERPRETATION of S.

Word Order

 N.V.B.I. We do not think of the question of word order as a part of our enterprise. Word order is determined in the execution phase, not the inquisition, and whatever it is that determines word order in English is independent of the structure imposed by the inquisition phase. To us, the problem of word order must remain a mystery.

Tai eats with chopsticks

is no doubt preferred to both of

Tai with chopsticks eats
With chopsticks Tai eats

except by poets, who are granted that kind of license.  But almost any arrangement would do when it comes to recovering the message.

Eats Tai with chopsticks
Eats with chopsticks Tai

would work just as well.  And indeed, many languages would organise the word order thus.  Same message, same structural arrangement of informational choices, same grammatical structure.  But a different word order.

So we shall keep questions of word order entirely off the screen.  This fact alone (there are several others) serves to distinguish our grammatical programme from anything remotely Chomskian.  Indeed, from our perspective the Chomskian enterprise runs together illicitly two different questions:  the question of grammatical structure (a property only of messages) and the question of word order (a property only of sentences).  Phrase-marker analysis, for instance, is thoroughly confused.  It conflates the realm of messages and the realm of sentences.  Wicked waste.