Language 2

Syntax and semantics

Contents

The syntax-semantics distinction

The Chinese Room spotlights a distinction between syntax and semantics. It is a widely used distinction and one we need to be as clear about as we can. If I try to explain it, you must not assume that it is straightforwardly there to be understood. Maybe you will want to reject the idea that there is a legitimate distinction of this kind. But let me try and say clearly what the distinction tries to be.

The analogy I shall suggest is this: the distinction between syntax and semantics is like the distinction between a train of wagons and what's in the wagons.

A sentence is like a train of loaded wagons. What's in the wagons is the meaning, the semantics. The wagons themselves, and their couplings, are the syntax.

Showing its mixed traffic pedigree, 6990 passes on a demonstration mixed goods train on 22nd June 1996. Pic and caption courtesy John Neave at Going Loco.

So the idea is that there is such a thing as the structure of a language, and there is the meaning that the structure can be used to convey.

The structure is the syntax. The meaning it can be used to convey is the semantics.


Top


 

GRAMMAR

How are we to think of the syntax of a language?

In the old days, people used to speak of 'grammar'. There were different categories of words, and rules which said how these might be combined if a properly grammatical sentence was to be arrived at. For example one category of word was said to be the 'verb'. And lots of grammar books said you can't have a properly grammatical sentence without a verb in it somewhere.

Just to remind you of another example:

Verbs were supposed to divide into two subcategories, transitive and intransitive.

('Mr Badger respected his friend the Rat' gives us an example of a transitive verb.

'Badgers smell dreadfully' gives us an example of an intransitive verb.)

Another familiar rule was this. If you have in a sentence a word belonging to the category 'transitive verb' that sentence must have a word belonging to the category 'noun', and this was glossed as the 'object' of the transitive verb.

So this rule would say: You can't have a grammatically proper sentence 'Mr Badger respected.' In the sentence 'Mr Badger respected his friend Ratty' 'Ratty' is a noun, and the object of the verb 'respected'.'

So the idea seems to be here that words belong to various categories, and there are rules governing which categories go with which.

We are to think of these rules as having nothing to do with meaning. We are not talking about what's in the wagons, just what different types of train element there are and how they may be linked together. For example a proper train must have a locomotive at the front. Or a railway rule might be: you have to have the wagons linked head to tail, you can't have one wagon linked directly to two others.

If we are thinking of a sentence as a vehicle, we can talk about it and say what elements it might have and how they must be related together if the sentence is to be a proper one all without talking about meaning. Meaning is what we put into the wagons. And we have not got to that point yet.

Uninterpreted symbols

What is there in language that corresponds to the empty wagon?

People speak in this connection of an 'uninterpreted symbol'. A language element which has not got any meaning.

Are there any?

Prompt

What would you give as an example?

 

What are we looking for? A mark which carries no meaning, but whose life is governed by rules.

Prompt: Have a go at making a formal system. You will need some symbols, and then some rules governing how they can be combined.

There is the idea then of a set of symbols and then a set of rules governing what constitutes a 'grammatical' sequence of these symbols. That is to say, there is the concept of what might be called a "formal" language.

Logical calculi

Now think of adding a third feature, besides symbols and rules for sequencing them.

Look at the existing features this way: they let you tell of any sequence of marks put forward whether it is 'syntactically correct' or not.

But then suppose for some reason you want to collect a set of sequences.

You could bang any old symbols together, keep that as a sequence, then bang another few together, and that will give you a second: and so on. You could get yourself a pile of sequences very quickly and easily that way. You could automate the process.

But you could make another collection, if for any reason you wanted to, by limiting the sequences you kept to those which were syntactically correct. To make this kind of collection you bang sequences together as before, but you check each out by seeing if it follows the rules of syntax. If it does you keep it, if it doesn't, you junk.

I'm not saying why you might want this collection rather than the first, but it's more select.

Then think of another way of cutting down your collection. Think of a rule which says: if you have got sequence A then add sequence B.

Eg if you've got µ ‡ ¥ then add ¥ ‡ µ

If you just have the one rule, and it's as simple as this one, this will limit your collection very severely.

If µ ‡ ¥ is in your collection to begin with, and you apply just this one rule, you will end up with just two sequences in fact.

But then think of inventing a few more rules, and you will be able to think again of building up a decent collection.

What you will then have is a logical calculus: a set of uninterpreted symbols with rules which say which further sequences you can have, given what you've already got.

Is it interesting, or a doodle?

You could have fun with a friend seeing how many sequences you could collect with just ten symbols and six rules, say - competing over the clever design of each rule.

Or you can think of it as a starter question for Who wants to be a millionaire?: "Given these symbols and these rules, see who can generate the most number of legitimate sequences in 2 minutes starting Now."

Is there anything more than the pleasures of a doodle here?

I think what we are talking about is formal logic.

And a system of this kind - uninterpreted symbols, rules that legitimize certain sequences if certain others are already legitimized - gets intriguing if the rules you set up copy the rules that appear to be used in real argument, or real sums. Then you have got a formalisation of argument - logic - or a formalization of calculation - mathematics.

In a logical calculus there are simply:

1. Rules for determining what counts as an expression in the system

2. What sequences of expressions count as well-formed (well-formed formulae or wffs).

3. Which sequences of expressions count as proofs.

The propositional calculus is an example; the predicate calculus another. More...


Top


This large set of 4 brass symbols means Good Luck, Prosperity, Joy and Long Life. 5" diameter. $28.00/set of 4. Pic and caption courtesy the International Feng Shui Guild

The foregoing suggests we should recognize two quite different senses of 'symbol'.

 

TWO SENSES OF SYMBOL

1. A something that stands for something.

2. A mark that belongs to a set marks, and a set of rules, such that the rules determine how the marks may be combined.

Symbols of the second kind, if there are any, belong to symbol systems or calculi. A calculus is a purely syntactical system. The propositional calculus would be one such. Maybe algebra is another? Could you see a game like chess as a third?


Top


'Grammar' in natural languages

Natural languages like English and Japanese are not sets of uninterpreted symbols whose sequencing is governed by rules, but it has been suggested that the concept of a formal language gives us a way of approaching one aspect of a natural language like English.

Chomsky. Pic courtesy Johnson County Community Network

Noam Chomsky suggested that this was the way to look at the grammatical aspect of a natural language. In our everyday linguistic dealings we encounter strings of words, and we fire strings of words back. But there are rules governing what strings are OK and which strings are not. There would appear to be more than one set of rules governing what strings are allowed and what strings aren't. There are rules to do with the meaning of what we say, whatever that turns out to be. We shouldn't fire back a string which makes no sense in relation to the string that has just been directed at us. There's another set of rules which have to do with truth. We shouldn't fire back strings which make sense but which aren't truthful.

But let us put meaning rules and truth rules on one side for the moment. For now let us concentrate on another set of rules - the rules which govern which strings are grammatical. These rules seem to have a strong parallel in the rules which help constitute formal languages.

Chomsky argued that the words of a natural language fell into a set of 'syntactical' categories. And he argued that as language-users we were equipped with a set of rules which governed how words from these categories could legitimately be sequenced. That was Chomsky's conception of grammar. It was a computational view of grammar. Remember our guiding question is: how do you get a machine to do language? Chomsky is addressing an aspect of that. If you are to get a machine to speak like we do, you will have to get it to speak grammatically. And the way to do this is to program into it the syntactical categories, and the rules governing how they may legitimately be sequenced.

When you try and do this, you find you have to set up a large number of categories. The dozen or so types of words that traditional grammar distinguished were by no means enough. And the rules that seemed to be be buried in English governing the sequencing of words in the different categories turned out to be many and often complex.

The rules would be such that if you started with a single language element the rules would allow you to keep generating legitimate ways of adding to that first element others. Thus the rules allowed you to generate sentences - and to go on generating sentences indefinitely. They constituted a 'generative grammar'.

Chomsky's observation is that children learn a vast amount in learning a language; and that in fact they learn much more than can be acquired through the teaching they appear to receive!

He is thinking of the speed with which children master their first language, which he thinks cannot be accounted for in terms of any learning theory.

This is sometimes known as the 'poverty of stimulus' argument.

His conclusion is that much of the language one comes to speak is 'hardwired'. We are born with a language chip, so to speak.


Top


Semantics

How does the notion of syntax help with the project of getting a computer to speak?

You could say that it gets you halfway there. Computers handle uninterpreted symbols with relish. The problem is that it is unclear whether they can do anything else.

If you go along with the analogy of the train of wagons that can carry goods, the question is now Where does the merchandise come from? How does it get into the wagons?

If we can understand how the brain might manipulate uninterpreted symbols according to rules, how are we to understand how those symbols might stop being uninterpreted and carry meaning?

Here is a starting idea: a 'symbol', if we are thinking of a mechanical brain, is something electronic, a pattern of electrical activity in a set of neurons. Suppose it was always stimulated by a particular type of thing in the environment. Wouldn't that give it meaning? If a particular pattern of electrical activity in the brain was always triggered by your seeing a badger, wouldn't that confer the meaning 'badger' on the pattern?

Prompt: Any good?

Lycan's summary comment.

 


END


Credits

as shown plus:

Grammar Made Easy pic courtesy Academic Distributing Inc

Learning language pic courtesy Jeddah Institute for Speech and Hearing

Small badger pic courtesy Ruby Mountains Express

Badger tracks pic courtesy The Mammal Society


Images will be removed if notified of copyright infrigement.

Anybody is welcome to use my stuff as they wish.

VP


Revised 11:04:03

222 home