Prosody

Prosody refers to all aspects of the sound system above the level of segmental sounds e.g. stress, intonation and rhythm. The annotations in prosodically annotated corpora typically follow widely accepted descriptive frameworks for prosody such as that of O'Connor and Arnold (1961). Usually, only the most prominent intonations are annotated, rather than the intonation of every syllable. The example below is taken from the London-Lund corpus:

1 8 14 1470 1 1 A 11 ^what a_bout a cigar\ette# .		   /
1 8 15 1480 1 1 A 20 *((4 sylls))*				   /
1 8 14 1490 1 1 B 11 *I ^w\on't have one th/anks#* - - - 	   /
1 8 14 1500 1 1 A 11 ^aren't you .going to sit d/own# - 	   /
1 8 14 1510 1 1 B 11 ^[/\m]# -					   /
1 8 14 1520 1 1 A 11 ^have my _coffee in p=eace# - - - 		   /
1 8 14 1530 1 1 B 11 ^quite a nice .room to !s\it in ((actually))# /
1 8 14 1540 1 1 B 11 *^\isn't* it#				   /
1 5 15 1550 1 1 A 11 *^y/\es#* - - -				   /

The codes used in this example are:

# end of tone group
^ onset
/ rising nuclear tone
\ falling nuclear tone
/\ rise-fall nuclear tone
_ level nuclear tone
[] enclose partial words and phonetic symbols
. normal stress
! booster: higher pitch than preceding prominent syllable
= booster: continuance
(( )) unclear
* * simultaneous speech
- pause of one stress unit

Problems of Prosodic Corpora

  1. Judgements are inherently of an impressionistic nature. For example, the level of a tone movement is a difficult matter to agree upon. Some listeners may perceive a fall in pitch, while others may perceive a slight rise after the fall. This leads to our second point:
  2. Consistency is difficult to maintain, especially if more than one person annotates the corpus. (This can be alleviated to some degree by having two people both annotate a small part of the corpus.)
  3. Recoverability is difficult (see Leech's 1st Maxim) since prosodic features are carried by syllables rather than whole words - annotations appear within the words themselves making it difficult for software to retrieve the raw corpus.
  4. Sometimes special graphics characters are used to indicate prosodic phenomena. However, not all computers and printers can handle such characters. TEI guidelines for text encoding will hopefully alleviate these difficulties.

Part-of-speech annotation | Lemmatisation | Parsing
Semantics | Discoursal and text annotation
Phonetic transcription | Problem-oriented tagging