Introduction to the British National Corpus
The
British National Corpus (BNC) is a very large corpus of present-day British English,
containing 100 million words of text. It was collected in the early 1990s but many of the texts are from earlier years.
It contains both written and spoken texts, as outlined in the table below.
Broad text category
|
Text category and description
|
Number of words
|
Closest equivalent in Brown,Frown,LOB, FLOB
|
Written
|
"informative" writing: 9 types = world affairs, leisure, arts, commerce & finance, belief & thought,
social science, applied science, natural & pure science
|
70.1 million
|
Sections A-J
|
"imaginative" writing (= fiction)
|
19.7 million
|
Sections K-R
|
Spoken
|
"spoken demographic" = informal conversation which has been demographically
sampled across the population of the UK
|
4.2 million
|
none
|
"spoken context governed" = speech recorded at specific locations
for specific events, such as business meetings, public talks
|
6.2 million
|
none
|
|