Lancaster University Department of Linguistics and Modern English Language
Corpus Linguistics Home
Page index
WordSmith
BNCweb
Intro to BNCweb
More on BNCweb
Refining BNCweb Queries
DIY Corpora
 
Page One
 
 
Page Two
 
 
Page Three
 
 
Page Four
 
 

Thinning Queries

 

To thin a query is to reduce the number of hits so that the data is more manageable. Whether you should use all the hits or not will depend on your research question, but if you work on high-frequency words, the BNCWeb could be very, very slow in indexing the file or calculating collocational stats.

Here is how to thin the hits:

  1. After you get concordance lines, choose "Thin" from the window right next to the [Go!] button and press [Go!]
  2. On the BNC Thinning Options page, choose one of the options from "first n hits", "random", or "1 per text".
  3. Then type in the number you want to reduce the hits to, and press [Thin Solution Set].
  4. The concordance lines will reappear, thinned according to the method you just specified.
These are the Thin options you can choose from.

For practice, try thinning the lines of "lovely" from the exercise on page one down to 1000 with the random selection method.