Comparing frequencies for corpora of different sizes
We cannot easily compare the results of the previous exercise,
because the sections of the corpora are of different sizes.
A common solution to this problem is to convert each frequency into a value per million
words, or per thousand words. This is called normalizing the frequency scores.
Frequency per million words = ( frequency ÷ text no. words ) x 1,000,000
Now try filling in the "per million" column of the table, and think about the
patterns.
Use the computer's calculator if you don't have your own pocket calculator:
[ Start - Programs - Accessories - Calculator ]
|