Lancaster University Department of Linguistics and Modern English Language
Corpus Linguistics Home
Page index
WordSmith
BNCweb
DIY Corpora
Building DIY Corpora
Headers in DIY Corpora
 
Page One
 
 
Page Two
 
 
Current page
 
 
Page Four
 
 

Managing and Exploiting Files

 

Managing files by directories

Once you have saved text files, it is important that you store/archive/manage the files systematically. There are sophisticated ways of classifying text data by SGML/XML, but here we will just cover how to manage files by sorting them into different directories. Take a few minutes to try creating your own subdirectories to save some of the files you downloaded.

In Windows, the quickest way to creatre a subdirectory is to use Windows Explorer and click [File] - [New] - [Folder] ("folder" is just Windows' name for "directory".)

Example:

An example of a good, systematic direrctory structure.
 

IMPORTANT NOTE: The present version of WordSmith recognises file names with a maximum of 8 characters only (plus a 3-character extension).

Working with your files

Now let’s process your text with WordSmith.

If you need a quick reminder of how WordSmith works, click here.

  1. Open WordSmith Tools Controller
  2. Click on [Files] - [Choose texts].
  3. Go to the drive and folder where your files are stored; choose your favourite file and press [OK]
  4. Now do the following:
    1. Make a wordlist and check the text size (types and tokens)
    2. Search for any particular word you are interested in using Concord
    3. If you have time, compute the keywords of the text.