Resources Home | Contact Us

About the Book
Tables of Contents
About the Authors
Samples
Corpus Survey
Resources
Useful Links

This page contains links to some of the resources and tools which we have used in the case studies in Section C of the book. We regret being unable to make some corpora and packages available on this site due to license restrictions. For these, readers are advised to check Useful Links and Corpus Survey for more information. This site also provides resources that are not used in the book but are useful in corpus-based research.

Each of the case studies in the book only demonstrates the use of one corpus exploration package. Demonstrations will be given at this page of how to undertake the case studies using different packages.

This page is constantly being updated and enriched with new resources.

 

UCREL CLAWS C7 tagset

Back to Top

Scott Songlin Piao's Multilingual Corpus Tools (requires Java) (474 k)

Back to Top

Tool for preprocessing the Longman Learners' Corpus (requires PERL) (case study 3) (1 k)

Back to Top

Tool for computing factor scores (requires PERL) (case study 5) (1 k)

Back to Top

Search algorithms for use in MF/MD (case study 5)

Back to Top

File-based search patterns for use with the search algorithms (case study 5) (14 k)

Back to Top

Wordlist based on the BNC World Edition (for use with WordSmith 3 in case study 5) (1,630 k)

Back to Top

Wordlist based on FLOB (for use with WordSmith 3 in case study 5) (126 k)

Back to Top

Wordlist for academic prose (for use with WordSmith 3 in case study 5) (171 k)

Back to Top

Wordlist for speech (for use with WordSmith 3 in case study 5) (141 k)

Back to Top

Wordlist for conversation (for use with WordSmith 3 in case study 5) (89 k)

Back to Top

Wordlists (in batches) for three genres (for use with WordSmith 3 in case study 5) (132 k)

Back to Top

Key keywords for three genres (for use with WordSmith 3 in case study 5) (117 k)

Back to Top

Tagged CPSA Corpus (comm797 and comr 797) (case study 5; note that there are two copies of comr797, one in a single file and one split into halves) (692 k)

Back to Top

Tagged Frown J category (case study 5) (537 k)

Back to Top

The Chinese-English Parallel Corpus of Public Health (case study 6) (197 k)

Back to Top

The Comparable Chinese Corpus of Public Health (case study 6) (82 k)

Back to Top

The Weekly Corpus of Chinese (case study 6) (269 k)

Back to Top

The Lancaster Corpus of Mandarin Chinese (LCMC)

Back to Top

The Lancaster Los Angeles Spoken Chinese Corpus (LLSCC)

Back to Top

The EMILLE Corpus

Back to Top

The Babel English-Chinese Parallel Corpus

Back to Top

The PFR People's Daily Corpus of Chinese

Back to Top

The PH Corpus of Chinese

Back to Top

The PDC2000 Corpus of Chinese

Back to Top

Academia Sinica Balanced Corpus of Modern Chinese (External link)

Back to Top

Peking University Chinese Corpus (External link, web interface in Chinese)

Back to Top

Peking University Babel Chinese-English Parallel Corpus (External link, web interface in Chinese)

Back to Top

Xiamen University corpora (External link, web interface in Chinese)

Back to Top

Beijing Language and Culture University corpus (External link, web interface in Chinese)

Back to Top

The XML version of FLOB and Frown (For internal use only)

Back to Top

Tagged CallHome Mandarin  (For internal use only)

Back to Top

 


Home | About the Book | Tables of Contents | About the Authors | Samples | Corpus Survey | Resources | Useful Links

Copyright © 2005-2013 Richard Xiao, Lancaster University. All rights reserved.
Last modified: 21-05-2013.