Andrew Hardie: Publications
Title links either go to an
online version of the text, or to further publication details if the text is
not openly available online.
Books
Culpeper,
J., Hardie, A., & Demmen, J. (2023) The
Arden Encyclopedia of Shakespeare’s Language: Dictionary A-M.
Bloomsbury.
Culpeper,
J., Hardie, A., & Demmen, J. (2023) The
Arden Encyclopedia of Shakespeare’s Language: Dictionary N-Z.
Bloomsbury.
McEnery, T, Hardie, A and Younis, N (eds)
(2019) Arabic
Corpus Linguistics. Edinburgh University Press.
Semino, E, Demjén, Z, Hardie, A, Payne, S and
Rayson, P. (2018) Metaphor,
Cancer, and the End of Life: A corpus-based study. Routledge
McEnery, T and Hardie, A (2012) Corpus
Linguistics: Method, Theory and Practice. Cambridge
Baker,
P, Hardie A and McEnery, T (2006)
A Glossary of
Corpus Linguistics. Edinburgh
Journal articles
Hughes, J and Hardie, A (forthcoming) The
psychological reality of a linguistic-statistical construct: Observation of
collocational and non-collocational language processing using event-related
brain potentials.
Hughes, J and Hardie, A (forthcoming) The effect of
collocational strength on the speed of processing of adjective-noun bigrams in
native speakers and learners of English: Evidence from a self-paced reading
experiment.
Hardie, A (forthcoming) A dual
sort-and-filter strategy for statistical analysis of collocation, keywords, and
lockwords.
Hardie, A. and Daraselia, S. (forthcoming) A theory for words in Georgian:
traditional constructs versus corpus annotation.
Jehangir, H. and Hardie, A. (forthcoming) Design and construction
of an openly available Urdu web corpus.
Gillings, M and Hardie, A (2023) The
interpretation of topic models for scholarly analysis: an evaluation and
critique of current practice. Digital Scholarship in the Humanities
38(2): 530–543. https://doi.org/10.1093/llc/fqac075
Collins, L. and Hardie, A. (2022) Making
use of transcription data from qualitative research within a corpus-linguistic
paradigm: Issues, experiences, and recommendations. Corpora 17(1):
123-135. https://doi.org/10.3366/cor.2022.0237
Hardie, A. and Ibrahim, W. (2021) Exploring
and classifying the Arabic copula and auxiliary kāna
via enhanced part-of-speech tagging. Corpora 16(3): 305-335. https://doi.org/10.3366/cor.2021.0225
Culpeper, J., Hardie, A., Demmen, J., Hughes,
J. and Timperley, M. (2021) Supporting the corpus-based study of
Shakespeare’s language: Enhancing a corpus of the First Folio. ICAME
Journal 45(1): 37-86. https://doi.org/10.2478/icame-2021-0002
Collins, L., Semino, E., Demjén, Z., Hardie,
A., Mosely, P., Woods, A. and Alderson-Day, B. (2020) A linguistic approach
to the psychosis continuum: (dis)similarities and (dis)continuities in how
clinical and non-clinical voice-hearers talk about their voices. Cognitive
Neuropsychiatry 25(6): 447-465. https://doi.org/10.1080/13546805.2020.1842727
Hu, X., Xiao, Z. and Hardie, A. (2020) 翻译英语变体的语料库文体统计学分析 [A corpus-based multi-feature stylo-statistical
analysis of translational English]. 外语教学与研究 [Foreign Language Teaching and Research] 52(2): 273-282.
Hardie, A and Dorst, I van (2020) A survey
of grammatical variability in Early Modern English drama. Language and
Literature 29(3): 275-301. https://doi.org/10.1177/0963947020949440
Blything, L, Hardie, A and Cain, K (2020) Question asking during reading comprehension instruction: a corpus
study of how question type influences the linguistic complexity of primary
school students’ responses. Reading
Research Quarterly 55(3): 443-472. https://doi.org/10.1002/rrq.279
Hu, X, Xiao, R and Hardie, A (2019) How do English translations differ from
native English writings? A multi-feature statistical model for linguistic
variation analysis. Corpus
Linguistics and Linguistic Theory 15(2):
347-382. https://doi.org/10.1515/cllt-2014-0047
Love, R., Brezina, V., McEnery, T., Hawtin,
A., Hardie, A. and Dembry, C. (2019) Functional
variation in the Spoken BNC2014 and the potential for register analysis. Register Studies 1(2): 296-317. https://doi.org/10.1075/rs.18013.lov
Love, R, Dembry, C, Hardie, A, Brezina, V and
McEnery, T (2017) The Spoken BNC2014:
designing and building a spoken corpus of everyday conversations. International Journal of Corpus Linguistics
22(3): 319-344. https://doi.org/10.1075/ijcl.22.3.02lov
Semino, E, Demjén, Z, Demmen, J, Koller, V,
Payne, S, Hardie, A, Rayson, P (2017) The
online use of ‘Violence’ and ‘Journey’ metaphors by cancer patients, as
compared with health professionals: a mixed methods study. BMJ Supportive & Palliative Care
2017(7): 60-66.
https://doi.org/10.1136/bmjspcare-2014-000785
Gregory, I, Atkinson, A, Hardie, A,
Joulain-Jay, A, Kershaw, D, Porter, C, Rayson, P and Rupp, CJ (2016) From digital resources to historical
scholarship with the British Library 19th Century Newspaper Collection. Journal of Siberian Federal University:
Humanities and Social Sciences 9(4): 994-1006. https://doi.org/10.17516/1997-1370-2016-9-4-994-1006
Demmen, J, Semino, E, Demjén, Z, Koller, V,
Hardie, A, Rayson, P and Payne, S (2015) A
computer-assisted study of the use of Violence metaphors for cancer and end of
life by patients, family carers and health professionals. International Journal of Corpus Linguistics
20(2): 205-231.
https://doi.org/10.1075/ijcl.20.2.03dem
Murrieta-Flores, P, Baron, A, Gregory, I,
Hardie, A and Rayson, P (2015) Automatically
analysing large texts in a GIS environment: The Registrar General’s reports and
cholera in the nineteenth century. Transactions
in GIS 19(2): 296-320. https://doi.org/10.1111/tgis.12106
Hardie, A (2014) Modest XML for Corpora: Not a standard, but a suggestion. ICAME Journal 38: 73-103. https://doi.org/10.2478/icame-2014-0004
Hardie, A (2012) CQPweb – combining power, flexibility and usability in a corpus
analysis tool. International Journal
of Corpus Linguistics 17 (3): 380-409. https://doi.org/10.1075/ijcl.17.3.04har
[alternative
link]
Hardie, A, Lohani, R and Yadava, YP (2011) Extending corpus annotation of Nepali:
advances in tokenisation and lemmatisation. Himalayan Linguistics 10 (1): 151-165. https://doi.org/10.5070/H910123572
Gregory, I and Hardie, A (2011) Visual GISting:
Bringing together corpus linguistics and Geographical Information Systems. Literary and Linguistic Computing 26
(3): 297-314. https://doi.org/10.1093/llc/fqr022
Hardie, A and McEnery, T (2010) On two traditions in corpus linguistics,
and what they have in common. International
Journal of Corpus Linguistics 15 (3): 384-394. https://doi.org/10.1075/ijcl.15.3.09har
Hardie, A and Mudraya,
O (2009) Collocational patterning in
cross-linguistic perspective: adpositions in English,
Nepali, and Russian. Arena Romanistica 4: 138-149. [accessible from this
website]
Dunning, A, Gregory, I and Hardie, A (2009) Freeing up digital content: new research
means new licenses. Serials 22
(2): 166-173.
Prentice, S and Hardie, A (2009) Empowerment and
disempowerment in the Glencairn uprising: a corpus-based critical analysis of
Early Modern English news discourse. Journal of Historical Pragmatics 10(1): 23-55.
https://doi.org/10.1075/jhp.10.1.03pre
Yadava, Y.P., Hardie, A., Lohani R.R., Regmi
B.N., Gurung, S., Gurung, A., McEnery, T., Allwood, J., and Hall, P. (2008). Construction and annotation of a corpus of
contemporary Nepali. Corpora
3(2): 213-225.
https://doi.org/10.3366/E1749503208000166
Koller, V, Hardie, A, Rayson, P and Semino, E
(2008) Using a semantic annotation tool
for the analysis of metaphor in discourse. Metaphorik.de 15. http://www.metaphorik.de/15/
Hardie, A (2008) A collocation-based approach to Nepali postpositions.
Corpus Linguistics and Linguistic Theory 4(1): 19-62.
https://doi.org/10.1515/CLLT.2008.002
Hardie, A (2007) Part-of-speech ratios in
English corpora. International Journal of Corpus Linguistics 12(1):
55-81. https://doi.org/10.1075/ijcl.12.1.05har
Hardie, A (2007) From legacy
encodings to Unicode: the graphical and logical principles in the scripts of
Baker, P, Hardie, A, McEnery, T, Xiao, R, Bontcheva, K, Cunningham, H, Gaizauskas,
R, Hamza, O, Maynard, D, Tablan, V, Ursu, C, Jayaram,
BD and Leisher, M (2004) Corpus
linguistics and South Asian languages: corpus creation and tool development.
Literary and Linguistic Computing
19(4): 509-524. https://doi.org/10.1093/llc/19.4.509
Hardie, A and McEnery, T (2003) The were-subjunctive in British
rural dialects: marrying corpus and questionnaire data. Computers and
the Humanities 37(2): 205-228. https://doi.org/10.1023/A:1022657227889
Chapters in
edited volumes
Supanfai,
P. and Hardie, A. (2023) Corpus linguistics and the languages of Asia.
In: Shei, C. and Li, S. (ed.) The Routledge Handbook of Asian Linguistics,
pp. 531-547. Routledge. https://doi.org/10.4324/9781003090205-37
McEnery, T. and Hardie, A. (2023) Neo-Firthian corpus linguistics to 2000.
In: Waugh, L.R., Monville-Burston, M. and Joseph, J.E. (eds) The Cambridge History of
Linguistics, pp. 515-517.
Cambridge University Press. https://doi.org/10.1017/9780511842788.027
McEnery, T. and Hardie, A. (2022) Corpus
methods. In: Culpeper, J., Malory, B., Nance, C., Van Olmen, D., Atanasova,
D., Kirkham, S., & Casaponsa, A. (eds) Introducing
Linguistics, pp. 383-399. Routledge.
https://doi.org/10.4324/9781003045571-25
Semino, E, Hardie, A and Zakrzewska, J (2020)
Applying corpus linguistics to a diagnostic tool for pain. In: Demjén, Z
(ed.) Applying linguistics in illness and healthcare contexts, pp. 99-128.
Bloomsbury. https://doi.org/10.5040/9781350057685.0011
Hughes, J. and Hardie, A. (2020). Corpus linguistics and event-related
potentials. In Egbert, J. and Baker, P. (eds) Using corpus methods to triangulate linguistic analysis, pp.
185-218. Routledge. https://doi.org/10.4324/9781315112466-8
McEnery, T, Hardie, A and Younis, N (2019) Introducing Arabic corpus linguistics.
In: McEnery, T, Hardie, A and Younis, N (eds) Arabic
Corpus Linguistics, pp. 1-16. Edinburgh University Press.
Ibrahim, WMA and Hardie, A (2019) Accessible corpus annotation for Arabic.
In: McEnery, T, Hardie, A and Younis, N (eds) Arabic
Corpus Linguistics, pp. 56-75. Edinburgh University Press.
Mohamed, G and Hardie, A (2019) Approaching text typology through cluster
analysis in Arabic. In: McEnery, T, Hardie, A and Younis, N (eds) Arabic
Corpus Linguistics, pp. 201-228. Edinburgh University Press.
Gregory, I, Donaldson, C, Hardie, A and
Rayson, P (in press) Modelling space and time in historical
texts. In: Flanders, J and Jannidis, F (eds) The
Shape of Data in Digital Humanities: Modeling Texts
and Text-based Resources, pp. 133-149. Routledge.
Hardie, A. (2018) Using the Spoken BNC2014 in CQPweb. In: Brezina, V., Love, R. and
Aijmer, K. (eds) Corpus
Approaches to Contemporary British Speech: Sociolinguistic Studies of the
Spoken BNC2014, pp. 27-30. Routledge. https://doi.org/10.4324/9781315268323-4
Hardie, A and Brandt, S (2018) First language acquisition. In:
Culpeper, J., Kerswill, P., Wodak, R., McEnery, T. and Katamba, F. (eds.) English
Language: Description, Variation and Context, 2nd edition,
pp. 541-559. Palgrave Macmillan [reprinted by Bloomsbury]. (Revision of Hardie
2009, “First Language Acquisition”.)
Baker, H, McEnery, T and Hardie, A (2017). A corpus-based investigation into English representations
of Turks and Ottomans in the early modern period. In: Pace-Sigge, M and
Patterson, KJ (eds) Lexical Priming:
Applications and advances, pp. 41-66. John Benjamins. https://doi.org/10.1075/scl.79.02bak
Hardie, A (2016) Infrastructure
for analysis of the CEPhiT
corpus: implementation and applications of corpus annotation and indexing. In: Moskowich,
I., Camiña Rioboo, G., Lareo, I. and Crespo, B. (eds)
‘The Conditioned and the
Unconditioned’: Late Modern English Texts on Philosophy, pp. 61-76.
John Benjamins. https://doi.org/10.1075/z.198.04har
Hardie, A (2016) Corpus linguistics. In Allan, K (ed.) The
Routledge Handbook of Linguistics, pp. 502-515. Routledge.
Gregory, I, Cooper, D, Hardie, A and Rayson,
P (2015). Spatializing and analyzing digital texts: Corpora, GIS and places. In:
Bodenhamer, D, Corrigan, J and Harris, TM (eds) Deep
Maps and Spatial Narratives. Indiana University Press.
Gregory, I, Baron, A, Cooper, D, Hardie, A,
Murrieta-Flores, P and Rayson, P (2014) Crossing
boundaries: Using GIS in literary studies, history and beyond. In: Hueber,
J and Mendes da Silva, A (eds) Keys for
architectural history research in the digital era. Institut
national d’histoire de l’art Actes de colloques. http://inha.revues.org/4931 .
Hardie, A (2014) XML encoding for spoken learner (and other) corpora: a modest approach.
In: Ishikawa, S (ed.) Learner corpus
studies in Asia and the world. Vol. 2. Papers from LCSAW2014, pp. 49-62.
Kobe, Japan: School of Languages and Communication, Kobe University.
McEnery, T. and Hardie, A. (2013) The history of corpus linguistics. In:
Allan, K (ed.) The Oxford Handbook of the
History of Linguistics, pp. 727-746. Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199585847.013.0034
Hardie, A, McEnery, T, and Piao, SS (2010) A corpus-based approach
to text reuse in the newsbooks of the Commonwealth. In: Dooley, B (ed.) The
Dissemination of News and the Emergence of Contemporaneity in Early Modern
Europe, pp. 251-286.
Ashgate.
Hardie, A, Lohani, RR, Regmi, BR, and Yadava,
YP (2009) A morphosyntactic categorisation scheme for the automated analysis of
Nepali. In: Singh, R. (ed.) Annual
Review of South Asian Languages and Linguistics 2009, pp. 171-196. Mouton de Gruyter.
Hardie, A and McEnery, T (2009) Corpus linguistics and historical contexts:
text reuse and the expression of bias in early modern English journalism.
In: Bowen, R, Mobärg, M and Ohlander,
S (eds) (2009) Corpora and
discourse – and stuff: papers in honour of Karin Aijmer, pp. 59-92. Gothenburg
Studies in English 96. Göteborg: Acta Universitatis Gothoburgensis.
Hardie, A (2009) First language acquisition. In: Culpeper, J., Katamba, F. Kerswill,
P., Wodak, R. and McEnery, T. (eds.) English Language: Description,
Variation and Context, pp. 609-624. Houndmills:
Palgrave.
Hardie, A (2009) Corpus linguistics and the languages of
Hardie, A, Baker, P, McEnery, T and Jayaram,
BD (2006) Corpus-building for South
Asian languages. In: Saxene, A and Borin, L
(eds.) Lesser-known
languages in South Asia: Status and Policies, Case Studies and Applications of
Information Technology, pp.
211-242. Mouton
de Gruyter.
Hardie, A and McEnery, T (2006) Statistics. In: Brown, K (ed.) Encyclopaedia of Language
and Linguistics, 2nd edition, vol. 12: 138-146.
Hardie, A (2005) Automated part-of-speech
analysis of Urdu: conceptual and technical issues. In: Yadava, Y,
Bhattarai, G, Lohani, RR, Prasain, B and Parajuli, K (eds.) Contemporary issues in Nepalese linguistics, pp. 48-72. Kathmandu: Linguistic
Society of
Hardie, A, Levin, E and Pęzik,
P (2005) Analiza morfologiczno-składniowa
korpusów (“Part-of-speech tagging”). In: Lewandowska-Tomaszczyk, B (ed.) Podstawy
językoznawsta korpusowego (“Foundations of Corpus Linguistics”).
Łódź, Poland: Wydawnictwo Uniwersytetu Łódzkiego.
McEnery, T, Baker, JP and Hardie, A (2000a) Swearing
and abuse in modern British English. In: Lewandowska-Tomaszczyk, B and Melia, PJ (eds.) PALC ’99:
Practical Applications in Language Corpora, pp. 37-48. Peter Lang.
McEnery, T, Baker, JP and Hardie, A (2000b) Assessing
claims about language use with corpus data – swearing and abuse. In:
Kirk, J (ed.) Corpora Galore.
Papers in peer-reviewed
conference proceedings
Evert, S and Hardie, A (2015) Ziggurat: A
new data model and indexing format for large annotated text corpora.
In: Bañski, Piotr; Biber, Hanno; Breiteneder, Evelyn;
Kupietz, Marc; Lüngen, Harald; Witt, Andreas (eds.) Proceedings
of the 3rd Workshop on Challenges in the Management of Large Corpora
(CMLC-3). Mannheim: Institut für Deutsche Sprache,
pp. 21-27.
Rupp, CJ, Rayson, P, Gregory, I, Hardie, A,
Joulain-Jay, A and Hartmann, D (2014) Dealing with
heterogeneous big data when geoparsing historical corpora. In:
Proceedings of the 2014 IEEE International Conference on Big Data, pp 80-83.
Rupp, CJ, Rayson, P, Baron, A, Donaldson, C,
Gregory, I, Hardie, A and Murrieta-Flores, P (2013) Customising
geoparsing and georeferencing for historical texts. In: Proceedings of the
2013 IEEE International Conference on Big Data, pp. 59-62. [alternative
link]
Gregory, I, Baron, A, Murrieta-Flores, P,
Hardie, A, Rayson, P and Rupp, CJ (2013) Geographical
Text Analysis: GIS approaches to analysing large volumes of texts. In: Proceedings of GISRUK 2013.
Michaud, A, Guillaume, S, Hardie, A and Todam
M (2012) Combining documentation and
research: Ongoing work on an endangered language. In Xiong, D et al.
(eds.), Proceedings of IALP 2012 (2012
International Conference on Asian Language Processing), pp. 169-172. Hanoi,
Vietnam: MICA Institute, Hanoi University of Science and Technology. [alternative
link]
Evert, S and Hardie, A (2011) Twenty-first
century Corpus Workbench: Updating a query architecture for the new millennium.
In: Proceedings of the Corpus Linguistics
2011 conference. University of Birmingham, UK.
Hardie, A (2007) Collocational
properties of adpositions in Nepali and English. In: Proceedings of
the Corpus Linguistics 2007 conference.
Hardie, A, Koller, V, Rayson, P and Semino, E
(2007) Exploiting a semantic
annotation tool for metaphor analysis. In: Proceedings of the Corpus
Linguistics 2007 conference.
Semino, E, Koller, V, Hardie, A and Rayson, P
(2005) A
computer-assisted approach to the analysis of metaphor variation across genres.
In: Barnden, J, Lee, M, Littlemore, J, Moon, R, Philip, G and Wallington, A
(eds.) Corpus-based approaches to
figurative language: a Corpus Linguistics 2005 colloquium, pp. 145-154.
Xiao, Z, McEnery, T, Baker, P and Hardie, A
(2004) Developing Asian language
corpora: standards and practice. In: Proceedings of the 4th Workshop on Asian Language Resources,
Hardie, A (2003) Developing a tagset for automated
part-of-speech tagging in Urdu. In: Archer, D, Rayson, P, Wilson, A,
and McEnery, T (eds.) (2003) Proceedings
of the Corpus Linguistics 2003 conference. UCREL Technical Papers Volume 16. Department of
Linguistics,
Archer, D, McEnery, T, Rayson, P and Hardie,
A (2003) Developing
an automated semantic analysis system for Early Modern English. In:
Archer, D, Rayson, P, Wilson, A, and McEnery, T (eds.) (2003) Proceedings of the Corpus Linguistics 2003
conference. UCREL Technical Papers
Volume 16. Department of Linguistics,
Baker, P, Hardie, A, McEnery, T and Jayaram, BD (2003) Constructing
corpora of South Asian languages. In: Archer, D, Rayson, P, Wilson, A, and McEnery, T
(eds.) (2003) Proceedings of the Corpus
Linguistics 2003 conference. UCREL Technical
Papers Volume 16. Department of Linguistics,
Baker, P, Hardie, A, McEnery, T and Jayaram,
BD (2003) Corpus
data for South Asian language processing. In: Proceedings of the EACL Workshop on South Asian Languages,
Baker, P, Hardie, A, McEnery, T, Cunningham,
H and Gaizauskas, R (2002) EMILLE, a 67-million word corpus of Indic languages: data collection,
markup and harmonisation. In: Proceedings
of LREC 2002.
Book reviews
Hardie, A (2013) Review of: Vander Viana,
Sonia Zyngier and Geoff Barnbrook (eds.). 2011. Perspectives on Corpus Linguistics.
Amsterdam and Philadelphia: John Benjamins. ICAME Journal 37: 266-271.
Hardie,
A (2004) Review of: Lars Borin (ed).
2002. Parallel corpora, parallel worlds.
Selected papers from a symposium on parallel and comparable corpora at Uppsala
University, Sweden, 22–23 April, 1999. Amsterdam: Rodopi.
Languages in Contrast 5(2): 291-296.
Edited conference proceedings
Formato, F and Hardie A (eds.) (2015) Corpus
Linguistics 2015: Abstract Book. Lancaster: UCREL.
Hardie, A and Love, R (eds.) (2013) Corpus
Linguistics 2013: Abstract Book. Lancaster: UCREL.
Rayson, P, Wilson, A, McEnery, T, Hardie, A
and Khoja, S (eds.) (2001) Proceedings
of the Corpus Linguistics 2001 conference. UCREL Technical Papers Volume 13
Special Issue. Department of Linguistics,
Baker, P, Hardie, A, McEnery, T and
Siewierska, A (eds.) (2000) Proceedings
of the Third Discourse Anaphora and Reference Resolution Colloquium (2000). UCREL Technical Papers Volume 12 Special Issue. Department of Linguistics,
Unpublished
PhD thesis
Hardie, A (2004) The computational analysis of
morphosyntactic categories in Urdu. Unpublished PhD thesis,
Department of Linguistics and English Language,
Talks
and conference presentations
November 2019. The methodology of computer-assisted historical discourse analysis: On
the concordance and its centrality. Invited presentation at a workshop on
“Text and Language Analysis from a Diachronic Perspective: Corpus and Discourse
Insights”, University of Modena and Reggio Emilia, Italy.
October 2019: The statistics of collocation: basic principles and potential problems.
Invited talk at Xi’an Jiaotong University, Xi’an,
China.
October 2019. Fundamentals of corpus statistics. Invited talk at Xi’an
International Studies University, Xi’an, China.
October 2019. Designing and documenting a corpus. Invited talk at Xi’an
International Studies University, Xi’an, China.
July 2019. Neuroimaging of
collocation: Implications of electroencephalography findings for a network
model of collocation (with Jennifer Hughes (lead)). Presentation at
the CL2019 conference, University of Cardiff.
July 2019. Managing complex and arbitrary
corpus subsections at scale and at speed: from formalism to implementation
within CQPweb. Presentation at the 7th Workshop on Challenges in the
Management of Large Corpora, University of Cardiff.
June 2019. Lexicogrammar and the
brain, in theory and in practice (with Jennifer Hughes (lead)). Plenary
presentation at the symposium on Corpus Approaches to Lexicogrammar (LxGr), Edge Hill University.
May 2019. Analysing Arabic grammar through corpus data: the case of
copula/auxiliary kāna (with Wesam Ibrahim).
Presentation to the UCREL Corpus Research Seminar, Lancaster University.
October 2018. The basics for corpus linguistics: where do we start with a “new”
language? Plenary presentation at the 4th
International Conference of the Linguistic Association of Pakistan (ICLAP
2018), Fatima Jinnah Women University, Rawalpindi, Pakistan.
October 2018. A corpus-based typological analysis of adverbials in Urdu (with
Humaira Jehangir (lead)). Presentation at the 4th International Conference of
the Linguistic Association of Pakistan (ICLAP 2018), Fatima Jinnah Women
University, Rawalpindi, Pakistan.
October 2018. Corpus analysis with CQPweb: a practical introduction. Presentation
at the 4th International Conference of the Linguistic Association of Pakistan
(ICLAP 2018), Fatima Jinnah Women University, Rawalpindi, Pakistan.
September 2018. The Written BNC2014: Designing the future, respecting the past.
(with Abi Hawtin (lead)). Presentation at the Thai-UK Seminar on Developing and
Exploiting National Corpora, Chulalongkorn University, Bangkok, Thailand.
September 2018. Practical aspects of corpus creation: Markup, annotation and metadata
in National Corpora. Presentation at the Presentation at the Thai-UK
Seminar on Developing and Exploiting National Corpora, Chulalongkorn
University, Bangkok, Thailand.
August 2018. A new morphosyntactic annotation schema for Georgian (with Sophiko
Daraselia (lead)). Presentation at the VII International Summer School in
Digital Humanities, Batumi Shota Rustaveli State University, Georgia.
July 2018. Using corpus methods to investigate guided reading: what teachers say
they do, what they do, and what works (with Liam Blything (lead) and Kate
Cain). Poster presentation at the conference of the Society for Text and
Discourse, Brighton, UK.
July 2018. Teacher Directives and Pupil Responses in SEN Classrooms: insights from
corpus methods (with Gillian Smith
(lead) and Kate Cain). Poster presentation at the conference of the Society for
Text and Discourse, Brighton, UK.
July 2018. The ethics of corpus-building in the age of the Digital Panopticon.
Plenary presentation at the “Lancaster Postgraduate Conference in Linguistics
and Language Teaching” (LAELPG), Lancaster University.
July 2018. Part-of-speech tagging in Shakespeare: Trials, tribulations and
preliminary results (with Jane Demmen (lead) and Jonathan Culpeper).
Presentation at the conference on “Computational Methods for
Literary-Historical Textual Scholarship”, De Montfort University.
July 2017. Exploratory analysis of word frequencies across corpus texts: towards a
critical contrast of approaches. Plenary presentation at the CL2017
conference, University of Birmingham.
July 2017. Guided reading: Using corpus methods to investigate how teacher
strategies differ across children’s reading ability, SES, and teacher
experience (with Liam Blything and Kate Cain). Presentation at the CL2017
conference, University of Birmingham.
July 2017. The ESRC Centre for Corpus Approaches to Social Science: An
introduction and overview. Presentation at the CL2017 pre-conference
workshop “CLARIN-UK: Promoting Cross-disciplinary Corpus Linguistics”,
University of Birmingham.
July 2017. A corpus-based assessment of a diagnostic pain questionnaire. (with Elena Semino and Joanna
Zakrzewska). Presentation at the CL2017 pre-conference workshop on “Corpus
approaches to health communication”, University of Birmingham.
July 2017. Morphosyntactic annotation schemata:
From background considerations to conflicting design imperatives. Plenary presentation at CAMRL2017 (Computational
Approaches to Morphologically Rich Languages 2017), University of Leeds.
June 2017. The Spoken BNC2014: designing and building a spoken corpus of everyday
conversations (with Robbie Love and Claire Dembry). Presentation at the
“Spoken BNC2014 Symposium”, Lancaster University.
May 2017. Plotting and comparing corpus lexical growth curves as an assessment of
OCR quality in historical news data. Presentation at the ICAME 38
conference, Charles University Prague, Czech Republic.
April 2017. Corpus methods in the humanities and social sciences: Three case
studies. Invited talk at the Master Class on “Modalities of the Text”,
University College Cork, Ireland.
March 2017. Introducing corpus linguistic analysis with CQPweb. Invited talk at
the Department of English Philology, Complutense
University of Madrid, Spain.
February 2017. Using CQPweb to analyse EEBO-TCP. Invited talk at the NEH Workshop
on “The Genealogy of Texts and Ideas”, Rice University, Houston, Texas.
September 2016. Nineteenth Century Newspapers in CQPweb. Invited talk at the
CLARIN-PLUS workshop on “Working with Digital Collections of Newspapers”, KU
Leuven, Belgium.
June 2016. Some thoughts on transparent effect size measures for collocation.
Invited talk at the Symposium on Collo-Phenomena, University of
Erlangen-Nuremberg, Germany.
September 2015. Part-of-speech tagging in different kinds of language some theoretical
bases for morphosyntactic annotation schemata. Plenary presentation at the
Language and Modern Technologies 2015 conference, Tbilisi, Georgia.
May 2015. Multidimensional analysis for the masses. Presentation at the ICAME
36 conference, University of Trier, Germany.
March 2015. “Big data” in language studies: from cargo-cult science to phantom
revolution. Plenary presentation at the CILC conference, University of
Valladolid, Spain.
October 2014. Fundamentals of corpus statistics. Invited talk at the Department
of English, University of Uppsala, Sweden.
October 2014. The statistics of collocation: from current practice to a new approach.
Invited talk at the Department of English, University of Uppsala, Sweden.
October 2014. The art and science of concordance analysis. Invited talk at the
Department of English, University of Uppsala, Sweden.
June 2014. Extending a corpus analysis tool to support the analysis of field data.
Talk at the Department of Linguistics, University of Ghana.
June 2014. Yesterday, Today, Towards Tomorrow (with Tony McEnery). Plenary
presentation at the IVACS 2014 conference, Newcastle University.
May 2014. Rethinking basic statistical techniques in corpus analysis. Plenary
presentation at the International Symposium on Learner Corpus Studies in Asia
and the World (LCSAW) 2014, Kobe University, Japan.
May 2014. XML encoding for spoken learner (and other) corpora: a modest approach.
Plenary presentation at the International Symposium on Learner Corpus Studies
in Asia and the World (LCSAW) 2014, Kobe University, Japan.
May 2014. Statistical identification of keywords, lockwords and collocations as a
two-step procedure. Presentation at the ICAME 35 conference, University of
Nottingham.
March 2014. Analysing EEBO-TCP as an annotated corpus. Invited talk at the
Sheffield Centre for Early Modern Studies, University of Sheffield.
March 2014: The applicocausative voice in dialectal and standard
Javanese: a corpus-based analysis (with Noor Malihah). Presentation at the
Second Asia Pacific Corpus Linguistics Conference (APCLC 2014), Hong Kong
Polytechnic University.
February 2014: The affordances of corpus analysis software in approaching EEBO-TCP.
Invited presentation at the Northern Renaissance Seminar ‘To set the word against the word’: new directions in early modern
textual analysis, Lancaster University.
January 2014. Using version control software for corpus construction. ESRC Centre
for Corpus Approaches to Social Science Technical Presentation, Lancaster
University.
September 2013: Transforming EEBO-TCP into a Corpus (with Paul Rayson, Alistair
Baron). Presentation at the EEBO-TCP 2013
conference, University of Oxford.
June 2013: Annotation and analysis of Early Modern English corpus data.
Invited presentation at the Contested
Words: The Digital Analysis of Early Modern Texts workshop, University of
Warwick.
June 2013: The statistics of collocation: basic principles and potential problems.
Invited talk at the University of Sheffield.
May 2013: Applying cluster analysis to the problem of text-type classification
(with Ghada Mohamed). Invited talk at the Institute of the Czech National
Corpus, Charles University, Prague.
May 2013: Annotation and analysis: an overview of tools and techniques.
Invited talk at the Institute of the Czech National Corpus, Charles University,
Prague.
May 2013: Spatial analysis of corpus data using Geographical Information Systems.
Invited talk at the University of Erlangen-Nuremberg.
April 2013: Annotation and analysis of Early Modern corpus data. Invited
presentation at Giornata di Studi – Corpus Linguistics and
Historical Corpora, University of Florence.
February 2013: Wrangling large-scale data for specialised corpora. Invited
presentation at the BAAL Corpus
Linguistics SIG Symposium on Building and Mining Small Specialised Corpora,
University of Edinburgh.
January 2013: Approaching text typology through cluster analysis in English and
Arabic corpora (with Ghada Mohamed). Presentation at the LSB2013 conference, Brussels.
September 2012: Prerequisites to a corpus-based analysis of EEBO-TCP (with Alistair
Baron). Presentation at the EEBO-TCP 2012
conference, University of Oxford.
September 2012. Which ‘Lancaster’ do you mean? Disambiguation challenges in extracting
place names for Spatial Humanities (with Paul Rayson and Alistair Baron).
Presentation at the Digital Humanities
Congress conference 2012, University of Sheffield.
January 2012: Modest XML for Corpora. Presentation to the UCREL Corpus Research
Seminar, Department
of Linguistics,
July 2011. Research ethics in corpus linguistics
(with Tony McEnery). Presentation at the
CL2011 conference,
July 2011. Twenty-first century Corpus Workbench:
Updating a query architecture for the new millennium (with Stefan Evert). Presentation at the CL2011 conference,
May 2011: The conceptual convergence of
functional-cognitive theory and neo-Firthian linguistics (with Tony
McEnery). Presentation at the ICAME 32 conference, Oslo.
May 2011: The internal gradience of the adposition category:
some evidence from comparable corpora of English, Nepali and Russian.
Presentation at the ICAME 32 pre-conference workshop on Corpus-Based
Contrastive Analysis, Oslo.
November
2010: Extending a corpus analysis tool to
support the analysis of field data: CQPweb and minority languages of South Asia.
Presentation to the UCREL Corpus Research
Seminar, Department
of Linguistics,
November 2010: Invited panel
presentation at the 5th Chicago Colloquium on Digital Humanities and
Computer Science,
September
2010: An introduction to CQPweb (and its
application to the lesser-studied languages of the world). Invited talk at
CNRS, Paris.
September 2010: Extending a corpus analysis tool to support the analysis of field data:
Bodo and Dimasa data in the CQPweb system. Presentation at the 16th
Himalayan Languages Symposium,
October 2009: Collocational patterning in cross-linguistic perspective: adpositions
in English, Nepali, and Russian. Presentation at the 28th International
conference on Lexis and Grammar,
July 2009: Corpus
evidence and the internal gradience of grammatical categories in Nepali. Presentation at the 15th
Himalayan Languages Symposium,
May 2009: CQPweb –
combining power, flexibility and usability in a corpus analysis tool. Presentation at the ICAME
30 conference,
September 2008: Visual GISting: Merging Corpus Linguistics
and Geographical Information Systems (with Ian Gregory). Presentation at
the Digital Resources for the Humanities
and Arts conference 2008 (DRHA08),
June 2008: Text reuse and ideology: tracing duplicates and variants in the news
discourse of seventeenth-century
May 2008: Computer-assisted metaphor analysis using key semantic domains
(with Veronika Koller, Paul Rayson, and Elena Semino). Presentation at the Researching and Applying Metaphor
conference (RaAM 7),
December 2007: Mentions in time &
space: extracting and visualizing report impact from a corpus of newsbook text.
Presentation at the Places of News conference,
July 2007: Collocational properties
of adpositions in Nepali and English. Presentation at the CL2007
conference,
July 2007: Exploiting a semantic
annotation tool for metaphor analysis (with Paul Rayson, Veronika Koller,
Elena Semino). Presentation at the CL2007 conference,
June 2007: Historical text mining
applied to Early Modern English Literature. Presentation (jointly with Stephen
Pumfrey)at the workshop on “The Electronic Revolution in Textual Analysis”,
Institute for Advanced Studies,
May 2007: Quantifying syntactic
structures for keyness analysis. Presentation at the ICAME-28 conference,
May 2007: Collocational patterns around prepositions in English. Presentation
at Madan Puraskar Pustakalaya,
April 2007: The
February 2007: Prepositions in English: some thoughts towards a collocation-based
approach to grammatical categorisation. Presentation at the
December 2006: Historical text mining: corpus-based approaches to the newsbooks of the
Commonwealth. Presentation at the workshop on “Time and Space on the Way to
Modernity: The Emergence of Contemporaneity in European Culture”,
December 2006: Corpora and the languages of
February 2006: A collocation-based
approach to Nepali postpositions. Presentation to the Research Issues in
Theoretical Linguistics group, Department of Linguistics,
November 2005: Exploiting the
Nepali National Corpus: postpositions and collocational patterns.
Presentation at the Conference of the Linguistic Society of
November 2005: Automated
part-of-speech analysis of Urdu: conceptual and technical issues.
Presentation at the Conference of the Linguistic Society of
November 2005: Creating and
analysing a corpus of Nepali. Presentation to the Corpus Research Group, Department of
Linguistics,
September 2005: Foundational issues for corpus linguistics and the languages of
July 2005: How common is a noun? Part-of-speech ratios in English.
Presentation at the CL2005 conference,
June 2005: Approaching part-of-speech tagging: manual
and automatic analysis. Presentation at Madan Puraskar
Pustakalaya,
February
2005: Written corpora: design and data
collection. Unicode, XML and XCES: corpus encoding and mark-up. Corpus
annotation. Presentations at Madan Puraskar Pustakalaya,
March 2004: Data
and software resources for natural language processing in the South Asian
languages. Presentation at EuroIndia 2004
conference,
March 2004: Tagging
a new language: a case study in Urdu. Presentation at the University of Łódź,
March 2003: Developing a tagset for automated
part-of-speech tagging in Urdu. Presentation
at the CL2003 conference,
October
2002: A part-of-speech tagset for Urdu. Presentation at the BAAL/CUP
Seminar on Researching the Indic Languages Diaspora in
April 2002: A
part-of-speech tagset for Urdu. Presentation to the Corpus Research Group,
Department of Linguistics,