(BAWE) British Academic Written English Corpus
About BAWE
The British Academic Written English Corpus (BAWE) was collected as part of the project, 'An Investigation of Genres of Assessed Writing in British Higher Education'. The project was funded by the Economic and Social Research Council. (2004 - 2007 project number RES-000-23-0800).
The corpus is a record of proficient university-level student writing at the turn of the 21st century. This Excel Spreadsheet contains information about the corpus holdings. A more detailed spreadsheet is available from the Oxford Text Archive. It contains just under 3000 good-standard student assignments (6,506,995 words). Holdings are fairly evenly distributed across four broad disciplinary areas (Arts and Humanities, Social Sciences, Life Sciences and Physical Sciences) and across four levels of study (undergraduate and taught masters level). Thirty main disciplines are represented. Parsed versions of the corpus have been created by Phil Durrant using the Stanford Core NLP parser, and are available here: https://phildurrant.net/parsed-bawe-corpus/
Selected links to the corpus are available through the BAWE Quicklinks project, designed for EAP teachers who would like to use corpus data in their feedback to students.
The corpus is available free of charge to researchers who agree to the conditions of use and who register with the Oxford Text Archive. It can also be searched online via the Sketch Engine open site or Lextutor. Please contact Hilary Nesi for further information, or if you have any queries or comments relating to the project.
Project team
- Professor Hilary Nesi
- Professor Sheena Gardener
- Dr. Siân Alsop
- Dr. Paul Thompson (Birmingham University)
- Dr. Paul Wickens (Oxford Brookes)
- Dr. Maria Leedham (Open University)
- Dr. Signe Oksefjell Ebeling (Oslo University)
Using Sketch Engine with BAWE 2019
Size: 2 mb
Bawe corpus manual
Size: 356 kb