Getting into Academic Corpora

CaptureI really want to use corpora more often in class, I really do. I love the idea so much. But it all comes down to interface. The interfaces of some corpora are so dated and cumbersome to navigate that you’d have to devote hours of class time just to get students to use them in a basic way. (Though I do like Just-the-Word and SkELL, and I just came across this list of corpora resources on Twitter today and am hoping there maybe be a few gems on there.) Corpora are a teaching tool I need to spend some more time with and that I’d like to expand my knowledge of.

With corpora of academic English, things are even more restricted, simply because there are so few of them. In terms of student academic writing, the British Academic Written English (BAWE) corpus is a great resources, but I can only find the British Council’s Writing with a Purpose collection, and the BAWE collections in Flax  as avenues through which to interact with it not just on the word and phrase level, but on the paper level.  You can access both the BAWE and the British Academic Spoken English (BASE) corpus through Sketch Engine.

The two academic corpora out of Michigan are my favourites. The Michigan Corpus of Upper-Year Student Papers (MICUSP) and the Michigan Corpus of Academic Spoken English (MICASE) have interfaces that are so easy to use! And you can explore not just words and phrases, but genre, through access to papers and texts in their entirety.

If you use any other corpus-based resources for teaching academic English please share!


10 thoughts on “Getting into Academic Corpora

  1. Hi Jennifer

    do you know the ADVICe corpus of academic speech links here –,

    there’s the PERC corpus which you just need to register free to get access – but after June 2015 not sure about free to access satus 😦

    there also used to be a great one that marked academic moves which you could search on but it is offline now 😦

    if you want the MICUSP corpus for offline use you can contact this guy


  2. Thanks for the plug, Jennifer! At FLAX, we have divided the BAWE into the four sub-corpora with texts from across the larger discipline areas. Here is the about section of the Arts and Humanities part We’d love to hear what you think about our augmented full text approach to building corpora with FLAX. Agreed, concordance lines are a put off to language learners and teachers who don’t have the querying know-how to utilise them effectively. It seems that the corpus linguistics community is finally waking up to this realisation, which can only mean better tools to come for the language learning and teaching community.

    • Happy to plug your great resources. Flax is one of the few avenues into the BAWE I can actually recommend to my students for self study. FYI: I also had a shoutout/links to your blog and Slideshares in presentations on teaching genre I gave at IATEFL and BALEAP in the past few weeks. Cheers!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s