Screencast: AntConc 101: A quick introduction to text corpus analysis




AntConc is an open-source corpus analysis toolkit. It’s main function is to identify patterns in large collections of texts, such as novels, blog posts, e-mails, or essays. These patterns might provide you with valuable clues for your research.

Today, we’re going to show you how to get started with AntConc and quickly demo some of its powerful features.

You can download the version for your operating system directly from the author’s website - we’ll include a link in the description too. Next, let’s load in some texts. Project Gutenberg hosts one of the largest collection of public domain e-books in the world. Let’s look at the Sherlock Holmes detective books written by Sir Arthur Conan Doyle. Note that you must use files in a plain text format like .txt with AntConc - you can use a program like Microsoft Word to convert a document to .txt format if you need to.

Now that we’ve loaded our texts into AntConc, we’re ready to analyze.

The Collocate feature works like a search engine that scans our entire corpus. Let’s look for all instances of MURDER. The Concordance Plot visualizes the exact moment where MURDER appears in our novels -- or at least where the word murder appears.

AntConc contains a number of features to discover trends in the words that occur near or next to our search terms.  For today, let’s try Collocate searches with a few different words. AntConc uses a combination of searching and statistical analysis to show us words that appear near our search term, and that were unlikely to appear there by chance alone.

You may wish to play with some of the parameters in the bottom right of the screen. I’m increasing the minimum frequency to 5, which helps make sure AntConc is capturing repeated trends in our data and not just a single weird phrase that appeared once.

AntConc can export your results through the Save Output to Text File feature. These are plain text files, but they use tab-separated formatting -- which means that you can load them into a spreadsheet with Excel! This gives you a lot of options to work further with the data.

I hope you enjoyed this introduction, and that you keep exploring -- we’ll include some links in the description to other webinars and tutorials. Enjoy AntConc and thanks for watching!

Comments

Popular posts from this blog

Blog Post 3: Reflections and Readings

Hi!

Blog Post 6: Gaming, Transfer, Enthusiasm!