Lab: Analyzing Text Data
Guests
Avery Blankenship, Northeastern University English and Jess Frye, UIUC iSchool
Lab Topic
Today we will be joined by two guests, Avery Blankenship (PhD Candidate, Northeastern University English) and Jess Frye (PhD student, UIUC iSchool) who will share some of the text analysis methods they are using as researchers with the Virality of Racial Terror project, which is part of the larger Viral Texts project.
Collaborative Lab Notes Doc
Instructions
The materials you will need for today’s lab can all be found in this Google Drive Folder.
Lab Task: Perform your own topic model analysis
The premise of this lab assignment is pretty straightforward: use the topic modeling tool we explored in lab this week to model the topics in the example corpora we did not directly use in class. In class we used the “race riot” and “white women” datasets; for your independent work you can use the “lynch” or “klan” data from VRT. OR, if you’re feeling ambitious, consider preparing your own corpus for analysis following the instructions here. The single biggest choice is how you delineate an individual “document” in your corpus, as this will dramatically influence which words are most strongly associated with one another in the model.
Resources
- Programming Historian’s list of distant reading lessons, most specifically Shawn Graham, Scott Weingart, and Ian Milligan, []”Getting Started with Topic Modeling and MALLET”](http://programminghistorian.org/en/lessons/topic-modeling-and-mallet)
- Sathvika Anand, Quinn Dombrowski, and Xanda Schofield, DSC #20: Xanda Rescues the Topic Model Disaster (2023)