Lab: Distant-Reading The Crisis with Voyant Tools

Cox, Flores, Sohl, Zaidi

We picked the word college, and found that Vol. 14 No. 1 had the highest raw frequency of the word. This issue was published in May 1917. Although Voyant Tools indicated that there were 171 uses of the word, when we searched the PDF via Ctrl-F, there were only 36 results. This indicates that there was an issue with what words were picked up on in the PDF vs the corpus that Voyant analyzed. In the PDF, the searchable uses of “college”nearly entirely were used either in an advertisement, or to describe a person. Descriptions such as “went to X college” or “assistant dean at Y college.”


By reducing The Crisis to graphical primitives, a meta analysis of the work is gained, while context is lost. We also have the ability to use word selection to analyze the text. However, we must go to the PDF itself to figure out exactly why some words are used more than others.

The Cirrus widget shows word trends, that is bigger words in the bubble appear more often. This gives a general idea of what common words are and how they relate to the theme of the magazine. I particularly paid attention to some of the smaller words. When I came across those less used words in the PDF, I spent more time analyzing them than I would have not knowing that they were used infrequently.




Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s