So way back when I started messing around with digital history tools, I tweeted some results that turned out to be the best mistake I ever made because Heather Froelich tweeted back a response.  Since then, she has been my across-the-pond guide to some serious sh1t including antconc, a freeware concordance program.


However, even better than the technical support she has provided, Heather has slowly been teaching me to stop thinking like a historian.  Initially all I cared about was content words.  Boring I was told.  Wait, what?  textual analysis = content word.  In other words I wanted to make tools do what my mind normally did.  Heather keeps pushing me to see how articles and conjunctions and other “stuff” historians, at least this one, tend to ignore in our close readings. 

 so connjunctions. I should care why?


  1.  if they’re behaving in a way that is somehow unexpected!


I of course have limited expectations for conjunctions, which clearly occurred to Heather, who then kindly sent me a detailed email.  So get you pen and paper (or open that new doc) and learn with me. 

fill in the blanks for the following. Try to come up with 3-4 examples each (this shouldn’t be very difficult – you can use one word, or 2+ words…)
he  _______
the ______
_______ not
_______  it


he said

he sucks

he is


the book
the computer
the kid
why not
is not
can not
get it
was it
find it

What were the kinds of words you used to fill in these blanks? My examples for he included “he is” “he was” “he has” “he ate” “he likes”; for not i had examples like “did not” “dare not” “shall not”. The reason this was kind of easy was because we know what words go together VERY often, and are therefore highly salient. (Pick up a book near you and look to see what is appearing next to he or it, for instance). These words, as you have noticed, are collocates – yes – but they also build stock phrases as n-grams.* 

so me and my expectations.  Conjunctions might not have been where I should start it seems as that is pretty hard (they can join pretty much anything right).  However pronouns could be promising.  
So I look into the above.  I pick “he” and check to see if what I thought above is true

When I say expectations, I mean “What words do you expect to find near these words based on your knowledge of the language anyway?”  You may not have ever thought about them this way – but start by filling in the blanks. If you can produce a quick list of words you’d expect to find like that, go see if they are indeed appearing in your corpus. If they are, great, that means you’re mostly adhering to expectations. If they’re NOT, why not? Can you explain it?

well SH!T it worked for “he”
was
that
is
had
but
as
when
said

“not” also works, as is = top collocateand  for “it” got was way up top

OK so √ homework, now I can play with fun possessive pronouns and the like!

This is also where things can get interesting. What about “male _________”? (patriarchy, hegemony, oppression …) These are all expected based on your corpus of women’s lib writing. but what if “male hegemony” only shows up after a certain date? That’s interesting, because we might have assumed that the phrase “male hegemony” was already established by 1979 – but you now have evidence which suggests otherwise. (More questions to ask after that: do they all start using “male hegemony” in 1982? Who does it first? How quickly does this phrase spread? etc) Get creative!

I’m looking forward to working with 1978-1981 Chrysalis which my amazing graduate student Whitney Esson is digitizing and converting and well as Off Our Backs for the same dates courtesy of JSTOR.  I’ll be presenting the results at the Greenwood Digital Center for the History of Women’s Education in March and as always will blog as I go! he Albert M. Greenfield Digital Center for the History of Women’s Educatio

*These are not quite ngrams as google would suggest but these kinds of ngrams http://en.wikipedia.org/wiki/N-gram. Basically, this means they are really likely to appear near each other to the point of being highly-saturated in language as high-saliency combinations. A good example of this is to repeat the process with ________ dark.  Which of the following leaps to mind first: “After dark” or “table dark”? “after dark” is the stock phrase, whereas “table dark” requires more effort to come up with (and “dark table” would be the more salient use of dark + table in a phrase, for example)

Advertisements