So if we take as a given that what historians do is make connections between things, then I can’t actually fathom why everyone isn’t all over digital history.  I’m envisioning all sorts of things ranging from network map of authors in and among periodicals to  topic modeling discourses to see if they line up with what we think of as the “strands” of feminism in each periodical.


Right now I’m trying to figure out the methodology of my topic modeling project, while at the same time learning how to topic model.  Thankfully for me, the fine folks at the programming historian as well as awesome DH tweeps like Miriam Posner and Ted Underwood and Scott Weingart are so helpful and have written huge posts about how –to.

So methodology …  I’m looking at the concept of culture in women’s liberation periodicals in the United States.   I’m interested in how discourse shifts around culture and if a particular feminist ideology “matches up” with coverage/discussion of culture.

Off Our Backs is the only fully digitized “movement” periodical.  It runs 1970-2008, which is a problem in terms of the limits JSTOR puts on downloading, but I can always request the whole thing.  I’ve had Chrysalis, a magazine of women’s culture digitized for its full run 1977-1981.  I’m interested in looking at Women A Journal of Liberation ( fall 1969 -1981) as a comparison for OOB and Heresies (1977-1983), which is already in PDF format thanks to this amazing project and documentary, by Joan Braderman, which you should SO bring to you campus ASAP. 
So do I model all of OOB and Women, then subset out the dates that match Heresies and Chrysalis? Or do I even need to do that?  

Should I model individual articles or as issues?

Also wondering if I should stick with David Newman’s tool or switch to Mallett Confirmed that Newman wrote an awesome GUI for mallet

UPDATE

Frustration and getting it all wrong
So playing around on the interwebs I came across some new things I just had to play with
I quickly grabbed a pdf of Heresies #1.  Ran it through ABBYY fine reader to create text.
Then I decided to run that through voyant which yielded some lovely results.
Frequencies
Count
Z-Score
Difference
Mean
Std. Dev.
Peakedness
Skew
Trend
women
417
7.91
64.6
0
art
402
7.62
62.3
0
work
198
3.7
30.7
0
feminist
144
2.66
22.3
0
class
132
2.43
20.4
0
women’s
128
2.35
19.8
0
new
123
2.26
19.1
0
woman
123
2.26
19.1
0
like
116
2.12
18
0
la
111
2.02
17.2
0
men
107
1.95
16.6
0
feminism
105
1.91
16.3
0
male
99
1.79
15.3
0
political
98
1.77
15.2
0
world
91
1.64
14.1
0
artists
89
1.6
13.8
0
time
89
1.6
13.8
0
make
78
1.39
12.1
0
que
74
1.31
11.5
0
people
71
1.26
11
0
york
71
1.26
11
0
mural
63
1.1
9.8
0
female
61
1.06
9.4
0
experience
60
1.04
9.3
0
power
59
1.02
9.1
0
working
59
1.02
9.1
0
left
55
0.95
8.5
0
feminists
51
0.87
7.9
0
just
51
0.87
7.9
0
los
51
0.87
7.9
0
movement
51
0.87
7.9
0
way
51
0.87
7.9
0
artist
50
0.85
7.7
0
prison
50
0.85
7.7
0
social
50
0.85
7.7
0
socialist
50
0.85
7.7
0
society
50
0.85
7.7
0
say
49
0.83
7.6
0
culture
48
0.81
7.4
0
day
48
0.81
7.4
0
life
48
0.81
7.4
0
good
47
0.79
7.3
0
love
46
0.77
7.1
0
tion
46
0.77
7.1
0
collective
45
0.76
7
0
en
44
0.74
6.8
0
money
44
0.74
6.8
0
politics
44
0.74
6.8
0
things
44
0.74
6.8
0
fact
43
0.72
6.7
0
I saved the above as csv spreadsheet thinking I could run that through gephi (although I later realized I didn’t export ALL of it  which runs to 219 screens)
(actually I’m skipping the part where I messed up the export to CSV, which I realized when I put file into gephi and saw it had 0 edges).
So happily I worked my way through Gephi tutorial, which is lovely.  Then I reached “rank”  Clicking on that I realized that the numbers in my excel spread sheet were being entered into the graph.  DOH see what happens when you play without knowing the tools?  I’m pretty sure the error is in formatting then importing into Gephi but damned if I can figure it out tonight.
Still I’m finding the mistakes as instructive as the successes.  Nothing like being the utterly clueless prof to give one insights into teaching. 

And Voyant revealed some interesting stuff.  For example, looking at the 46 instances of culture, gives me the context via words to right and words to left.  However this has to be my favorite, knots (or maybe the collocate clusters).

Update again, found some really really lovely stuff done by Lisa Rhoday @lmrhody (topic modeling poetry)and Aditi S Muralidharan ‏@silverasm (v. sophisticated text mining of slave narratives, Shakespeare)

Advertisements