hack vs. yack meets big data vs. small data.
I’m enjoying the provocative collection of essays about digital humanities and theory in the recent issue (are we still calling it issue?) of Journal of Digital Humanities. Brought together by Natalia Cecire, the essays ask us to think carefully about the dangers of positivism in the DH turn toward computational power. Theory, which makes many DHers roll their eyes, appears in these articles as the required grounding for DH (even as a number of writers seek to unground assumptions about the digital). Perhaps, the essays suggest, theory is even the very thing that makes DH a humanities endeavor instead of a purely digital one.
I’ve been particularly thinking about Benjamin M. Schmidt’s “Theory First,” which in many ways makes this point. As Cecire herself argues, Schmidt too claims that the digital humanities must draw upon theoretical inquiries into the very nature of epistemological meaning-making from evidence, especially from “big data,” with its intimations of positivist, statistical truth claims.
What I am struck by in this wonderful piece of writing by Schmidt is that there are two debates wrapped into one here. The first has to do with what DHers call hack vs. yack, which is to say whether DH is about simply diving in and building and coding projects or whether DH is about investigating the ethical, ideological, and, yes, theoretical dimensions of computational power and its uses. As almost all these essayists note, DH is most intriguing when it is about both.
But there is also a second debate lurking here, and it has to do with scale. Schmidt calls for DH to focus on the pursuit of processing big data as a means to access “deeper structures are readable in the historical record.” But might not small data also lead to these deeper structures? How do we know that history is statistically quantifiable? Perhaps causalities, meanings, truths, significances concentrate in particular objects, texts, events, people? Perhaps history is ultimately uneven and it is a dangerous distortion to even it out? Or better said, maybe different kinds of history lurk at different scales of looking and listening to the past.
My point is that computers might help us access small data too. We can go both microscopic and macroscopic in order to continue to probe the nature of evidence for historical meaning-making. I’ve gone on this rant before, but I see no reason to assume that more necessarily means more true when it comes to historical evidence. Big data might offer something profitable to marketers, the Pentagon, and corporations, but small data might matter just as much to our understanding of history.
Of course, DHers should, and must, explore the historical record through all lenses—epic and minute, qualitative and quantitative, telescopic and microscopic. Moreover, one collective project, it seems to me, might be for us not only to continue to enhance the dialectic between hack and yack, but also to think about how the digital can enable movement between scales—between the micro and the macro perceptual levels of historical analysis (not to mention the scales of justice, which Benjamin evocatively refers to in his explorations of how theory is much more crucial for the historical losers than the winners).
Hacking this movement between macro and micro scales of the historical record while also yacking about it will be important work.
And now back to following Benjamin’s explorations of Mad Men and Downton Abbey and other fascinating work.
I can’t speak for Ben here, but — for my part — when I suggest that text-mining tends to be more useful toward the large end of the scale continuum, it’s not based on any notion that larger patterns are inherently more meaningful. It comes rather from a suspicion that human beings are already pretty good at catching the smaller patterns. So the comparative advantage of text-mining skews toward the larger end.
I wouldn’t advance that as an absolute rule, though; I think it absolutely is possible to do good text-mining work on a smaller scale.
Hi Ted —
Thanks for the comment. I am ready and willing to explore the big data end of things. Your talk at last year’s Chicago Digital Humanities gathering at Loyola really made sense to me, esp. your notion of using n-gram data as a heuristic.
What’s nagging at me is this concern: are humanists romanticizing mathematical, statistical models of truth telling? Are we doing so without the awareness that these are but one interpretive lens on evidence? And are we in the humanities weirdly imitating, if not embracing, larger fetishes of big data in the corporate and military worlds?
I’m not against big data at all, in its place—I just think it’s important to be aware of the power involved in turning to big data. What I long for is for DH scholars to keep front and center that scale has a politics to it. Something like an awareness of the politics of epistemology here.
The problem with my desire here is that, intellectually, it means we are coming close merely to rehashing debates from the 1970s about cliometrics (here I’ve learned from Will Thomas’s great article on this, “Computing and the Historical Imagination,” http://www.digitalhumanities.org/companion/view?docId=blackwell/9781405103213/9781405103213.xml&doc.view=content&chunk.id=ss1-2-5&toc.depth=1&brand=9781405103213_brand&anchor.id=0).
But if we can find a way out of that stale debate, my point is this: we still don’t really understand scale and its relationship to causality and correlation when it comes to humanities topics. There could be some promise, I think, in moving between big and “small” data rather than pitting one against each other or privileging one over the other, as some seem to do (not you, not Ben, but the idea is out there).
After all, big data has a history, one linked to hyper-rationalized, dehumanizing modes of control (everything from Dr. Strangelove stuff to bad educational policy decisions to economic suffering to Vietnam kill ratios). Using big data has also done good in the world. My hope is that we keep big data’s problems in mind as well as its possibilities.
Thanks again!
— Michael