New tutorials on network analysis with Cytoscape

The Cytoscape interface, featuring a pane on the left with buttons and a graph diagram on the right
I find the Cytoscape interface more intuitive than Gephi’s, although in both cases, you need to have a basic understanding of key NA terms.

For some reason I got it into my head to write a bunch of tutorials on using Cytoscape for network analysis. They’re now all up on Github. (I’ve been moving to Github for tutorials because they’re easier to update there.)

I started writing these for the students in my spring-quarter class and, even though the class is over, I’ve been adding to them compulsively. They’ll take you from zero to an interactive, web-based network graph, with stops along the way for projecting a two-mode network to a one-mode network and working with node attributes. (If you don’t know what any of that stuff means, they explain that, too.)

There’s a bit of a Gephi-versus-Cytoscape battle right now among people who do network analysis. I actually started out on Cytoscape, only because I found it slightly more intuitive, and switched to Gephi when I discovered most people used that. But in recent years, I’ve had a really hard time dealing with Gephi. First, there was the Legendary Java Problem, and although the new version is purportedly more stable, I actually just cannot get it to work on my Mac and have frankly kind of lost the will to keep trying.

Cytoscape is Fine. It’s designed for scientists, really, and other people who care very much about statistical measures of networks, which to be honest, I don’t really care that much about. (I don’t think most humanists trust these measures anyway, so I don’t see much point in hammering on them.) I find Cytoscape’s web service, CyNetShare, to be pretty janky-looking, but … you can interact with the network diagram, so that’s good, I guess.

To be honest, I’ve been slowly making the switch from Gephi/Cytoscape/etc. to R’s igraph package, and to D3 for displaying networks on the web, just because they’re so much nicer looking. One thing I like about Cytoscape is that after you’ve measured various aspects of your network, you can export JSON that’s set up specifically for D3’s popular force-layout network.

When I was visiting Stanford last winter, I got to see a preview of a network analysis tool that the Humanities + Design team is building, and I really liked the way they placed the emphasis on exploration and discovery, rather than statistical measures. I’ll be looking forward to seeing the release of that tool (I think it’s called Idiographic?), since I do feel that humanists have different interests when it comes to networks than scientists or social scientists.

New book chapter

arclight-cover-page-001-e1464474821563I’m really proud to have a new chapter in an open-access volume edited by Eric Hoyt and Charles Acland called The Arclight Guidebook to Media History and the Digital Humanitiespublished by the UK press REFRAME. The chapter, which is called “How is a Digital Project Like a Film?” is really about data and narrative. What does it mean to tell stories with data? On what basis can we call data-based narratives true, and where do they necessarily lie? And what role does the interface play in all of this?

The full TOC includes lots of great stuff, including pieces by Deb Verhoeven, Haidee Wasson, Greg Waller, and Lea Jacobs. I think it does a nice job bridging the gap between traditional film studies and other forms of scholarship, and I’m very pleased to be included.

Materials on Image-Mining for Medical History

Last week, I taught the image-mining portion of the Images and Texts in Medical History workshop at the National Library of Medicine. I am far from an expert on OpenCV, the open-source computer-vision library. But as usual, that didn’t stop me from attempting to teach it.

The materials I created for the workshop include detailed instructions on how to use OpenCV to extract images from scanned journal pages (using a script written by Chris Adams), as well as a detailed breakdown of how to use the Python OpenCV library to take the average color of an image. I’ve also included links to my favorite resources on OpenCV and computer-vision in general. (My experience has been that there are a lot of really terrible tutorials out there, so I’ve tried to link only to those that are actually helpful.)

Ben Schmidt taught the text-mining portion of the workshop, and his materials are really great. His handouts in particular are concise, opinionated rundowns of the strengths and weaknesses of various forms of text analysis.

In preparation for the workshop, Ben and I created a virtual machine, provisioned via Vagrant with all the dependencies and data the participants needed. If you’d like to install the VM, it has everything you need for both Ben and my portions of the workshop, and the instructions should be pretty clear. (The VM is based on one that Andrew Goldstone created for his Literary Data class.)

The process of getting the VM installed on participants’ own computers was … complicated. We learned many things about Vagrant and VirtualBox, including the fact that Windows 7 and 8 don’t come with any way to handle SSH.

It was definitely the most technically complex workshop either of us have attempted (to a group of about 50!!). It was definitely not hitch-free, but it was really satisfying to see participants get excited about computer vision, and to talk about ways they might use these techniques in their own research.

Money and Time

This is an edited version of a talk I gave at UC Irvine on February 5, at a symposium organized by Peter Krapp and Geoffrey Bowker.

Digital humanities, as we all know, is sexy right now. It seems to be everywhere, including the New York Times, the New Republic, and the Atlantic. Mellon’s funding it, the NEH is funding it, ACLS is funding it, we’re telling our grad students to prepare to work in it. Digital humanities initiatives or centers are popping up everywhere, and what a luxury to be part of a field that’s so frequently mentioned that people create angry memes about it.

At UCLA, I run and teach in our digital humanities minor and graduate certificate, which started four years ago and now enrolls about 60 undergraduates and 30 graduate students. Students are genuinely excited about DH, and it is a total blast for me to work with them to chart out the possibilities of this expanding field.

University departmental structures aren’t always congenial to interdisciplinary work, but students seem to get it right away. They’re really fascinated by the basic questions DH raises about knowledge organization, history, and epistemology, and I love the way they push the field’s boundaries just by asking the questions that come most naturally to them. I’ve felt actually extremely lucky to be part of a field that’s growing so quickly, and even to be in a position to help chart its direction.

But all this excitement and energy might conceal some less exciting ground truths. I have been spending a ton of time on the road lately, meeting with people who are starting DH centers and talking with people who are keeping initiatives and centers going. And they are tired. They are all really tired.

And once you drill down into the specific staffing and labor configurations of these DH initiatives, you’ll begin to see why. So many of these programs are staffed entirely by postdocs, perhaps with a faculty director who spends a portion of his or her time running the center.

In other cases, a DH initiative consists of a single librarian, who’s probably also responsible for liaising with several academic departments. If a DH initiative has programmers, they’re usually what you’d call “matrixed,” meaning they have multiple bosses, to whom they have to account for their time in exquisite detail. Or if the DH activity is coming from faculty, it’s from people who have to use every ounce of their ingenuity to scare up resources to support their students and their research.

Why is this widespread shortstaffing happening? Some of it is probably just because DH is new and untested, and it is notoriously difficult to launch new, interdisciplinary programs at universities, especially big ones like most of the UCs.

And DH has had the bad timing to emerge during a moment of particular budget austerity, at least when it comes to paying for academic programs. (Whether that’s coincidental is another, much longer discussion.) Launching a program with a two-year postdoc is clearly absurd and shortsighted, but it’s nevertheless become standard operating procedure for many places looking to get a program going. So, in a way, many of these conditions are just typical of our corner of academia at our current moment.

I wonder, though, if part of the problem might also be that our institutions have absorbed some of the widespread rhetoric about the immateriality of digital labor. We’ve come to think that stuff that you do on a computer can be done anywhere, anytime — and thus everywhere, all the time, with no particular material requirements.

We’re used to getting digital stuff for free, from Facebook to iPhone apps, so perhaps we think digital academic programs shouldn’t be any different. People built Linux for free; why shouldn’t they donate their time to build a DH program?

The wide availability of free software and the general enthusiasm about all things digital have probably contributed to this notion that all we need to make a DH center is a laptop and a postdoc. For my part, I’ve optimized absolutely everything about my job I can possibly optimize, from text-expanders and email auto-filters to IFTTT pipelines to automatic appointment-booking software. We’re all lifehacking, right? And I still feel like I’m teetering on the brink of burnout.

Don’t worry, this isn’t really about me. I mean, we should all be concerned with every laborer’s working conditions, and we should all be concerned about what’s happening with academic labor. I suspect we all are. But I actually want to make a somewhat different argument here, one that has more to do with the possible futures of both of our fields.

Recently, I was talking to a group of our grad students about the kinds of work people are doing right now in digital humanities, and they asked some uncomfortable questions.

Take digital mapping. Postcolonial theorists have known since forever that the Mercator projection enshrines Western European, Cartesian models of space, when in fact there are many different ways of understanding geography. Why does every DH project use the Mercator projection?

Or take network analysis software. The tools we tend to use, like Gephi and Cytoscape, are great at measuring centrality and clustering coefficients. But what about some of the most basic things a humanist might like to do, like transforming the network diagram to reflect the perception of a different historical actor? That’s just not a possibility for us. Why is that?

Why? It’s simple. Because we’re relying on tools and infrastructure built for industry — or, in the best cases, for scientists. Which makes a certain amount of sense; one doesn’t want to reinvent the wheel. But it’s also had material effects on the kind of work we can produce, and the horizons of possibility our work can open. When we choose not to invest in our own infrastructure, we choose not to articulate a different possible version of the world.

In fact, this state of affairs is already very well-documented for edtech. By outsourcing development of key components of educational technology to for-profit vendors, we’ve chosen to invest in the development of software companies that mine our students’ data, encourage us to spy on their work, and lock us into a closed ecosystem of for-profit technology whose philosophy bears very little resemblance to the kinds of teachers we started out wanting to be.

And for all of the excitement about grant funding opportunities and enthusiastic administrators, the actual state of DH funding is less flush-with-cash than boom-and-bust. An NEH grant, no matter how prestigious, doesn’t secure a salary for very long. A postdoc, no matter how smart and committed, isn’t going to singlehandedly change campus culture.

It’s one thing to get an awesome project going; it’s another thing to pay for the routine maintenance necessary to keep it up and running. Recently, we saw the closure of HyperCities, UCLA’s well-known mapping platform for humanistic projects. People were tired of piecing together grant funding to keep it lurching along. Meanwhile, Google decided to shut down its support for the Google Earth browser plugin, so … it’s gone. That’s what happens when we don’t invest in our own infrastructure.

Don’t get me wrong, I get tired all the time of trying to wrestle with the exhausting bureaucracy of a public school, and I’ve turned to private-industry solutions plenty of times. Most recently, I’ve given up on trying to control my own space on university servers and started encouraging my students to purchase their own space from hosting companies for class projects.

It seems like the reasonable thing to do, since Lord knows I’ve had my stuff written over and erased from university servers more times than I can count. But I’m also aware that by choosing not to invest in support for this kind of thing, we’re relinquishing all of this work to private servers. We’ll never get it back again.

Last year, UCLA announced an app competition. The contest promised a $5,000 prize for the best app to, quote, promote “UCLA’s mission of education, research and service.”  I’m 100% sure that the offices that sponsored this contest had the best intentions, and I salute the winners. But this is not support. This is not research support. How long does it take to build an app? How many people does it take? How is the app going to get updated once the contest is over? What message are we sending our students by telling them they should work for free? Has anyone thought this through?

We want to believe that we can be agile and innovative, like Silicon Valley says it is, by making DH run with short-term grants, app contests, and temporary labor. We want to have a sort of Uber-style sharing economy for DH-research. But this is not how one supports careful, enduring scholarship and teaching.

Why does digital humanities look the way it does right now? I think the boom-and-bust cycle of grant-chasing and temporary funding has had a huge but largely unacknowledged effect on the kind of scholarship we’re producing. If we want to produce truly challenging scholarship and keep our best scholars from burning out, we need to pressure our institutions to, frankly, pay up. You can optimize, streamline, lifehack, and crowdsource almost everything you do — but good scholarship still takes money and time.

A better way to teach technical skills to a group

a stack of orange, blue, yellow, and pink post-it-notes
“Post-It Notes,” by Dean Hochman

My DH101 class this year was my biggest yet, with 45 undergrads. I suppose that’s not huge compared with many other classes, but DH101 is very hands-on. I am fortunate enough to have a TA, the awesome Francesca Albrezzi, who runs separate weekly labs. Still, I often have to teach students to do technical things in a large-group setting, and the size of the class this year prompted me to rethink how I do this.

As I see it, many of my students’ biggest problem with computers is their own anxiety. Obviously, I have a self-selecting group, since I teach a class with “digital” in the title, but even so, many of my students tell me that they are just “not technical.” Many of them are so convinced of this that they see any failure to get something to work as confirmation of what they already knew: they’re just not good with computers.

And since this is UCLA, the vast majority of my students do not fit the stereotype of the Silicon Valley programmer. This is awesome for the class, since we have so many different voices in the room. But it also puts many of my students at risk for stereotype threat, in which students’ performance suffers because they fear their mistakes will be seen as representative of their entire race or gender.

I’ve seen a version of this happen in workshops countless times. The instructor issues directions while students try to keep up at each step. Some students accomplish each step quickly, but some students take a little longer to find the right menu item or remember where they’ve saved a file. No matter how often you tell students to please interrupt or raise a hand if they need help, most students won’t do this. They don’t want to slow everyone else down with what they’re sure is a stupid question. Eventually, these students stop trying to follow along, and the workshop becomes, in their minds, further evidence that they’re not cut out for this.

Continue reading “A better way to teach technical skills to a group”

Rehabbing DH101

Someday I'll come up with a better way of illustrating blog posts than a Flickr search for "data," but in the meantime, here's "Untitled" by Karen Blaha.
Someday I’ll come up with a better way of illustrating blog posts than a Flickr search for “data,” but in the meantime, here’s “Untitled” by Karen Blaha.

I’m teaching Introduction to Digital Humanities for the third time this year, along with Francesca Albrezzi, my wonderful two-time teaching assistant, and I’m really enjoying it. It’s a challenging but rewarding class, with 45 students, a 10-week quarter, and a large number of moving parts. I reworked the syllabus quite a bit for this iteration, and I thought it might be useful to talk about what I’ve done differently and why.

As I’ve taught through the class a few times, its purpose and value have become more clear to me. My version of DH101 is about developing a humanistic attitude toward data. To me, that means the ability to hold in one’s mind simultaneously the value of any particular dataset and its inevitable poverty, compared with the phenomena it purports to describe. I want students to be able to “work” with data — that is, to analyze, visualize, and map it — but also to retain a perpetually critical, interrogative stance toward it.

In the service of this goal, I’ve completely rewritten the students’ final project assignment. The previous assignment, which I first inherited and then adapted, was for students to work in groups to build Omeka projects on topics of their choice. This had the benefit of exposing them to the demands and complexity of Dublin Core metadata, but I felt that the students were spending too much time describing objects and not enough time working directly with data. Since Omeka has no real export function, they weren’t able to do much with the metadata they were creating, besides build exhibits. Continue reading “Rehabbing DH101”

The (sort-of) selfies class

Room of boisterous students mugging for the camera
Class selfie! Lotta brilliance in this room!

Last winter I taught a class called Selfies, Snapchat, and Cyberbulles: Coming of Age Online. It was incredibly fun and rewarding, and I learned a ton. Mark Marino simultaneously taught a great class on selfies over at USC, and while we weren’t able to sync up our classes as much as we might have liked, we were able to have a joint Facebook group for them, which was really fun.

(Mark and I were able to teach our classes at all in large part because of the generosity of the scholars involved in the Selfies Research Network, to whom I owe a big debt of gratitude.)

Mark’s class generated a ton of publicity, and because he mentioned my own class, I rode Mark’s coattails a bit as we got mentioned in the New York Times, the LA Times, and elsewhere. Of course, Mark and I knew that the only reason our classes were getting any press was so that people could talk about how ridiculous a selfie class is. But it was still fun, and we tried to inject as much substance as we could into the conversation.

Meanwhile, the ever-awesome Liz Losh took the time to really dig into the substance of my class in this excellent post on the DML blog; I was really honored to be interviewed.

I got an interview request for another outlet, and since the article seems not to have seen the light of day, I thought I’d just post my responses to the interviewer’s questions here.

Incidentally, I don’t really take my own selfies, not because I disapprove of them, but because I’m really bad at it. Much respect to people who can do it well!

What enticed you to teach a class centered around the selfie?

The class wasn’t entirely centered around the selfie. It was about the experience of being a young adult in the digital age and, more broadly, how we should think about the relationship between technological and cultural change. I wanted to teach this class because I’ve heard a lot of generalizations about millennials, both in the media and from people I know, and I felt that many of these characterizations didn’t accurately reflect the complicated, diverse people I encounter in the classroom at UCLA. I wanted to submit those generalizations to rigorous scrutiny, to see whether they held up.

I also noticed that every time I mentioned social media or online culture in the classroom, students were really eager to chime in with their own experiences. I thought it would be fun and interesting for us to carefully study something they care so much about. I also have a sister who’s 21, so I felt a personal investment in countering some of the more pernicious stereotypes about young adults.

What insights and observations have you gained regarding the relationship between students and social media?

My students had a ton to say about social media and its relationship to youth culture. One thing I found most interesting is how worried they are about social media’s effects on their attention spans and relationships. That makes sense, since they’re hearing the same news stories and media messages about millennials that we are! But they’re thinking very hard about technology and social change; no one should assume that just because a young adult has her eyes on her phone, she’s not also self-aware and thoughtful.

Can you give an example of an assignment for the course?

Students’ main project was a digital ethnography (meaning an in-depth study of a particular culture) of an online community. I asked them to immerse themselves, and in some cases participate in, an online community of their choice. We had a couple Tinder papers, one on Yik Yak, and a few on Instagram. Students were surprised at how hard it was! We spent a long time talking about how to be an ethical, honest, and diligent participant-observer.

Based on what you’ve seen among students, are there specific aspects that constitute a typical selfie?

I think it really depends on context. Selfies can have different meanings, depending on who’s taking them and for what purpose, and often you’ll find people consciously imitating or exaggerating elements of the “typical selfie” for ironic effect. For example, many teenage girls will offer up an exaggerated “duckface” to the camera, in a conscious and ironic imitation of the “typical selfie.” Just as any portrait can, a selfie can mean many different things, and one has to be very alert to its context when one’s trying to suss out the meaning of any particular image.

Outside of classroom purposes, do you condone taking selfies? If yes, how do you justify a selfie as something more than an act of narcissism?

I don’t really think it’s my place to condone or not condone any form of participation in a visual culture. Community, as we all know, means a lot to people, and taking selfies is one way that some people participate in a community. I think we should also be very alert to what is connoted by the word “selfie.” As the term is popularly used, it’s closely associated with teenaged girls, who have frequently been the object of scathing ridicule in American culture. I think of selfie-opprobrium as somewhat akin to people’s annoyance at vocal fry: both phenomena are associated with teenage girls, and both suggest a degree of annoyance (perhaps even fear) at girls’ temerity in entering the public sphere.

What do you hope students carry away from the course?

On our last day of class, I asked students what they’d remember about what we learned. They all agreed that “It’s complicated!”  — which is also the title to danah boyd’s recent book, which we read in class. What boyd means, and what my students meant, is that you can’t assume that all online youth culture is one thing, or that every young person experiences life online in the same way. Phenomena that look very similar to outside observers can turn out, on closer inspection, to have very complicated and multilayered meanings. Young people — like all human beings — are complicated, diverse, and multifaceted. Sweeping generalizations about them are unhelpful and usually wrong.

They also said they’d remember our discussions about the need to “hustle,” by which we meant the reality of labor in the twenty-first century. Students carry unprecedented educational debt these days, even as the likelihood of them owning their own homes, or even attaining the same living standard as their parents, is lower than it has ever been. Steady jobs, the kind with pensions and benefit plans, are becoming increasingly rare, and students are facing the possibility of a future made up of freelance gigs and short-term contracts. It’s no wonder they feel compelled to create complex online identities. In an economic moment in which their online identities can determine their ability to earn a wage, it’s incumbent upon them to create charismatic online personae.

What’s in your conference travel bag?

A red purse sits in the backyard. In front are a laptop, notebag, pens, two small gray zippered pouches, a power adapter, a power strap, a pill case, and a striped pouch.
Taken in a hotel room, appropriately. Iphone not pictured, since I had to take the picture somehow.

Anyone else have a weakness for those “What’s in your bag?” features? My stuff is not nearly as nice as the stuff those people carry, but deep in my heart, I seem to cling to the belief that my life really would be better if I could just optimize a few things.

Anyway, I posted on Facebook about a new receipt-filing thing I’d bought, and the response was so enthusiastic (what is wrong with my friends?) that I thought I’d do a quick post about what I’ve been carrying lately. I’ve been traveling for work a ton this year (way too much, obviously) and I’ve been devoting more thought than I’d like to admit to making my conference travel bag efficient.

Continue reading “What’s in your conference travel bag?”

“Stronger and Whiter Light Down Deeper and Darker Holes”: Jacob Sarnoff and the Strange World of Anatomical Filmmaking

For some reason, I love this image, from Sarnoff's The Human Body in Pictures. It's so stylish!
For some reason, I love this image, from Sarnoff’s The Human Body in Pictures (1927). It’s so stylish!

I have an essay up over on the National Library of Medicine’s Medical Movies on the Web site about Jacob Sarnoff, a Brooklyn surgeon who made thousands of anatomical and surgical films. I’m also so excited that the NLM posted Sarnoff’s weird 1927 film “The Human Body in Pictures.”

From the essay:

Motion pictures’ utility for surgeons might seem to be their ability to show things just as they appear to an observer present at the scene. But a film like Sarnoff’s suggests that there is a gulf between what mechanical reproduction shows and the way that something like circulation actually appears to the surgeon present.

For surgeons like Sarnoff, the value of film wasn’t only, or even chiefly, its ability to mechanically reproduce reality, but its ability to function as a dynamic college: to offer students of surgery a lesson on how to move back and forth seamlessly between the messy substance of reality and the neat diagrams that populate anatomical atlases.

I was especially happy to write something for the NLM because the Library’s History of Medicine division has been invaluable to my work. From my first, exploratory research into my dissertation, their librarians and archivists have been true research partners (and sometimes cheerleaders!). The History of Medicine division does invaluable work, and I’m so grateful to its staff.

What’s Next: The Radical, Unrealized Potential of Digital Humanities

This is a lightly edited version of the keynote address I was honored to give at the Keystone Digital Humanities Conference at the University of Pennsylvania on July 22, 2015. Thank you to the organizing committee for inviting me!

My sincere thanks, too, to Lauren Klein and Roderic Crooks for their advice and feedback on this talk. I’d also like to acknowledge the huge intellectual debt I owe to David Kim and Johanna Drucker, with whom I’ve argued, negotiated, and formulated a lot of these ideas, mostly in the context of teaching together. David’s important dissertation, Archives, Models, and Methods for Critical Approaches to Identities: Representing Race and Ethnicity in the Digital Humanities (UCLA, 2015), takes on many of these issues at much greater length.

I gave the title of this talk to Dot Porter some time ago in a fit of ambition, and it’s seemed wildly hubristic to me ever since. But it’s something I care a lot about, and so tonight I’d like to outline some ideas about how digital humanities might critically investigate structures of power, like race and gender.

We are doing some of that now, as evidenced by some of the work at this conference, but I don’t think we’re doing it with the energy or the creativity that we might. I’ll argue that to truly engage in this kind of work would be so much more difficult and fascinating than we’re currently talking about for the future of DH; in fact, it would require dismantling and rebuilding much of the organizing logic, like the data models or databases, that underlies most our work.

So I’ll start by saying a little about where I think we are with digital humanities now, and also about some new directions, with respect to these structures of power, that I’d like to see the field go.

Continue reading “What’s Next: The Radical, Unrealized Potential of Digital Humanities”