A fun way to introduce DH students to dataviz

As a teacher, I’ve always operated on the assumption that students are primarily interested in each other. Here’s a fun activity that takes advantage of that interest to teach students a little about data visualization. It’s an extremely unscientific Cosmo-style quiz, designed to show students which interests they have in common with each other. It’s just an introductory lesson, but it gives you a fun dataset to play with. You’ll probably want to split this among a few class sessions, since students will need at least one full class to just get familiar with Gephi.

Of course, it’s also a good chance to talk about how authoritative graphs like these can look, and whether the data these contain actually means much at all. (Probably not!)

Make a questionnaire for your students

wpid1844-media_1429464822280.png

I’d do this about a week before you do the dataviz lesson. I used Google Forms for this. Just to make things more fun, I called it the Mysterious DH Questionnaire. I asked five questions, each of which had five options. The possible answers were literally the first options that occurred to me.

Of course, you can choose whatever you want; just be sure you have a constrained list of choices (no write-ins).

Make your spreadsheet into a two-mode edge list

wpid1845-media_1429465322984.png

Now that you have your data, you want it in three different formats: 1) raw; 2) an edge list for a two-mode network graph; and 3) an edge list for a one-mode network graph. To get your two-mode list, use Open Refine to transpose columns across rows. The idea is to go from the layout shown in the above screenshot to …

wpid1846-media_1429465687328.png

… this one. It’s the same data, just rearranged into two columns.

Make your spreadsheet into a one-mode edge list

wpid1847-media_1429466525579.png

Then, if you want (you don’t have to, but it can help students see the difference between one-mode and two-node network graphs), you can project your two-mode edge list into a one-mode edge list, using Gephi and this tutorial from Shawn Graham.

Make an alluvial diagram

wpid1848-media_1429467279291.png

You can do this with the class. Use RAW to make alluvial diagrams from the raw dataset, experimenting with different categories. It’s fun to see the various relationships between, say, book and movie preferences.

Make network graphs

wpid1849-media_1429467435887.png

When the class is ready, move on to using the datasets to show which students have the most in common. Here’s a tutorial I prepared for students to use with this dataset (names have been blurred out). (And here’s a Word version of the Gephi tutorial, in case you’d like to alter it.)

Start with the two-mode network diagram, and when the class is ready, move on to the one-mode. Students really enjoyed seeing who had the most in common, examining the communities Gephi was able to detect, and comparing those communities to their own groups of friends.