Visualize a network of film casts and crews

A friend of mine wrote to me recently with a request. For his dissertation, he’s unearthing the filmmaking culture of a particular time and place. “I keep running across these names of actors and filmmakers,” he wrote, “and I know I’ve seen them before, but I can’t remember all the relationships. Is there a way I can visualize the overlapping networks of people within this culture?”

There is! To demonstrate, I’ll use the last dozen films of one of my favorite filmmakers, Alfred Hitchcock. This is a fun way to get started making network visualizations.

Get some data

media_1319853289300.png

First, I needed a data source. I chose IMDB. My friend could record his data as he unearths it over the course of his archival research.

Record your data in the form of subject-verb-object.

media_1319850188217.png

Using Excel (or your favorite spreadsheet application), create one column for “Person,” one column for “Relationship,” and a third column for “Film.” You can choose the language you use to describe relationships, but be consistent. You’re welcome to download my spreadsheet as an example.

(Incidentally, there are ways to automate a process like this, though IMDB doesn’t provide an API and its TOS disallow scraping. But for our purposes, let’s assume you’re gathering your data by hand.)

Save your spreadsheet and give it a name you’ll remember.

Download Cytoscape

media_1319850408334.png

Head over to http://www.cytoscape.org. Cytoscape is a free, open-source platform that allows you to visualize network data. It works with all operating systems. Download Cytoscape as you would any program.

Open Cytoscape

media_1319850637381.png

Double-click on the application’s icon. If a dialog box opens asking you to choose a template, just close the box by clicking on the red button. Don’t worry if the application looks baffling to begin with.

Import your spreadsheet

media_1319850812132.png

Head to File, Import, and then Network from Table (Text/MS Excel)…

Select your file and show text import options

media_1319850968325.png

  1. Click on Select File(s) and choose the spreadsheet you created.
  2. Click on Show Text File Import Options.

Tell Cytoscape how to interpret your data

media_1319851173591.png
  1. If you’ve labeled your columns in the first row of your spreadsheet (like I did), click the box next to Transfer first line as attribute names.
  2. For Source Interaction, choose Column 1.
  3. For Interaction Type, stick with Default Interaction.
  4. For Target Interaction, choose Column 3.
  5. Click Import.

A dialog box will appear to tell you that you’ve successfully imported your data. Click Close.

You’ve got a (confusing) visualization!

media_1319851509279.png

Hey, you’ve got something! But what is it? Each circle, called a node, represents either a film or a person. Each line, called an edge, represents a relationship. The whole thing — all the lines and circles — is called a network.

Start making sense of your network

media_1319851669450.png
  1. As a start, click on the little network diagram (hover over for a tool tip) to create a force-directed layout. Already that’s much better!
  2. Expand the size of your sheet by clicking and dragging the corner.
  3. Zoom in by clicking on the magnifying glass and then clicking on the diagram.
  4. Magnify different parts of your network by dragging around the shaded window

You can also change the location of individual nodes by clicking and dragging them.

Cool, huh? Now you have a big-picture view of your network, and you can already start to ask questions. For example, why is Frenzy (up in the top right-hand corner) the only film in which no cast or crew member is connected to any of Hitchcock’s previous networks?

Customize your visualization

media_1319852120117.png

Cytoscape allows you to change almost everything about your network visualization, from the color of the background to the size of the font. It comes loaded with several preexisting templates. To get access to them, click on the right-pointing arrow directly to the right of Network in the sidebar.

From the dropdown menu labeled Current Visual Style, try selecting different options. Each one will change the look of your visualization.

Tell Cytoscape that you’d like to see relationships, too

Screen_Shot_2011-10-28_at_9.45.15_PM.png

You’ve got a cool network visualization, but you don’t yet know who did what. To get Cytoscape to display this information, click on the words Double-Click that appear to the right of Edge Label. From the drop-down menu, choose Relationship. Then click on Mapping Type and choose Passthrough Mapper.

Now you can see who did what!

media_1319853024939.png

OK, it’s not perfect, but you’ve got a network! In a subsequent post, I’ve shown you how to further customize the look of your network. If you’d like to save views of your network, click on the camera icon and save your view in the format of your choice. And have fun exploring the network you’ve created.