Wordle on the NYC Tenements

Creating visualizations are often hard if the data being used is tricky. This is exactly what happened with this week’s blog post. I decided I would attempt to get a head start in our big project by using my data, as it would give me an excuse to really look into the data and make a visualization. My data consisted of about 1100 photographs of New York City Tenements taken by inspectors between the years 1934-1938. The issue with the data is that instead of there being a hyperlink for each photograph, there is a permalink that takes you to the collection website and shows you only that individual photograph (there is no scrolling function on the archive database). Additionally, because the label on all of them are “NYC Tenements” and there are only 5 different year options, I decided to use the notes. The notes, on the other hand, had a lot more information that could actually be used to create a visualization (disclaimer: I am sure I can create better digital representations of my data once we have moved further along in the course).

In the notes, there was information about the picture itself, such as “baby sitting on a bed”, generic information about what the photograph showed, such as “storefront”, and even the address of where the photograph was taken. With this, I copied all of the notes and pasted them onto the Wordle database. While I waited for Wordle to create a “word cloud” of the most common words found in the description notes of over 1100 data entries, I expected to see words like “storefront” or “child” or “st (because of the addresses” be bigger than the rest. Instead, it made me think about a whole other aspect of my data that I had not even considered exploring.

When the cloud arrived, these were the huge words: Manhattan, Brooklyn, and Bronx. That’s when I thought that maybe instead of focusing so much on what was in the picture, I could categorize them according to where in New York the picture was taken. I already had previous knowledge that those were neighborhoods in which immigrants at that time flooded to, and thought that could have something to do with why the photos showed small enclosed spaces with big families, crowded storefronts in building corners, tall buildings with many windows signaling many apartments, etc. Thanks to this word cloud, I was able to see that most of these photographs were taken in 3 specific neighborhoods, where before I was too busy focused on what each photograph contained. Now with this new outlook on my data, I can attack it in a way that is organized and much easier to manage. In other words, Online Visualizations-1 Excel Sheet-0

screen-shot-2016-10-21-at-7-44-47-pm

3 thoughts on “Wordle on the NYC Tenements”

  1. It was interesting seeing the word visualization. It definitely helps with seeing the bigger picture, like you said that it seems that most of the pictures were taken in Manhattan and Brooklyn. But now that we know this, I wonder if it’s possible to create another word visualization showing which words have the most frequency if we exclude those two words. Although it’s definitely good to know where the pictures were taken, I think it would be interesting to see the visualization of the description of the actual tenements rather than the location. Also, I think excluding words like “with,” “street,” and different variations of words (ex. apartment vs apartments) could also help with the visualization of the actual conditions of the tenements.

  2. Great job! I really enjoy word visualizations, they often use them in political data (I think I remember one about the words chosen in the RNC acceptance speech by Trump vs. the DNC acceptance speech by Clinton that was particularly informative). I think word visualizations like these have further applications in showing bias, just as they do in politics, so that could be something to explore when working on your Tenement dataset project.

  3. This is terrific, Geraldine. I loved reading about your thought process and the way you’re approaching a very tricky dataset. I think you’re right; the Notes field has a lot of potential. And it’s so cool to see that visualizing your data helped you imagine a path you might not have discovered before!

Leave a Reply

Your email address will not be published. Required fields are marked *