OpenRefine

This weeks blog topic was my favorite so far and definitely very resourceful. Getting my feet wet a little with OpenRefine has shown me how useful organizing data can be, and how it can be applied to my group project.

Our group was assigned a decently sized dataset regarding graphic novels, and it also included data columns such as the names of the authors, countries of origin and also race, not to mention a multitude of others. The first of many ways I can find usefulness in OpenRefine is by cleaning up the Firstname column. Many different versions of the same name are likely to come up so narrowing that down will definitely help. Gender is one of the key factors in our research question, so being able to get a more efficient look at the names will help us analyze the data regarding gender.

Another way I’m interest in using the OpenRefine software to help us with the project is by getting a clear count on the country of origin that the graphic novels came from. By using the count organizer, we can get a clear number on where the books were from and easily compare the various places of origin. Along with gender and race, origin is another one of the most pivotal key points in our research question so being able to get a straightforward view on the data in order to be able to fluently discuss the info we’ve been presented as well as further analyze it will prove very helpful.

One comment

  1. Audie nice to see that we both got something out of this week blog post. I definitely feel like the work we”re doing is hands on, something we can use moving forward.

Leave a Reply