Open Refine for Graphic Novels

After following the tutorial posted by the professor, I learned that OpenRefine is a really useful program that does a good job of making things easier for one to understand a dataset that can contain almost endless amounts of information. Being able to collapse hundreds if not thousands of data points into categories that can be easier to digest is an almost invaluable advantage to have. The function of facets in OpenRefine can do this task by separating things into almost any category imaginable including alphabet, count, etc. It is also extremely convenient it can detect and correct things that are obviously meant to be the same thing, but are differentiated by slight human error such as slight spelling or capitalization issues.

Although our dataset and its subcategories are quantifiable like every other dataset out there, our topic doesn’t really pose many numerical questions that can be solved by quantities. I hope by using OpenRefine and generating several categories we can find the best approach as a team to create the best representation of the answers to our questions.

 

4 comments

  1. Hi there,

    You made a great point here by mentioning how clearing the data enhance representation. Especially for data visualization, by clearing up and better organizing data, it will be much easier for us to generate succinct and easy-to-understand data visuals.

  2. Hi,
    I do agree that data-cleaning is important to the representation of a dataset. I have done some data visualization projects before. I can conclude that an organized dataset will not only save one’s time, but will also help them construct a more succinct data representation project.

  3. Hi,

    I have the same data set and am having similar issues with categorizing this data in a quantifiable manner. Also, I agree with the importance of data cleaning and its effects on how the data set will be presented; it helps a great deal when you start to work on the data visualization.

  4. Great point in mentioning how your topic does not pose numerical questions. I have that same issue, but I’m sure OpenRefine will help us represent our answers in a creative way. Nice job!

Leave a Reply