With our dataset, we have multiple excel sheets with thousands of records. Initially, it was overwhelming. My group did not know where to start since the information seemed all over the place. But with OpenRefine, we are able to slowly condense the data to what we actually need.
OpenRefine took a while for me to use. When I uploaded my data set, the first one did not want to load. However, when I reuploaded it, it successfully went through and I was able to experiment and test with OpenRefine. As I looked at my data and split it into facets, since our data is heavily based on location, I wanted to see where the majority of publishers were located at. It seems that most of them are based in the United States, so knowing this alone, I think my group will focus our research on the United States.
One of the issues that I had with my data was that the number of publishers is so large that even with OpenRefine, it does not indicate who were the publishers. Because each publisher relocated a couple of times, I want to be able to see how many times a publisher relocated. This will allow piece together little details in our research.
Although our data set still seems much, having OpenRefine helps the process of cleaning the data. It allows us to group certain categories together and determine whether or not the information is necessary. It also lets us select certain columns that we want to target and focus on.