This week I explored OpenRefine and learned how to clean up my data and make it easier to analyze. As someone who works mostly with Excel, I noticed that although some functions in OpenRefine, specifically the “split multi-valued columns” function, very similar to those that exist in Excel, nonetheless I found OpenRefine an excellent resource to clean up data!
My group is working with the Scottish Witch Trials data, and I would use OpenRefine to separate the names into first and last names using the “split multi-valued columns” function, and clean up several variables like the county names or even the complaints and notes columns, as I noticed that some words are written in Scottish spelling conventions. Cleaning and bundling up all of this information will make analysis easier.
Additionally, because our dataset has a lot of data types, and a huge portion of that is dedicated to the type of witchcraft performed, I would love to see more sophisticated functions that can group these data together to make them more comprehensive and analyzable. This would help my group greatly not only in term of analyzing the data, but also in visualizing and presenting them.
Overall, I really enjoyed exploring the OpenRefine software, and I will definitely use it to clean up the data before going in-depth in our analysis for the project!
I agree with most of what you said about cleaning up the Scottish Witch Trials dataset! Although it’s true that OpenRefine has somewhat similar functions to Excel, it is definitely useful for cleaning up data and clustering more related columns together. Pointing out small details such as the Scottish spelling conventions and will definitely help us clean them up later and merge variables into more useful subsets.