{"id":1934,"date":"2017-10-31T22:19:26","date_gmt":"2017-11-01T05:19:26","guid":{"rendered":"http:\/\/miriamposner.com\/classes\/dh101f17\/?p=1934"},"modified":"2017-10-31T22:19:26","modified_gmt":"2017-11-01T05:19:26","slug":"using-openrefine-with-cmoa-photography-data","status":"publish","type":"post","link":"http:\/\/miriamposner.com\/classes\/dh101f17\/2017\/10\/31\/using-openrefine-with-cmoa-photography-data\/","title":{"rendered":"Using OpenRefine with CMOA Photography Data"},"content":{"rendered":"<p>When I initially opened my group&#8217;s dataset of the Carnegie Museum of Art Photography Collection, I was overwhelmed by the 3728 records and 26 columns of information included. It was daunting to think that we&#8217;d have to clean up all of this data to uncover useful information for our research questions while excluding extraneous data. Learning how to use OpenRefine for this week&#8217;s blog post, however, alleviated some of my anxiousness. I found the facets\/filters options helpful because it allows us to select certain subsets of data and edit, merge and re-cluster them. I also enjoyed being able to examine each column more closely and clean them up.<\/p>\n<p>I attempted to manipulate the data to be more useful in answering the following research questions that my group had:<\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">What location (country, city) do these photographs originate from?\u00a0<\/span>\n<ul>\n<li>Merge Selected &amp; Re-Cluster New York City, Pittsburgh, Perth; Edit text facet of Philadelphia, San Francisco, Santa Monica, West Carlisle, Hoboken<\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">What kind of subjects are depicted in the photographs?<\/span>\n<ul>\n<li>Could not manipulate data because the the main title that identifies the object or artwork had no multiples<\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">How are the photographs connected; by artist, origin, content?<\/span>\n<ul>\n<li>Used Text Facet for full_name column to find artists who made significant contributions to the museum (had 50-500 records), then for one artist ex. Clyde Hare, looked at the title (subjects) of his 142 records and creation date.<\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">How is race\/ethnicity represented in the photographs?<\/span>\n<ul>\n<li>The closest way I could examine this was through looking at the nationality column and cleaning up\/re-clustering some of the data.<\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">How is gender represented in the photographs?<\/span>\n<ul>\n<li>Gender is not represented in any of the columns. It would be helpful if we had this information to compare the number of photographs taken by those identifying as female vs male vs other.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Some other manipulations I made to the data was common transform to date the creation_date_earliest and creation_date_latest columns to keep the data format uniform. I also reclustered and edited the medium column because some of the values were repetitive but only differed in lowercase\/uppercase or had an extra white space.<\/p>\n<p>Overall, while the dataset requires much more refining, experimenting with OpenRefine has made me less overwhelmed with all the data because of its easy-to-navigate and intuitive features.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When I initially opened my group&#8217;s dataset of the Carnegie Museum of Art Photography Collection, I was overwhelmed by the<\/p>\n","protected":false},"author":133,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_eb_attr":"","footnotes":""},"categories":[1],"tags":[],"class_list":["post-1934","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"http:\/\/miriamposner.com\/classes\/dh101f17\/wp-json\/wp\/v2\/posts\/1934","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/miriamposner.com\/classes\/dh101f17\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/miriamposner.com\/classes\/dh101f17\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/miriamposner.com\/classes\/dh101f17\/wp-json\/wp\/v2\/users\/133"}],"replies":[{"embeddable":true,"href":"http:\/\/miriamposner.com\/classes\/dh101f17\/wp-json\/wp\/v2\/comments?post=1934"}],"version-history":[{"count":0,"href":"http:\/\/miriamposner.com\/classes\/dh101f17\/wp-json\/wp\/v2\/posts\/1934\/revisions"}],"wp:attachment":[{"href":"http:\/\/miriamposner.com\/classes\/dh101f17\/wp-json\/wp\/v2\/media?parent=1934"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/miriamposner.com\/classes\/dh101f17\/wp-json\/wp\/v2\/categories?post=1934"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/miriamposner.com\/classes\/dh101f17\/wp-json\/wp\/v2\/tags?post=1934"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}