Week 5: Mistakes are Inevitable in DH

When I read the description behind the website The Real Face of White Australia, it struck me how it explained the shortcomings of their use of a face detection script. While they have tried to weed out most of the inconsistencies, faces of white people have managed to escape their notice. I was eager to see if I could spot one, and sure enough, after a few minutes of scrolling and exploring, I came across a Customs documentation portrait of a white man named Tom Solomon Toby. Even research projects of this extent have deficiencies in their data visualization. The problem does not lie with the data itself, it has to do with the computer’s processing of the data. Like we have learned in class, the world of the humanities is too complex to be completely and fully processed by that of a computer, and this serves as an example of how this issue can transfer into problems with Digital Humanities projects.

This reminded me of what Francesca warned us about in lab on Friday. The data visualization programs we learned about (Many Eyes, Tableau, and Palladio) may not correctly process our data. Therefore, we must be on the lookout for inconsistencies between our data and its visualization, and be prepared to either find a way around it or explain why the irregularities have occurred.

The inconsistencies between an item search on a website and the wide variety of products that come up serve as an example of discrepancies between what is listed in the database and what is represented in the visualization of that data. For example, on Etsy, an online marketplace for independent merchants, when one searches for a “computer case,” many different items pop up. You can see the results for this search here: https://www.etsy.com/search?q=computer%20case&ref=auto1

In addition to actual laptop cases; laptop stickers, messenger bags, cosmetic bags, travel tags, and even a faux-crocodile handbag came up as search results. There is nothing wrong with Etsy’s database; it is the means of processing this data with a search engine to visualize it on its website where problems come into existence. Etsy can use a controlled vocabulary to better streamline the representation of their database with search engines; minimizing the use of ambiguous terms like “computer case” and thus streamlining their searching process. Again, computers process things in a very strategic way that leaves out the potential of processing people’s tendencies for multiple vocabularies and complex ideas.