Decoding Netflix compared to Plateau People’s Web Portal

Source: http://www.slashgear.com/wp-content/uploads/2009/06/apple_macbook_pro_13-inch_teardown_1.jpg
Source: http://www.slashgear.com/wp-content/uploads/2009/06/apple_macbook_pro_13-inch_teardown_1.jpg

This is an image of all the bits and pieces that go into a Macbook computer. It reminds me of the difference between the Plateau People’s Portal and Madrigal’s Netflix exploration. Just as the computer must be built before it can be broken down, a website must be put together before the semantics that make up that website can be revealed.

The difference between creating content and cracking method by which content is created is the difference between building up and breaking down. Thousands of memories and years of history go into creating the background of the Plateau People, just as thousands of directors and actors put together the movies that make up Netflix’s endless categorization. All those moments and all those movies were uploaded meticulously into a data base, then made public for browsing and viewing. Breaking down that content, however, takes just a few people and some really good data recovery software. This can be seen when comparing Madrigal’s “How Netflix Reverse Engineered Hollywood” and Washington State University’s “Plateau People’s Web Portal”.

Clicking to the “About” page on the Plateau People’s Web Portal reveals that “tribal administrators, working with their tribal governments, have provided information and their own additional materials to the portal as a means of expanding and extending the archival record.” Memories, artifacts, dates and events were used to create a comprehensive history of the Plateau people. The curators pulled out the most potent pieces of information, deciding what must be shown versus what can be thrown away. All this human effort prevents the website from showing random outliers.

To crack Netflix’s “alt-genre” movie categorization algorithm, Madrigal used a plethora of software and equations. She states the programs took over 20 hours to grab all of Netflix’s possible URLs and patterns, a feat that would have taken years to accomplish in the absence of said programs. Although interpreting and finding patterns in all the data could have only been done by a human, she heavily relied on technology to get the information she decoded.

At the end of the article, Madrigal reveals the Perry Mason effect, where there are an outstanding number of categories for a person most Americans today cannot name, making it clear that the algorithm cannot decide which information is unimportant or an outlier).

Altogether, this shows that although equations and technology are both essential in cataloguing the information we use today, there is no substitute for human effort.