Week 3: Hybrid Human and Machine Intelligence: Pushing Boundaries

After reading Alexis Madrigal’s article, “How Netflix Reverse Engineered Hollywood,” it made me even more interested in the human work and technological software used to classify, or in this case, microtag information on the internet. I did not know what tagging or micotagging data on the Internet consisted of. The thing that interested me the most was that Netflix altered the system of tagging and microtagging by going deeper into the content of the movies featured on the website. In order to gain more content-based information and make the Netflix experience personal, Netflix’s Vice President of Product, Todd Yellin, and a Netflix crew used a mix of human and machine intelligence to create the Netflix Quantum Theory system of tagging movies and shows. The idea of combining both human and machine intelligence for a more powerful system reminds me of the book, The Singularity Is Near, written by computer scientist, futurist, and inventor, Ray Kurzweil.

Ray Kurzweil is a futurist, or in his terms, a Singulatarian. In his book he explains this theory of the Singularity. The Singularity is Near defines the term Singularity as “a future period during which the pace of technological change will be so rapid, its impact so deep, that human life will be irreversibly transformed.” Kurzweil’s idea of the Singularity emphasizes the emergence of transhumanism through technological advancements in society, which will create such immense progress in technology that it will lead to the transcendence of humanity to a post-human race.

Obviously the Netflix Quantum Theory is not as extreme as Ray Kurzweil’s theory of the Singularity; however, Yellin’s system of combining human and machine intelligence is comparable to Kurzweil’s theory of transformation from a human intelligent society into a machine intelligent society. Kurzweil thinks in order to become the most intelligent and successful society that the human race must combine with machine technology and then later transform to a post-human machine intelligent society. Technology like Netflix’s Quantum Theory system is constantly improving and becoming more personable to the people using these technologic devices. Yes, Netflix is not capable of taking over the human race, but the Quantum Theory is learning to ‘outsmart’ consumers by categorizing information of consumer interests in a way that manipulates the consumers to continue to watch Netflix movies and shows. After gaining an understanding for Netflix’s Quantum Theory, Madrigal even said, “But if Netflix’s system didn’t already exist, most people would probably say that it couldn’t exist either.” Who knows what boundaries of technology, in this case, categorizing and tagging data will be broken? A tagging system like Netflix has just added to this whirlwind of out-of-the-box yet logical systems and can only improve from the systems that exist today.

Works cited:
Kurzweil, Ray. The Singularity Is Near: When Humans Transcend Biology. New York: Penguin Group, 2005. Print.

Week 3 Blog Post

Screen Shot 2014-10-20 at 10.07.25 AM

After reading Alexis C. Madrigal’s article “How Netflix Reverse Engineered Hollywood”, I began to think about how well the Internet knows us. Madrigal, in her article, describes how she uncovered Netflix’s 76,897 unique movie genres, which, initially, does not appear to have much significance besides a good laugh over the ridiculous genre titles. However, this discovery provides insight into how Netflix uses their “hybrid human and machine intelligence approach” to provide movie suggestions to users based on the tags attached to each movie in his or her viewing history. Madrigal explains, “the underlying tagging data isn’t just used to create genres, but also to increase the level of personalization in all the movies a user is shown. So, if Netflix knows you love Action Adventure movies with high romantic ratings (on their 1-5 scale), it might show you that kind of movie, without ever saying, “Romantic Action Adventure Movies.” I know that I, along with every other Netflix binger, have fallen victim to their recommendation feature—It is simply too hard to pass up.

unnamed

 

Soon after reading Madrigal’s article, I began to realize that a majority of the websites and apps that I use on a daily basis attempt to manipulate their users in similar ways. Facebook, Instagram, and iTunes all employ specific methods to recommend certain products to their users. On Instagram, there are two different methods. First, after following a high profile page, immediately a tab drops down recommending three other possible pages that you might be interested in. The other method is on the explorer page, where they display pictures based on who your friends follow, or pictures you’ve previously viewed or liked. iTunes recommends music in similar ways; at the bottom of every page there are “Listeners Also Bought” and “Genius Recommendation” sections, which suggest music based on your purchase history.

Screen Shot 2014-10-20 at 10.56.36 AM

Facebook, comparably, recommends pages to follow but it also utilizes cookies as well. Cookies are “a small piece of data sent from a website and stored in a user’s web browser while the user is browsing that website. Every time the user loads the website, the browser sends the cookie back to the server to notify the website of the user’s previous activity (Wikipedia). The usage of cookies is easily noticeable by the advertisements displayed on your Facebook Newsfeed. I’ve always noticed that after I browse certain websites, my feed suddenly fills up with advertisements from that particular website and those similar to it. It is crazy to think just how well the Internet knows you.

Screen Shot 2014-10-20 at 11.11.20 AM

Works Cited:

 

 

 

Week 3 Blog Post

There is a consistent theme in many articles relating to digital humanities: metadata is important, and good categorization of information is essential for a digital database, website or exhibit to be functional. An ontology is a “formal framework for representing knowledge…and that framework names and defines the types, properties and interrelationships of the entities in a domain of discourse.” (Wikipedia).

In Local-Global, J. Wallack and R. Srinivasan highlight the importance of intersecting thought-out ontologies with information systems. Ontologies that are mismatched “impede communities’ ability to impart and communicate information and states’ ability to fully understand the territories they govern.” (link to the article)

Although ontologies are meant to represent some sort of reality, they can also be used to shape new realities. If properly designed and executed, these information systems can serve as incredible tools. They can, and should be used for better city planning (as discussed in J. Wallack and R. Srinivasan’s article), better health care systems, etc. The opportunities are endless, but it takes intersecting human thought and decision making with the power of digital tools. The authors of Local-Global write about improvements that need to be made to information systems, but some success stories already exist. Pandora created the still unsurpassed music library, in my opinion, by building the music genome project and Netflix created an online movie library. Both systems learn your preferences and tailor a unique experience. They have an extensive vocabulary and grammar system to categorize and describe their content, and follow impressive algorithms written to learn about the user. For both of these systems to be successful though, it took a perfect marriage of human intuition and decision making with the searching/learning/sorting functions of digital tools. I haven’t really heard anyone describe this better than A. Madrigal, in How Netflix Reverse Engineered Hollywood: “to me, that’s the key step: It’s where the human intelligence of the taggers gets combined with the machine intelligence of the algorithms.”

I extrapolate this “perfect marriage” to apply to the interaction of humans and technology in general, not just in information systems. Right now, wearable technology is growing in popularity and market size. There are tons of wearables already on the market, and new ones continue to emerge: Forbes covered Microsoft’s announcement of a wearable expected to be released this holiday season, which got a lot of attention. The “coming-soon” wearable that caught my attention, though, was will.i.am’s new PULS. Wearables are remarkable new pieces of technology—not only do they incorporate many of the same functions as your smart phone, but they also serve as metadata collectors. (PULS reportedly can even read your emotions!)

Mostly I have been most interested in the fitness and health trackers included in these devices. Wearables collect all kinds of information about you—your sleep habits, your daily activity and levels of exertion, among many others—then presents that information back to you in a way that can shape your decisions and future behavior. These devices connect humans to technology in a newly involved way. And, although I have been impressed and interested in all of the functions of these types of wearables, I have resisted entering the market. For some reason the watches (or ‘cuffs’ as will.i.am calls his new PULS devices) did not seem “human” enough. To me, they were all ugly and clunky, and certainly did not serve as a fashion statement. This is why will.i.am’s new PULS campaign caught my attention.

Human elements need to be incorporated in our development of new technology and information systems—and Pandora and Netflix serve as great testimonials. Will.i.am and his new brand FASHIONOLOGY believe something similar: that it is “inevitable that fashion and technology will come together”. People like me have been hesitant to enter the wearable market because it lacked a certain human element, fashionable design. In a recent press conference, former Vogue editor Andre Leon Talley stated: “it doesn’t matter if a gadget can organize my life and make my dinner, if it’s ugly to look at”, and I couldn’t agree more. He insists that closer collaboration between fashion and technology is urgently required—a collaboration between humans and technology that I think should be extended to all aspects of the digital humanities.

Check out will.i.am’s promotional video for his wearable cuff. His new brand i.am/FASHIONOLGY seeks to take the wearable market mainstream, and once you see the video it’s hard to resist the movement.

Week 3: A Beautiful Complex

webearth

After completing the reading, I initially found Madrigal’s Netflix quest to be daunting and failed to see the significance behind her efforts.  Simply put, Alexis Madrigal, with the aid of others, discovered that Netflix possessed 76,897 unique genres on their website.  As a result, this indicated how precise and descriptive the teams of taggers at Netflix truly were.  This high level of specificity has allowed Netflix to accurately provide suggestions to subscribers of what to watch based on their history.

 

I soon realized just how remarkable this complex and interrelated system was by comparing it to the densely-packed world wide web.  Michael Stevens from Vsauce goes into detail about the origins of the web and how it connects various sources through a nonlinear fashion.  This relates to the Netflix article in some sense because all of the tags that are created are linked to one another in some level.  As Madrigal describes, “every movie gets a romance rating, not just the ones labeled ‘romantic’ in the personalized genres.”  Furthermore, every movie’s plot is tagged, as well as the job of the actors and the locations.  Thus, all of the movies have a degree of similarity that they share and continue to make the system evermore complex.

 

During the beginning of the internet, information was organized illogically through a hierarchical method.  As explained by Michael, it was not until Tim Berners-Lee sought to change how information was connected to one another by writing a proposal.  In his Information Management: A Proposal, Tim desired a structure that would allow information to develop and evolve and reduce information loss.  He continued to argue that by having “web” of notes with links between information would be far more efficient than the fixed hierarchical system that was present at that time.  In other words, documents would be connected to one another through nonlinear ways, known as hypertexts, which would ultimately allow unification between the web and the internet.

 

information management

The way this all ties into the article regarding Netflix is by acknowledging how intricate the system or organization has become.    What started as simple relationships between information, or in Netflix’s case, tags, has developed into vast webs that have evolved through continual ingestion of new data and algorithms.  In the Netflix article, Todd Yellin, VP of product management, discusses the way in which these instances of unexplainable occurrences makes life interesting by serendipity.  He even states, “The more complexity you add to a machine world, you’re adding serendipity that you couldn’t imagine.”  To think that in a digital world of 1’s and 0’s there can still be surprising elements that cannot be entirely foreseen is in a sense quite beautiful.  Whether a bug or a feature, as both Todd and Madrigal described, these imperfections contribute to an intricately dense system, thus producing an element of surprise and excitement to an often-perceived realm of rigid analytics.

 

Work Citied:

1. Alexis C. Madrigal, “How Netflix Reverse Engineered Hollywood,” The Atlantic, January 2, 2014

2. http://www.w3.org/History/1989/proposal.html

3. Video Provided in Post

The Relationship Between Netflix and Pinterest

When reading this week’s article about Netflix’s use of metadata and use of categorizing genres, I was struck by the author’s question: “How do you systematically dismember thousands of movies using a bunch of different people who all need to have the same understanding of what a given microtag means?” This inquiry took me back to our discussion in class where we talked about how assigning a category to something implies a belief about that item or an ideology about the world that may not be universally held. If it were up to the viewers to assign the categories, their differences in perspective would therefore yield different interpretations of what the genres should be. The way Netflix was able to address this problem was through establishing a systemized rating system for different parts of movies; in other words, turning to the actual content to speak for itself when choosing a label for it. In this way, many different tangible parts of the movie came together to create a single, specific, and coherent genre for itself. By allowing the content itself to create the categories, the possibility of introspection (when genres tell the viewer not just what they would like, but what kind of things they would like) becomes possible and adds more to the viewer’s discovery of not just movies, but himself in general.

This introspection reminded me of Pinterest’s use of its “Guided Search” feature, described in the article “Pinterest puts metadata to good use with Guided Search” (http://www.techtimes.com/articles/6081/20140425/pinterest-puts-metadata-to-good-use-with-guided-search.htm ). Basically, the system uses user-generated metadata from the titles, comments, and descriptions made on individual pins to classify it with sub-categories that pop up when a user makes a broad search, allowing him or her to choose a more specific search within the broad category if he or she chooses. This use of metadata derived from the actual content of the pin allows users to stumble across subcategories that are actually pertinent to them, instead of being confined to only the website’s broad thirty-two categories. The more tailored your search, the more the system can detect the user’s specific likes, and thus make more suggestions to material it knows they will like. Similar to Netflix, this process also displays introspection in that it shows the user what kind of things they like, not just what they like. This reliance on content to complete the digital categorization of a topic mirrors that of the field of Digital Humanities in general. Our job is to unite human created content with technologically created classification systems to enhance the way we discover, view, and analyze information.

Netflix Recommendations – Netflix + Scandinavian Folklore

Everyone who watches Netflix knows how easy it is to be physically unable to stop watching Netflix. This is partly because of the solid recommendations it provides, but also due to how awesome it is streaming movie after movie via Xbox on a Saturday night accompanied with Lay’s potato chips and Diet Coke. Personally, I also noticed the strange genres that Netflix would come up with to classify the movie I just watched and a potentially compatible movie that is one click away. Users, like myself, really do take for granted all the work and effort put into creating the metadata for the classification of all the movies simply so they can watch one just like it in a matter of seconds. I also want to know where I can get a job that requires you to watch movies all day.

This recommendation feature on Netflix is very convenient for users, which made me think of the idea of how it could be applied to different media or resources even. I was reminded of the program that is used in Scandinavian C171, a class I am taking about Scandinavian folk narrative. The professor actually spent many years writing a book (Danish Folktales, Legends, and Other Stories) that includes a CD that has access to a created digital database of thousands of mostly Danish folktales. This program uses metadata to classify the stories and each have a call number (e.g. DS_VII_505) that resememble those used in libraries. Metadata is also used for recommending other tales, much like how Netflix recommends, except without the goofy genre titles.

Screenshot 2014-10-20 01.22.18 Screenshot 2014-10-20 01.22.28

As seen in the screenshots of the program above, the stories’ pages provide as vast amount of information. Not only do the pages provide original manuscript transcription and translate, a map to show the origin, and dates of when it was told, but also sections dedicated to associated keywords (blue), story indices (green), and recommended stories (red). These recommended stories, much like Netflix recommendations, are for the user to continue reading without stopping, which is complete possibly because the recommended stories have different recommended stories which have different recommended stories etc. Since this use of metadata for recommendation, as seen on Netflix, also can be applied to Scandinavian folktales, there is no limit to how other media can also be grouped together and recommended at this time.

Example keywords: mound dweller, troll, ghosts, mares, coins, bottle, toad

UCLA Dininghall metadata #yum

I absolutely loved the article about netflix and felt like sharing it with everyone I know that watches netflix. I do stand up comedy and the joke photos were so funny, it’s such good material to make jokes about….. but honestly the work done by the computer programs like AntCon and Alexis Madrigal was incredible. Regarding Netflix, I sometimes didn’t like how specific the altgenres get… for example, say you babysit a little kid and watch a show with them, for the next month you get suggestions for little kid TV shows…. which I always thought was kind of dumb. Sometimes I want to see things that are completely new to me. But I do understand their approach and I think it has been very successful for the most part.  It especially creates the conditions for binge watching which is kind of an American epidemic.
Anyway, I was going to talk about how cool Pandora’s Music Genome Project is, but reading this article on Howstuffworks.com made me realize that I myself have been part of a metadata analysis group right here at UCLA.

I still would like to compare what I did, to the Pandora project and below is an except from HowStuffWorks:

“Pandora relies on a Music Genome that consists of 400 musical attributes covering the qualities of melody, harmony, rhythm, form, composition and lyrics.It’s a project that began in January 2000 and took 30 experts in music theory five years to complete. The Genome is based on an intricate analysis by actual humans (about 20 to 30 minutes per four-minute song) of the music of 10,000 artists from the past 100 years. The analysis of new music continues every day since Pandora’s online launch in August 2005. As of May 2006, the Genome’s music library contains 400,000 analyzed songs from 20,000 contemporary artists. ”

When I was living in the dorms my senior year I became a part of the Distinguished Palate Committee, and that pretty much meant I got to eat food for free at the dining halls and then rate dishes and the over all atmosphere of the dining halls. At the dining hall Feast, they specifically targeted students of Asian descent so they could make sure each dish retained it’s authenticity and the students were considered experts in their field, kind of like the experts in music theory mentioned in the quote above. They also had computers at the front of each restaurant where students got to rate dishes based on temperature, presentation, taste, etc.  And because I worked as a taste tester I got to learn about how long it took them to develop the Bruin Plate menu and it actually took them years because of the balance they had to create between being healthy and also delicious.
I think the main thing to take away from all these articles and occurrences, is the illuminating understanding that it really takes a lot of work and data to make things such a song selection or plate of food look simple and easy.
http://computer.howstuffworks.com/internet/basics/pandora.htm

 

Week 3: DDC to Netflix

DDS

 

As we take a more in-depth look at methods of classification, I reminisced about a visit to my elementary school library, where I was first introduced to the Dewey Decimal System. Sat down in front of the librarian at the tender age of 11 or 12, she explained to us how they used this relatively simple system to categorize their awe-inspiring collection of books. First established back in 1876, then revised and expanded through over 20 major editions, the Dewey Decimal Classification (known as DDC, link) is a system of numbering books based on content. Information is divided into ten broad areas, and then from there these groups are broken up into smaller and more specific topics. Topics are given call numbers, which you can look up to see what books the library has on this topic. For example, “Tigers” are given the call number 599.756.

 

I enjoyed all of this week’s articles, but “How Netflix Reverse Engineered Hollywood” definitely stood out for me from the selection. Paired with my nostalgia involving my elementary school library, I couldn’t help but think of how far classification has progressed. The article featured how Netflix creates obscure, but helpfully user-specific genres for its subscribers. The site uses a “real combination: machine-learned, algorithms, algorithmic syntax” (link). The hybrid human and machine intelligence implemented by this system shows the development of classification as the world gravitates toward a digital focus. Netflix partially abandoned a system that depended solely on numerical values, like ratings, broadening their scope to involve a bit of human introspection.

 

While topics in the DDC are very broad, like “500 Math and Science,” or “800 Literature,” this article highlighted the outside the box methods used by Netflix, such as “quanta” and “microtags” to classify their film collection and personally tailor recommendations for their users. Other user-friendly digital media sites have come to prominence in recent years, especially in the music industry. For example, Pandora’s Music Genome Project has attempted a similar formula to achieve what Netflix has, but they haven’t yet reached the success of their movie-streaming counterparts. 8tracks also comes to mind with their widespread selection of ‘tags,’ where you can find a playlist tailored especially for a certain activity, such as “classical + studying,” or “electronic + gym.” It’ll be interesting to see who branches out next and tries to add their own personal spin to classification.

The Tiers of Categorization

Food Chain

All animals and living organisms are classified within a specific tier of the food chain. These classifications have been established and molded for centuries, and help to define the general flow of survival. In this week’s reading titled Sorting Things Out, the two concepts of classification and standards are broken down into concrete definitions. In my opinion, standards serve as the foundation that allows various forms of classification to occur. Without a specific set of standards or guidelines to associate with animals or objects, classification is essentially meaningless. The article states, “a standard spans more than one community of practice… it has temporal reach as well in that it persists over time”. The scope of the standards linked to the food chain has been transformed over the years and altered to allow new classifications to be possible as groundbreaking discoveries continue to be made regarding new species. All communities have accepted the set of standards that are tied to the multitude of unique food chains.

 

The food chain has become an accepted way of ranking superiority in our world. While the sun is often seen as the main cog that turns this wheel of life, different forms of food chains can be broken down and applied to more focused groups. This version of classification within certain pre-conceived categories helps to further specify and define the different levels of consumers and producers in our ecosystems. Without the ability to use intricate categorization, ensuring all aspects of all species involved within a food chain are hashed out, it is hard to tell where the public’s level of general knowledge towards other species and organisms would be. It is true that different communities may view the categorization of some species in separate ways, but as Sorting Things Out mentions, in practicing classification and the implementation of standards, objects must be “able to both travel across borders and maintain some sort of constant identity”. The overarching layout of the general food chain and its sub categories has become embedded in today’s society. The standards that have been developed over the years will continue to change with unpredictable discoveries and worldview changes. Certain classifications may seem to be set in stone and unarguable, but there is always potential that the standards could be slightly altered with time. Categorization under the boundaries set in place by certain standards is absolutely necessary to compartmentalize society and analyze its specificities, but the way people think and process information will never stop changing and will always have a direct affect on the categorization process.

 

Sources:

Selections from Bowker and Star, Sorting Things Out (Cambridge, Ma:
MIT, 1999).

 

http://education-portal.com/cimages/multimages/16/Trophiclevels.jpg

 

“Network thinking in ecology and evolution”. http://eeb19.biosci.arizona.edu/Faculty/Dornhaus/courses/materials/papers/Proulx%20Promislow%20Phillips%20networks%20ecol%20evol.pdf

Week 3: Netflix and Metadat

The article about how Netflix reverse engineered Hollywood was extremely interesting because not only was it relevant to what we have been learning about in class, but also to my everyday life. I love watching Netflix and most of my friends do as well, but most of my friends do not have the insight into Netflix categorization and metadata that I do. I have noticed the extended categories on Netflix before, but didn’t make the connection that this was metadata until reading this article. It makes sense that they want to categorize the movies as specifically as possible.

One of the first things the article mentioned that we talked about in class was the use of controlled vocabularies. In order to correctly categorize movies, they had to pick certain words and phrases to use and orders that these words should be put in. The article spends a lot of time dissecting the controlled vocabulary of the lengthy category descriptions and figuring out how all 90,000 categories were formed. It turned out that this metadata was not all of the metadata that was used for categorizing the movies, and actually not even close to scratching the surface. When the author met up with the man that made the actual categorizations it became apparent that the metadata that went into making the classifications was much more complicated than some controlled vocabulary tags. This metadata was made up of categories that rated each movie on its main characters, romance, likeability, and main actors. All of this metadata makes up what is chosen for the public metadata categories.

This metadata is what makes Netflix so successful at not only keeping subscribers, but also getting new ones. It makes sense that they would advertise similar movies next to movies that people are watching and appeal to what people want. Most people already understand this, but another thing I learned from this article is that more surprising is that they use all of this information also when they are creating shows. Wildly successful shows right now such as Orange is the New Black and House of Cards were created by the people of Netflix that have already been studying what people want. Through these shows they give people elements of television that they have observed as the most popular. After learning so much about Netflix’s system of categorization through metadata, I am very curious to learn more about the metadata of my other favorite websites like Facebook, Spotify, and Pandora.

Netflix Home Page