Robots Reading Vogue – Fashionable Big Data

Fashion and data science. Sound incompatible? But this is what Robots Reading Vogue is all about. A joint project by Lindsay King and Peter Leonard, Robots Reading Vogue analyzed every Vogue Magazine since its inception and uncovered fresh and thought-provoking insights behind the leading fashion magazine.

Sources

125 years, 2,700 issues, 91,880 articles, 400,000 pages and 6TB of data – all of this become the building blocks of Robots Reading Vogue. Every issue of Vogue can be retrieved from the digital archive made available by Condé Nast, the mass media company that owns Vogue. Pictures, words, images, and color of the magazine will be processed computational and analyzed statistically to show patterns and trends.

Processing and Presentation

To explore the pattern of color, pictures and images need to be processed in order to be quantifiable. Images in a dataset are sliced into a uniform number of pieces and graphed into a histogram. Slice histogram – the product of this processing and visualizing technique, reveals the trend of brightness and saturation of Vogue magazine over the year.

A colormetric is also used to discover how colorful Vogue cover is over the year. First, the researchers use ImagePlot package to visualize Vogue cover quantitatively. ImagePlot outputs large and high quality images (uncompressed TIFF file). These images of Vogue cover are then plotted and placed on a graph, where user can zoom in to see the images more clearly. The researchers also use another web visualization framework – d3.js – to output large amount of individual images that can be moved and rearranged on axes in real time. I feel this is a good way to discover the color relationship since you can see the pattern easily when images of similar color are arranged together.

Word processing is also a major part of Robot Reading Vogue experiment. The team build a n-gram Search function that allows you to search and discover the frequency of word usage in more than 400,000pages of ads and articles. Users can input any words into the function and visualize its frequency on a time series graph in different metrics such as Words per Million, Percent of Words and etc.

Words are clustered to discover common theme. The team called this topics modelling. Researchers use computer-generated topics and statistical method to discover theme based on term coöccurence – how often words appear close to each other. Word Cloud is used to visualize which term is categorized in a certain theme, such as art, dressmaking and others. A graph is then plotted to show how dominant is a certain theme against time. This helps us learn about the editorial focus at different point of time.

In short, various methods and visualizing tools can be employed to discover trend behind almost everything – even for a fashion magazine.

Leave a Reply