Course blog

The Social Network of Instagram

After reading this week’s articles about networks, the various social networks I use were the first things to pop up in my mind (probably since I use these almost every day). For example, Instagram is a social network where friends can connect and share pictures with each other. Also, people can follow various celebrities and or favorite business to stay updated on all of their activities they wish to share. The two nodes involved with this social media are the user and their pictures. Thus, Instagram is a bimodal network. These two nodes are connected by an asymmetric edge; the edge in this case being “is the photographer of.” For example, “user A is the photographer of picture B.” However, one cannot switch the two nodes around, as in “picture B is the photographer of user A,” thus the asymmetry of this edge, called a directed edge.

However, Instagram is considered a social network because there are edges not just between user and picture, but between users and other users, other users and pictures, and pictures and other users. Users can connect with other users by following them, in which the second user may choose (or not choose) to follow back. Depending on if the following is reciprocal or not, this edge is either an undirected edge with a symmetric relationship (“user A is following user B” and equally “user B is following user A”) or a directed edge with an asymmetric relationship (“user A is following user B” but not vice versa because user B did not follow user A back). Users can “like” another user’s photo, thus connecting other users with other photos in an asymmetric directed edge (“user B likes user A’s photo”). One user’s photos may have tags of other users in them, connecting photos with different users in another asymmetric directed edge (“user A is tagged in user B’s photo”). From my understanding of what a dense network is (which may or may not be correct), Instagram serves as an example of one. A user and his photos are connected to other users and their photos in multiple edges, linking the social world together through a variety of interwoven relationships. Algorithms can detect trends in connections and put together a string of suggested photos a user may like based on similar connections of the other users they follow. More and more so, social networks are becoming a platform to discover new interests and people to connect with, in addition to connecting to friends and interests one already has.

 

http://instagram.com

Week 6

The “Demystifying Networks” article immediately made me think of my interpretation of the Google search engine. The discussion prompts me to understand algorithms as a modern form of a community. This approach lends itself to the idea of digital humanities. I understand algorithms as communities because they dictate exactly what we see on the Internet based on our previous searches. These dictations are different for every individual, and can shield people from gaining a complete and well-rounded view of the world. Similarly, communities of people establish the ways that individuals understand the world because historically they have been sheltered from seeing views outside of their own.

algorithm

 

When I type the word “Europe” into Google, I receive links to travel websites and news articles because in recent searches I have been planning my study abroad trips and looking up news articles for my classes. If my roommate who is a Dance Major types the word “Europe” into Google she gets links to performances and travel locations. This difference in results is key. My roommate will not receive nearly as many news articles about Europe and will thus not be informed of global happenings, despite the fact that she is interested in that aspect of society. Even further, if a stranger who has no interest in travel and who could not point out Europe on a map types the word “Europe” into a Google search, their results will be even more different. This reality shows the danger of blindly trusting algorithms. Weingart’s quote “Nothing worth discovering has ever been found in safe waters. Or rather, everything worth discovering in safe waters has already been discovered, so it’s time to shove off into the dangerous waters of methodology appropriation, cognizant of the warnings but not crippled by them” can act as a warning against complete algorithms and be advice to push past what is initially given to us and to discover new information.

W5 – Information Visualization Cont. & The Refugee Project

In “Humanities Approaches to Graphical Display,” Johanna Drucker suggests that we “rethink the foundation of the way data are conceived as capta by shifting its terms from certainty to ambiguity and find graphical means of expressing interpretive complexity.” She begins her paper by differentiating between “capta” (knowledge constructed through interpretive processes) and data (that which is observed and recorded). For her, the problem with information visualizations in the field of digital humanities is that they render capta as if it were data. She writes, “the digital humanities can no longer afford to take its tools and methods from disciplines whose fundamental epistemological assumptions are at odds with humanistic method.”

 

She proposes several approaches for representing the temporality of time and space in visualizations. Her modifications to a bar chart showing the number of new novels put into print by a single publisher in the years 1855-1862 (Fig. 3) is notable because it displays so much more than just numbers. It displays publication data in relation to the time of writing, acquisition, editing, pre-press work, release, etc. with color-coded timeline elements superimposed on the time axis. In this way, it presents the interpretive process behind the information displayed. However, non-traditional representations such as this add extra layers of complexity to information visualizations. Although this example is pretty straightforward – only one publication year is broken down into it’s relational components – things can get ugly quickly. Imagine a large data set displayed like this or like Fig. 9. There comes a point when Drucker’s approach confuses the reader and impedes their understanding of the information, which opposes the purpose of information visualization in the first place. In this way, there will always be a tradeoff between representing information in a way that conveys its true nature and context and representing information in a way that is easily understandable to the reader.

 

refugee_project_2

 

The Refugee Project ( http://www.therefugeeproject.org ) is an interactive, narrative, temporal map of refugee migrations since 1975. “UN data is complemented by original histories of the major refugee crises of the last four decades, situated in their individual contexts.” This visualization is a great example of information rendered in a way that is simple to understand, yet multi-faceted and descriptive in nature. The map view displays numbers of refugees per country (represented by circles of various sizes) and where they fled to (on mouse hover). Statistics and quantitative information are linked to historical events with narrative information. In addition, there is a timeline feature for the map and different view options (country of origin/country of asylum, refugees/[refugees/population]). Although the people are in the end treated as numbers, The Refugee Project does an excellent job of presenting “the big picture.”

W6 – Networks and Software Flow

In his blog post “Demystifying Networks: An Introduction,” Scott Weingart explains the underlying concepts behind networks and how they can be applied to digital humanities work. His most basic definition of a network is “stuff and relationships.” He outlines several compatibility issues that arise when subjective digital humanities stuff is linked by complex and interpretive relationships. First, he argues that the tools available to graph networks are not suitable for nuanced stuff. He writes, “as it stands now, network science is ill-equipped to deal with multimodal networks. 2-mode networks are difficult enough to work with, but once you get to three or more varieties of nodes, most algorithms used in network analysis simply do not work.” In addition, digital humanities data often must be cut or cleaned to fit existing network methodologies and algorithms. Second, networking is not suitable for all datasets and it can create misleading relationships. Network relationships often add “a layer of interpretation not intrinsic in the objects themselves.” Despite these challenges “network analysis remains a viable methodology for answering and raising humanistic questions – we simply must be cautious, and must be willing to get our hands dirty editing the algorithms to suit our needs.”

uiFlow

 

Weingart’s blog post got me thinking of other uses for networks. One of the most interesting applications of networks is user interface (UI) design. UI designers are tasked with designing software interfaces that can accommodate a dizzying array of use cases. Site maps, wireframes, and UI flows are important methods of visualizing the relationships between the content, screens, and code of an app, website, or other piece of software. All of these things qualify as networks although they are different in nature from the networks described in Weingart’s blog post. First, most UI networks are uni-modal, meaning that there is only one type of stuff. For example, site maps are networks of pages where each page is a node connected by explicit edges on the web. Second, most UI networks have asymmetrical, directed edges. In his blog post, “A shorthand for designing UI flows,” Ryan of Basecamp explains his method of sketching out flows ( https://signalvnoise.com/posts/1926-a-shorthand-for-designing-ui-flows ). The relationship between these event nodes are asymmetrical in nature because node order matters; “what the user sees next” doesn’t cause “what the user sees.” This kind of chronology is inherent in UI networks because their purpose is to present many, interconnected use cases. Ryan’s networking scheme is useful because it combines the visuals information of wireframes and with the functional information of UI flows. Each node contains visual and functional information to provide a bigger picture of how the interface drives the user and the other way around. Ryan’s shorthand is unique because it allows for bimodal networks in a field of largely unimodal ones.

323-flow-template

325-login-flow

 

Week Six: Network Analysis

alisnetwork-1-1

 

The first thing that came to mind when reading Demystifying Networks was infamously creepy social network LinkedIn. Weingart introduces the basics of networks along with the inevitable challenges that come with them. The blog post is directed, as Weingart notes, to digital humanists. Therefore, the issues are directed at humanist scholars who face the challenge of dealing with data that is “uncertain, open to interpretation, flexible, and not easily definable”.

This is where I began thinking about social networks. Weingart warns of the dangers of using networks to analyze data. First, “networks can be used on any project. Networks should be used on fewer”. Second, “methodology appropriation is dangerous”; scientific approach, as we know, does not always map on neatly to a humanist one. Social networks connect people. I am not sure how nodes and edges work within social networks, but I assume that these are in use for websites’ features like “People You May Know”.

There are many articles online that question LinkedIn’s analysis techniques. For example, David Veldt’s article LinkedIn: The Creepiest Social Network for Interactually.com takes a critical look at some of the site’s functions and features. I don’t personally have a LinkedIn account but know from friends and family that use it that they often see the most random, unexpected people pop up in their LinkedIn “People You May Know” section. Veldt lists a couple examples of his own experience with “People You May Know”. The suggestions it comes up with are often inexplicable – it seems that LinkedIn has no possible way of knowing that this person is your mailman’s cousin! It even sometimes suggests the name of someone you know, but is not actually that person (just the same name).

Veldt attempts to analyze LinkedIn’s established network. Although I am not positive, it is pretty safe to assume that LinkedIn has some system of “edges”, which Weingart defines as descriptive links that connect nodes. I believe this is what Veldt is after – what is LinkedIn using to inform its edges? LinkedIn’s Help Center quotes only two factors that the “People You May Know” section is based on: “Commonalities between you and other members. For example, you may have common connections, similar profile information and experiences, work at the same company or in the same industry, or attend the same school” and “Members you’ve imported from other address books in your Contacts list”. Veldt discovers that there are pre-checked boxes within his account that allow LinkedIn to share his data with third party applications as well as giving information about his site visits to pages that use LinkedIn plugins. However, Facebook (as Veldt suspected) is not listed as one of these plugins. The mystery remains…

I am genuinely interested in how LinkedIn succeeds in such creepiness. This example resonates with Weingart’s opinion on humanist approach to data (as far as I understand how LinkedIn works). Weingart argues, “Unfortunately, given that humanist data are often uncertain and biased to begin with, every arbitrary act of data-cutting has the potential to add further uncertainty and bias to a point where the network no longer provides meaningful results. The ability to cut away just enough data to make the network manageable, but not enough to lose information, is as much an art as it is a science”. Plotting links between human relationships seems so complicated, but LinkedIn somehow masters it to an uncomfortable degree.

 

http://www.interactually.com/linkedin-creepiest-social-network/

Six Degrees of Separation

2000px-Six_degrees_of_separation.svg

Scott Weingart in this week’s reading “Demystifying Networks” discusses the basics of networks. He describes networks as “a net-like arrangement of threads, wires, etc...It later came to stand for any complex, interlocking system.” Networks cannot stand on their own; they are interdependent on connections between all of the little parts that make them up. Weingart further explains, “Network analysis generally deals with one or a small handful of types of stuff, and then a multitude of examples of that type.” In Weingart’s example, he uses books and authors as his nodes—which are basically an assortment of stuff. Nodes also have attributes, like page number, title, birth and death, etc. The combination of books and authors makes it a bimodal network and if we add publishers, then it is multimodal. By doing this, each book is connected to an author, who is then connected to one or more publishers (Weingart). Ultimately, as expected, these connections form relationships and, in this case, an authorship relationship.

This concept of networks and everything that they are comprised of (nodes, relationships, types, etc.) immediately made me think of six degrees of separation. Six degrees of separation is “the theory that everyone and everything is six or fewer steps away, by way of introduction, from any other person in the world, so that a chain of “a friend of a friend” statements can be made to connect any two people in a maximum of six steps” (Wikipedia). Everyone is essentially connected to each other through a vast web of friends, acquaintances, and strangers. The creator of this theory, Frigyes Karinthy, proposed that “the modern world was ‘shrinking’ due to [the] ever-increasing connectedness of human beings…He [believed] that despite great physical distances between the globe’s individuals, the growing density of human networks made the actual social distance far smaller.” This was back in 1929 when the “technological advances” in communication and travel were far less developed than they are today. Due to the progress in modern technology, the six degrees, nowadays, is probably more around three.

JA-Degrees-of-Separation

I have personal experience in this “six degrees of separation” phenomenon. Because of the multitude of social networks available on the Internet it is hard to not be constantly in contact with people across the world. When I was accepted into college and looking for a roommate I was contacted by and put into contact with many friends of friends, or many “my best camp-friend’s older brother’s ex-girlfriends” who I happened to know of through seeing pictures of them on Facebook. In the ever-increasing technological world, instances like this will only become more common and maybe even one day there will only be a two or one degree of separation.

Works cited:

Scott Weingart, “Demystifying Networks

http://en.wikipedia.org/wiki/Six_degrees_of_separation

Week 6: Social Network Analysis

SNA Graph

This week’s readings got me interested in the science behind social network analysis. Network theory involves nodes, which represent individuals within the network, along with ties, which show the various ways these individuals come together. This could be through a number of factors, including friendships, organizations, hobbies, and other topics of that nature. A social network diagram shows nodes represented by points, and ties as lines. Social network analysis has been able to provide a mathematical way of analyzing human relationships. For example, management consultants have implemented it with their business clients for what they call Organizational Network Analysis. Scott Weingart’s blog post “Demystifying Networks” looks at how networks are being created so frequently that it’s difficult to keep up with the network’s true meaning. His definition of network as “a net-like arrangement of threads and wire” gives an easy visualization to what is actually a complicated subject. His use of authors and books set a simple stage for me before diving into wide array of social network analysis topics.

Team Sports

While typing in “social network analysis of…” on Google, the first auto fill options were “…terrorist organizations in India, …Alice and Wonderland, …a criminal hacker community.” I didn’t know where to start; all the options seemed equally obscure but attention grabbing. After looking through a few of these various, unorthodox topics that could be studied through social network analysis, I stumbled across, “The Application of Social Network Analysis to Team Sports,” by Dean Lusher. The study allowed for the simultaneous examination of social relations with the individual-level qualities from members of the team. By incorporating a range of attitudes, behaviors, along with other individual-level attributions, an examination was reached on how these may affect and be affected by team structures. Players were asked whom they considered friends on their team along with whom they saw as the most influential. After, they were asked who they viewed as the ‘best’ player on the team, and anyone who received more than five notes was denoted with a black node. The image above shows how the ‘best’ players on a team formed their own group that was connected, but still separate from the other white nodes. This illustration shows two social networks (friendship and influence) coupled with an individual-level attribute (playing ability). It was interesting to see a digital angle applied to this very human subject.

Link 1

Link 2

Link 3

Week 6 Blog Post

network data and our newly digital interactions with information

“Data visualization is the presentation of data in pictorial or graphical format. For centuries, people have depended on visual representations such as charts and maps to understand information more easily and quickly.” Data Visualization: What it is and why it is important.

Digitized archives and data visualizations are incredibly powerful tools. They give users the ability to make sense of large amounts of information—allowing them to form questions, make predictions, learn lessons and even plan future actions. As these new mediums of displaying information become more prominent, it is important to understand their limitations and drawbacks, as well as to understand how they are transferred from raw data to their meaningful digitized form. In my blog post last week I discussed in relatively more detail an approach that might be taken when learning from and interpreting data visualizations. Just as we are taught in our statistics classes, for example, to approach graphs and charts skeptically—to ensure they are not misleading or mistaken—we as scholars should also approach data visualizations with a bit of skepticism. In order to best utilize tools such as digital archives or data visualizations, we must first understand the process by which this information is transposed. A great overview of how data can be sorted and “networked” was given in the blog post Demystifying Networks. The author first recognizes that “humanities scholars are often dealing with the interactions of many types of things, and so the algorithms developed for traditional network studies are insufficient for the networks we often have.” He then goes to note “humanists also struggle with fitting square pegs in round holes. Humanistic data are almost by definition uncertain, open to interpretation, flexible, and not easily definable.” Not only is humanities data not easily transferrable in to a digital form, but many initial decisions must be made before the information is transformed—so that the digital form presents the information in a way that supports the author’s arguments or intentions. When interacting with live maps, timelines or online archives for example, we rarely consider the work and decisions that had to be made to publish those works. Lauren Klein, in The Image of Absence: Archival Silence, Data Visualization, and James Hemings, makes the point that “as scholars, we do not see the labor involved in transcribing manuscripts into machine-readable text, nor do we think of the discussions—equal parts technical and theoretical—that contribute to the development of the encoding standards and database design that allow us to perform our search queries”. We live and learn in an age of digitized information, and since digital form is relatively new, people don’t quite understand the inputs that make it possible. Because of these tools we can interact with data in a completely new way, we just need to educate ourselves on the inputs that are required for the tools as final products. We must understand these inputs, just as we must understand chart and graph standards in statistics, especially so we can look for and correct mistakes—with new, powerful tools like google refine.

The inputs we must understand include data networking and mapping, employing controlled vocabularies when entering information in to a database, and potential errors and bias that may be present in visualizations.

Although the video below is relatively dry it emphasizes the importance of controlled vocabularies, and again highlights the incredible amount of back work that must be completed for this tools to be usable.

Intertwining Tweets

Every single thing in this world is connected in one way or another. Some may be connected on an extremely basic level, but these connections always exist. In Scott Weingart’s blog post titled “Demystifying Networks”, he discusses how many of these “networks” are created each and every day. More specifically, he touches on how many assumptions are made when various networks are established as well as the risks associated with them. The analogy “when you’re given your first hammer, everything looks like a nail” outlines the practice of creating too many networks and complicating an idea or object’s true purpose or intention. The vast array of unique networks can each include a certain topic or specific event; rendering it difficult to be associated with one specific emotion or purpose.

Twitter-map

Twitter is a type of social network that incorporates millions of these “sub-networks”. The image above shows the most prevalent hashtags used on twitter following the Boston Marathon bombing. Every time a user posted a tweet using one of these hashtags, it immediately connected their post and their profile to a huge network of other Twitter users discussing the same related event. Users established networks using singular hashtags such as “#prayforboston” and “bostonstrong”, which then all combined to form the more comprehensive network shown in the picture that related to the entire Boston Marathon tragedy as a whole. This example shows how many extensive networks can branch off from one singular event or subject, and how many individuals can become instantly connected to a network through just one hashtag they post to Twitter.

Social media sites in general have exponentially increased the amount of networks created, and have made the process of creating a new network nearly effortless. Going off of Weingart’s personal beliefs pertaining to the subject, we need to ensure that we do not lose the “theoretical and logical caveats” associated with all of these networks. Naturally, everyone will have differing feelings and outlooks towards specific networks that are created, and we must recognize this difference and understand its implications. Although a certain hashtag on twitter may connect a string of user’s posts and ideas on a network, by no means does that make the unique feelings and ideas embedded within the posts exactly the same. It is difficult to decipher the level of connectedness across different networks, which is why we must use caution when analyzing any data drawn from them.

 

Sources:

  1. Scott Weingart, “Demystifying Networks” http://www.scottbot.net/ HIAL/?p=6279
  2. Image: http://www.washington.edu/news/files/2014/03/Twitter-map.jpg

Writing Without Words: Networking Jack Kerouac’s On The Road

Stephanie Posavec’s data visualization of “On the Road” part 1

For this blog post I originally wanted to find a network showing the relationships of Beat Artists to one another, using the edge “has shared a romantic partner”. Although I had no such luck in this endeavor, I came across Stephanie Posavec’s “Writing Without Words” project, which uses different types of data visualization to “visually represent the rhythm and structure of Kerouac’s literary space, creating works that are not only gorgeous from the point of view of graphic design, but also exhibit scientific rigor and precision in their formulation: meticulous scouring the surface of the text, highlighting and noting sentence length, prosody and themes, Posavec’s approach to the text is not unlike that of a surveyor.”(https://netmap.wordpress.com/tag/jack-kerouac).

The space is Jack Kerouac’s On the Road, Part 1. The first branch of the network is chapters, which are broken up into paragraphs, sentences, and words. These are directed edges (sentences can be broken up into words, but words cannot be broken up into sentences). The words are color coded to indicate the theme they correspond to, some of the themes being Social Events & Interaction, Travel, Work & Survival, and Character Sketches.

I interpret this visualization as, broadly, a representation of the themes in Part One of On The Road. It has been broken up further, but it would make the same amount of sense if there was one point in the middle representing this space, with colored lines coming off it representing themes, weighted to show how common each theme is. However, such a network would fail to show the spirit of Kerouac’s work. With the structure Posavec has used, we can also see Kerouac’s writing style and get a sense of the book itself: average paragraph length, chapter length, and sentence structure. Even so, this is a very simple unimodal network.

I was debating weather this project could even be considered a network, as it almost seems too straight forward. But thinking through Weingart’s “Demystifying Networks”, it seems to be applicable. First, the variables are interdependent rather than independent. Second, the relationships can be described with the term of “x is the theme of this paragraph”. The edges are inferred, because other readers could break the themes up in a different ways than Posavec. Not all the nodes are connected, as in not all the words/ sentences relate to the same theme, which makes the network effective.

Prosavec’s network of On the Road Part One is a good example of a digital humanities use for networks because it maps a literary work itself and seeks to capture the spirit of the book without using words.