R Tutorial - Facebook
In this tutorial, you will analyze your own social network by connecting to the Facebook Open Graph API, and visualizing the results in Gephi (you need at least version 0.8.1).
- Start RStudio. It makes sense to start a new "project" in a directory of your choosing.
- This tutorial uses the rjson, RCurl, XML, igraph and bitops libraries. You can install these on Linux of OS X like this:
install.packages(c('rjson','RCurl','XML','igraph','bitops'),dependencies=TRUE). On Windows you will have to manually download the packages from this directory and install them like this:
install.packages(choose.files(), repos=NULL)and then select the downloaded packages.
facebook.R, save it to your project directory, and open it in RStudio.
- Click on "Source" in the top right corner of the pane in which the file was loaded.
- Go to https://developers.facebook.com/tools/explorer, login with your Facebook account, and click on "Get Access Token". Be sure to tick the following boxes:
- User Data Permissions:
- Friends Data Permissions:
- Extended Permissions:
Click on the "Get Access Token" and paste the long string of characters that appears in the box after "Access Token" between the two parenthesis in the tutorial to
facebook.R. For instance:
access_token <- "AAACEdEose0cBAAwgqS…O1DRHN506ObjNxv2P4hJ02ugH"
- User Data Permissions:
NB: This token may 'expire' during this lab session… in that case most of the functions won't work anymore. To fix this, go back to https://developers.facebook.com/tools/explorer and get a new access token.
The Facebook Graph API Explorer
The Facebook Graph API Explorer at https://developers.facebook.com/tools/explorer allows you to manually construct queries against the Facebook Graph API.
- Try building a couple of queries, e.g. using your friends, or friends list, and look at the results
- What are suitable queries (i.e. what are suitable attributes of your profile) for constructing an Affiliation Graph?
Building a Friend Graph
We will now use the
facebook.R script to query your Facebook account for your friends, the friends you mutually share (Facebook does not allow us to see friends of friends that you are not connected to yourself)
As soon as you Source the
facebook.R script, it will use the
friends <- facebook(path="me/friends", access_token=access_token)
It then creates two vectors,
friends.name that respectively hold a list of identifiers, and a list of names for your friends.
friendships function to call the Graph API for the mutual friends list of each friend, and construct a friendship matrix:
f_m <- friendships()
Create a graph out of it:
f_q <- friendgraph(f_m)
Write the graph to a GraphML file as follows:
getwd() to see in what directory it was saved)
And open the graph in Gephi via the File menu.
- Have a look at the friendship matrix
f_m, what kind of matrix is this?
- In Gephi, show the labels of the graph (in the usual way) an run some analyses.
- What modules does the graph contain? Do they correspond to reality?
- Who (apart from yourself) are the most central nodes in your graph. Explain.
- Who are the "unhappiest" people in your network? Explain the results
- To what extent does the strong triadic closure principle apply? Tip: create a complement of the graph using
unfriendship.graph <- graph.complementer(friendship.graph), and load it in Gephi. Or calculate the number of expected edges compared to the existing edges. Explain.
Building an Affiliation Graph
In question 1 you spent some thought on what focal points you could use to create a Affiliation Graph. Let's see how far we can get in building such a graph.
frienddata function creates a matrix (table) with each friend's
f_d <- frienddata()
Have a look at it.
Use last week's tricks to create a bipartite graph of your friends and one of these attributes. For instance, the
friend_location_matrix <- as.matrix(ifelse(table(f_d$name,f_d$location) > 0, 1, 0))
Create an bipartite graph from the
f_l <- graph.incidence(friend_location_matrix)
Have a look at the vertices:
As you can see, the names of your friends appear first, and the locations they have appear second.
Add some labels for use in Gephi:
n_friends <- length(unique(f_d$name)) n_locations <- length(unique(f_d$locations)) V(f_l)[1:n_friends]$kind <- "friend" V(f_l)[n_friends+1:n_locations]$kind <- "location"
Set the graph
Label to the
name to make life easier on Gephi, and save the graph to a Graphml file:
V(f_l)$Label <- V(f_l)$name write.graph(f_l,"friend_location.graphml",format="grapheme")
Open the graph in Gephi and color the nodes according to the partition. What do you see?
Apply a filter to remove the 'none' location:
* Go to filters (on the right)
* Click on Attributes->Equal>kind
* Fill in
(?!none).* in the pattern box, and check "Use regex"
* Click ok, and then Filter
- Does the network make any sense? What can you say about the locations, and the importance of the location in the different parts of your network?
- Add the first friend graph you created via the file menu. Make sure to check the
- What can you say about the results now. How could we improve these results?
- Do the same for the
- How do the results differ from the
locationattribute? What attribute is a stronger focal point? In what way?
NB: I can try to add functionality on request. If you think another attribute is more interesting…
VU University Amsterdam R Tutorials by Willem Robert van Hage is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
This tutorial has been developed within the COMBINE project supported by the ONR Global NICOP grant N62909-11-1-7060.