R Tutorials
R Tutorial - Facebook
By Rinke Hoekstra and Willem Robert van Hage.
In this tutorial, you will analyze your own social network by connecting to the Facebook Open Graph API, and visualizing the results in Gephi (you need at least version 0.8.1).
Again, we will use the the open source R language and RStudio program for statistical analysis.
Getting Started
- Start RStudio. It makes sense to start a new "project" in a directory of your choosing.
- This tutorial uses the rjson, RCurl, XML, igraph and bitops libraries. You can install these on Linux of OS X like this:
install.packages(c('rjson','RCurl','XML','igraph','bitops'),dependencies=TRUE)
. On Windows you will have to manually download the packages from this directory and install them like this:install.packages(choose.files(), repos=NULL)
and then select the downloaded packages. - Download
facebook.R
, save it to your project directory, and open it in RStudio. - Click on "Source" in the top right corner of the pane in which the file was loaded.
- Go to https://developers.facebook.com/tools/explorer, login with your Facebook account, and click on "Get Access Token". Be sure to tick the following boxes:
- User Data Permissions:
user_hometown
,user_location
,user_interests
,user_likes
,user_relationships
- Friends Data Permissions:
friends_hometown
,friends_location
,friends_interests
,friends_likes
,friends_relationships
- Extended Permissions:
read_friendlists
Click on the "Get Access Token" and paste the long string of characters that appears in the box after "Access Token" between the two parenthesis in the tutorial to
access_token
infacebook.R
. For instance:access_token <- "AAACEdEose0cBAAwgqS…O1DRHN506ObjNxv2P4hJ02ugH"
- User Data Permissions:
NB: This token may 'expire' during this lab session… in that case most of the functions won't work anymore. To fix this, go back to https://developers.facebook.com/tools/explorer and get a new access token.
The Facebook Graph API Explorer
The Facebook Graph API Explorer at https://developers.facebook.com/tools/explorer allows you to manually construct queries against the Facebook Graph API.
Question 1
- Try building a couple of queries, e.g. using your friends, or friends list, and look at the results
- What are suitable queries (i.e. what are suitable attributes of your profile) for constructing an Affiliation Graph?
Building a Friend Graph
We will now use the facebook.R
script to query your Facebook account for your friends, the friends you mutually share (Facebook does not allow us to see friends of friends that you are not connected to yourself)
As soon as you Source the facebook.R
script, it will use the facebook
function to query the Graph API for a list of friends:
friends <- facebook(path="me/friends", access_token=access_token)
It then creates two vectors, friends.id
and friends.name
that respectively hold a list of identifiers, and a list of names for your friends.
Call the friendships
function to call the Graph API for the mutual friends list of each friend, and construct a friendship matrix:
f_m <- friendships()
Create a graph out of it:
f_q <- friendgraph(f_m)
Write the graph to a GraphML file as follows:
write.graph(f_g, "friends.graphml",format="graphml")
(use getwd()
to see in what directory it was saved)
And open the graph in Gephi via the File menu.
Question 2
- Have a look at the friendship matrix
f_m
, what kind of matrix is this? - In Gephi, show the labels of the graph (in the usual way) an run some analyses.
- What modules does the graph contain? Do they correspond to reality?
- Who (apart from yourself) are the most central nodes in your graph. Explain.
- Who are the "unhappiest" people in your network? Explain the results
- To what extent does the strong triadic closure principle apply? Tip: create a complement of the graph using
unfriendship.graph <- graph.complementer(friendship.graph)
, and load it in Gephi. Or calculate the number of expected edges compared to the existing edges. Explain.
Building an Affiliation Graph
In question 1 you spent some thought on what focal points you could use to create a Affiliation Graph. Let's see how far we can get in building such a graph.
The frienddata
function creates a matrix (table) with each friend's name
,location
and hometown
attributes:
f_d <- frienddata()
Have a look at it.
Use last week's tricks to create a bipartite graph of your friends and one of these attributes. For instance, the location
attribute:
friend_location_matrix <- as.matrix(ifelse(table(f_d$name,f_d$location) > 0, 1, 0))
Create an bipartite graph from the friend_location_matrix
:
f_l <- graph.incidence(friend_location_matrix)
Have a look at the vertices:
V(f_l)
As you can see, the names of your friends appear first, and the locations they have appear second.
Add some labels for use in Gephi:
n_friends <- length(unique(f_d$name))
n_locations <- length(unique(f_d$locations))
V(f_l)[1:n_friends]$kind <- "friend"
V(f_l)[n_friends+1:n_locations]$kind <- "location"
Set the graph Label
to the name
to make life easier on Gephi, and save the graph to a Graphml file:
V(f_l)$Label <- V(f_l)$name
write.graph(f_l,"friend_location.graphml",format="grapheme")
Open the graph in Gephi and color the nodes according to the partition. What do you see?
Apply a filter to remove the 'none' location:
* Go to filters (on the right)
* Click on Attributes->Equal>kind
* Fill in (?!none).*
in the pattern box, and check "Use regex"
* Click ok, and then Filter
Question 3
- Does the network make any sense? What can you say about the locations, and the importance of the location in the different parts of your network?
- Add the first friend graph you created via the file menu. Make sure to check the
append graph
checkbox! - What can you say about the results now. How could we improve these results?
Question 4
- Do the same for the
hometown
attribute. - How do the results differ from the
location
attribute? What attribute is a stronger focal point? In what way?
NB: I can try to add functionality on request. If you think another attribute is more interesting…
VU University Amsterdam R Tutorials by Willem Robert van Hage is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
This tutorial has been developed within the COMBINE project supported by the ONR Global NICOP grant N62909-11-1-7060.