In the context of our Web Science and Semantic Web research and education program at the VU University Amsterdam we regularly build educational material. This web page lists our currently available tutorials about how to do science on and about the Web using the R statistical programming language and some external tools, like Gephi.
- R Tutorial - Alice in Wonderland: Analyzing Alice in Wonderland from the Gutenberg project. Word distributions and Information Retrieval basics.
- R Tutorial - Wikipedia as a Graph: Exploring the link graph network of Wikipedia in R and Gephi (at least version 0.8.1).
- R Tutorial - Facebook: Exploring the Facebook social network in R using the Facebook Open Graph API.
- SPARQL for R Tutorial - Linked Open Piracy: Analyzing Maritime Piracy in R using the SPARQL package on the Linked Open Piracy data set. This tutorial is about basic RDFS semantics, statistics, and mapping.
- SPARQL for R Tutorial - Hollywood Social Network Analysis: Analyzing the Hollywood movie social network in R using the SPARQL package on the DBpedia data set. This tutorial is about network analysis, and visualization in Gephi.
RequirementsThe following tutorials have been tested to run on RStudio, R 2.15.1 and Gephi 0.8.1. The following R libraries are used: SPARQL, igraph, XML, RCurl, rjson, sp, zoo, mapproj, ggmap. You can install these on Linux or OS X like this:
install.packages(c('SPARQL','sp','ggmap','mapproj','igraph','network','ergm','zoo','gsubfn','rjson'),dependencies=TRUE), and on Windows by copying the files from this directory to your computer and installing them like this:
install.packages(choose.files(), repos=NULL), and then selecting the downloaded zip files. If you are running a Linux system, it could be the case that you need additional libraries, like GLUT. A detailed description of how to install these on Ubuntu can be found here.
A self-contained RDF store
The Linked Open Piracy tutorial and the Hollywood Social Network Analysis tutorial make use of SPARQL queries on an RDF triple store. To make it easy for everybody to reproduce the results presented in these tutorials we provide a self-contained Jena Fuseki RDF triple store with all the data used in these tutorials. You can download a zipfile that contains all the data here: tutorial_data_jena_fuseki.zip. Just unzip the file, enter the jena-fuseki-0.2.4 directory and start either run.sh or run.bat. This starts the server, so that the SPARQL client in R can contact it on your own computer.
Some of the tutorials mentioned on this page make use of the SPARQL package for R. You can find more information about this package, which requires R 2.15, at http://cran.r-project.org/web/packages/SPARQL/. You can find more tutorials about SPARQL and R on http://linkedscience.org.
If you have any questions, remarks or additions, please contact me:
Willem Robert van Hage, E-mail
VU University Amsterdam R Tutorials by Willem Robert van Hage is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
The tutorials on this website have been developed in part within the COMBINE project supported by the ONR Global NICOP grant N62909-11-1-7060.