Raw data, conversion scripts and results

On this page, we list the input and intermediary results of the Amsterdam Museum conversion, as described in the ESWC 2012-submitted paper "Supporting Linked Data Production for Cultural Heritage institutes: The Amsterdam Museum Case Study". The steps we refer to are listed in the figure below:


Step 1: XML ingestion files

The XML consists of multiple XML parts: The collection metadata was provided in 73 files each of 1000 records. The download files can be found at the VUA Eculture GIT server. The concept thesaurus and person authority file were provided as separate XML files and can be downloaded directly from the GIT server: [people.xml(10.2 MB) and thesaurus.xml (9.3 MB)

Step 2 and 3: XMLRDF files

XMLRDF rewrite files:
Filename description link
rewrite_data.pl Rewrite rules for collection metadata
rewrite_people.pl Rewrite rules for Person Authority file
rewrite_thes.pl Rewrite rules for Thesaurus
util.pl Prolog utility predicates (re)used for the conversions


Step 2 and 3: Converted RDF

We here list pointers to the individual converted turtle files. The files are hosted on the Eculture Git server
The RDF Data files:
SizeFilename description link
240530197 am-data.ttl Collection Metadata (large)
5318673 am-thesaurus.ttl Thesaurus Data
8483717 am-people.ttl Person Authority File
2320 void.ttl VOID file, describing the dataset


Step 4: Create metadata schema mapping

Below, we list the schema mapping RDF files that map the Amsterdam Museum classes and properties to EDM classes and properties (DCterms, SKOS, RDA Group 2 Elements and a few EDM-specific classes and properties)
The RDF Schema files:
SizeFilename description link
10567 am-schema.ttl Collection Metadata Schema
855 am-thesaurus-schema.ttl Thesaurus Schema
2211 am-people-rdagr2-schema.ttl Person Schema (to RDA Group 2 Elements)
51590 ElementsGr2.rdf The RDA Group 2 Elements Schema


Step 5: Align vocabularies

The Amalgame alignment strategy used is detailed here. Below we list the mapping files that are the result of this strategy.
The mapping files:
SizeFilename description link
126129 am_to_aat_nonamb.ttl Thesaurus to AATNed (high-precision)
94957 am_to_aat_amb.ttl Thesaurus to AATNed (mid-precision)
3680 am_to_dbp_high.ttl Persons to DBPedia (high-precision)
28058 am_to_dbp_low_nonamb.ttl Persons to DBPedia (low-precision)
3680 am_to_geonl.ttl Thesaurus to Geonames (high-precision)
35141 am_to_ulannl_amb.ttl Persons to ULAN (mid-precision)
92206 am_to_ulannl_nonamb.ttl Persons to ULAN (high-precision)


Step 6: Publish as LOD

The converted, schema-mapped and aligned Amsterdam Museum Linked Open DAta is registered at CKAN. There, you can find example URIs, the link to the server and the SPARQL endpoint.