Wordnet 3.0 in RDF

This RDF version of Wordnet 3.0 uses the following namespace:

http://purl.org/vocabularies/princeton/wn30/

Browsers requesting urls in this namespace will be redirected by purl.org to an HTML rendering of the requested resource. See wn30:synset-chair-noun-1 for an example. Semantic Web applications using the HTTP request header to explictly request application/rdf+xml will be redirected to a RDF/XML rendering of the symmetric concise bounded description of the resource. You can also override the request headers of your browser by adding a .rdf or .ttl suffix to the URL. See wn30:synset-chair-noun-1.rdf for an example.

Alternatively, you can browse or download the latest version of all source files directly from our git repository.

More good news: Wordnet 3.0 is now also present in the famous LOD cloud graph! Many thanks to Anja and Richard!

Mark, Antoine and I are busy cleaning up the schema files. This is not ready yet, but you can follow our progress on github.

I've published my mappings from Wordnet 3.0 synsets to Wordnet 2.0 synsets in the wn20mappings folder.

Note: The mappings in this folder are not based on any Princeton sourcefile. All erroneous mappings are my responsibility, not Princeton's.

Mapping statistics and origin:

The mappings in this folder have been created in multiple steps. The result of each step is reflected in a separate file.

In the RDF version we have 117,657 Wordnet 3.0 synsets to be mapped.

Step one: detecting synsets with identical label and gloss (103,339)

I've detected 103,339 Wordnet 3.0 synsets with a unique one-to-one mapping to a Wordnet 2.0 synset on the basis of having both an identical label and gloss. I assume these synsets correspond. Note that this first step covers already around 88% of all synsets.
Results are in file: glossmatches-m.ttl
An additional twelve Wordnet 3.0 synsets where found which had a mapping to two Wordnet 2.0 synsets, based on identical gloss and label, and two Wordnet 3.0 synsets that where both mapped to the same Wordnet 2.0 synset
Results are in file: glossmatches-p.ttl
Since the mappings in the file above are ambiguous, we will ignore them in the following steps.

Step two: detecting synsets with identical label and strong family resemblences.

For all the 3.0 synsets not having a one-to-one mapping already, I've looked at 2.0 synsets that have identical labels and:

Both have a matching broader and narrower synset in the hyponym that was already matched by an earlier step.
Results are in file: label-childparent-matches.ttl (1,272/1,550).
Only have a broader (based on hyponym, meronym or instance) match.
Results are in file: label-parent-matches.ttl (3,396/3,682).
Results are in file: label-meronym-matches.ttl (1,403/1,561).
Results are in file: label-instance-matches.ttl (507/486).
Only have a narrower (hyponym axis) match. Results are in file: label-child-matches.ttl (309/141).
If non of the above applies, but a label occurs only once in wn30 and also only once in wn20 (within the same part of speech), we consider the corresponding synsets to match as well.
Results are in file: label-unique-matches.ttl (1562/1200).
If non of the above applies, but the labels match and the glosses are very similar, we consider the corresponding synsets to match as well.
Results are in file: label-neargloss-matches.ttl (823/666).
Before saving the above 3 results, we have removed the synsets for which this step three resulted in ambiguous alignments, and saved this ambigous mappings in a separate file. Results are in file:
ambiguous-label-pc-matches.ttl (253/279).

Step three: rerun step two

We rerun step two multiple times to take advantage of the new mappings generated. Repeat until no new mappings are found (this was the case after three repetitions). The second number in the statics above shows the number on which this stabelizes.

Analysis of recall

This leaves us with 117657 - 103339 - 1550 - 3682 - 1561 - 486 - 141 - 1200 - 666 - 165 = 4869 unmapped synsets. These are in the file:
to_be_mapped.ttl (4869)

A quick manual inspection showed that many of these unmapped synsets are new senses of existing words. Improvements on the mappings will be posted on this site.

Today we publish the RDF version of Princeton Wordnet 3.0 as Linked Open Data in the following namespace:

http://purl.org/vocabularies/princeton/wn30/

Browsers requesting urls in this namespace will be redirected by purl.org to an HTML rendering of the requested resource. Semantic Web applications requesting application/rdf+xml will be redirected to a RDF/XML rendering of the resource.

Unlike the 2.0 version hosted by W3C, you can interactively search this version by using the textbox at the right of this page. You can use this, for example, to explore the many senses of the word bank.

Princeton Wordnet^3.0

in RDF

Wordnet 3.0 in RDF

We're in the LOD cloud!

Schema files being updated

Second beta release of Wordnet 3.0/2.0 mappings

Mapping statistics and origin:

Analysis of recall

Alpha release of Wordnet 3.0 in RDF

About Wordnet 3.0 in RDF

Acknowledgments

Princeton Wordnet3.0

in RDF

Wordnet 3.0 in RDF

We're in the LOD cloud!

Schema files being updated

Second beta release of Wordnet 3.0/2.0 mappings

Mapping statistics and origin:

Analysis of recall

Alpha release of Wordnet 3.0 in RDF

About Wordnet 3.0 in RDF

Acknowledgments

Princeton Wordnet^3.0