All predicatesShow sourcerdf_abstract.pl -- Abstract RDF graphs

The task of this module is to do some simple manipulations on RDF graphs represented as lists of rdf(S,P,O). Supported operations:

merge_sameas_graph(+GraphIn, -GraphOut, +Options)
Merge nodes by owl:sameAs
bagify_graph(+GraphIn, -GraphOut, -Bags, +Options)
Bagify a graph, returning a new graph holding bags of resources playing a similar role in the graph.
abstract_graph(+GraphIn, -GraphOut, +Options)
Abstract nodes or edges using rdf:type, rdfs:subClassOf and/or rdfs:subPropertyOf
Source merge_sameas_graph(GraphIn, GraphOut, +Options) is det
Collapse nodes in GraphIn that are related through an identity mapping. By default, owl:sameAs is the identity relation. Options defines:
predicate(-PredOrList)
Use an alternate or list of predicates that are to be treated as identity relations.
sameas_mapped(-Assoc)
Assoc from resources to the resource it was mapped to.
Source sameas_map(+Graph, +SameAs, -Map:assoc) is det[private]
Create an assoc with R->Set, where Set contains an ordered set of resources equivalent to R.
Source same_as(+Predicate:resource, +SameAs:list) is semidet[private]
True if Predicate expresses a same-as mapping.
Source representer_map(+List:list(Repr-Set), -Assoc) is det[private]
Assoc maps all elements of Set to its representer.
Source bagify_graph(+GraphIn, -GraphOut, -Bags, +Options) is det
If a graph contains multiple objects of the same type (class) in the same location in the graph (i.e. all links are the same), create a bag. The bag is represented by a generated resource of type rdf:Bag and the RDF for the bags is put in Bags. I.e. appending GraphOut and Bags provides a proper RDF model. Options provides additional abstraction properties. In particular:
class(+Class)
Try to bundle objects under Class rather than their rdf:type. Multiple of these options may be defined
property(+Property)
Consider predicates that are an rdfs:subPropertyOf Property the same relations.
bagify_literals(+Bool)
If true (default), also try to put literals into a bag. Works well to collapse non-preferred labels.
To be done
- Handle the property option
Source canonise_options(+OptionsIn, -OptionsOut) is det[private]
Rewrite option list from possible Name=Value to Name(Value)
Source group_resources_by_class(+Resources, -ByClass, +Options) is det[private]
ByClass is a list of lists of resources that belong to the same class. First step we process the classes specified in Options.
Source has_class(+Match, +Class, +Node) is semidet[private]
Source class_of(+Node, +Match, -Class) is det[private]
class_of(+Node, +Match, +Class) is semidet[private]
Source resource_bags(+ByClass:list(list(resource)), +NodeToEdges:list(node-list(edges)), -RawBags:list(list(resource))) is det[private]
Find bags of resources that have the same connections.
Source ord_subkeys(+Keys, +Pairs, -SubPairs) is det[private]
SubPairs is the sublist of Pairs with a key in Keys.
Arguments:
Keys- Sorted list of keys
Pairs- Key-sorted pair-list
SubPairs- Key-sorted pair-list
Source same_edges(+NodeToEdges:list(node-edges), -Bags:list(list)) is det[private]
Bags is a list of lists of resources (nodes) that share the same (abstracted) edges with the rest of the graph.
Source graph_node_edges(+Graph, -NodeEdges:assoc, +Options) is det[private]
NodeEdges is an assoc from resource to a sorted list of involved triples. Only subject and objects are considered.

Processes bagify_literals and property options

Source property_map(+Options, -Map:assoc(P-Super))[private]
Process the options, creating a map that replaces a property by its registered super.
Source abstract_property(+P0, +Map0, -P, -Map) is det[private]
Find the abstract property for some property P.
Source assign_bagids(+Bags:list(bag), -IDBags:list(id-bag))[private]
Assign bag identifiers to the each bag in Bags.
Source make_rdf_graphs(+IDBags, -RDFBags) is det[private]
Translate BagID-Members into an RDF graph.
Source merge_properties(+GraphIn, -GraphOut, +Options) is det[private]
Merge equivalent properties joining the same nodes. They are replaced by their common ancestors.
Arguments:
GraphIn- List of rdf(S,P,O)
GraphOut- List of rdf(S,P,O)
Options- Option list (unused)
Source common_ancestor_forest(:Pred, +Objects, -Forest) is det[private]
Forest is a minimal set of minimal spanning trees with real branching (more than one child per node) covering all Objects. The partial ordering is defined by the non-deterministic goal call(Pred, +Node, -Parent).
  • Build up a graph represented as Node->Children and a list of roots. The initial list of roots is Objects. The graph is built using breath-first search to minimize depth.
  • Once we have all roots, we delete all branches that have only a single child.
Arguments:
Forest- is a list of trees. Each tree is represented as Root-Children, where Children is a possibly empty list if sub-trees.
To be done
- First prune dead-ends?
rdf_db:rdf_global_term([ulan:assisted_by, ulan:cousin_of], In),
gtrace,
rdf_abstract:common_ancestor_forest(sub_property_of, In, Out).
Source keys_to_assoc(+Keys:list, +Value, -Assoc) is det[private]
True if Assoc is an assoc where each Key maps to Value.
Source ancestor_tree(+Open, +Closed, +Targets, :Pred, +NodesIn, -NodesOut, -Roots) is det[private]
Explore the ancestor graph one more step. This is the main loop looking for a spanning tree. We are done if
  • There is only one open node left and no closed ones. We found the single common root.
  • No open nodes are left. We have a set of closed roots which form our starting points. We still have to figure out the minimal set of these, as some of the trees may overlap others.
  • We have an open node covering all targets. This is the lowest one as we used breath-first expansion. This step is too expensive.
Source expand_ancestor_tree(+Open0, -Open, +Closed0, -Closed, +Nodes0, -Nodes, :Pred)[private]
Expand the explored graph with one level. Open are the currently open nodes. Closed are the nodes that have no parent and therefore are roots.
Arguments:
Nodes- is an assoc R->(State*list(Child))
Source add_parents(+Parents:list, +Child, -NR, +NRT, +Nodes0, -Nodes)[private]
Add links Parent->Child to the tree Nodes0. The difference list NR\NRT contains Parents added new to the tree.
Source in_tree(?Node, +Root, +Nodes) is nondet[private]
True if Node appears in the tree below Root.
Source prune_forest(+Nodes, +Roots, -MinimalForest) is det[private]
MinimalForest is the minimal forest overlapping all targets.
To be done
- Currently doesn't remove unnecessary trees.
Source prune_root(+Nodes, +Root0, -Root) is det[private]
Prune the parts of the search tree that ended up nowhere. The first real branch is where we find a solution or there are multiple parents. This avoids doing double work pruning the trees itself.
Source prune_ancestor_tree(Nodes, Root, Tree) is det[private]
Tree is a pruned hierarchy from Root using the branching paths of Nodes.
Source tree_covers(+Root, +Nodes, -Targets:list) is det[private]
True if Targets is the sorted list of targets covered by the tree for which Root is the root.
Source map_graph(+GraphIn, +Map:assoc, -GraphOut) is det[private]
Map a graph to a new graph by mapping all fields of the RDF statements over Map. Then delete duplicates from the resulting graph as well as rdf(S,P,S) links that did not appear before the mapping.
To be done
- Should we look inside literals for mapped types? That would be consistent with abstract_graph/3.
Source map_graph(+GraphIn, +Map:assoc, -GraphOut, -AbstractMap) is det[private]
Map a graph to a new graph by mapping all fields of the RDF statements over Map. The nodes in these graphs are terms of the form Abstract-list(concrete).
Arguments:
AbstractMap- assoc Abstract -> ordset(concrete)
Source pairs_keys_intersection(+Pairs, +Keys, -PairsInKeys) is det[private]
True if PairsInKeys is a subset of Pairs whose key appear in Keys. Pairs must be key-sorted and Keys must be sorted. E.g.
?- pairs_keys_intersection([a-1,b-2,c-3], [a,c], X).
X = [a-1,c-3]
Source map_to_bagged_graph(+GraphIn, +Map, -GraphOut, -Bags) is det[private]
GraphOut is a graph between objects and bags, using the most specific common ancestor for representing properties.
Source rdf_to_paired_graph(+GraphIn, -PairedGraph) is det[private]
Arguments:
GraphIn- Graph as list(rdf(S,P,O))
PairedGraph- Graph as list(S-list(O-P)), where the pair lists are key-sorted,
Source used_properties(+S0, +O0, +GraphIn, +AbstractMap, -PredList) is det[private]
Find properties actually used between two bags. S0 and O0 are the subject and object from the abstract graph.
Arguments:
GraphIn- original concrete graph represented as pairs. See rdf_to_paired_graph/2.
AbstractMap- Assoc Abstract->Concrete, where Concrete is an ordset of resources.
Source graph_resources(+Graph, -Resources:list(atom)) is det
Resources is a sorted list of unique resources appearing in Graph. All resources are in Resources, regardless of the role played in the graph: node, edge (predicate) or type for a typed literal.
See also
- graph_resources/4 distinguishes the role of the resources.
Source graph_nodes(+Graph, -Nodes) is det[private]
Nodes is a sorted list of all resources and literals appearing in Graph.
To be done
- Better name
Source graph_resources(+Graph, -Resources:list(atom), -Predicates:list(atom), -Types:list(atom)) is det
Resources is a sorted list of unique resources appearing in Graph as subject or object of a triple. Predicates is a list of all unique predicates in Graph and Types is a list of all unique literal types in Graph.
Source abstract_graph(+GraphIn, -GraphOut, +Options) is det
Unify GraphOut with an abstracted version of GraphIn. The abstraction is carried out triple-by-triple. Note there is no need to abstract all triples to the same level. We do however need to map nodes in the graph consistently. I.e. if we abstract the object of rdf(s,p,o), we must abstract the subject of rdf(o, p2, o2) to the same resource.

If we want to do incremental growing we must keep track which nodes where mapped to which resources. Option?

We must also decide on the abstraction level for a node. This can be based on the weight in the search graph, the involved properties and focus such as location and time. Should we express this focus in the weight?

Options:

map_in(?Map)
If present, this is the initial resource abstraction map.
map_out(-Map)
Provide access to the final resource abstraction map.
bags(-Bags)
If provided, bagify the graph, returning the triples that define the bags in Bags. The full graph is created by appending Bags to GraphOut.
merge_concepts_with_super(+Boolean)
If true (default), merge nodes of one is a super-concept of another.
Source node_map(+Nodes, +Map0, -Map, +Options) is det[private]
Create the abstraction map for the nodes of the graph. It consists of two steps:
  1. Map all instances to their class, except for concepts
  2. If some instances are mapped to class A and others to class B, where A is a super-class of B, map all instances to class A.
Source identity_map(+List, -Map) is det[private]
Source find_broaders(+List, +Map0, -Map) is det[private]
Source deref_map(+Map0, -Map) is det[private]
Source deref(+Pairs0, NewPairs) is det[private]
Dereference chains V1-V2, V2-V3 into V1-V3, V2-V3. Note that Pairs0 may contain cycles, in which case all the members of the cycle are replaced by the representative as defined by rdf_representative/2.
Source edge_map(+Edges, +MapIn, -MapOut) is det[private]
Source concept_of(+Resource, -Concept) is det[private]
True if Concept is the concept Resource belongs to. If Resource is a concept itself, Concept is Resource.
To be done
- Make thesaurus concept classes a subclass of skos:Class.
- Put in a reusable place, merge with kwd_search.pl
Source broader(+Term, -Broader) is nondet[private]
True if Broader is a broader term according to the SKOS schema.
To be done
- Deal with owl:sameAs (and skos:exactMatch)
Source rdf_representative(+Resources:list, -Representative:atom) is det[private]
Representative is the most popular resource from the non-empty list Resources. The preferred representative is currently defined as the resource with the highest number of associated edges.
To be done
- Think about the function. Use sum of logs or sum of sqrt?
Source minimise_graph(+GraphIn, -GraphOut) is det
Remove redudant triples from a graph. Redundant triples are defined as:
  • Super-properties of another property
  • Inverse
  • Symetric
  • Entailed transitive
To be done
- Implement entailed transitive