rdf_optimise.pl

rdf_optimise(:Goal) is nondet

Optimise Goal and execute the result. Semantically equivalent to call/1.

To be done: - Module handling is not correct.

rdf_optimise(+Goal, -Optimized) is det

Goal is a Prolog control-structure with calls to the rdf_db.pl and SeRQL runtime predicates. Optimized is a semantically equivalent goal, obtained by re-ordering conjunctions in Goal and processing sub-queries that do not share variables independently.

reorder(Goal, Reordered)[private]

Assuming Goal is a conjunction, create permutations of its order. Instead of blindly generating permutations however, we get an arbitrary element of the conjunction to the front and compute subgraphs of goals connected through variables after executing the first goal.

reorder_subgraph_conjs(SubGraphs, -ReorderedSubGraphs)[private]

Reorder the individual subgraphs. As we know these are independent there is no need to order the subgraphs themselves. we also know they are fully connected, and no longer contain cuts, so we use the simplified version of reorder_conj/2 called reorder_conj2/2.

group_by_cut(+List, -BeforeCut, -Cut, -AfterCut)[private]

Split the conjunction over a cut (!) as we cannot guarantee proper behaviour when moving goals to the other side of a cut.

group_by_optional(+List, -NotOptional, -Optional)[private]

Split the conjunction in optional and non-optional part as we always want the optional part to happen last.

bind_args(Goal, !State)[private]

Bind all arguments in Goal or list of goals. Assumes that executing a goal grounds all its arguments. Only the goal A = literal(B), generated by optimising where constraints is handled special.

State is a term bindings(List) that is destructively maintained by instantiate/4.

make_subgraphs(+Goals, -SubGraphs)[private]

Create a list of connected subgraphs from Goals, assuming the variables in the assoc Grounded have been bound.

Arguments:

Goals - is a list goal(Id, Goal, Vars).

unbound_vars(+Goal, -Vars) is det[private]

True when Vars is an ordered set of unbound variables in Goal.

conj_to_list(+Conj, -List)[private]

Translate a conjunction into a list of elements of the format

% goal(Id, Goal, Vars)

Where Id is a goal identifier, Goal  is the goal itself and Vars
is a list of variables inside the  Goal. Variables are sorted to
the standard order of terms.

list_to_conj(+List, -Conj)[private]

id(G, Id) is det[private]

goal(G, Goal) is det[private]

vars(G, Vars) is det[private]

Extract fields from the goal structure.

instantiate_obj(+Arg, -Old, +New, !Bindings) is det[private]

Called to change the state of an RDF object after running some goal. Old is the current instantiation. After executing the RDF Goal the new instantiation is New (typically b). Bindings is a term bindings(List), which is updated using destructive assignment. List is the list of all variables to which we added attributes.

instantiate_unify(A, B, State) is det[private]

We encounter a plain unification. Ideally, we determine the most specific binding and propagate this and execute the unification. The latter however must be undone if we evaluate a disjunction. For now, the code is somewhat simplified and we merely decide that if one side is instantiated, the other will be too, disregarding the actual value we might know.

attr_unify_hook(+Attribute, +Value)[private]

For now, the attribute unify hook only allows unifying with a variable with the same attribute. This deals with the unification that takes place in rdf_optimise/3 for the variables of the saved copy.

instantiate_term(+Term)[private]

Instantiate all unbound variables

uninstantiate_term(+Term, +How)[private]

Remove all skolem instantiations from Term.

delete_instantiated(+List, -Unbound)[private]

Delete all elements of List that are not instantiated (i.e. var and without an instantiated attribute).

rdf_complexity(+GoalIn, -GoalOut, -SpaceEstimate, -TimeEstimate)[private]

Provide an estimate for the size of the search space for executing Goal. For this we estimate the branching factor of each subgoal in the conjunction. If the branching factors are B0, B1, ... then the total complexity estimate is

E = 1 + B0 + B0*B1 + B0*B1*B2, ...

Non-RDF calls are supposed to be boolean tests that can be executed at the first opportunity all arguments are bound by RDF calls. They have a probability of failure, reducing the search space. Using the above formula, any number lower than 1 moves the test as far as possible to the head of the conjunction.

If GoalIn and GoalOut are the same the system will not try to optimize local conjunctions.

ISSUES: control structures ;, if-then-else, etc.

carth_complexity(+Bags, +State, +Size0, -Size, +Time0, +TimeSum0, -TimeSum) is det[private]

Compute the time and space efficiency of the carthesian product. the total cost in time is the sum of the costs of all branches, The search space at the end is still the same product.

independent_complexity(+GoalsIn, -Goals, +State, +Size0, -Size, +Time0, +TimeSum0, -TimeSum) is det[private]

Compute the complexity of an independent conjunction.

Arguments:

Goals - is a list of g(Goal, Branch, Cost), where Branch is the number of solutions expected fromGoal and Cost is the estimated CPU time.

simplify_carthesian(+Bags, -Goal) is det[private]

Peer of the leading little branching goals from the carthesian handler. If there is a bag of one left, turn it into a normal goal.

complexity0(+SI, +PI, +OI, +P, +G, -Setup, -PerAlt, -Branch)[private]

SI,PI,OI describe the instantiation pattern. P is the predicate and G is the actual goal. Complexity is unified to an estimate of the size of the search space and therefore an estimate of the execution time required to prove the goal.

Literal `like' matches come out as +(like(Pattern)). We must estimate the percentage of the literals that match this pattern. Suppose the factor is 1,000. This means the branching is reduced by 1,000, but finding each solution is slow as it requires a linear scan. It is faster than going all the way back to Prolog backtracking however, so we estimate a factor 10 (TBD: verify that number).

ISSUES: rdf_has/3 vs rdf_reachable/3.

rdf_db_goal(+Goal, -Subject, -Predicate, -Object)[multifile]

True if Goal is a pure (logical) predicate on the RDF database involving the given Subject, Predicate and Object. Defined multifile, allowing the optimiser to understand user-defined rdf/3 like predicates.

To be done: - Allow specifying different costs and branching factors

pattern_filter(+Like, -Factor)[private]

Estimate the efficiency of a pattern. This is a bit hard, as characters are not independent.

rdf_estimate_complexity(+Goal, -Complexity)[private]

Estimate the branching factor introduced by Goal. This uses the rdf_db statistics of the hash chains which are based on exploiting the RDFS subPropertyOf reasoning.

In addition, rdf_reachable/3 introduces its own complexity which must be estimate using the branching factor of the relation.

c_bindings(+AtEnd, +AtStart, +CVars, -Vars)[private]

The variables of the difference-list AtEnd..AtStart are conditionally bound. Tag each such variable with CVars.

Arguments:

CVars - Term c(Vars), where Vars are the other variables that have the same conditional binding.

c_bind(+Vars, +State)[private]

Do unconditional binding of Vars. Var may be in a rdfql_bind_null/1 set. In that case, delete it from the set, and set the class to 'c' to make a conditional binding at the end.

id(G, Id) is det[private]

goal(G, Goal) is det[private]

vars(G, Vars) is det[private]

Extract fields from the goal structure.

id(G, Id) is det[private]

goal(G, Goal) is det[private]

vars(G, Vars) is det[private]

Extract fields from the goal structure.

Undocumented predicates

The following predicates are exported, but not or incorrectly documented.

rdf_optimise(Arg1, Arg2, Arg3, Arg4)
serql_select_bind_null(Arg1, Arg2)
rdf_complexity(Arg1, Arg2, Arg3)