PublicShow sourcepure_input.pl -- Pure Input from files and streams

This module is part of pio.pl, dealing with pure input: processing input streams from the outside world using pure predicates, notably grammar rules (DCG). Using pure predicates makes non-deterministic processing of input much simpler.

Pure input uses attributed variables to read input from the external source into a list on demand. The overhead of lazy reading is more than compensated for by using block reads based on read_pending_codes/3.

Ulrich Neumerkel came up with the idea to use coroutining for creating a lazy list. His implementation repositioned the file to deal with re-reading that can be necessary on backtracking. The current implementation uses destructive assignment together with more low-level attribute handling to realise pure input on any (buffered) stream.

To be done
- Provide support for alternative input readers, e.g. reading terms, tokens, etc.
Source phrase_from_file(:Grammar, +File) is nondet
Process the content of File using the DCG rule Grammar. The space usage of this mechanism depends on the length of the not committed part of Grammar. Committed parts of the temporary list are reclaimed by the garbage collector, while the list is extended on demand due to unification of the attributed tail variable. Below is an example that counts the number of times a string appears in a file. The library dcg/basics provides string//1 matching an arbitrary string and remainder//1 which matches the remainder of the input without parsing.
:- use_module(library(dcg/basics)).

file_contains(File, Pattern) :-
        phrase_from_file(match(Pattern), File).

match(Pattern) -->
        string(_),
        string(Pattern),
        remainder(_).

match_count(File, Pattern, Count) :-
        aggregate_all(count, file_contains(File, Pattern), Count).

This can be called as (note that the pattern must be a string (code list)):

?- match_count('pure_input.pl', `file`, Count).

Undocumented predicates

The following predicates are exported, but not or incorrectly documented.

Source stream_to_lazy_list(Arg1, Arg2)
Source phrase_from_stream(Arg1, Arg2)
Source phrase_from_file(Arg1, Arg2, Arg3)
Source syntax_error(Arg1, Arg2, Arg3)
Source lazy_list_location(Arg1, Arg2, Arg3)
Source lazy_list_character_count(Arg1, Arg2, Arg3)