SWI-Prolog -- Manual

5.2.1 Predicates that operate on strings

Strings may be manipulated by a set of predicates that is similar to the manipulation of atoms. In addition to the list below, string/1 performs the type check for this type and is described in section 4.6.

SWI-Prolog's string primitives are being synchronized with ECLiPSe. We expect the set of predicates documented in this section to be stable, although it might be expanded. In general, SWI-Prolog's text manipulation predicates accept any form of text as input argument and produce the type indicated by the predicate name as output. This policy simplifies migration and writing programs that can run unmodified or with minor modifications on systems that do not support strings. Code should avoid relying on this feature as much as possible for clarity as well as to facilitate a more strict mode and/or type checking in future releases.

atom_string(?Atom, ?String)

Bi-directional conversion between an atom and a string. At least one of the two arguments must be instantiated. Atom can also be an integer or floating point number.

number_string(?Number, ?String)

Bi-directional conversion between a number and a string. At least one of the two arguments must be instantiated. Besides the type used to represent the text, this predicate differs in several ways from its ISO cousin:^{143Note that SWI-Prolog's
syntax for numbers is not ISO compatible either.}

If String does not represent a number, the predicate fails rather than throwing a syntax error exception.
Leading white space and Prolog comments are not allowed.
Numbers may start with '+' or '-'.
It is not allowed to have white space between a leading '+' or '-' and the number.
Floating point numbers in exponential notation do not require a dot before exponent, i.e., "1e10" is a valid number.

term_string(?Term, ?String)

Bi-directional conversion between a term and a string. If String is instantiated, it is parsed and the result is unified with Term. Otherwise Term is `written' using the option quoted(true) and the result is converted to String.

term_string(?Term, ?String, +Options)

As term_string/2, passing Options to either read_term/2 or write_term/2. For example:

?- term_string(Term, 'a(A)', [variable_names(VNames)]).
Term = a(_G1466),
VNames = ['A'=_G1466].

string_chars(?String, ?Chars)

Bi-directional conversion between a string and a list of characters (one-character atoms). At least one of the two arguments must be instantiated.

string_codes(?String, ?Codes)

Bi-directional conversion between a string and a list of character codes. At least one of the two arguments must be instantiated.

[det]text_to_string(+Text, -String)

Converts Text to a string. Text is an atom, string or list of characters (codes or chars). When running in --traditional mode, '[]' is ambiguous and interpreted as an empty string.

string_length(+String, -Length)

Unify Length with the number of characters in String. This predicate is functionally equivalent to atom_length/2 and also accepts atoms, integers and floats as its first argument.

string_code(?Index, +String, ?Code)

True when Code represents the character at the 1-based Index position in String. If Index is unbound the string is scanned from index 1. Raises a domain error if Index is negative. Fails silently if Index is zero or greater than the length of String. The mode string_code(-,+,+) is deterministic if the searched-for Code appears only once in String. See also sub_string/5.

get_string_code(+Index, +String, -Code)

Semi-deterministic version of string_code/3. In addition, this version provides strict range checking, throwing a domain error if Index is less than 1 or greater than the length of String. ECLiPSe provides this to support String[Index] notation.

string_concat(?String1, ?String2, ?String3)

Similar to atom_concat/3, but the unbound argument will be unified with a string object rather than an atom. Also, if both String1 and String2 are unbound and String3 is bound to text, it breaks String3, unifying the start with String1 and the end with String2 as append does with lists. Note that this is not particularly fast on long strings, as for each redo the system has to create two entirely new strings, while the list equivalent only creates a single new list-cell and moves some pointers around.

[det]split_string(+String, +SepChars, +PadChars, -SubStrings)

Break String into SubStrings. The SepChars argument provides the characters that act as separators and thus the length of SubStrings is one more than the number of separators found if SepChars and PadChars do not have common characters. If SepChars and PadChars are equal, sequences of adjacent separators act as a single separator. Leading and trailing characters for each substring that appear in PadChars are removed from the substring. The input arguments can be either atoms, strings or char/code lists. Compatible with ECLiPSe. Below are some examples:

% a simple split
?- split_string("a.b.c.d", ".", "", L).
L = ["a", "b", "c", "d"].
% Consider sequences of separators as a single one
?- split_string("/home//jan///nice/path", "/", "/", L).
L = ["home", "jan", "nice", "path"].
% split and remove white space
?- split_string("SWI-Prolog, 7.0", ",", " ", L).
L = ["SWI-Prolog", "7.0"].
% only remove leading and trailing white space
?- split_string("  SWI-Prolog  ", "", "\s\t\n", L).
L = ["SWI-Prolog"].

In the typical use cases, SepChars either does not overlap PadChars or is equivalent to handle multiple adjacent separators as a single (often white space). The behaviour with partially overlapping sets of padding and separators should be considered undefined. See also read_string/5.

sub_string(+String, ?Before, ?Length, ?After, ?SubString)

SubString is a substring of String. There are Before characters in String before SubString, SubString contains Length character and is followed by After characters in String. If not enough information is provided to compute the start of the match, String is scanned left-to-right. This predicate is functionally equivalent to sub_atom/5, but operates on strings. The following example splits a string of the form <name>=<value> into the name part (an atom) and the value (a string).

name_value(String, Name, Value) :-
        sub_string(String, Before, _, After, "="), !,
        sub_string(String, 0, Before, _, NameString),
        atom_string(Name, NameString),
        sub_string(String, _, After, 0, Value).

atomics_to_string(+List, -String)

List is a list of strings, atoms, integers or floating point numbers. Succeeds if String can be unified with the concatenated elements of List. Equivalent to

atomics_to_string(List, 
'', String)

atomics_to_string(+List, +Separator, -String)

Creates a string just like atomics_to_string/2, but inserts Separator between each pair of inputs. For example:

?- atomics_to_string([gnu, "gnat", 1], ', ', A).

A = "gnu, gnat, 1"

string_upper(+String, -UpperCase)

Convert String to upper case and unify the result with UpperCase.

string_lower(+String, LowerCase)

Convert String to lower case and unify the result with LowerCase.

read_string(+Stream, ?Length, -String)

Read at most Length characters from Stream and return them in the string String. If Length is unbound, Stream is read to the end and Length is unified with the number of characters read.

read_string(+Stream, +SepChars, +PadChars, -Sep, -String)

Read a string from Stream, providing functionality similar to split_string/4. The predicate performs the following steps:

Skip all characters that match PadChars
Read up to a character that matches SepChars or end of file
Discard trailing characters that match PadChars from the collected input
Unify String with a string created from the input and Sep with the separator character read. If input was terminated by the end of the input, Sep is unified with -1.

The predicate read_string/5 called repeatedly on an input until Sep is -1 (end of file) is equivalent to reading the entire file into a string and calling split_string/4, provided that SepChars and PadChars are not partially overlapping.^{144Behaviour that
is fully compatible would requite unlimited look-ahead.} Below are some examples:

% Read a line
read_string(Input, "\n", "\r", End, String)
% Read a line, stripping leading and trailing white space
read_string(Input, "\n", "\r\t ", End, String)
% Read upto , or ), unifying End with 0', or 0')
read_string(Input, ",)", "\t ", End, String)

open_string(+String, -Stream)

True when Stream is an input stream that accesses the content of String. String can be any text representation, i.e., string, atom, list of codes or list of characters.