- Documentation
- Reference manual
- Packages
- SWI-Prolog SSL Interface
- library(crypto): Cryptography and authentication library
- SWI-Prolog SSL Interface
3.5 Hashes
A hash, also called digest, is a way to verify the integrity of data. In typical cases, a hash is significantly shorter than the data itself, and already miniscule changes in the data lead to different hashes.
The hash functionality of this library subsumes and extends that of
library(sha)
, library(hash_stream)
and library(md5)
by providing a unified interface to all available digest algorithms.
The underlying OpenSSL library (libcrypto
) is
dynamically loaded if
either library(crypto)
or library(ssl)
are loaded. Therefore, if your application uses library(ssl)
,
you can use library(crypto)
for hashing without increasing
the memory footprint of your application. In other cases, the
specialised hashing libraries are more lightweight but less general
alternatives to library(crypto)
.
3.5.1 Hashes of data and files
The most important predicates to compute hashes are:
- [det]crypto_data_hash(+Data, -Hash, +Options)
- Hash is the hash of Data. The conversion is
controlled by Options:
- algorithm(+Algorithm)
- One of
md5
(insecure),sha1
(insecure),ripemd160
,sha224
,sha256
,sha384
,sha512
,sha3_224
,sha3_256
,sha3_384
,sha3_512
,blake2s256
orblake2b512
. The BLAKE digest algorithms require OpenSSL 1.1.0 or greater, and the SHA-3 algorithms require OpenSSL 1.1.1 or greater. The default is a cryptographically secure algorithm. If you specify a variable, then that variable is unified with the algorithm that was used. - encoding(+Encoding)
- If Data is a sequence of character codes, this must be
translated into a sequence of bytes, because that is what the
hashing requires. The default encoding is
utf8
. The other meaningful value isoctet
, claiming that Data contains raw bytes. - hmac(+Key)
- If this option is specified, a hash-based message authentication code (HMAC) is computed, using the specified Key which is either an atom, string or list of bytes. Any of the available digest algorithms can be used with this option. The cryptographic strength of the HMAC depends on that of the chosen algorithm and also on the key. This option requires OpenSSL 1.1.0 or greater.
Data is either an atom, string or code-list Hash is an atom that represents the hash in hexadecimal encoding. - See also
- - hex_bytes/2 for conversion
between hexadecimal encoding and lists of bytes.
- crypto_password_hash/2 for the important use case of passwords.
- [det]crypto_file_hash(+File, -Hash, +Options)
- True if Hash is the hash of the content of File. For Options, see crypto_data_hash/3.
3.5.2 Hashes of passwords
For the important case of deriving hashes from passwords, the following specialised predicates are provided:
- [semidet]crypto_password_hash(+Password, ?Hash)
- If Hash is instantiated, the predicate succeeds iff
the hash matches the given password. Otherwise, the call is equivalent
to
crypto_password_hash(Password, Hash, [])
and computes a password-based hash using the default options. - [det]crypto_password_hash(+Password, -Hash, +Options)
- Derive Hash based on Password. This predicate is
similar to
crypto_data_hash/3 in
that it derives a hash from given data. However, it is tailored for the
specific use case of
passwords. One essential distinction is that for this use case,
the derivation of a hash should be as slow as possible to
counteract brute-force attacks over possible passwords.
Another important distinction is that equal passwords must yield, with very high probability, different hashes. For this reason, cryptographically strong random numbers are automatically added to the password before a hash is derived.
Hash is unified with an atom that contains the computed hash and all parameters that were used, except for the password. Instead of storing passwords, store these hashes. Later, you can verify the validity of a password with crypto_password_hash/2, comparing the then entered password to the stored hash. If you need to export this atom, you should treat it as opaque ASCII data with up to 255 bytes of length. The maximal length may increase in the future.
Admissible options are:
- algorithm(+Algorithm)
- The algorithm to use. Currently, the only available algorithm is
pbkdf2-sha512
, which is therefore also the default. - cost(+C)
- C is an integer, denoting the binary logarithm of the number
of iterations used for the derivation of the hash. This means
that the number of iterations is set to 2
^
C. Currently, the default is 17, and thus more than one hundred thousand iterations. You should set this option as high as your server and users can tolerate. The default is subject to change and will likely increase in the future or adapt to new algorithms. - salt(+Salt)
- Use the given list of bytes as salt. By default, cryptographically secure random numbers are generated for this purpose. The default is intended to be secure, and constitutes the typical use case of this predicate.
Currently, PBKDF2 with SHA-512 is used as the hash derivation function, using 128 bits of salt. All default parameters, including the algorithm, are subject to change, and other algorithms will also become available in the future. Since computed hashes store all parameters that were used during their derivation, such changes will not affect the operation of existing deployments. Note though that new hashes will then be computed with the new default parameters.
- See also
- crypto_data_hkdf/4 for generating keys from Hash.
3.5.3 HMAC-based key derivation function (HKDF)
The following predicate implements the Hashed Message Authentication Code (HMAC)-based key derivation function, abbreviated as HKDF. It supports a wide range of applications and requirements by concentrating possibly dispersed entropy of the input keying material and then expanding it to the desired length. The number and lengths of the output keys depend on the specific cryptographic algorithms for which the keys are needed.
- [det]crypto_data_hkdf(+Data, +Length, -Bytes, +Options)
- Concentrate possibly dispersed entropy of Data and then
expand it to the desired length. Bytes is unified with a list
of bytes of length Length, and is suitable as input
keying material and initialization vectors to the symmetric encryption
predicates.
Admissible options are:
- algorithm(+Algorithm)
- A hashing algorithm as specified to crypto_data_hash/3. The default is a cryptographically secure algorithm. If you specify a variable, then it is unified with the algorithm that was used.
- info(+Info)
- Optional context and application specific information, specified as an atom, string or list of bytes. The default is the zero length atom ''.
- salt(+List)
- Optionally, a list of bytes that are used as salt. The default is all zeroes.
- encoding(+Atom)
- Either
utf8
(default) oroctet
, denoting the representation of Data as in crypto_data_hash/3.
The
info/1
option can be used to generate multiple keys from a single master key, using for example values such askey
andiv
, or the name of a file that is to be encrypted.This predicate requires OpenSSL 1.1.0 or greater.
- See also
- crypto_n_random_bytes/2 to obtain a suitable salt.
3.5.4 Hashing incrementally
The following predicates are provided for building hashes incrementally. This works by first creating a context with crypto_context_new/2, then using this context with crypto_data_context/3 to incrementally obtain further contexts, and finally extract the resulting hash with crypto_context_hash/2.
- [det]crypto_context_new(-Context, +Options)
- Context is unified with the empty context, taking into
account
Options. The context can be used in crypto_data_context/3.
For
Options, see crypto_data_hash/3.
Context is an opaque pure Prolog term that is subject to garbage collection. - [det]crypto_data_context(+Data, +Context0, -Context)
- Context0 is an existing computation context, and Context
is the new context after hashing Data in addition to the
previously hashed data. Context0 may be produced by a prior
invocation of either crypto_context_new/2
or crypto_data_context/3
itself.
This predicate allows a hash to be computed in chunks, which may be important while working with Metalink (RFC 5854), BitTorrent or similar technologies, or simply with big files.
- crypto_context_hash(+Context, -Hash)
- Obtain the hash code of Context. Hash is an atom representing the hash code that is associated with the current state of the computation context Context.
The following hashing predicates work over streams:
- [det]crypto_open_hash_stream(+OrgStream, -HashStream, +Options)
- Open a filter stream on OrgStream that maintains a hash. The
hash can be retrieved at any time using crypto_stream_hash/2.
Available
Options in addition to those of crypto_data_hash/3
are:
- close_parent(+Bool)
- If
true
(default), closing the filter stream also closes the original (parent) stream.
- [det]crypto_stream_hash(+HashStream, -Hash)
- Unify Hash with a hash for the bytes sent to or read from HashStream. Note that the hash is computed on the stream buffers. If the stream is an output stream, it is first flushed and the Digest represents the hash at the current location. If the stream is an input stream the Digest represents the hash of the processed input including the already buffered data.