Persistent Hashing and Persistent Dictionaries#
This module contains functionality that allows hashing with keys that remain valid across interpreter invocations, unlike Pythonβs built-in hashes.
This module also provides a disk-backed dictionary that uses persistent hashing.
- class pytools.persistent_dict.KeyBuilder[source]#
A (stateless) object that computes hashes of objects fed to it. Subclassing this class permits customizing the computation of hash keys.
- rec(key_hash, key)[source]#
- Parameters:
key_hash β the hash object to be updated with the hash of key.
key β the (immutable) Python object to be hashed.
- Returns:
the updated key_hash
Changed in version 2021.2: Now returns the updated key_hash.
- static new_hash()#
Return a new hash instance following the protocol of the ones from
hashlib
. This will permit switching to different hash algorithms in the future. Subclasses are expected to use this to create new hashes. Not doing so is deprecated and may stop working as early as 2022.New in version 2021.2.
- class pytools.persistent_dict.PersistentDict(identifier, key_builder=None, container_dir=None)[source]#
A concurrent disk-backed dictionary.
- __init__(identifier, key_builder=None, container_dir=None)[source]#
- Parameters:
identifier β a file-name-compatible string identifying this dictionary
key_builder β a subclass of
KeyBuilder
- class pytools.persistent_dict.WriteOncePersistentDict(identifier, key_builder=None, container_dir=None, in_mem_cache_size=256)[source]#
A concurrent disk-backed dictionary that disallows overwriting/deletion.
Compared with
PersistentDict
, this class has faster retrieval times.- __init__(identifier, key_builder=None, container_dir=None, in_mem_cache_size=256)[source]#
- Parameters:
identifier β a file-name-compatible string identifying this dictionary
key_builder β a subclass of
KeyBuilder
in_mem_cache_size β retain an in-memory cache of up to in_mem_cache_size items