Persistent Hashing and Persistent Dictionaries¶
This module contains functionality that allows hashing with keys that remain valid across interpreter invocations, unlike Python’s built-in hashes.
This module also provides a disk-backed dictionary that uses persistent hashing.
-
class
pytools.persistent_dict.
KeyBuilder
[source]¶ A (stateless) object that computes hashes of objects fed to it. Subclassing this class permits customizing the computation of hash keys.
-
rec
(key_hash, key)[source]¶ - Parameters
key_hash – the hash object to be updated with the hash of key.
key – the (immutable) Python object to be hashed.
- Returns
the updated key_hash
Changed in version 2021.2: Now returns the updated key_hash.
-
static
new_hash
()¶ Return a new hash instance following the protocol of the ones from
hashlib
. This will permit switching to different hash algorithms in the future. Subclasses are expected to use this to create new hashes. Not doing so is deprecated and may stop working as early as 2022.New in version 2021.2.
-
-
class
pytools.persistent_dict.
PersistentDict
(identifier, key_builder=None, container_dir=None)[source]¶ A concurrent disk-backed dictionary.
-
__init__
(identifier, key_builder=None, container_dir=None)[source]¶ - Parameters
identifier – a file-name-compatible string identifying this dictionary
key_builder – a subclass of
KeyBuilder
-
__getitem__
(key)¶
-
__setitem__
(key, value)¶
-
clear
()¶
-
store_if_not_present
(key, value, _stacklevel=0)¶
-
-
class
pytools.persistent_dict.
WriteOncePersistentDict
(identifier, key_builder=None, container_dir=None, in_mem_cache_size=256)[source]¶ A concurrent disk-backed dictionary that disallows overwriting/deletion.
Compared with
PersistentDict
, this class has faster retrieval times.-
__init__
(identifier, key_builder=None, container_dir=None, in_mem_cache_size=256)[source]¶ - Parameters
identifier – a file-name-compatible string identifying this dictionary
key_builder – a subclass of
KeyBuilder
in_mem_cache_size – retain an in-memory cache of up to in_mem_cache_size items
-
__getitem__
(key)¶
-
__setitem__
(key, value)¶
-
store_if_not_present
(key, value, _stacklevel=0)¶
-