serialize — Serialisation Infrastructure¶
Warning
Internal module — subject to change without notice.
Do not import these functions directly. Use the public serialisation
methods on NestedDictionary instead
(to_json, from_json, to_pickle, from_pickle).
Private serialization infrastructure for ndict_tools.
This module is not part of the public API. It provides the low-level helpers
used by the serialization methods on _StackedDict (to_json, from_json,
to_pickle, from_pickle).
Contents¶
_encode_key/_decode_key: JSON key encoding via type-tagged string prefixNestedDictionaryEncoder:json.JSONEncodersubclass_make_decoder_hook: factory forobject_pairs_hook_pickle_dump/_pickle_load: pickle helpers with SHA-256 verification
Design decisions¶
JSON key encoding (design decision #87): JSON mandates string keys; Python supports arbitrary hashable keys. Non-string keys are encoded as
__type__:valuetagged strings (e.g.,__int__:42,__tuple__:(1, 2)). Decoding usesast.literal_evalfor safe reconstruction oftupleandfrozensetvalues. Known limitation: string keys that already start with a__type__:prefix are indistinguishable from encoded keys.API placement (design decision #94): Serialization methods (
to_json,from_json,to_pickle,from_pickle) are defined on_StackedDictand delegate to the private helpers below via lazy imports. This creates an accepted import cycle betweentools.pyand this module.
Key encoding (DD-021)¶
- ndict_tools.serialize._encode_key(key: Any) str¶
Encode a
_StackedDictkey to a JSON-safe string.JSON mandates string keys. This function maps any supported hashable Python key to a unique, reversible string using a type-tagged prefix of the form
__type__:value(e.g., integer42→"__int__:42", tuple(1, 2)→"__tuple__:(1, 2)"). Plain string keys are passed through unchanged. Known limitation: a string key that already starts with a recognised prefix (e.g."__int__:42") is indistinguishable from an encoded integer key after a round-trip.- Parameters:
key (Any) – The key to encode. Supported types:
str,int,float,bool, flattupleof scalars, flatfrozensetof scalars.- Returns:
A JSON-safe string representing the key.
- Return type:
str
- Raises:
StackedTypeError – If
keyis of an unsupported type.
Examples
>>> _encode_key("hello") 'hello' >>> _encode_key("[42]") '\\[42]' >>> _encode_key(42) '[42]' >>> _encode_key(3.14) '[3.14]' >>> _encode_key(True) '[True]' >>> _encode_key((1, 2)) '[(1, 2)]' >>> _encode_key(frozenset({1, 2})) '[frozenset{1, 2}]'
Notes
boolis tested beforeintbecauseboolis a subclass ofint.floatvalues userepr()to preserve full precision.frozensetis unordered: element order in the encoded form is not guaranteed. Round-trip preserves set equality, not element ordering.
- ndict_tools.serialize._decode_key(encoded: str) Any¶
Decode an encoded JSON key back to its original Python type.
Applies five sequential decoding rules in priority order:
__bool__:prefix →bool(checked beforeintto avoid misclassification).__int__:prefix →int.__float__:prefix →float.__frozenset__:prefix →frozenset(viaast.literal_eval).__tuple__:prefix →tuple(viaast.literal_eval).No recognised prefix → plain
str(identity).
No
eval()is used.ast.literal_eval()is used only for flat tuples of Python scalars, which are valid Python literals by definition.- Parameters:
encoded (str) – The encoded JSON key string produced by
_encode_key.- Returns:
The original Python key.
- Return type:
Any
Examples
>>> _decode_key("hello") 'hello' >>> _decode_key('\\[42]') '[42]' >>> _decode_key("[42]") 42 >>> _decode_key("[3.14]") 3.14 >>> _decode_key("[True]") True >>> _decode_key("[(1, 2)]") (1, 2) >>> _decode_key("[frozenset{1, 2}]") frozenset({1, 2})
Notes
Decoding rules (applied in order):
Starts with
\[→ strip\, return asstr.Starts with
[frozenset{and ends with}]→ parse inner CSV of scalars, return asfrozenset.Starts with
[(and ends with)]→ast.literal_eval(), return astuple.Starts with
[and ends with]→ infer scalar type (True/False→bool;.ore→float; elseint).Otherwise → return as
strunchanged.
JSON encoder¶
- class ndict_tools.serialize.NestedDictionaryEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)¶
Bases:
JSONEncoderJSON encoder for
_StackedDictinstances.Encodes
_StackedDict(and subclasses) as plain nesteddict, applying_encode_keyto all non-string keys so that the JSON output is valid and fully reversible.Used internally by
_StackedDict.to_json. Not part of the public API.Examples
>>> import json >>> from ndict_tools import NestedDictionary >>> nd = NestedDictionary({"a": {"b": 1}}) >>> json.dumps(nd, cls=NestedDictionaryEncoder) '{"a": {"b": 1}}'
Constructor for JSONEncoder, with sensible defaults.
If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, float or None. If skipkeys is True, such items are simply skipped.
If ensure_ascii is true, the output is guaranteed to be str objects with all incoming non-ASCII characters escaped. If ensure_ascii is false, the output can contain non-ASCII characters.
If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an RecursionError). Otherwise, no such check takes place.
If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.
If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.
If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation.
If specified, separators should be an (item_separator, key_separator) tuple. The default is (’, ‘, ‘: ‘) if indent is
Noneand (‘,’, ‘: ‘) otherwise. To get the most compact JSON representation, you should specify (‘,’, ‘:’) to eliminate whitespace.If specified, default is a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a
TypeError.- default(o: Any) Any¶
Implement this method in a subclass such that it returns a serializable object for
o, or calls the base implementation (to raise aTypeError).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
- encode(o: Any) str¶
Return a JSON string representation of a Python data structure.
>>> from json.encoder import JSONEncoder >>> JSONEncoder().encode({"foo": ["bar", "baz"]}) '{"foo": ["bar", "baz"]}'
- iterencode(o: Any, _one_shot: bool = False)¶
Encode the given object and yield each string representation as available.
For example:
for chunk in JSONEncoder().iterencode(bigobject): mysocket.write(chunk)
- ndict_tools.serialize._make_decoder_hook(cls: type, class_options: dict[str, Any]) Callable[[...], Any]¶
Return an
object_pairs_hookthat reconstructs a_StackedDict(or subclass) from JSON key-value pairs.- Parameters:
cls (type) – The
_StackedDictsubclass to instantiate.class_options (dict) – Keyword arguments forwarded to
cls.from_dict, must includedefault_setup.
- Returns:
A hook suitable for
json.load(..., object_pairs_hook=hook).- Return type:
Callable
Pickle helpers¶
- ndict_tools.serialize._pickle_dump(nd: Any, path: str | Path, protocol: int | None = None) None¶
Write a
_StackedDictto a pickle file with a SHA-256 sidecar.Writes two files: -
<path>— the pickle file -<path>.sha256— hex digest of the pickle bytes- Parameters:
nd (_StackedDict) – The object to pickle.
path (str or Path) – Destination file path.
protocol (int, optional) – Pickle protocol (default:
pickle.DEFAULT_PROTOCOL).
- Warns:
UserWarning – Always emits a warning reminding callers that pickle is unsafe with untrusted files.
- ndict_tools.serialize._pickle_load(path: str | Path, verify: bool = True) Any¶
Load a
_StackedDictfrom a pickle file, optionally verifying its SHA-256 sidecar.- Parameters:
path (str or Path) – Path to the pickle file.
verify (bool, optional) – If
True(default), read the.sha256sidecar and raiseStackedValueErrorif the digest does not match or the sidecar is absent.
- Returns:
The unpickled object.
- Return type:
Any
- Raises:
StackedValueError – If
verify=Trueand the digest mismatches or the sidecar is absent.- Warns:
UserWarning – Always emits a warning reminding callers that pickle is unsafe with untrusted files.