structured capacity

the Hemera sponge state is 16 Goldilocks field elements. the first 8 (rate) absorb input and produce output. the last 8 (capacity) carry structural context — they are never overwritten by input data. this is how Hemera achieves domain separation without multiple modes.

layout

state[0..8]   rate       input absorption / output squeezing
state[8..16]  capacity   structural context, never touched by input

state[8]      counter    chunk position in file (0-based, u64)
state[9]      flags      structural role (bitfield)
state[10]     msg_length total input byte count (sponge finalization)
state[11]     domain_tag API mode selector
state[12]     ns_min     namespace lower bound (NMT only, zero otherwise)
state[13]     ns_max     namespace upper bound (NMT only, zero otherwise)
state[14]     reserved   must be zero
state[15]     reserved   must be zero

flags (state[9])

three single-bit flags, combined via bitwise OR:

FLAG_ROOT   = 0x01    this hash finalizes a tree root
FLAG_PARENT = 0x02    this hash combines two child hashes (internal node)
FLAG_CHUNK  = 0x04    this hash derives a leaf chaining value

valid combinations:

context flags value
plain sponge hash (none) 0x00
non-root leaf CHUNK 0x04
root leaf (single-chunk file) CHUNK | ROOT 0x05
non-root internal node PARENT 0x02
root internal node (tree root) PARENT | ROOT 0x03

flags encode what the hash IS, not what it contains. a flag combination that does not appear in the table is invalid.

domain tags (state[11])

DOMAIN_HASH             = 0x00    plain hash (default)
DOMAIN_KEYED            = 0x01    keyed hash (MAC)
DOMAIN_DERIVE_KEY_CTX   = 0x02    key derivation — context phase
DOMAIN_DERIVE_KEY_MAT   = 0x03    key derivation — material phase

domain tags are set before the first absorption and never modified. they are orthogonal to flags — a keyed hash of a Merkle leaf would have state[9] = FLAG_CHUNK and state[11] = DOMAIN_KEYED.

counter (state[8])

the counter tracks chunk position within a file. chunk 0 gets counter 0, chunk 1 gets counter 1, and so on. the counter prevents chunk reordering: the same data at position 0 and position 5 produces different chaining values.

the counter is set during the structural binding pass of hash_leaf (see tree). plain sponge hashes leave it at zero.

message length (state[10])

total input byte count, stored during sponge finalization. this prevents length extension attacks and distinguishes messages of different lengths that would otherwise produce the same padded block.

namespace bounds (state[12..14])

used only by NMT (Namespace Merkle Tree) nodes. ns_min and ns_max commit the namespace range of a subtree into the hash. when both are zero, hash_node_nmt reduces to hash_node.

the verifier checks: parent.ns_min ≤ left.ns_max < right.ns_min ≤ parent.ns_max (for sorted NMT). namespace bounds enable completeness proofs — cryptographic proof that nothing was withheld for a given namespace.

how capacity provides domain separation

different contexts produce different hashes because different capacity values enter the permutation — not because different functions are called:

plain hash:      state[9] = 0x00, state[11] = 0x00
keyed hash:      state[9] = 0x00, state[11] = 0x01
leaf chunk:      state[8] = counter, state[9] = 0x04
root chunk:      state[8] = counter, state[9] = 0x05
internal node:   state[9] = 0x02
root node:       state[9] = 0x03
NMT node:        state[9] = 0x02, state[12] = ns_min, state[13] = ns_max

capacity fields are mixed into every permutation output. two hashes with identical rate input but different capacity values produce completely different outputs. this is how Hemera maintains one function, one mode, and still prevents cross-context collisions.

security invariant

the capacity region is never XORed with input data. input absorption touches only state[0..8]. capacity values are set by the API (flags, counter, domain tag, namespace bounds) or by finalization (message length). the permutation mixes rate and capacity together — but input never directly overwrites capacity.

this separation is what makes the flags, counters, and domain tags trustworthy. an attacker who controls the input cannot set capacity values. capacity is controlled by the protocol, not by the data.

Dimensions

capacity
hemera/docs/explanation/capacity
structured capacity: one function, unlimited contexts the problem Hemera must produce different hashes for different structural contexts. a Merkle leaf and a Merkle internal node might receive identical input bytes, but they must produce different outputs — otherwise an attacker could substitute a…

Local Graph