encoding specification

how bytes map to field elements and back. this encoding is shared by every system that moves data across the byte↔field boundary.

bytes to field elements

input bytes are packed into field elements using 7-byte little-endian chunks.

bytes:    [b₀, b₁, b₂, b₃, b₄, b₅, b₆,  b₇, b₈, ...]
           └───────── chunk 0 ──────────┘   └── chunk 1 ...
element:  b₀ + b₁·2⁸ + b₂·2¹⁶ + ... + b₆·2⁴⁸

the maximum 7-byte value is 2⁵⁶ − 1 = 72057594037927935. since 2⁵⁶ − 1 < p, every 7-byte chunk maps to a valid field element without conditional reduction. branchless. constant-time.

8 bytes would not work: the maximum 8-byte value 2⁶⁴ − 1 exceeds p. encoding would require a conditional branch to check and reduce — breaking constant-time guarantees and adding complexity.

this is the other half of the double seven (see goldilocks § why this prime): d = 7 is the minimum invertible S-box exponent, and 7 bytes is the maximum unconditional encoding width. both forced by the same prime.

padding

the last chunk may have fewer than 7 bytes. it is zero-padded to 7 bytes. an empty input (0 bytes) absorbs no chunks.

the zero-padding alone is not injective: a 6-byte input ending in non-zero and a 7-byte input with a trailing 0x00 produce the same field element. disambiguation requires recording the total input length externally. in hemera, the length is stored in the sponge capacity (state[10]) during finalization. other systems using this encoding must provide an equivalent mechanism.

field elements to bytes

output field elements are canonicalized (reduced to [0, p)) and serialized as 8-byte little-endian u64 values.

element:  v (where 0 ≤ v < p)
bytes:    [v & 0xFF, (v >> 8) & 0xFF, ..., (v >> 56) & 0xFF]

8 bytes per element, not 7. the output encoding preserves the full field element without loss. the asymmetry (7 bytes in, 8 bytes out) is deliberate: input encoding maximizes absorption throughput, output encoding preserves the canonical representation.

rate and throughput

the throughput depends on the sponge rate r (number of field elements absorbed per permutation call). for hemera with rate r = 8:

r elements × 7 bytes/element = 7r bytes/block

at r = 8: 56 bytes/block. for a 4096-byte chunk: ⌈4096/56⌉ = 74 permutations.

Dimensions

encoding
hemera/reference/encoding
encoding specification Input Encoding (Bytes → Field Elements) Pack input bytes into 7-byte little-endian chunks. Each 7-byte chunk is zero-extended to 8 bytes and interpreted as a u64 in little-endian order, producing one Goldilocks field element. Why 7 bytes, not 8: The maximum 7-byte value is…
nox/reference/encoding
noun encoding specification version: 0.2 status: canonical identity every noun has an identity computed by the instantiated hash function H (see nouns.md structural hash). the nox protocol operates exclusively on hash identities. in the canonical instantiation (nox), the…

Local Graph