parameter decisions

field: Goldilocks

Why not 31-bit fields: capacity=8 at 31 bits yields only 124 bits collision resistance.

Why not 254-bit: multiprecision costs ~10x more than native 64-bit.

Why Goldilocks (p = 2^64 - 2^32 + 1):

  • native CPU width — single 64-bit register per element
  • fast reduction — subtract-and-shift, no division
  • large NTT domain — multiplicative group of order 2^32
  • curve independence — no coupling to any elliptic curve
  • 8-byte elements — clean alignment, no padding

S-box: d=7

The S-box exponent must be a bijection over the field: gcd(d, p-1) = 1.

  • d=3: gcd(3, p-1) = 3. Not invertible.
  • d=5: gcd(5, p-1) = 5. Not invertible.
  • d=7: gcd(7, p-1) = 1. Invertible.

d=7 is the minimum invertible exponent. Multiplicative depth is 3 (computed as x -> x^2 -> x^4 -> x^3 * x^4 = x^7 with squarings and one multiply).

state width: t=16, r=8, c=8

The ecosystem standard t=12 gives exactly 128-bit collision resistance with capacity 4 — zero margin.

BHT quantum collision search at cap=4 is 2^85, insufficient for a permanent system.

Security comparison:

metric cap=4 (t=12) cap=8 (t=16)
classical collision 2^128 2^256
BHT quantum collision 2^85 2^171
classical preimage 2^256 2^512
Grover quantum preimage 2^128 2^256

Throughput is identical: both have rate r=8 = 56 input bytes per permutation call.

round counts: R_F=8, R_P=64

Full rounds (R_F=8): the wide trail strategy guarantees at least 8 active S-boxes across 4 full rounds. Differential probability per S-box is at most 6/2^64. Over 8 active S-boxes: (6/2^64)^8 ~ 2^-480.

Partial rounds (R_P=64): algebraic degree after 64 partial rounds is 7^64 ~ 2^180. An interpolation attack requires ~2^180 evaluations.

The 42 additional partial rounds beyond Plonky3's R_P=22 add only ~19% to total field multiplications. This is cheap margin for a permanent hash.

Context: the Ethereum Foundation bounty program has not produced attacks on Poseidon2 at standard round counts. Hemera's R_P=64 extends well beyond any published attack threshold.

round structure: 8 + 64 = 72

The total 72 is not a power of 2. But the total never appears in code — it is a derived sum.

Loop bounds and array sizes are powers of 2:

  • R_F = 8 (2^3)
  • R_P = 64 (2^6)
  • half-full = 4 (2^2)

R_P=64 was chosen over R_P=56 because the partial round constant array is a data structure. A 64-element array aligns to cache lines and simplifies indexing.

computational elegance

Every parameter that appears as a loop bound, array size, or memory layout is a power of 2:

parameter value power of 2 code role
p (Goldilocks) 2^64 - 2^32 + 1 reduction via shifts field arithmetic
t (state width) 16 2^4 array size, SIMD width
c (capacity) 8 2^3 security parameter
r (rate) 8 2^3 absorption loop bound
R_F (full rounds) 8 2^3 outer loop bound
R_P (partial rounds) 64 2^6 inner loop bound, constant array size
output (bytes) 64 2^6 output buffer size
element (bytes) 8 2^3 memory stride

Only non-power-of-2 values: derived sums (72 total rounds, 192 total constants), input rate (56 = 7 x 8 bytes), and the S-box exponent d=7.

The Goldilocks prime forces 7 twice: as the S-box exponent (minimum invertible) and in the encoding rate (56 bytes = 7 field elements of 8 bytes each).

SIMD-aligned memory access, clean loop unrolling, cache-line alignment — all follow from the power-of-2 discipline.

The permutation loop structure:

for _ in 0..4:        // half-full rounds (power of 2)
    add_constants()
    sbox_full()        // 16 S-boxes (power of 2)
    mds()

for _ in 0..64:       // partial rounds (power of 2)
    add_constant()
    sbox_single()      // 1 S-box
    mds()

for _ in 0..4:        // half-full rounds (power of 2)
    add_constants()
    sbox_full()
    mds()

Dimensions

parameters
/bandwidth/parameters

Local Graph