parameter decisions
field: Goldilocks
Why not 31-bit fields: capacity=8 at 31 bits yields only 124 bits collision resistance.
Why not 254-bit: multiprecision costs ~10x more than native 64-bit.
Why Goldilocks (p = 2^64 - 2^32 + 1):
- native CPU width — single 64-bit register per element
- fast reduction — subtract-and-shift, no division
- large NTT domain — multiplicative group of order 2^32
- curve independence — no coupling to any elliptic curve
- 8-byte elements — clean alignment, no padding
S-box: d=7
The S-box exponent must be a bijection over the field: gcd(d, p-1) = 1.
- d=3: gcd(3, p-1) = 3. Not invertible.
- d=5: gcd(5, p-1) = 5. Not invertible.
- d=7: gcd(7, p-1) = 1. Invertible.
d=7 is the minimum invertible exponent. Multiplicative depth is 3 (computed as x -> x^2 -> x^4 -> x^3 * x^4 = x^7 with squarings and one multiply).
state width: t=16, r=8, c=8
The ecosystem standard t=12 gives exactly 128-bit collision resistance with capacity 4 — zero margin.
BHT quantum collision search at cap=4 is 2^85, insufficient for a permanent system.
Security comparison:
| metric | cap=4 (t=12) | cap=8 (t=16) |
|---|---|---|
| classical collision | 2^128 | 2^256 |
| BHT quantum collision | 2^85 | 2^171 |
| classical preimage | 2^256 | 2^512 |
| Grover quantum preimage | 2^128 | 2^256 |
Throughput is identical: both have rate r=8 = 56 input bytes per permutation call.
round counts: R_F=8, R_P=16
Full rounds (R_F=8): the wide trail strategy guarantees at least 8 active S-boxes across 4 full rounds. Differential probability per S-box is at most 6/2^64. Over 8 active S-boxes: (6/2^64)^8 ~ 2^-480. full rounds use x⁷ S-box.
Partial rounds (R_P=16): use x⁻¹ (field inversion) instead of x⁷. algebraic degree per partial round is (p-2) ≈ 2^64. after 16 partial rounds: (p-2)^16 ≈ 2^1024. combined with full rounds (7^8): total degree ≈ 7^8 × (p-2)^16 ≈ 2^1046. the x⁻¹ S-box achieves far higher algebraic degree with far fewer rounds — 16 instead of 64.
security margin: 2^1046 / 2^128 = 2^918 bits over 128-bit security target. no known or foreseeable algebraic attack comes close.
context: the Ethereum Foundation bounty program has not produced attacks on Poseidon2 at standard round counts. Hemera's x⁻¹ partial S-box with R_P=16 provides 2^918 margin — more than any other Poseidon2 instantiation.
round structure: 8 + 16 = 24
total 24 = 3 × 2³. every component is a power of 2 (R_F=8=2³, R_P=16=2⁴). the round structure: 4 initial full rounds + 16 partial rounds + 4 terminal full rounds.
Loop bounds and array sizes are powers of 2:
- R_F = 8 (2^3)
- R_P = 16 (2^4)
- half-full = 4 (2^2)
R_P=16 provides 2^918 bits of security margin over the 128-bit target. the x⁻¹ partial S-box achieves far higher algebraic degree per round than x⁷, requiring fewer rounds for equivalent security.
computational elegance
Every parameter that appears as a loop bound, array size, or memory layout is a power of 2:
| parameter | value | power of 2 | code role |
|---|---|---|---|
| p (Goldilocks) | 2^64 - 2^32 + 1 | reduction via shifts | field arithmetic |
| t (state width) | 16 | 2^4 | array size, SIMD width |
| c (capacity) | 8 | 2^3 | security parameter |
| r (rate) | 8 | 2^3 | absorption loop bound |
| R_F (full rounds) | 8 | 2^3 | outer loop bound |
| R_P (partial rounds) | 16 | 2^4 | inner loop bound, constant array size |
| output (bytes) | 32 | 2^5 | output buffer size |
| element (bytes) | 8 | 2^3 | memory stride |
Only non-power-of-2 values: derived sums (24 total rounds, 144 total constants), input rate (56 = 7 x 8 bytes), and the S-box exponent d=7.
The Goldilocks prime forces 7 twice: as the S-box exponent (minimum invertible) and in the encoding rate (56 bytes = 7 field elements of 8 bytes each).
SIMD-aligned memory access, clean loop unrolling, cache-line alignment — all follow from the power-of-2 discipline.
The permutation loop structure:
for _ in 0..4: // half-full rounds (power of 2)
add_constants()
sbox_full() // 16 S-boxes (power of 2)
mds()
for _ in 0..16: // partial rounds (power of 2)
add_constant()
sbox_single() // 1 S-box
mds()
for _ in 0..4: // half-full rounds (power of 2)
add_constants()
sbox_full()
mds()