square root specification
modular square root over the Goldilocks field. given a ∈ F_p, find r such that r² = a (mod p), or determine that no such r exists.
Legendre symbol
the Legendre symbol determines whether a has a square root:
legendre(a) = a^((p−1)/2) mod p
| result | meaning |
|---|---|
| 0 | a = 0 |
| 1 | a is a quadratic residue (has a square root) |
| p−1 | a is a quadratic non-residue (no square root) |
exactly (p−1)/2 non-zero elements are quadratic residues. the other (p−1)/2 are non-residues.
this is one exponentiation: (p−1)/2 = 0x7FFFFFFF80000000. the same square-and-multiply chain as inversion, with a different exponent.
Tonelli-Shanks algorithm
since p ≡ 1 (mod 4), not every element has a square root, and the simple formula r = a^((p+1)/4) does not apply (that works only when p ≡ 3 mod 4). the general algorithm is Tonelli-Shanks.
setup
factor out powers of 2 from p − 1:
p − 1 = 2³² × s where s = 2³² − 1 = ε
the two-adicity of p is 32. this means Tonelli-Shanks uses at most 32 iterations.
choose a known quadratic non-residue: z = 7 (confirmed by vectors § primitive root).
algorithm
sqrt(n):
if n = 0: return 0
if legendre(n) = p−1: return ⊥ // no square root
M = 32 // two-adicity
c = 7^s mod p // 7^ε, a 2^M-th root of unity
t = n^s mod p // n^s
R = n^((s+1)/2) mod p // initial guess
loop:
if t = 1:
if R > (p−1)/2: R = p − R // canonical: pick r ≤ (p−1)/2
return R
find least i > 0 such that t^(2^i) = 1 // i < M always
b = c^(2^(M−i−1)) mod p // adjustment factor
M = i
c = b² mod p
t = t · c mod p
R = R · b mod p
cost
each iteration: one squaring chain (to find i), one exponentiation (to compute b), two multiplications. the initial setup costs one Legendre check + two exponentiations by s.
worst case: 32 iterations (when t has order exactly 2³²). typical case: far fewer.
constant-time variant
the variable-time find least i loop leaks information about the input through timing. for constant-time operation, always perform all 32 iterations, using conditional moves to select the result:
sqrt_ct(n):
// same setup
for round in 0..32:
// find i by squaring t up to 32 times
// conditionally update (M, c, t, R) based on whether t^(2^i) = 1
this is required when the input is secret (e.g., in zero-knowledge circuits).
sign convention
every non-zero quadratic residue has exactly two square roots: r and p − r. the canonical square root is the one where r ≤ (p−1)/2.
(p−1)/2 = 0x7FFFFFFF80000000
if r > (p−1)/2, return p − r instead.