hashing and confidentiality

long-term integrity: 64-byte (512-bit) digests provide full 256-bit post-quantum pre-image margin (grover only halves the exponent). this keeps verification sound well past 2040 without needing a risky, late migration.
migration risk: recomputing hashes later requires the original bytes. if some chunks are only available on third-party nodes, coordination and bandwidth become the bottleneck. starting at 64 bytes now avoids this systemic risk.
negligible performance cost: blake3’s cost is dominated by message compression, not output size. moving from 32 to 64 bytes adds <5% cpu on large streams and only tens of nanoseconds on small items, while storage proofs roughly double but remain <2% of payload.
privacy-by-default: publishing only anonymised node ids plus commitments to weights lets anyone audit topology without exposing personal balances or authorship by default. this resists “store now, decrypt later” and casual scraping.
selective transparency: commitments are homomorphic. any stakeholder can later open chosen edges or identities and prove consistency with the canonical state, enabling targeted audits and disclosures without blanket deanonymisation.
zk readiness: the ranking and audit logic runs inside zk circuits using poseidon2 (constraint-cheap), while storage and networking continue to use blake3 (hardware-fast). this separation keeps everyday ops fast and proofs practical.
dedup and upgrade agility: fastcdc chunking plus content addressing preserves dedup across edits. multicodec tagging and auxiliary poseidon2 tags let us evolve circuit-level primitives without changing stored blake3 roots.
operational simplicity: a single fixed digest width across all merkle nodes, proofs, and indices simplifies implementations, cache behavior, and protocol negotiation.

why these specific primitives

blake3-xof-512 for storage: saturates modern simd, supports verified streaming, and scales linearly; choosing 64-byte output reclaims full quantum headroom at minimal cost.
pedersen commitments for weights: perfectly hiding, additively homomorphic, mature libraries, no need to alter storage keys. later migration to kzg/ipa is possible via zk equivalence proofs.
poseidon2 in circuits: field-friendly hash with low constraint count; we attach poseidon2 tags to blake3 nodes where zk needs to traverse merkle paths.
anonymised node ids: poseidon2(pubkey ∥ salt) gives unlinkability by default and a clean path for voluntary identity disclosure.
range proofs on weights: ensure values are non-negative and bounded, preventing pathological or adversarial inputs without revealing exact balances.

foundational choices

content addressing: blake3-xof, fixed 64-byte digests for all merkle nodes and cids.
chunking/dedup: variable-size fastcdc with target 32 kiB (min 4 kiB, max 256 kiB). each chunk is stored once and referenced by its 64-byte digest.
storage encoding: bao-style verified streaming adapted to 64-byte digests.
multicodec: register blake3-512-xof; cid strings in base32-lowercase.

confidentiality model

public sees only an anonymised, weighted digraph: edges with commitments to weights and anonymised node identifiers.
authorship and per-account contributions stay hidden by default.
anyone can later reveal chosen subgraphs and prove they match the canonical state.

identities and node ids

real identity key: ed25519/secp256k1 (off-chain).
anonymised node id: poseidon2(pubkey ∥ salt) → field element.
disclosure path: reveal salt + pubkey to bind a node id back to an identity when desired.

edge record (canonical on-chain/off-chain object)

u′, v′: anonymised node ids (field elements).
cₑ: pedersen commitment to total edge weight wₑ (additively homomorphic).
πₑ: range proof that 0 ≤ wₑ < 2⁶⁴ (bulletproof or halo2 equivalent).
leaf hash for storage: blake3-512(u′ ∥ v′ ∥ cₑ ∥ πₑ).

staking/deposit flow (confidential)

depositor chooses random rᵢ and computes cᵢ = g^{wᵢ} h^{rᵢ}.
relayer verifies a pedersen-opening proof for cᵢ against policy.
relayer updates cₑ ← cₑ · cᵢ and republishes the updated leaf + merkle path.
no public link to the depositor is recorded; only anonymised node ids and commitments are visible.

ranking with zk correctness

inputs (public): anonymised adjacency list and per-edge commitments cₑ.
inputs (private witness): openings of commitments (weights), salts for any node-ids needed inside the circuit.
computation: run k power-iteration steps or chosen ranking algorithm using private weights.
outputs: rank vector r (quantised) + poseidon2 root of r + zk proof π that r was computed exactly from the hidden weights and published topology.

selective disclosure of subgraphs

reveal chosen edges by providing their pedersen openings (wᵢ, rᵢ) and verifying ∏ revealed cᵢ equals current cₑ.
reveal authorship by disclosing the salt for affected node ids to map back to pubkeys.
partial disclosure is zero-knowledge for everything not revealed; unopened edges remain perfectly hidden.

hashing inside vs outside circuits

outside (fast path): blake3-xof-512 for all content-addressed storage and merkle nodes.
inside zk (constraint-minimal): poseidon2 for merkle steps and vector commitments.
glue: store poseidon2 tags of blake3 nodes where circuits need to traverse the same paths cheaply.

security notes

integrity: 64-byte digests restore full 256-bit post-quantum pre-image margin (grover-resistant).
hiding: pedersen commitments are perfectly hiding; range proofs prevent invalid or extreme values.
anonymity: node ids derive from salted hashes; unlinkable until the salt is disclosed by the owner.
downgrade resistance: clients that understand commitments/zk must reject content missing required fields once the flag-day is reached.

performance and overhead

hashing throughput: blake3 remains ~7–10 gb/s per core for large streams; 64-byte outputs add <5% cpu.
metadata growth: +32 bytes per commitment and ~50 bytes per aggregated range proof per edge (implementation-dependent).
indexes/outboards: double due to 64-byte digests; still <2% overhead relative to payload for typical chunk sizes.
zk proving: dominated by poseidon and linear algebra; practical for ~10⁵ edges per proof on today’s hardware; shard larger graphs across multiple proofs.

interoperability and migration

genesis uses only 64-byte digests; no dual-hash compatibility path required.
if a new field-friendly hash supersedes poseidon2, add an auxiliary tag without touching blake3 roots.
commitments can migrate (pedersen ↔ kzg) via a zk equivalence proof without re-indexing storage.

implementation checklist

update hashing and bao to fixed 64-byte outputs.
implement fastcdc chunker and content-addressed store keyed by 64-byte blake3 digests.
define edge record schema and merkle layout.
integrate pedersen commitments and range proofs for weights.
build poseidon2 merkle helpers and rank circuit (groth16/plonk/stark as appropriate).
define selective disclosure api (openings, salt reveal, audit routines).
set a flag-day after which clients must enforce presence/validity of commitments and range proofs.

open questions / to decide

exact ranking algorithm parameters (α, iterations k, damping, normalisation).
choice of proof system (groth16 vs plonk vs stark) based on ecosystem and performance.
range-proof system (bulletproofs, plonkish custom gates, or halo2 native) and aggregation strategy.
privacy budget for node-id salts and rotation policy.
governance policy for subgraph disclosures and audit procedures.

appendix: example edge object (informal)

edge {
  uid: blake3-512(u′ ∥ v′ ∥ cₑ ∥ πₑ)
  from: u′            // poseidon2(pubkey ∥ salt)
  to:   v′            // poseidon2(pubkey ∥ salt)
  commit: cₑ          // pedersen commitment to total weight
  range_proof: πₑ     // non-negative, bounded
  tags: [poseidon2(uid)] // optional, for in-circuit traversal
}

Cyber

Explorer

hashing and confidentiality

Graph View

Backlinks