Why 4 KB Chunks
Ten independent lines of evidence converge on 4096 bytes as the optimal chunk size for verified streaming.
1. Field alignment
4096 = 2^12. Each chunk requires 74 absorptions + 1 binding = 75 permutations per leaf. Clean power-of-two alignment with the field arithmetic.
2. OS page alignment
4 KB is the native page size on x86, ARM, and RISC-V. It matches the default block size of ext4, XFS, NTFS, and APFS. It is the minimum NVMe transfer unit. Zero-copy mmap operates at page granularity — 4 KB chunks require no partial-page bookkeeping.
3. L1 cache fit
4 KB fits entirely in L1 data cache (32-64 KB typical). 8 KB increases cache pressure. 16 KB exceeds L1 on most microarchitectures. Processing a chunk without evicting working set from L1 is a hard performance boundary.
4. STARK proof granularity
75 permutations x ~1,200 constraints = ~90,000 constraints per leaf. Large enough to amortize proof overhead, small enough to prove individual chunks without excessive trace length.
5. Tree depth and proof size
| Data size | Chunks | Tree depth | Proof size |
|---|---|---|---|
| 1 MB | 256 | 8 | 512 B |
| 1 GB | 262,144 | 18 | 1,152 B |
| 1 TB | 268M | 28 | 1,792 B |
| 1 PB | 274B | 38 | 2,432 B |
| 1 EB | 281T | 48 | 3,072 B |
| 1 YB | 288 x 10^18 | 68 | 4,352 B |
MMR depth at 10^24 cyberlinks remains tractable.
6. Overhead ratio
64 bytes of metadata per 4096-byte chunk = ~1.6% overhead. At 256 B chunks: 25%. At 64 KB chunks: 0.1%. 1.6% is the practical minimum where overhead is negligible but granularity is still useful.
7. Deduplication quality
4 KB matches the page size of databases, VM disk images, and document container formats. Content-defined chunking at 4 KB aligns with existing storage deduplication infrastructure.
8. Streaming verification
Buffer one chunk + one Merkle proof = approximately 6 KB. A receiver can verify and process data incrementally without buffering the entire file.
9. Network transport
4 KB = 3 TCP segments at MTU 1500, or 1 jumbo frame at MTU 9000. Legacy MTU 1500 networks carry a chunk in 3 packets. Jumbo frame networks carry it in 1.
10. Bounded locality
Changing one byte requires rehashing: 75 permutations (the affected chunk) + 2 x log2(N) permutations (Merkle path to root). The blast radius of a single-byte edit is bounded and predictable.
Comparison table
256B 1KB 4KB 8KB 16KB 64KB
Absorbs/chunk 5 19 74 147 293 1171
Perms/leaf 6 20 75 148 294 1172
1GB tree depth 22 20 18 17 16 14
1GB proof (bytes) 1408 1280 1152 1088 1024 896
Overhead ratio 25% 6% 1.6% 0.8% 0.4% 0.1%
OS page aligned x x yes x x x
L1 cache fit yes yes yes ~ x x
STARK constraints 7.2K 24.0K 90.0K 178K 353K 1.4M
Streaming buffer 256B 1K 4K 8K 16K 64K
Dedup quality poor fair good good fair poor
Network packets 1 1 3 6 12 46
4 KB is the only row with yes on both page alignment and L1 cache fit.