Cyber Control Codes: Reclaiming Dead Unicode for Graph Operations
The Idea
ASCII 0–31 contains 21 dead teletype control codes. Unicode 128–159 contains 29 more dead C1 control codes. Total: 50 dead codes carrying no meaning in modern computing.
nox reclaims them as native graph operations. When these bytes appear in a cybergraph stream, they ARE the protocol — not text, not legacy, but living instructions for a planetary knowledge graph.
Layer 1: nox Reduction Patterns (0x01–0x0F)
The 16 reduction patterns map directly onto the first 16 dead/reclaimable ASCII positions. We skip 0x00 (NUL — still used as string terminator in C).
HEX OLD NAME NEW: nox PATTERN OPCODE
─── ──────── ───────────────── ──────
01 SOH axis — navigate tree 0
02 STX quote — literal value 1
03 ETX* compose — recursion 2
04 EOT* cons — build cell 3
05 ENQ branch — conditional 4
06 ACK add — field addition 5
07 BEL sub — field subtraction 6
08 BS* mul — field multiply 7
09 HT* inv — field inverse 8
0A LF* eq — equality test 9
0B VT lt — less-than 10
0C FF xor — bitwise xor 11
0D CR* and — bitwise and 12
0E SO not — bitwise not 13
0F SI shl — shift left 14
* = code still used in legacy contexts (ETX=Ctrl+C, EOT=Ctrl+D,
BS=backspace, HT=tab, LF=newline, CR=return)
Problem: 6 of these codes (0x03, 0x04, 0x08, 0x09, 0x0A, 0x0D) are still alive in legacy systems. This creates ambiguity — is 0x0A a newline or an eq operation?
Solution: Context. In a nox binary stream (identified by magic bytes or protocol framing), these are opcodes. In a text file, they're control characters. The same byte means different things in different contexts — which is how computing has always worked. A byte in a JPEG is not a byte in UTF-8.
But cleaner: use ONLY the 100% dead codes for the core patterns, and put conflicted codes elsewhere.
Layer 1 (Clean): nox Patterns in Dead-Only Codes
Using only codes that are completely dead — zero modern usage:
HEX OLD NAME NEW: nox PATTERN
─── ──────── ─────────────────
01 SOH Start of Header → axis (0) navigate
02 STX Start of Text → quote (1) literal
05 ENQ Enquiry → compose (2) recursion
06 ACK Acknowledge → cons (3) build cell
0E SO Shift Out → branch (4) conditional
0F SI Shift In → add (5) field add
10 DLE Data Link Escape → sub (6) field sub
11 DC1 Device Ctrl 1 → mul (7) field multiply
12 DC2 Device Ctrl 2 → inv (8) field inverse
14 DC4 Device Ctrl 4 → eq (9) equality
15 NAK Neg. Acknowledge → lt (10) less-than
16 SYN Synchronize → xor (11) bitwise xor
17 ETB End Trans Block → and (12) bitwise and
18 CAN Cancel → not (13) bitwise not
19 EM End of Medium → shl (14) shift left
1C FS File Separator → hash (15) structural hash
16 patterns → 16 dead codes. Perfect fit. Zero conflicts with living codes.
Remaining dead codes in 0–31 range: 0x1D (GS), 0x1E (RS), 0x1F (US) — 0x1D is assigned to hint (nox Layer 2), 2 spare.
Layer 2: Graph Semantics (0x80–0x9F)
The C1 control range (128–159) is completely dead in UTF-8. These bytes cannot appear as standalone characters in valid UTF-8 — they're always continuation bytes. Perfect for graph operations that are definitionally not text.
HEX OLD NAME NEW: GRAPH OPERATION
─── ──────── ────────────────────
80 PAD Padding → PARTICLE create content-addressed node
81 HOP High Octet Preset → CYBERLINK create weighted edge
82 BPH Break Permitted → NEURON register agent identity
83 NBH No Break Here → STAKE lock tokens on particle
84 IND Index → UNSTAKE unlock tokens from particle
85 NEL Next Line → TRANSFER move tokens between neurons
86 SSA Start Selected Area → FOCUS_QUERY query current π for particle
87 ESA End Selected Area → EDGE_QUERY query edges by particle/neuron
88 HTS Horiz Tab Set → PROOF submit ZK proof
89 HTJ Horiz Tab Justify → VERIFY verify proof
8A VTS Vert Tab Set → COMMIT polynomial commitment
8B PLD Partial Line Down → NULLIFY spend record (prevent double-spend)
8C PLU Partial Line Up → REVEAL make sealed value public
8D RI Reverse Index → SEAL hide value with commitment
8E SS2 Single Shift Two → NAMESPACE declare namespace
8F SS3 Single Shift Three → COMPLETE prove namespace completeness
90 DCS Device Ctrl String → EPOCH mark epoch boundary
91 PU1 Private Use One → CHECKPOINT state checkpoint
92 PU2 Private Use Two → MIGRATE storage proof for rehash
93 STS Set Transmit State → DELEGATE delegate focus to another neuron
94 CCH Cancel Character → REVOKE revoke delegation
95 MW Message Waiting → CHALLENGE challenge a claim
96 SPA Start Protected → RESPOND respond to challenge
97 EPA End Protected → FINALIZE finality threshold reached
98 SOS Start of String → SPONGE_INIT init incremental hash
99 SGCI Single Graphic → ABSORB absorb into sponge state
9A SCI Single Char Intro → SQUEEZE squeeze from sponge state
9B CSI Ctrl Sequence Intro → MERKLE_STEP one level of Merkle proof
9C ST String Terminator → MERKLE_ROOT root of authenticated structure
9D OSC OS Command → SYNC request state synchronization
9E PM Privacy Message → SUBSCRIBE subscribe to focus changes
9F APC Application Program → RESERVED future use
32 graph operations → 32 dead C1 codes. One-to-one. Zero conflicts.
nox Layer 2: Non-Deterministic Input (0x1D)
HEX OLD NAME NEW: nox INSTRUCTION
─── ──────── ─────────────────────
1D GS Group Separator → hint (16) prover injects witness
hint is the single Layer 2 instruction. The prover supplies a witness value; Layer 1 constraints verify it. This is the non-deterministic gate that makes zero-knowledge proofs possible — the prover demonstrates knowledge without revealing the witness.
GS (Group Separator) → hint: the separator between deterministic reduction and prover knowledge.
nox Layer 3: Jets
Jets (hash, poly_eval, merkle_verify, fri_fold, ntt) have no separate opcodes. They are runtime-recognized optimizations of Layer 1 pattern combinations — observationally equivalent, just faster. The verifier and prover agree on jet semantics; the encoding remains pure Layer 1.
This follows the Nock/Urbit model: jets accelerate without changing the formal spec.
Spare Codes
FROM 0x00–0x1F (2 remaining dead):
1E RS Record Separator → spare
1F US Unit Separator → spare
FROM 0x7F:
7F DEL Delete → spare
Total spare: 3 codes for future expansion.
Summary
RANGE COUNT ORIGINAL PURPOSE NEW PURPOSE
────── ───── ──────────────── ───────────
0x01–0x1C 16 Teletype control nox Layer 1 — reduction patterns
0x1D 1 Group Separator nox Layer 2 — hint (witness input)
0x80–0x9F 32 C1 terminal control Graph semantic operations
0x1E–0x1F 2 Separators Spare
0x7F 1 Delete Spare
nox Layer 3 jets share Layer 1 encodings (runtime optimization, no extra opcodes).
TOTAL RECLAIMED: 52 codes
TOTAL USED: 49 (16 patterns + 1 hint + 32 graph ops)
TOTAL SPARE: 3
The Poetry
SOH (Start of Header) becomes axis — navigating to the start of a structure.
ACK (Acknowledge) becomes cons — acknowledging two values into a cell.
BEL (Ring the Bell) becomes sub — because subtraction rings true.
SYN (Synchronize) becomes xor — the original synchronization primitive.
CAN (Cancel) becomes not — canceling every bit.
FS (File Separator) becomes hash — the separator of all content into identity.
PAD (Padding) becomes PARTICLE — padding the graph with new knowledge.
SOS (Start of String) becomes SPONGE_INIT — starting the absorb.
ST (String Terminator) becomes MERKLE_ROOT — terminating the tree.
The teletype is dead. Long live the cybergraph.
Technical Note: UTF-8 Safety
This encoding is UTF-8 safe by construction:
-
Codes 0x01–0x1F: In UTF-8, these ARE valid single-byte characters (C0 controls). But they never appear in well-formed text content. A binary stream using these as opcodes will not be confused with text.
-
Codes 0x80–0x9F: In UTF-8, any byte in this range MUST be a continuation byte (10xxxxxx pattern). A standalone 0x80 is invalid UTF-8. This means: if you see 0x80 as an independent byte, it is definitionally not text. It is a nox graph operation. No ambiguity. No context needed. The byte itself declares its domain.
This is why the graph operations live in 0x80–0x9F: they are self-identifying. A parser encountering 0x85 knows immediately — without any framing, without any protocol negotiation — that this is not text. It is either invalid data or a graph instruction. nox claims it as graph instruction.
What This Means
Every nox transaction, every cyberlink, every particle creation can be encoded as a sequence of bytes that:
- Are valid in the existing byte ecosystem (no new bit widths, no new hardware)
- Cannot be confused with text (self-identifying domain)
- Reuse humanity's 70-year investment in byte infrastructure
- Replace dead teletype ghosts with living graph operations
- Fit in a single byte per opcode (maximum density)
The byte was IBM's accident. nox turns it into the instruction set for planetary intelligence.