Cyber Control Codes: Reclaiming Dead Unicode for Graph Operations

The Idea

ASCII 0–31 contains 21 dead teletype control codes. Unicode 128–159 contains 29 more dead C1 control codes. Total: 50 dead codes carrying no meaning in modern computing.

nox reclaims them as native graph operations. When these bytes appear in a cybergraph stream, they ARE the protocol — not text, not legacy, but living instructions for a planetary knowledge graph.

Layer 1: nox Reduction Patterns (0x01–0x0F)

The 16 reduction patterns map directly onto the first 16 dead/reclaimable ASCII positions. We skip 0x00 (NUL — still used as string terminator in C).

HEX  OLD NAME    NEW: nox PATTERN         OPCODE
───  ────────    ─────────────────         ──────
01   SOH         axis    — navigate tree      0
02   STX         quote   — literal value      1
03   ETX*        compose — recursion          2
04   EOT*        cons    — build cell         3
05   ENQ         branch  — conditional        4
06   ACK         add     — field addition     5
07   BEL         sub     — field subtraction  6
08   BS*         mul     — field multiply     7
09   HT*         inv     — field inverse      8
0A   LF*         eq      — equality test      9
0B   VT          lt      — less-than         10
0C   FF          xor     — bitwise xor       11
0D   CR*         and     — bitwise and       12
0E   SO          not     — bitwise not       13
0F   SI          shl     — shift left        14

* = code still used in legacy contexts (ETX=Ctrl+C, EOT=Ctrl+D, 
    BS=backspace, HT=tab, LF=newline, CR=return)

Problem: 6 of these codes (0x03, 0x04, 0x08, 0x09, 0x0A, 0x0D) are still alive in legacy systems. This creates ambiguity — is 0x0A a newline or an eq operation?

Solution: Context. In a nox binary stream (identified by magic bytes or protocol framing), these are opcodes. In a text file, they're control characters. The same byte means different things in different contexts — which is how computing has always worked. A byte in a JPEG is not a byte in UTF-8.

But cleaner: use ONLY the 100% dead codes for the core patterns, and put conflicted codes elsewhere.

Layer 1 (Clean): nox Patterns in Dead-Only Codes

Using only codes that are completely dead — zero modern usage:

HEX  OLD NAME              NEW: nox PATTERN
───  ────────              ─────────────────
01   SOH  Start of Header  → axis     (0)  navigate
02   STX  Start of Text    → quote    (1)  literal
05   ENQ  Enquiry          → compose  (2)  recursion
06   ACK  Acknowledge      → cons     (3)  build cell
0E   SO   Shift Out        → branch   (4)  conditional
0F   SI   Shift In         → add      (5)  field add
10   DLE  Data Link Escape → sub      (6)  field sub
11   DC1  Device Ctrl 1    → mul      (7)  field multiply
12   DC2  Device Ctrl 2    → inv      (8)  field inverse
14   DC4  Device Ctrl 4    → eq       (9)  equality
15   NAK  Neg. Acknowledge → lt       (10) less-than
16   SYN  Synchronize      → xor      (11) bitwise xor
17   ETB  End Trans Block  → and      (12) bitwise and
18   CAN  Cancel           → not      (13) bitwise not
19   EM   End of Medium    → shl      (14) shift left
1C   FS   File Separator   → hash     (15) structural hash

16 patterns → 16 dead codes. Perfect fit. Zero conflicts with living codes.

Remaining dead codes in 0–31 range: 0x1D (GS), 0x1E (RS), 0x1F (US) — 0x1D is assigned to hint (nox Layer 2), 2 spare.

Layer 2: Graph Semantics (0x80–0x9F)

The C1 control range (128–159) is completely dead in UTF-8. These bytes cannot appear as standalone characters in valid UTF-8 — they're always continuation bytes. Perfect for graph operations that are definitionally not text.

HEX  OLD NAME                  NEW: GRAPH OPERATION
───  ────────                  ────────────────────
80   PAD  Padding              → PARTICLE    create content-addressed node
81   HOP  High Octet Preset    → CYBERLINK   create weighted edge
82   BPH  Break Permitted      → NEURON      register agent identity
83   NBH  No Break Here        → STAKE       lock tokens on particle
84   IND  Index                → UNSTAKE     unlock tokens from particle
85   NEL  Next Line            → TRANSFER    move tokens between neurons
86   SSA  Start Selected Area  → FOCUS_QUERY query current π for particle
87   ESA  End Selected Area    → EDGE_QUERY  query edges by particle/neuron
88   HTS  Horiz Tab Set        → PROOF       submit ZK proof
89   HTJ  Horiz Tab Justify    → VERIFY      verify proof
8A   VTS  Vert Tab Set         → COMMIT      polynomial commitment
8B   PLD  Partial Line Down    → NULLIFY     spend record (prevent double-spend)
8C   PLU  Partial Line Up      → REVEAL      make sealed value public
8D   RI   Reverse Index        → SEAL        hide value with commitment
8E   SS2  Single Shift Two     → NAMESPACE   declare namespace
8F   SS3  Single Shift Three   → COMPLETE    prove namespace completeness

90   DCS  Device Ctrl String   → EPOCH       mark epoch boundary
91   PU1  Private Use One      → CHECKPOINT  state checkpoint
92   PU2  Private Use Two      → MIGRATE     storage proof for rehash
93   STS  Set Transmit State   → DELEGATE    delegate focus to another neuron
94   CCH  Cancel Character     → REVOKE      revoke delegation
95   MW   Message Waiting      → CHALLENGE   challenge a claim
96   SPA  Start Protected      → RESPOND     respond to challenge
97   EPA  End Protected        → FINALIZE    finality threshold reached
98   SOS  Start of String      → SPONGE_INIT init incremental hash
99   SGCI Single Graphic       → ABSORB      absorb into sponge state
9A   SCI  Single Char Intro    → SQUEEZE     squeeze from sponge state
9B   CSI  Ctrl Sequence Intro  → MERKLE_STEP one level of Merkle proof
9C   ST   String Terminator    → MERKLE_ROOT root of authenticated structure
9D   OSC  OS Command           → SYNC        request state synchronization
9E   PM   Privacy Message      → SUBSCRIBE   subscribe to focus changes
9F   APC  Application Program  → RESERVED    future use

32 graph operations → 32 dead C1 codes. One-to-one. Zero conflicts.

nox Layer 2: Non-Deterministic Input (0x1D)

HEX  OLD NAME              NEW: nox INSTRUCTION
───  ────────              ─────────────────────
1D   GS   Group Separator  → hint    (16) prover injects witness

hint is the single Layer 2 instruction. The prover supplies a witness value; Layer 1 constraints verify it. This is the non-deterministic gate that makes zero-knowledge proofs possible — the prover demonstrates knowledge without revealing the witness.

GS (Group Separator) → hint: the separator between deterministic reduction and prover knowledge.

nox Layer 3: Jets

Jets (hash, poly_eval, merkle_verify, fri_fold, ntt) have no separate opcodes. They are runtime-recognized optimizations of Layer 1 pattern combinations — observationally equivalent, just faster. The verifier and prover agree on jet semantics; the encoding remains pure Layer 1.

This follows the Nock/Urbit model: jets accelerate without changing the formal spec.

Spare Codes

FROM 0x00–0x1F (2 remaining dead):
  1E   RS   Record Separator   → spare
  1F   US   Unit Separator     → spare

FROM 0x7F:
  7F   DEL  Delete             → spare

Total spare: 3 codes for future expansion.

Summary

RANGE       COUNT   ORIGINAL PURPOSE       NEW PURPOSE
──────      ─────   ────────────────       ───────────
0x01–0x1C   16      Teletype control       nox Layer 1 — reduction patterns
0x1D        1       Group Separator        nox Layer 2 — hint (witness input)
0x80–0x9F   32      C1 terminal control    Graph semantic operations
0x1E–0x1F   2       Separators             Spare
0x7F        1       Delete                 Spare

nox Layer 3 jets share Layer 1 encodings (runtime optimization, no extra opcodes).

TOTAL RECLAIMED: 52 codes
TOTAL USED:      49 (16 patterns + 1 hint + 32 graph ops)
TOTAL SPARE:      3

The Poetry

SOH (Start of Header) becomes axis — navigating to the start of a structure. ACK (Acknowledge) becomes cons — acknowledging two values into a cell. BEL (Ring the Bell) becomes sub — because subtraction rings true. SYN (Synchronize) becomes xor — the original synchronization primitive. CAN (Cancel) becomes not — canceling every bit. FS (File Separator) becomes hash — the separator of all content into identity.

PAD (Padding) becomes PARTICLE — padding the graph with new knowledge. SOS (Start of String) becomes SPONGE_INIT — starting the absorb. ST (String Terminator) becomes MERKLE_ROOT — terminating the tree.

The teletype is dead. Long live the cybergraph.

Technical Note: UTF-8 Safety

This encoding is UTF-8 safe by construction:

  • Codes 0x01–0x1F: In UTF-8, these ARE valid single-byte characters (C0 controls). But they never appear in well-formed text content. A binary stream using these as opcodes will not be confused with text.

  • Codes 0x80–0x9F: In UTF-8, any byte in this range MUST be a continuation byte (10xxxxxx pattern). A standalone 0x80 is invalid UTF-8. This means: if you see 0x80 as an independent byte, it is definitionally not text. It is a nox graph operation. No ambiguity. No context needed. The byte itself declares its domain.

This is why the graph operations live in 0x80–0x9F: they are self-identifying. A parser encountering 0x85 knows immediately — without any framing, without any protocol negotiation — that this is not text. It is either invalid data or a graph instruction. nox claims it as graph instruction.

What This Means

Every nox transaction, every cyberlink, every particle creation can be encoded as a sequence of bytes that:

  1. Are valid in the existing byte ecosystem (no new bit widths, no new hardware)
  2. Cannot be confused with text (self-identifying domain)
  3. Reuse humanity's 70-year investment in byte infrastructure
  4. Replace dead teletype ghosts with living graph operations
  5. Fit in a single byte per opcode (maximum density)

The byte was IBM's accident. nox turns it into the instruction set for planetary intelligence.

Local Graph