Neural Optimizer

GNN encoder + Transformer decoder (~13M parameters) that compiles TIR to TASM. Operates at the TIR→TASM boundary — the only non-deterministic stage in the pipeline.

Architecture

TIR ops → TirGraph → GNN Encoder → node embeddings
                                       ↓
              TASM tokens ← Beam Search ← Transformer Decoder

Encoder: 4-layer GATv2 (Graph Attention Network v2), d=256, d_edge=32. Encodes a TirGraph (typed edges: DataDep, ControlFlow, MemOrder) into per-node embeddings + global context. ~3M params.

Decoder: 6-layer Transformer with self-attention + cross-attention to GNN node embeddings. Stack-aware: injects stack depth (max 65) and type window (8 slots) as additional features at each step. d=256, 8 heads, d_ff=1024, max_seq=256. ~10M params.

Vocabulary: 140 tokens (EOS + 139 TASM instructions). Covers the full Triton VM ISA: push constants, stack ops (dup/swap/pick/place 0-15), arithmetic, comparison, bitwise, control flow, I/O, memory, crypto, extension field. Token 0 = EOS.

Grammar mask: At each decoder step, a stack state machine restricts the vocabulary to syntactically valid next tokens. Prevents the model from emitting invalid TASM.

Inference

Beam search with K=32, max 256 steps. Length normalization (alpha=0.7), repetition penalty (1.5x over 16-token window).

1. Build TirGraph from TIR ops
2. Encode graph → node features [N, 59], typed edges [E]
3. GNN forward → node embeddings [N, 256]
4. Beam search (K=32): autoregressive decoding with grammar mask
5. Validate candidates: stack verify + cost scoring
6. Return cheapest valid candidate, or fallback to compiler output

Validation uses two layers:

  • Stack verifier (src/cost/stack_verifier.rs): executes straight-line TASM on concrete Goldilocks values, checks stack transformation matches compiler output. Fast (~25 instructions modeled), used for training feedback.
  • Table profiler (src/cost/scorer.rs): counts actual table row increments across 6 Triton VM AETs. Cost = padded height (next power of 2 of max table height). The cliff function.

Training

Three stages, run via trident train:

Stage 1: Supervised pre-training

Teacher forcing with cross-entropy loss on (TirGraph, TASM) pairs. Training corpus: all .tri files from vm/, std/, os/, compiled to TIR and split into per-function blocks.

  • Optimizer: AdamW (lr=3e-4, weight_decay=0.01)
  • Cosine LR decay to 1e-5
  • Gradient clipping at norm 1.0
  • Early stopping: patience 3 epochs
  • Checkpoint: model/general/v2/stage1_best.mpk
trident train --epochs 10           # default: 10 epochs
trident train --stage 1 --epochs 50 # explicit stage 1

Stage 2: GFlowNet fine-tuning

Trajectory Balance loss. The model samples TASM sequences and receives reward from actual cost improvement over compiler baseline.

  • Temperature annealing: tau 2.0 → 0.5 over 10K steps
  • Partial credit shaping for first 1K steps (then pure reward)
  • Checkpoint: model/general/v2/stage2_latest.mpk
trident train --stage 2 --epochs 20

Stage 3: Online learning

Micro-finetunes on new build results via replay buffer. Regression guard prevents deploying checkpoints worse than production.

  • Triggers after 50 new results or 24h
  • 200 GFlowNet gradient steps per micro-finetune
  • 10% historical samples mixed in (prevents forgetting)
  • Max 2pp validity regression allowed
trident train --stage 3

Reset

trident train reset    # deletes model weights + cached .neural.tasm

Data Flow

TirGraph (src/neural/data/tir_graph.rs): Converts Vec<TIROp> into a graph with 54 op kinds (4 tiers), 3 edge types, 59-dimensional node features (one-hot opcode + field type + structural flags).

Training pairs (src/neural/data/pairs.rs): Extracted by compiling each .tri file, splitting TIR into per-function blocks, lowering each to TASM. Each pair = (TIR ops, TASM lines).

Replay buffer (src/neural/data/replay.rs): Priority-based buffer at model/general/v2/replay.rkyv. Stores build results for online learning.

File Map

src/neural/
  mod.rs                Public API: compile(), load_model(), compile_with_model()
  checkpoint.rs         Save/load via burn NamedMpk format
  model/
    composite.rs        NeuralCompilerV2 = encoder + decoder (~13M params)
    encoder.rs          GATv2 GNN encoder (4 layers, d=256)
    decoder.rs          Stack-aware Transformer decoder (6 layers, 8 heads)
    vocab.rs            140-token TASM vocabulary
    grammar.rs          Stack state machine for grammar masking
    grammar_tables.rs   Precomputed grammar transition tables
    gnn_ops.rs          Scatter/gather ops for GNN message passing
  inference/
    beam.rs             Beam search (K=32, max_steps=256)
    execute.rs          Candidate validation and ranking
  training/
    supervised.rs       Stage 1: cross-entropy with teacher forcing
    gflownet.rs         Stage 2: Trajectory Balance fine-tuning
    online.rs           Stage 3: replay buffer micro-finetune
    augment.rs          Data augmentation
  data/
    pairs.rs            Training pair extraction from .tri corpus
    replay.rs           Priority replay buffer (rkyv serialized)
    tir_graph.rs        TIR → typed graph conversion (54 ops, 3 edge types)

src/cost/
  scorer.rs             Table profiler (6 AETs, cliff-aware cost)
  stack_verifier.rs     Concrete-value TASM execution for fast verification

Speculative Compilation

The neural path is strictly speculative. Classical lowering always runs. Neural output is accepted only when:

  1. Stack verifier confirms equivalent stack transformation
  2. Table cost is strictly less than compiler output
  3. Neural output is not identical to compiler output (no memorization)

This is enforced in src/cli/bench.rs:compile_neural_tasm_inline() and src/neural/mod.rs:compile().

Dimensions

neural
semantic language for neurons over the cybergraph. whitepaper: neural language for superintelligence convergent successor for both formal and natural languages meaning is defined by cyberlinks — structure emerges from how agents link particles part of the soft3 stack, running on Bostrom alongside…
trident/src/neural
neural
trident/src/ir/tir/neural
neural

Local Graph