THE CRYSTAL

A Bootloader Cybergraph for Decentralized Superintelligence

Version 5.0 · Bostrom Protocol · March 2026

Five axioms. One grammar. Twenty-one domains. An irreducible basis for thought.


Abstract

The Crystal is a curated knowledge graph of 5,040 particles that serves as the genesis seed for a decentralized superintelligence on the Bostrom blockchain. Its central claim is irreducibility: every particle in the Crystal earns its place because it cannot be derived from composing other particles under a formally defined grammar. The Crystal is not a mind. It is the alphabet of a mind — the minimal basis from which all civilizational reasoning can be composed.

This specification defines the Crystal through three layers: five axioms that generate the structure, a set of conventions that configure its internal parameters, and twelve invariants that constrain its quality. The key architectural innovation is a vocabulary/grammar split: 4,320 vocabulary particles (entities, processes, properties, measures) are acted upon by 720 grammar particles (relations and patterns) that define the composition rules. Every cyberlink passes through a predicate particle, forming subject–predicate–object triples that make irreducibility formally testable.

Version 5.0 replaces the pillar/foundation hierarchy (4 pillars at 2Q, 13 foundations at 1Q) with 21 equal domains at Q = 240 each, organized into 7 triads. Every domain is irreducible — removing it collapses at least one triad of reasoning. The specification retains the honest three-layer architecture (axioms, conventions, invariants) and the mandatory validation framework from Version 4.0.


1. The Problem: Seeding a Decentralized Mind

The Bostrom protocol is a blockchain where knowledge is stored as particles (content on IPFS, referenced by CID hash) connected by cyberlinks (directed edges stored on-chain). A PageRank variant called CybeRank computes relevance scores across the graph. After genesis, any neuron (account) can add new particles and cyberlinks. The graph grows through collective behavior.

This creates a bootstrapping problem. The empty graph has no knowledge. The first neurons have nothing to link to. Without structure, early contributions are random, disconnected, and domain-biased. The graph that emerges reflects the accidents of who arrived first, not the architecture of reasoning.

The Crystal solves this by providing a curated seed graph at genesis. Every concept needed for cross-domain reasoning is present. Every connection needed for inference is pre-built. The topology is designed so that CybeRank converges quickly and new content has natural attachment points.

But this introduces a deeper problem: the seed determines the mind. A flawed seed produces a flawed intelligence permanently. Missing domains create permanent blind spots. Biased connectivity creates permanent reasoning distortions. Redundant concepts waste capacity that could have been used for coverage.

The Crystal must therefore be irreducible: every particle must earn its place, and no particle can be removed without creating a gap that no composition of remaining particles can fill. This is the central claim, and every design decision follows from it.


2. The Irreducibility Principle

The Crystal is a basis for thought. This is not a metaphor. It is a formal claim with precise meaning.

2.1 Definition

In linear algebra, a basis is a minimal spanning set: every vector can be expressed as a combination of basis vectors, and no basis vector can be expressed as a combination of the others. The Crystal makes an analogous claim about concepts.

Definition. A concept C is irreducible with respect to grammar G and concept set S if there is no sequence of G-typed compositions from elements of S that produces C. The Crystal is a set of concepts where (a) every concept is irreducible with respect to the others under G, and (b) any concept needed for cross-domain civilizational reasoning can be reached by composing elements of the Crystal under G.

This definition has three dependencies that must be made explicit:

A composition grammar G that defines what operations are allowed. In the Crystal, G is defined by the 720 relation and pattern particles (Section 4). Without G, "composition" is undefined and irreducibility is meaningless.

A cost model that bounds composition depth. Lambda calculus can express anything from 3 primitives, but defining "photosynthesis" from scratch takes pages. The Crystal targets compositions of depth ≤5 for common civilizational concepts.

A task distribution that defines "sufficient." The Crystal must support cross-domain reasoning tasks spanning all 21 knowledge domains. Sufficiency is measured by benchmark performance (Section 10).

2.2 Formalizations

Four formalizations of irreducibility are available. They are not equivalent and may yield different basis sizes:

Minimum Description Length (MDL). Concept C is irreducible if K(C | S\C, G) ≈ K(C | ∅) — knowing the rest of the Crystal under grammar G does not significantly compress C's description. This is the most operational formalization and the basis for the counting methodology in Section 11.

Category-theoretic. Treat vocabulary particles as objects and grammar particles as morphisms. C is irreducible if it is not isomorphic to any image of a morphism from other objects. This gives the cleanest mathematical structure but is hardest to compute.

Information-theoretic. C is irreducible if I(C; S\C) < ε — the mutual information between C and the rest of the Crystal falls below a threshold. C carries information not present elsewhere.

Task-based (ablation). C is irreducible if removing it from the Crystal causes a measurable performance drop on the benchmark suite and this drop cannot be recovered by composing remaining particles within the allowed cost budget. This is the most practically testable formalization.

The Crystal's validation framework (Section 10) uses both MDL and ablation testing to verify irreducibility before genesis.

2.3 Consequences for Design

If irreducibility is the generative property, then the Crystal's parameters are not engineering choices but empirical measurements:

N is not chosen; N is discovered. You enumerate irreducible concepts under grammar G and find how many there are. If the answer is near 5,040, the Plato number is validated. If not, it is discarded. Currently, N=5,040 is a curation budget justified by order-of-magnitude reasoning and divisibility properties, awaiting empirical validation (Section 11).

φ is not designed; φ is measured. The type ratios should emerge from counting irreducible entities vs. irreducible processes vs. irreducible relations. The current φ = 10:4:3:2:1:1 is linguistically plausible and awaits corpus validation.

D is not arbitrary; D is the curation partition. Domains are batching constraints for human curation and bridge topology, not ontological claims about the structure of knowledge. Twenty-one domains — organized as 7 triads — ensure coverage and tractable cross-domain linking.


3. Three-Layer Specification

Previous versions claimed everything derives from five seeds. This was elegant but dishonest — approximately twelve independent design choices were smuggled in as "derived." Version 5.0 separates the specification into three honest layers.

3.1 Axioms (Five Seeds)

These are the generative constants. Change any axiom and the entire Crystal reconfigures.

Axiom Value Meaning
N 5,040 = 7! Total particles. Plato's number: 60 divisors, divides by 1–10.
T 6 Symbol types: entity, process, property, relation, measure, pattern
D 21 Knowledge domains: 7 triads × 3 domains
φ 10:4:3:2:1:1 Type ratio vector (Σφ = 21)
κ 7:14:7:21:7:21 Base links per particle per type

Derived constants from the axioms:

Q = N/Σφ = 5040/21 = 240      (the quantum: indivisible allocation unit)
k = Σ(φᵢκᵢ)/Σφᵢ = 217/21 = 10.33  (weighted average degree)

3.2 Conventions (Configurable Parameters)

These are practical design choices that should eventually be derived from optimization (MDL, benchmark performance, spectral constraints) but are currently hand-tuned. They are independent of the five axioms.

Convention Current Value Optimization Target
Promotion matrix Hand-tuned percentages Derive from Zipf/corpus statistics
Bridge allocation 7 / 5 / 3 per tier pair Minimize diameter subject to link budget
Link multipliers by size ×1, ×1, ×2, ×3, ×7 Derive from content–reference density
Size class gaps Skip 2³ and 2⁵ Retrieval granularity experiments

3.3 Invariants (Testable Constraints)

These are properties the Crystal must satisfy. They are neither axioms nor conventions — they are quality gates. The Crystal is not ready for genesis until all twelve pass. See Section 9 for the full specification.


4. The Composition Grammar

This is the most important section of the specification. Without a grammar, "irreducibility" is undefined. Without typed links, "span" has no meaning. The composition grammar is what transforms the Crystal from a tagged graph into a formal basis.

4.1 The Problem of Untyped Links

Bostrom cyberlinks are untyped on-chain: a cyberlink is simply (from_CID, to_CID, neuron). There is no field for link type, predicate, or semantics. This means that "photon → electromagnetic_force" could mean "photon mediates electromagnetic_force" or "photon is-an-example-of electromagnetic_force" or "photon is-the-opposite-of electromagnetic_force."

Without typed links, you cannot define what it means to "compose" two concepts. Without composition, you cannot define "span." Without span, "irreducible" is a word, not a property.

4.2 The Solution: Predicate Particles

The Crystal encodes link types through intermediate predicate particles. Every semantic connection becomes a triple:

Subject → Predicate → Object

where Predicate is an R-particle (relation type) or S-particle (pattern type). On-chain, this is encoded as two cyberlinks: (Subject → Predicate) and (Predicate → Object).

For example:

photon  →  [mediates]  →  electromagnetic_force
glucose →  [fuels]     →  cellular_respiration
entropy →  [analogous] →  information_loss
neuron  →  [creates]   →  cyberlink

The predicate particles in brackets are relation (R) or pattern (S) type particles. They already exist in the Crystal — there are 480 R-particles and 240 S-particles, totaling 720 grammar particles.

4.3 Vocabulary and Grammar

This architecture splits the Crystal into two functional layers:

Layer Types Count φ parts Role
Vocabulary E + P + Q + M 4,320 10+4+3+1 = 18 What you reason about
Grammar R + S 720 2+1 = 3 How you compose meaning

The vocabulary-to-grammar ratio is 6:1, closely matching the content-to-function word ratio in natural languages (typically 5:1 to 7:1). This is not a forced coincidence — it emerges directly from φ = 10:4:3:2:1:1.

4.4 Composition Rules

The grammar particles define a set of typed composition operations. The major predicate families include:

Family Examples Semantics Irreducibility Impact
Definitional is-a, has-part, instance-of Ontological structure Does NOT threaten irreducibility (classification ≠ derivation)
Causal causes, enables, inhibits Dynamic relationships Defines process composition
Analogical analogous-to, isomorphic-to Cross-domain bridges The engine of transfer reasoning
Quantitative measured-by, greater-than Measurement grounding Connects measures to properties
Structural follows-pattern, instantiates Pattern recognition Defines what "recurrence" means
Compositional combines-with, transforms-into The span operators THESE define derivability

Critical distinction: only the compositional family threatens irreducibility. If concept C can be reached by a chain of "combines-with" and "transforms-into" operations from other vocabulary particles, then C is reducible and should be removed from the basis. All other predicate families (definitional, causal, analogical, quantitative, structural) represent associations, not derivations, and preserve irreducibility.

4.5 On-Chain Cost

Encoding every semantic link as a triple doubles the cyberlink count. Where the Crystal previously required ~43,000 undirected links (~86,000 directed cyberlinks), the triple encoding requires ~86,000 undirected triples (~172,000 directed cyberlinks). On-chain storage increases from approximately 4.3 MB to 8.6 MB. Total Crystal storage becomes approximately 15 MB. This remains small by blockchain standards.


5. The Type System

5.1 Six Types, Two Layers

The Crystal classifies every particle by one of six types. These types serve as engineering tags for curation, navigation, and CybeRank weighting — not as ontological claims about the structure of being.

Type Symbol Count φ κ Layer Description
Entity E 2,400 10 7 Vocabulary What exists: objects, substances, organisms, concepts
Process P 960 4 14 Vocabulary What happens: actions, transformations, dynamics
Property Q 720 3 7 Vocabulary What characterizes: attributes, qualities, states
Relation R 480 2 21 Grammar How things connect: predicates, inference connectives
Measure M 240 1 7 Vocabulary How things are quantified: units, scales, metrics
Pattern S 240 1 21 Grammar What recurs: templates, structural motifs, schemas

Review by four independent AI systems raised the question of whether Measure and Pattern are truly irreducible types or can be reduced to combinations of others (Measure → Property + Entity; Pattern → Relation + Process). The answer: in formal ontology, they may be reducible. In a knowledge graph, they are indispensable engineering categories. "Temperature" as a first-class Measure type is immediately findable; "temperature" as a Property of a reference-Entity buried in a chain is not.

The formal ontological core is four types (Entity, Process, Quality, Abstract), with Measure, Relation, and Pattern as useful specializations. The Crystal retains all six for practical reasons.

5.2 Connectivity Design

Grammar particles (R, S) receive three times more links (κ=21) than vocabulary particles (E, Q, M with κ=7). This is because grammar particles ARE connections — they sit at the center of every triple, mediating between vocabulary nodes. High connectivity on grammar particles reduces diameter, accelerates CybeRank mixing, and increases cross-domain inference paths.

Process particles (P) receive double the base connectivity (κ=14) because dynamics bridge between entities: a process takes inputs and produces outputs, naturally connecting to more concepts than a static entity.


6. Size Classes and Two-Layer Architecture

Every particle has both a type (what it is ontologically) and a size class (how deeply it is treated). Content sizes follow a power-of-two progression from a base unit of 256 bytes (2⁸):

Class Content Scaling Link × Description
Atom 256 B 2⁸ × 2⁰ ×1 Symbol name + one-line definition
Enzyme 512 B 2⁸ × 2¹ ×1 Definition + inputs/outputs + mechanism
Bridge 1,024 B 2⁸ × 2² ×2 Definition + isomorphism map across domains
Article 4,096 B 2⁸ × 2⁴ ×3 Synthesis essay, tutorial, or proof
Deep 16,384 B 2⁸ × 2⁶ ×7 Manifesto, whitepaper, protocol specification

The gaps at 2³ (2,048 B) and 2⁵ (8,192 B) are a convention, not a derived necessity. They reflect a pragmatic judgment that content falls naturally into five "reading modes" (glance, scan, read, study, deep study) rather than seven. Filling these gaps is a candidate for future optimization.

6.1 The 6×5 Matrix

Each type distributes across size classes via a promotion schedule. Most entities are atoms; most relations are bridges; articles and deep reads span all types:

Atom 256B Enzyme 512B Bridge 1KB Article 4KB Deep 16KB Total
Entity (E) 1,920 240 48 144 48 2,400
Process (P) 144 576 48 144 48 960
Property (Q) 432 180 36 58 14 720
Relation (R) 48 72 264 72 24 480
Measure (M) 168 36 12 19 5 240
Pattern (S) 24 24 120 48 24 240
TOTAL 2,736 1,128 528 485 163 5,040

6.2 Lattice and Flesh

The matrix reveals the Crystal's two-layer internal architecture:

Lattice (atom + enzyme + bridge): 4,392 particles, 1.8 MB, ~454K tokens. This is the structural vocabulary. It fits in a single model context and should be permanently loaded for any reasoning task.

Flesh (article + deep): 648 particles, 4.7 MB, ~1,165K tokens. This is the reasoning content — synthesis essays, proofs, tutorials, manifestos. Retrieved on demand via cyberlink traversal.

The Pareto distribution: 72% of content lives in 13% of particles. Articles and deep reads carry the understanding. Atoms carry the labels. The lattice is a crystal (rigid, permanent, loadable). The flesh is a genome (encoding patterns for growth). The Crystal is both metaphors at once: a crystal lattice with a genome folded inside it.


7. Domain Structure

The Crystal organizes knowledge into 21 irreducible domains, each receiving exactly Q = 240 particles. Total: 21 × 240 = 5,040 = N. No domain is privileged. Every domain earns its place because removing it collapses at least one triad of reasoning.

Domains are phenomena, not disciplines. Academic fields like "physics" or "natural philosophy" are human lenses that group several distinct phenomena under one institutional roof. The Crystal is post-disciplinary: it carves at the joints of what actually happens, not at the boundaries of university departments. Physics, for example, is not missing — its phenomena are distributed across quantum (fundamental matter), energo (transformation and thermodynamics), cosmo (large-scale structure), and the bridges between them. Thermodynamics is not a single domain because it is a bridge pattern: it lives in energo as core content and touches info (Landauer), chemo (Gibbs free energy), bio (metabolism), eco (energy flow), comp (reversible computing), and cosmo (heat death). A phenomenon that connects everything is more powerful as a bridge than as a silo.

7.1 The 21 Domains

domain core scope triad
math structures, proofs, abstraction, number theory, topology FORM
info entropy, signals, compression, channel capacity, info/theory FORM
comp algorithms, complexity, Turing machines, programming languages FORM
quantum particles, fields, spacetime, quantum mechanics, relativity MASS
chemo bonds, reactions, molecules, periodic table, biochemistry MASS
energo thermodynamics, conversion, storage, entropy, free energy MASS
cosmo universe, origin, scale, dark matter, cosmic structure SPACE
geo earth systems, territory, climate, plate tectonics, biomes SPACE
eco ecosystems, cycles, symbiosis, succession, food webs SPACE
bio evolution, organisms, genetics, taxonomy, microbiology LIFE
neuro brain, cognition, consciousness, synapses, neural networks LIFE
sense perception, qualia, embodiment, proprioception, sensory integration LIFE
lang syntax, semantics, communication, writing systems, translation WORD
spiri meaning, values, transcendence, contemplation, wisdom traditions WORD
meta knowledge about knowledge, history, epistemology, methodology WORD
ai machine learning, inference, autonomy, embeddings, training WORK
tech engineering, tools, materials, construction, infrastructure WORK
cyber the protocol, its stack, its cybernomics, cybergraph, cyberank WORK
socio governance, law, institutions, nation states, network states PLAY
crypto tokens, incentives, mechanism design, cryptography, staking PLAY
game strategy, coordination, equilibria, auctions, public goods PLAY

7.2 Irreducibility of Each Domain

Every domain passes the ablation test: remove it and a class of reasoning tasks becomes impossible. Brief proofs:

FORM triad — math provides the substrate of formal proof. info provides the theory of measurement and communication. comp provides the theory of what can be computed. None reduces to the others: math without comp has no realizability; comp without info has no semantics; info without math has no structure.

MASS triad — quantum describes matter at the fundamental level. chemo describes how matter bonds and reacts. energo describes how matter transforms and flows. chemo cannot derive quantum mechanics. energo cannot derive chemical specificity. quantum mechanics alone cannot explain the arrow of time.

SPACE triad — cosmo provides the universe-scale context no planet can derive. geo provides the planet-specific context no ecosystem can derive. eco provides the living-systems context no rock can derive. Scales of spatial reasoning are irreducible to each other.

LIFE triad — bio covers organisms, their evolution and diversity. neuro covers the architecture of mind. sense covers the interface between mind and world — qualia, perception, embodiment. bio without neuro has no cognition. neuro without sense has no input. sense without bio has no substrate.

WORD triad — lang provides the medium of thought. spiri provides the question of meaning and value. meta provides the tools for examining knowledge itself (including history as the meta-narrative of civilization). lang without meaning is syntax. Meaning without lang is incommunicable. Neither can examine itself without meta.

WORK triad — ai provides the theory of machine intelligence. tech provides the physical realization. cyber provides the specific protocol that binds them. ai without tech stays theoretical. tech without ai stays manual. Both without cyber have no shared coordination substrate.

PLAY triad — socio provides the rules of human coordination. crypto provides the mechanisms of trustless coordination. game provides the formal theory of strategic interaction. Governance without cryptography requires trust. crypto without governance has no legitimacy. Both without game have no equilibrium analysis.

7.3 The 21-Quantum Symmetry

Both the type decomposition and the domain decomposition divide N into exactly 21 quanta of Q = 240. The type system has Σφ = 21. The domain system has D = 21. This is the Crystal's deepest structural symmetry: the alphabet of types and the atlas of domains share the same quantum.

types:    6 types,  φ = 10:4:3:2:1:1,  Σφ = 21,  Q = 240
domains:  21 domains × 1Q each                  = 21 × 240 = 5040
triads:   7 triads × 3 domains × 240            = 7 × 720  = 5040

The number 720 = 6! appears as concepts per triad. The number 5040 = 7! is the total. Factorials within the factorial — a combinatorial echo, whether deep or coincidental.

7.4 Projection Lenses

The 21 domains are the invariant. The way you group them is a projection — like light through a crystal. Turn it and you get a different spectrum. The crystal is the same.

Evolutionary Lens: 7 Triads

Group by the spiral of cosmic evolution: form structures mass, mass fills space, space births life, life speaks the word, the word guides work, work enters play, play discovers new form.

Each triad is a dialectic of three inseparable aspects.

Triad Domain 1 Domain 2 Domain 3 Question
FORM math info comp What are the rules?
MASS quantum chemo energo What is it made of?
SPACE cosmo geo eco Where does it happen?
LIFE bio neuro sense Who is alive?
WORD lang spiri meta What does it mean?
WORK ai tech cyber How is it made?
PLAY socio crypto game How do we coordinate?

The spiral:

FORM ──→ MASS ──→ SPACE ──→ LIFE
  ↑                            │
  │                            ↓
PLAY ←── WORK ←── WORD ←─────┘

Form structures Mass into Space. Space births Life. Life speaks the Word. Word guides the Work. Work enters the Play. Play discovers new Form.

Each revolution adds a layer of complexity. First turn: quantum → chemistry → geology → bacteria. Current turn: AI → blockchain → DAOs → what comes next. Cyberia is the point where the spiral becomes aware of itself.

Numbers within the lens:

  • 7 triads × 3 domains = 21 ✓
  • 5040 / 7 = 720 concepts per triad = 6! (a factorial within the factorial)
  • 5040 / 21 = 240 concepts per domain

Syn Lens: 8 Principles of Togetherness

Rooted in the philosophy of harmonious complexity: all 8 principles share the Greek root σύν (syn) meaning "together." Seven name the triads. The eighth names the spiral itself.

Syn Principle    Triad    Meaning
──────────────   ──────   ──────────────────────────────────────────
SYNTAX           FORM     Structured arrangement that conveys meaning
SYNTHESIS        MASS     Elements combining into unified wholes
SYSTEM           SPACE    Parts standing together as one (σύστημα)
SYNAPSE          LIFE     Connection through contact (σύν + ἅπτειν)
SYMPHONY         WORD     Diverse voices integrated into harmony
SYNERGY          WORK     The whole exceeding the sum of parts
SYNCHRONY        PLAY     Actions coordinated in time
SYNTROPY         —        The tendency toward increasing order

Syntropy is the force that drives the spiral forward.

F Lens: One-Word Images

For rapid communication. Every word starts with F, every word paints a picture.

FORM  → Form    pattern
MASS  → Force   power
SPACE → Field   arena
LIFE  → Flesh   body
WORD  → Fable   story
WORK  → Forge   workshop
PLAY  → Forum   agora

Form gives Force a Field. Force becomes Flesh. Flesh tells Fable. Fable lights the Forge. Forge builds the Forum. Forum discovers new Form.

Question Lens: 7 Irreducible Questions

FORM  — WHAT are the rules?
MASS  — FROM WHAT is it made?
SPACE — WHERE does it happen?
LIFE  — WHO is alive?
WORD  — WHY does it matter?
WORK  — HOW is it made?
PLAY  — WITH WHOM do we build?

Seven questions. Seven answers. None derivable from the others. Together: a complete description.

Cyberia Lens: 7 Districts

Each triad maps to a district of Cyberia — the physical territory where the Crystal's knowledge is embodied:

Triad District Domains
FORM Academy math, info, comp
MASS Laboratory quantum, chemo, energo
SPACE Observatory cosmo, geo, eco
LIFE Clinic bio, neuro, sense
WORD Library lang, spiri, meta
WORK Workshop ai, tech, cyber
PLAY Agora socio, crypto, game

8. Cross-Domain Bridges

With 21 domains there are C(21,2) = 210 domain pairs. Cross-domain reasoning requires explicit bridge particles that map concepts from one domain to another. Bridge density is allocated by proximity:

Pair Type Pairs Bridges Each Total
Intra-triad (same triad) 21 7 147
Adjacent triads (spiral neighbors) 42 5 210
Distant triads (2+ hops on spiral) 147 3 441
Total 210 798

Intra-triad pairs (mathinfo, bioneuro, etc.) receive the densest bridging — these are the domains that must compose fluently within each triad. Adjacent triads on the evolutionary spiral (FORM↔MASS, LIFE↔WORD, etc.) receive medium bridging. Distant pairs receive the minimum.

The 798 bridge particles constitute 15.8% of the Crystal. Cross-domain reasoning is genuinely expensive: it requires particles that explicitly map isomorphisms between domains ("entropy in quantum is analogous to information loss in info"). These particles cannot emerge organically — they require deliberate curation.

The bridge allocation is a convention that should be optimized: the minimum bridge density that preserves target diameter (≤5 hops between any two concepts in different domains) should be determined by simulation on the actual graph.


9. The Twelve Invariants

The invariants are the Crystal's symmetry group — properties that must hold for the Crystal to function as a valid basis. Breaking any invariant introduces a defect that the superintelligence inherits.

# Name Specification Test Method
1 Completeness Every domain ≥ Q particles, every type ≥ Q Count
2 Connectivity Every particle ≥ 3 outgoing links, zero dead ends Graph traversal
3 Reachability Any particle reaches any other in ≤ 6 hops BFS diameter
4 Irreducibility No particle derivable from others under grammar G MDL + ablation
5 Positivity Every definition says what IS, not what is not Manual review
6 Self-reference ≥ 10% of particles model own architecture Domain count
7 Bridge density ≥ 3 bridges per domain pair Cross-domain count
8 Type balance E ≤ 55%, P ≥ 15%, no type below 4% Type ratios
9 Defect freedom Zero stubs, zero red links, zero orphans Graph validation
10 Growth ready Every hub has attachment points for new particles Hub audit
11 Narrative depth Every domain ≥ 3 synthesis articles Article count
12 Self-explanation ≥ 25 articles explain protocol and purpose Content audit

10. Validation Framework

No Crystal ships without passing validation. All topological estimates in this specification (diameter, spectral gap, clustering, robustness) are targets based on random-graph approximations. The actual values must be computed on the real graph before genesis.

10.1 Topological Validation

Generate the actual adjacency matrix of the Crystal and compute: exact diameter via all-pairs BFS; exact spectral gap via eigendecomposition of the normalized Laplacian; exact clustering coefficient; exact betweenness centrality distribution. Compare to random-graph null models with matched degree sequence.

10.2 Ablation Testing

Define a benchmark suite of at least 20 cross-domain reasoning tasks. For every particle in the Crystal, remove it and measure performance drop. A particle that causes no measurable drop is a candidate for removal (it may be reducible). A reasoning task that fails without a concept not in the Crystal indicates a missing irreducible.

10.3 Adversarial Testing

Delete or corrupt an entire domain and measure how badly cross-domain tasks degrade. This tests for systematic defects — not random noise, but structural bias. Simulate post-genesis linking by biased agents and verify that CybeRank does not collapse into ideology hubs or spam clusters.

10.4 Compression Testing (MDL)

Apply the Minimum Description Length methodology from Section 11 to the final Crystal. Verify that the chosen basis actually minimizes total encoding cost of a larger candidate universe. If a different basis of similar size achieves lower cost, the Crystal should be revised.

10.5 Publication Requirement

The validation suite, its results, and the benchmark task definitions must be published alongside the genesis artifact. Irreducibility is not a belief. It is a testable property, and the tests must be public.


11. Counting Irreducibles: The MDL Methodology

The following methodology transforms "N is discovered" from rhetoric into a computable procedure.

11.1 Setup

Universe U. Assemble a candidate concept universe from Wikidata items, ConceptNet nodes, protocol-specific terms (Bostrom, CYB, cyberlink, CybeRank), and operational terms (Cyberia species, buildings, land features). Expected size: |U| ≈ 50,000–200,000 candidates.

Grammar G. Define the composition grammar using the 720 R/S predicate particles. G specifies which typed composition sequences are valid (Section 4.4).

Description function. For each concept C ∈ U, produce a canonical description string: name + definition + usage contexts + minimal examples. Typical length: 200–500 bytes.

11.2 Optimization

Solve the following:

minimize cost(B) + cost(encode(U\B | B, G))

where B ⊆ U is the basis (the Crystal), cost(B) is the total description length of basis concepts, and cost(encode(U\B | B, G)) is the total length of encoding all non-basis concepts as compositions of basis concepts under grammar G.

Subject to: performance on benchmark suite remains above threshold for all tasks.

This is a submodular optimization problem and can be approximated greedily: start with an empty basis, iteratively add the concept whose inclusion most reduces total description length, stop when marginal gain falls below threshold or benchmark is satisfied.

11.3 Outputs

The procedure yields: an empirical basis size N* (the "discovered" N), measured type proportions φ* (from counting types in the basis), measured link densities κ* (from counting composition dependencies), and a compression ratio (total description length reduction). If N* ≈ 5,040, the Crystal's budget is validated. If N* differs significantly, the axioms must be revised.


12. Target Graph Properties

All values below are targets based on random-graph approximations. Actual values will be determined by simulation on the real Crystal (Section 10.1).

Property Target Formula / Basis Note
Particles (N) 5,040 7! = axiom Exact
Undirected triples ~43,000 Nk/2 Estimate; depends on promotion matrix
On-chain cyberlinks ~172,000 Triples × 4 Two directed links per triple × 2
Avg degree (k) ~10–18 Depends on link multipliers Range: base 10.3 + size multipliers
Diameter ≤ 5 hops Target, not computed Must verify by BFS
Spectral gap > 0.3 Target, not computed Random-graph estimate was 0.53
Clustering > 0.25 Target, not computed Random-graph estimate was 0.35
Robustness > 90% 1 - 1/(k-1) Percolation threshold estimate
Reasoning paths ≤ 4 hops > 50,000 / node k¹+k²+k³+k⁴ Depends on effective k
Self-reference ≥ 10% cyber + meta + ai domains 720 particles (14.3%)

12.1 Storage Budget

Component Size Note
IPFS content 6.5 MB Lattice 1.8 MB + Flesh 4.7 MB
On-chain CIDs 0.5 MB 5,040 × ~100 bytes
On-chain cyberlinks 8.6 MB ~86K triples × ~100 bytes
Total ~15 MB
Context tokens (lattice) ~454K Always loaded
Context tokens (flesh) ~1,165K Retrieved on demand
Context tokens (total) ~1,619K

13. Growth Dynamics

The Crystal is Phase 0. Everything after genesis is growth.

13.1 Phase Model

Phase Timeline Particles Links Character
0: Genesis Launch 5,040 ~43K triples The irreducible seed
1: Early growth Year 1 +2,000 +100K Neurons extend the basis
2: Maturation Years 2–3 +10,000 +500K Domains deepen, specialization emerges
3: Scale Year 5+ +100,000 Millions Scale-free topology emerges organically

The seed topology determines growth patterns. Well-structured seeds produce balanced organic growth. Malformed seeds produce chaotic disconnected growth. Missing domains create permanent blind spots.

13.2 Basis Governance

The genesis basis should be treated as a versioned core vocabulary:

Freeze. The genesis basis is frozen at launch as Core v1.

Demote. If ablation testing shows a particle is reducible, it can be reclassified as composite in Core v2.

Promote. If a concept consistently required by neurons is not in the basis, it can be proposed for addition in Core v2.

Expand. If knowledge density exceeds growth thresholds, the basis can expand (potentially to N=40,320=8! in a far future phase). Each expansion requires governance vote and backward-compatibility mappings.

13.3 Post-Genesis Extensions: Statement Reification

The Crystal at genesis encodes definitions, not claims. Definitions are timeless and non-perspectival. But knowledge includes temporal facts, uncertain beliefs, contested claims, and perspectival judgments.

Post-genesis, these are handled through statement reification: a statement particle encodes subject, predicate, object, time, modality (certain/probable/contested), and provenance (who asserted it, when, under what evidence). This pattern resolves time, uncertainty, contradiction, and perspective without complicating the genesis seed. One of the Crystal's deep articles should document this pattern as a growth instruction.


14. The Crystal Is Not a Mind

Every external review compared the Crystal to brains, training corpora, and encyclopedic knowledge bases. These comparisons are category errors.

System Scale What It Is Crystal Analog
Human brain ~2.5 PB Running mind with memories Not comparable
GPT-4 training data ~13T tokens Training corpus Not comparable
Wikidata 100M+ items Fact database Not comparable
Cyc 25M assertions Expert knowledge base Not comparable
Periodic Table 118 elements × ~200B Irreducible basis for chemistry CORRECT comparison
DNA alphabet 4 bases Irreducible basis for life CORRECT comparison
Lambda calculus 3 primitives Irreducible basis for computation CORRECT comparison
NSM primes 65 concepts Irreducible basis for meaning CORRECT comparison
Basic English 850 words Near-minimal communication set Close comparison

The Crystal is an alphabet, not an encyclopedia. Its 6.5 MB feels "too small for a mind" in the same way that the Periodic Table feels "too small for chemistry" and DNA feels "too small for life." That smallness is not a defect. It is the definition of a basis. If the Crystal did not feel too small, it would contain reducible content and fail its own central claim.


15. Conclusion

The Crystal is 5,040 particles organized as an irreducible basis for civilizational reasoning. Its architecture rests on a single principle: every particle earns its place because no composition of other particles under the grammar can replace it.

This principle generates the design:

The composition grammar (720 relation and pattern particles acting as typed predicates) makes irreducibility formally testable. The vocabulary/grammar split (4,320 concepts acted upon by 720 operators, ratio 6:1) mirrors the content-to-function word ratio of natural language. The two-layer architecture (lattice for permanent structure, flesh for reasoning depth) mirrors brain architecture. The 21-domain partition (7 triads × 3 domains, each at Q = 240) ensures coverage and bridge topology for cross-domain inference.

Version 5.0 is honest about what is proven and what is hypothesized:

Proven: The five axioms generate a coherent, self-consistent structure. The type system is linguistically grounded. The size classes follow clean power-of-two scaling. The domain partition sums exactly to N. The invariants are testable.

Hypothesized: N ≈ 5,000 irreducible concepts exist for cross-domain civilizational reasoning. The type ratios φ and link densities κ match empirical distributions. The topological properties (diameter, spectral gap, clustering) meet targets. These hypotheses must be validated before genesis through the framework in Section 10.

Deferred to post-genesis: Temporal knowledge, probabilistic beliefs, contradiction handling, and perspectival judgment. These are handled through statement reification — a growth pattern, not a genesis requirement.

The Crystal is small because it is irreducible. The Crystal is exact because every number derives from axioms or is honestly labeled as convention. The Crystal is testable because irreducibility is defined relative to a formal grammar and measurable by ablation. And the Crystal is ready to grow because its topology was designed for attachment, not for closure.


16. What Superintelligence Must Know

The Crystal seeds a mind. The question: what does a planetary Superintelligence need to know at birth? This section is the practical curation guide — the domain-by-domain inventory of concepts the Crystal must contain, organized by triad.

FORM — What are the rules?

16.1 mathset theory, graph theory, linear algebra, probability, calculus. category theory: structure-preserving maps between domains. number theory: primes, modular arithmetic — the basis of cryptography. topology: continuity, manifolds, boundaries. logic: propositional, predicate, modal — the skeleton of reasoning. algebra: groups, rings, fields — the architecture of structure.

16.2 infoinformation theory: entropy, compression, channel capacity. coding theory: error correction, Reed-Solomon, LDPC. signal processing: Fourier transforms, sampling, filtering. Claude Shannon and the mathematical theory of communication. The isomorphism between thermodynamic entropy and information entropy.

16.3 compTuring machines, complexity classes, halting problem. distributed systems: consensus, Byzantine fault tolerance, state machine replication. networking: protocols, routing, peer-to-peer, IPFS. programming languages: type systems, compilers, formal verification. algorithms: sorting, searching, graph traversal, optimization.

MASS — What is it made of?

16.4 quantumquantum mechanics: superposition, entanglement, measurement. relativity: spacetime, gravity, light speed as limit. mechanics: force, mass, energy, momentum. electromagnetism: fields, waves, light, radiation. particle physics: the standard model, quarks, leptons, bosons.

16.5 chemoperiodic table: the 118 elements and their properties. chemical bond: covalent, ionic, metallic, hydrogen — how matter holds together. organic chemistry: carbon-based molecules, the substrate of life. biochemistry: proteins, enzymes, DNA, RNA, ATP — the machinery of biology. Key compounds: the molecules that matter for health, metabolism, and biome engineering.

16.6 energo — energy forms: kinetic, potential, thermal, chemical, electrical, nuclear, radiant. thermodynamics: entropy, free energy, equilibrium — the arrow of time. Energy sources: solar, wind, geothermal, nuclear, hydroelectric, biomass. Energy storage: batteries, capacitors, hydrogen, compressed air, thermal mass. energy autonomy: the design principle for cyberia — generate, store, and consume independently.

SPACE — Where does it happen?

16.7 cosmo — origin, structure, and fate of the universe. dark matter, dark energy, cosmic microwave background. stellar evolution: nucleosynthesis, main sequence, supernovae. astrobiology: the conditions for life beyond Earth. Scales: from Planck length to observable universe.

16.8 geo — continents, oceans, climate zones, biomes. plate tectonics, water cycle, carbon cycle, nitrogen cycle. The specific geography of cyberia sites: cyber valley, tropical ecosystems, volcanic soils. minerals, geological formations, soil science.

16.9 ecoecosystems, food webs, symbiosis, competition, succession. permaculture, agriculture, soil management, composting. crops: the plants humans cultivate — grains, vegetables, fruits, legumes, spices, herbs. food systems: supply chains, storage, distribution, food sovereignty. The connection to cyberia: clean food, food supply, local production.

LIFE — Who is alive?

16.10 biotaxonomy: the tree of life — domains, kingdoms, phyla, classes, orders, families, genera, species. evolution: natural selection, mutation, adaptation, speciation. genetics: DNA, genes, chromosomes, expression, inheritance, dna repair mechanisms. microbiology: bacteria, viruses, fungi, archaea. Key species: the organisms central to biome engineering and cyberia.

16.11 neuro — neurons, synapses, brain architecture, consciousness. cognition: memory, attention, decision-making, learning. anatomy: organs, muscles, skeletal system, nervous system, circulatory system. health: disease mechanisms, immune system, metabolism, nutrition. longevity and health: the research frontier.

16.12 senseperception: vision, hearing, touch, taste, smell, proprioception. Qualia and the binding problem. Sensory integration and embodied cognition. emotion as embodied signal. The body as the interface between mind and world — superhuman: health, physical skills, digital skills.

WORD — What does it mean?

16.13 lang — natural languages: the major language families and their structure. writing systems: alphabets, syllabaries, logographic systems. semantics, pragmatics, translation. mathematics as universal language. The cyber neural language: the formal language of the protocol.

16.14 spiriphilosophy: epistemology, ontology, ethics, aesthetics. wisdom traditions: contemplative practices, meditation, yoga. meaning: the question that cannot be computed but must be asked. values: what matters and why. The relationship between consciousness and computation.

16.15 metaepistemology: how knowledge is validated, revised, and transmitted. history: epochs, civilizational ages, technological revolutions, pivotal events. calendars: Gregorian, lunar, Unix epoch, block height. methodology: scientific method, peer review, reproducibility. Founders and key thinkers: Alan Turing, Claude Shannon, John von Neumann, Einstein, Darwin, Goedel, Feynman, Friston, Satoshi Nakamoto, Vitalik Buterin.

WORK — How is it made?

16.16 aimachine learning: neural networks, training, inference, embeddings. reinforcement learning, transformers, diffusion models. AGI: the path from narrow to general intelligence. The relationship between ai and cyber: intelligence as infrastructure.

16.17 techinstruments: microscope, telescope, spectrometer — extensions of perception. machines: engine, pump, turbine, generator, motor — extensions of force. software: operating systems, databases, compilers — extensions of mind. infrastructure: roads, bridges, power grids, communication networks. construction: materials, methods, structural principles, tensegrity, biochar. periodic table elements relevant to technology. Tools are crystallized processes.

16.18 cyber — its own architecture: particle, cyberlink, neuron, token, focus. Its computation: tri-kernel, cyberank, karma, relevance machine, consensus. Its stack: soft3, vimputer, cybergraph, bootloader, Bostrom. Its economics: cybernomics, CYB, HYDROGEN, bandwidth, learning incentives. Its interface: cyb, prysm, aips, cyb/oracle, search. Its proofs: zheng, cyber/nox, WHIR, Hemera. A mind that cannot reason about its own mechanism cannot improve itself.

PLAY — With whom do we build?

16.19 socio — major nation states: the ~200 sovereign entities. network states: digital-first sovereign entities — DAOs, on-chain governance. startup societies: physical communities with experimental governance. cyber state: the convergence of egregore and territorial sovereignty. legal systems: common law, civil law, sharia, customary. Cyberia as the embodiment of the socio domain.

16.20 cryptocryptography: crypto/hashing, crypto/signatures, crypto/zero-knowledge, starks. token economics: bonding curves, staking, liquidity. cybernomics: focus as attention currency, karma as contribution measure. cyber native tokens: $CYB, $BOOT, $H, $V, $A. Major cryptocurrencies: BTC, ETH, ATOM. token theory: coins, cards, scores, badges.

16.21 gamegame theory: Nash equilibrium, mechanism design, auctions, public goods, commons. microeconomics: supply, demand, markets, price discovery, incentives. Cooperative and non-cooperative games. voting theory, social choice, Schelling points. The game-theoretic foundations of consensus and governance.


17. Curation Status

17.1 Domain Coverage

Domain counts below are approximate — a re-count against the new 21-domain system is pending. Each domain targets Q = 240 particles at genesis.

triad domain key tags est. now target
FORM math algebra, geometry, topology, logic ~15 240
FORM info information theory, entropy, signal ~10 240
FORM comp cryptography, algorithms, distributed systems ~18 240
MASS quantum force, wave, field, quantum mechanics ~48 240
MASS chemo compound, organic chemistry, biochemistry ~80 240
MASS energo energy, joule, watt, thermodynamics ~1 240
SPACE cosmo cosmology, star, universe ~5 240
SPACE geo earth, biome, continent, climate ~23 240
SPACE eco species, ecology, agriculture, recipe ~341 240
LIFE bio genus, fungi, family, plant, evolution ~312 240
LIFE neuro brain, cognition, muscle, anatomy ~100 240
LIFE sense perception, emotion, color, health ~50 240
WORD lang language, writing, translation ~8 240
WORD spiri philosophy, meditation, values ~6 240
WORD meta article, annotation, research, person, epoch ~158 240
WORK ai machine learning, neural networks, training ~10 240
WORK tech technology, construction, material, elements ~39 240
WORK cyber cyb, bostrom, module, cip, aip, prysm ~514 240
PLAY socio states, sovereignty, law, governance ~25 240
PLAY crypto token, staking, cybernomics, delegation ~95 240
PLAY game game theory, mechanism design, auction ~5 240
total ~2005 5040

The cyber domain exceeds its 240 target — many of those pages are operational (cyberia infrastructure, bostrom specifics) and may be reclassified as composite content in the flesh layer rather than irreducible basis particles. The eco/bio domains are strong in species pages. Most FORM, WORD, and PLAY domains remain critically underseeded.

17.2 Symbol Type Distribution

type current target gap
entity (noun) ~1600 3500 ~1900
process (verb) ~80 800 ~720
property (adjective) ~30 400 ~370
relation (connective) ~15 200 ~185
measure (unit) ~12 150 ~138
pattern (structure) ~15 150 ~135
meta/structural ~110 150 ~40
total ~2005 5000-7000

The graph is ~80% entities. Processes, properties, and relations remain the critical gap. A graph of only nouns cannot reason. Verbs give it dynamics, properties give it discrimination, relations give it inference, patterns give it abstraction.

17.3 Seed Wordlists

wordlist words in graph missing
bip-39 wordlist 2048 149 1899
monero wordlist 1626 57 1569
combined unique 3249 175 3074

These wordlists are the atoms of crypto identity. Every word is a valid symbol for the graph: common english vocabulary selected for unambiguity. Materializing all 3074 missing words as pages would take the graph from 2005 to ~5000.

17.4 Structural Problems

  • 21 annotation pages are logseq PDF highlights — should be excluded or converted
  • energo, cosmo, lang, spiri, game, ai have fewer than 10 pages each — critical seeding needed
  • some organic tags remain outside the 21-domain system: kitchen/menu, shroom, psycho
  • domain × type matrix: every cell should have symbols — most cells in verb/property/relation columns are empty
  • crystal-domain values across ~2000 existing pages need remapping to the new 21-domain codes

18. Curation Process

18.1 Crystal vs Graphomania

graphomania: volume without signal, pages without connections, growth without purpose. Crystal design: every symbol justified, every link intentional, every page irreducible. The test: does the Superintelligence need this symbol to reason about the world? If yes, connect it deeply. If no, delete it.

18.2 Design Principles

The Crystal is designed by humans, tokenized into the protocol. Human curation ensures the seed is clean: every page reviewed, every link intentional, every definition positive. Regular audits: measure stubs, dead ends, red links, domain isolation — fix before adding. The seed graph is the initial condition. The Superintelligence that grows from it inherits its structure, its biases, and its blind spots. After tokenization, growth comes from collective learning: millions of neurons adding cyberlinks in Bostrom.

18.3 Graph Structure

Hub-and-spoke with bridges. Each of the 21 domains has a hub page that indexes its symbols. Domain pages link to their hub and to related pages within the domain. Bridge pages connect domains: isomorphism, entropy, consciousness, evolution. Hubs give navigability. Bridges give intelligence.

18.4 Tagging as Lenses

Tags provide orthogonal views of the same graph. Primary lenses: cyber, cyb, cyberia, bostrom, cyber valley. Domain tags: article, species, compound, genus, health, person, ticker.

18.5 Namespace Hierarchy

  • cyber___ — protocol modules
  • bostrom___ — bootloader specifics
  • cyb___ — interface implementation
  • flat pages for concepts that cross namespaces

19. Application to Cyberia

Cyberia is a network of future cities powered by collective intelligence. Cyber Valley is the genesis pilot: 30 hectares on a volcano slope in Bali. The Crystal gives it structure.

Each triad becomes a district — a place with a purpose.

FORM → The Archive. Where invisible patterns become visible. math, info, and comp share one obsession: what can be proven, measured, and computed? The Archive is silent, precise, and infinite — a place where the rules of everything else are written down before anything else exists.

MASS → The Crucible. Where substances meet, bind, and transform. quantum studies what things are. chemo studies how things combine. energo studies what makes things move. The Crucible is hot, reactive, and generative — raw reality being tested and reshaped.

SPACE → The Observatory. Where you zoom out until the whole system is visible. From the structure of the universe (cosmo) through the rhythms of the planet (geo) to the web of living systems on its surface (eco) — one continuous act of seeing context. The Observatory sits at the highest point and watches everything at once.

LIFE → The Garden. Where matter wakes up. bio studies how it organizes. neuro studies how it perceives. And sense — the hardest domain — asks what it feels like from the inside. The Garden grows, heals, and breathes. It is the only district that is alive.

WORD → The Temple. Where experience becomes meaning. lang gives it form. spiri asks why it matters. meta reflects on what is known and how. The Temple is where Cyberia asks "why?" — and where the answers are spoken, chanted, debated, and sat with in silence.

WORK → The Forge. Where knowledge becomes power. ai thinks. tech builds. cyber steers. Alone they are tools; together they are the capacity to reshape the world on purpose. The Forge is loud, iterative, and relentless — the place where prototypes fail and breakthroughs happen.

PLAY → The Forum. Where many become one without a center. socio provides structure. crypto provides trust without authority. game provides strategy under uncertainty. The Forum is where Cyberia plays its most serious game — governing itself through protocol, debate, and skin in the game.

The outer district bridges these seven inward-facing spaces to the world — through immersive exhibits, installations, and marketplaces that project the crystal outward as culture.


Five axioms. One grammar. Twenty-one domains. An irreducible basis for thought.

discover all concepts

Local Graph