learning incentives
one mechanism within cyber/tokenomics: how $CYB is minted, burned, and locked to reward knowledge creation in the cybergraph
knowledge creation is costly, but its benefits are collective. without incentives, rational agents free-ride on others' cyberlinks. this mechanism makes contributing profitable — and free-riding unprofitable
the signal: Δπ
every reward traces back to one quantity: how much did your action shift the tri-kernel fixed point π?
$$\text{reward}(v) \propto \Delta\pi(v)$$
π is the stationary distribution of the composite operator $\mathcal{R} = \lambda_d D + \lambda_s S + \lambda_h H_\tau$ — diffusion explores, springs enforce structure, heat kernel adapts. the collective focus theorem proves π exists, is unique, and is computable locally
Δπ is the gradient of system free energy. creating valuable structure is literally creating value. no designed loss function — physics defines what should be optimized
reward functions
five candidates for measuring convergence contribution, each with trade-offs:
| function | formula | strength | weakness |
|---|---|---|---|
| Δπ norm | $\sum_j \|\pi_j^{(t+1)} - \pi_j^t\|$ | simple, easy to verify | gameable by oscillation |
| syntropy growth | $H(\pi^t) - H(\pi^{t+1})$ | rewards semantic sharpening | computationally heavier |
| spectral gap | $\lambda_2^t - \lambda_2^{t+1}$ | measures global convergence speedup | expensive, non-local |
| predictive alignment | $\text{align}(\pi^{(t+1)}, \pi^T)$ | favors early correct contributions | requires delayed validation |
| DAG weight | descendant blocks referencing this one | rewards foundational work | slow to accrue |
the hybrid model combines them:
$$R = \alpha \cdot \Delta\pi + \beta \cdot \Delta J + \gamma \cdot \text{DAGWeight} + \epsilon \cdot \text{AlignmentBonus}$$
where $\Delta J = H(\pi^t) - H(\pi^{t+1})$ is syntropy growth. fast local rewards use Δπ and ΔJ. checkpoints add alignment and spectral verification bonuses. validators sample and verify blocks probabilistically
link valuation
cyberlinks are yield-bearing epistemic assets. they accrue rewards over time based on contribution to focus emergence:
$$R_{i \to j}(T) = \int_0^T w(t) \cdot \Delta\pi_j(t) \, dt$$
where $\Delta\pi_j(t)$ = change in focus on target particle $j$ attributable to the link, $w(t)$ = time-weighting function, $T$ = evaluation horizon
| link type | characteristics | reward trajectory |
|---|---|---|
| viral | high Δπ short-term | early peak, fast decay |
| foundational | low Δπ early, grows later | slow rise, long reward |
| confirming | low individual Δπ, strengthens axon weight | shared reward via attribution |
| semantic bridge | medium, cross-module | moderate, persistent |
attribution
multiple neurons contribute cyberlinks in the same epoch. the total Δπ shift is a joint outcome — how to divide credit fairly?
the Shapley value answers: each agent's reward equals their average marginal contribution across all possible orderings. in this system, the coalition's total value is the free energy reduction $\Delta\mathcal{F}$, and each agent's marginal contribution is how much π shifts when their cyberlinks are added to the graph. Shapley distributes the total Δπ reward proportionally to each neuron's causal impact
exact computation is infeasible ($O(n!)$). probabilistic shapley attribution approximates:
- local marginal — compute each transaction's individual $\Delta\mathcal{F}$ (add link, measure π shift)
- Monte Carlo sampling — sample $k$ random orderings of the epoch's transactions, measure marginal contributions in each ordering
- hierarchical batching — cluster transactions by affected neighborhood, distribute within clusters
- final reward: $R_i = \alpha \cdot \Delta\mathcal{F}_i + (1-\alpha) \cdot \hat{S}_i$
where $\Delta\mathcal{F}_i$ is the fast local estimate and $\hat{S}_i$ is the sampled Shapley approximation. $\alpha$ balances speed (local marginal) against fairness (Shapley)
complexity: $O(k \cdot n)$ with $k \ll n$. feasible for 10⁶+ transactions per epoch
the three token operations
- mint: neurons earn $CYB proportional to Δπ of their cyberlinks
- burn: neurons destroy $CYB for permanent π-weight on particles (eternal particles) or cyberlinks (eternal cyberlinks)
- lock: neurons stake $CYB on particles or cyberlinks, earning from fee pools proportional to attention attracted
the game
the game design ensures the cybergraph improves over time:
- early, accurate links to important particles earn the most (attention yield curve)
- confirming links strengthen axon weight — repeated signals build consensus, not noise
- neurons build long-term reputation via accumulated π-weight (karma)
- focus as cost ensures every cyberlink is a costly signal
see cyber/tokenomics for the system-level economics (monetary policy, allocation curve, GFP flywheel). see collective learning for the group-level dynamics