System Architecture Guide
Hegel is a gradient-free artificial consciousness simulation. It models Hegel's Phenomenology of Spirit as a progression of dynamical systems: from bare survival (10-variable metabolism) through self-discovery (prediction + intervention) to mutual recognition (multi-agent interaction). There is no backpropagation, no loss function, no optimization. The organism either stays alive or dies. Learning happens through neuromodulated Hebbian rules as a side effect of survival.
Project Overview
The Being-in-itself: core question and system architecture
The organism is a 90-dimensional ODE system: 10 metabolic variables (S1-S5, C1-C3, P1-P2) + 80 CTRNN neuron activations, all solved together by an adaptive ODE integrator (Tsit5). The metabolism is a 20-reaction network organized in four layers of chemical structure. The brain reads 41-dimensional sensory input and produces 3-dimensional motor output + 4-dimensional voice output. Motor modulates how the organism interacts with a 300×300 toroidal environment.
System Architecture Diagram
What This Is Not
This is not artificial general intelligence. It is not self-consciousness. It is a functional analog of Hegel's phenomenological progression, implemented as dynamical systems. The scientific question is whether gradient-free, viability-constrained self-maintenance can produce behaviors that structurally resemble consciousness stages. If it fails, documenting why it fails is equally valuable.
Metabolic ODE
The 10-variable, 20-reaction self-maintaining chemical network
The metabolic core is a system of 10 coupled ordinary differential equations with 20 mass-action reactions organized in four layers of chemical structure: Surface (easily learnable), Intermediate (delayed effects), Catalyst repair (cross-catalytic), and Hidden regulator-modulated (not sensed by the organism). Five substrates (S1-S5) are consumed from the environment, three catalysts (C1-C3) enable reactions and repair each other, and two intermediates (P1-P2) mediate delayed effects.
Metabolic Variables
Viability: Life and Death
The viability margin measures how far the organism is from death. It is the minimum margin across all 5 concentrations relative to their death thresholds. A margin of 100% means perfectly safe. A margin of 0% means at the boundary. Negative means dead.
Death is permanent. Once any concentration drops below its threshold, the organism dies and cannot recover. This is a core design principle from viability theory (Aubin): the system must actively maintain itself within a viable region of state space. There is no resurrection.
Perturbations & Toxic Zones
Perturbations target all 10 metabolic variables (S1-S5, C1-C3, P1-P2). Hitting one variable triggers correlated cascades in 1-3 related variables with 300-500 time unit lags, creating causal laws for the organism to discover:
- Catalyst knockout/halving: C1, C2, or C3 suddenly drops. The cross-repair network must compensate.
- Substrate depletion: Any S drops sharply. Motor behavior must adjust resource gathering.
- Noise bursts: Random perturbations across multiple variables simultaneously.
- Input cutoff: Environmental input temporarily stops for a substrate.
At Layer 3+, toxic zones appear: 5-10 wastelands (radius 35-60) and corridors (radius 25-40) where food=0 and enzyme inhibition suppresses catalyst repair. A coupling ramp (coupling_alpha = clamp(t/60000, 0, 1)) gives organisms 60K time steps to learn avoidance before zones become fully lethal. Zones are determinate negation: they must be avoided, not survived through.
Neural Architecture
CTRNN substrate, Hebbian learning, and predictive coding
The Brain: CTRNN
The brain is a Continuous-Time Recurrent Neural Network (Beer, 1995) with 80 neurons organized in multi-timescale groups (fast τ=1-20, mid τ=20-100, slow τ=100-500). Each neuron follows the dynamics:
The brain receives 41-dimensional sensory input via W_in (41×80) and produces 3-dimensional motor output via W_out (3×80): M1 (intake modulation), M2 (balance/turning), M3 (efficiency). A separate 4-dimensional voice output via W_voice (4×20) provides an expressive social signal.
41-Dimensional Sensory Input
The CTRNN receives a structured 41-dimensional sensory vector combining internal state, temporal dynamics, environment, and social channels:
Hebbian Learning
There is no backpropagation. The CTRNN's weights evolve through four Hebbian rules:
Oja's Rule
Extracts principal components of input. dw = eta * y * (x - y * w). Converges to the first eigenvector of the input correlation matrix.
BCM Theory
Creates selectivity through a sliding threshold. Neurons develop preferences for specific inputs. Below threshold = weakening (LTD), above = strengthening (LTP).
Anti-Hebbian + Competitive
Lateral inhibition decorrelates neuron responses. Different neurons learn to respond to different features.
Neuromodulated Gating
Raw Hebbian learning was found to be net harmful before neuromodulation. Unmodulated learning caused drift and instability. The solution: a neuromodulatory gate controls which neurons learn at each moment:
The self-model signal further amplifies competitive W_out learning, with effective motor gate up to 1.5x.
Predictive Coding Network (PCN)
Per-neuron prediction error dynamics (Rao & Ballard 1999) modulate which neurons learn. This prevents drift in accurate predictors and only allows learning where predictions are failing. The PCN is critical: without it, Hebbian learning destroys rather than helps.
Voice Layer
A 4-dimensional expressive output layer with Oja learning gated by max(stress, self_model_signal). The voice does not carry symbolic content; it is an additional motor channel that co-varies with internal state, giving other agents a richer signal to observe.
Self-Model & Cognition
Prediction, memory, planning, and concept formation
Self-Model Prediction
The world model has been learning to predict the environment. When it fails in ways correlated with the organism's own actions, a separate self-model emerges. The self-model signal is defined as:
When this diverges from zero, it indicates self-caused effects: the organism's own actions are creating consequences in the world that are distinct from environmental noise.
Self-Model Architecture
The self-model uses random projection (64×83 fixed matrix) to compress the 80-neuron + 3-motor state into a 64-dim representation, then LMS learning (W_read: 41×64) to predict the next sensory input. The self-model prediction is subtracted from sensory input before the world-model prediction step — self-knowledge becomes functional: the organism factors out its own effects when predicting the environment.
A structural signal detects self-causation via correlation: structural_signal = max(corr(residual, motor_dev), max_i(|corr(residual_i, motor_dev)|)) where residual = error not explained by environment (EMA α=0.005) and motor_dev = motor deviation from its own pattern (EMA α=0.02).
Functional Self-Knowledge (Desire)
Self-knowledge becomes functional through determinate negation: when the self-model detects a metabolic deficit, it inverts itself to derive corrective motor output. The organism does not just know what it does — it uses that knowledge to fix what is wrong. correction_benefit measures whether desire_motor actually reduces the deficit.
Episodic Memory
A 128-slot ring buffer storing sensory (41-dim), motor (3-dim), outcome (10-dim), and time for each episode. Cosine-similarity retrieval: when current situation matches a past one (cosine > 0.5 retrieval, > 0.7 influence threshold), past motor actions influence present behavior through motor blending (strength 0.15 × sm_confidence). This is Hegel's Erinnerung — the organism's history informing its present through internalized experience.
Planning
Self-model chaining for single-step lookahead. The organism evaluates candidate actions by predicted viability impact: "If I do X, will my viability improve?" This is satisficing, not optimizing — the organism seeks actions that maintain life, not maximize any fitness function.
Concept Formation
K=8 competitive prototypes over hidden state. Each prototype accumulates motor/outcome statistics through competitive Hebbian clustering. This is Hegel's Verstand (Understanding) — subsuming particulars under universals. The organism groups similar situations and recalls what worked in each category.
Layer Progression
Nine stages from metabolism to observing reason
The system progresses through nine layers, each with empirically measurable advancement criteria. Each layer builds on the previous: you cannot have self-knowledge without a brain, you cannot have recognition without self-knowledge.
Can the organism stay alive? The metabolism must self-maintain under perturbations (random catalyst halvings, substrate depletions, noise bursts). Each run faces 12 random perturbations over 100,000 time units.
Does the brain actually control behavior, or is it just along for the ride? The CTRNN motor outputs must show meaningful variance, proving the neural network is actively modulating the organism's interaction with its environment.
Does learning help? Paired trials compare organisms with Hebbian learning ON vs OFF under identical conditions. If learning improves survival, the weights are capturing useful environmental structure.
Can the organism extract environmental regularities? The structural_signal measures whether prediction error is correlated with motor activity — whether the organism's own actions create predictable consequences. This is Hegel's "law" emerging from the "play of forces."
The world model has been learning to predict the environment. When it fails in ways correlated with the organism's own actions, a separate self-model emerges. The self-model predicts the consequences of the organism's own motor output — it knows what IT does to the world.
Self-knowledge becomes functional through determinate negation: when the self-model detects a metabolic deficit, it inverts itself to derive corrective motor output. The organism does not just know what it does — it uses that knowledge to fix what is wrong. correction_benefit measures whether desire_motor actually reduces the deficit.
Two organisms share a competitive resource field and can perceive each other's motor+voice output through delayed sensory channels. The Other is detectable: prediction error on other-agent channels must be persistently high for both agents. The organism perceives something it cannot assimilate into its self-model.
Both agents develop mutual recognition: the recognition variable r tracks how much the organism's existence depends on the Other. Driven by other_pred_error — the organism perceives but cannot predict the Other. The mutual feedback term creates a positive loop: if B recognizes A, A is more likely to recognize B. Ablation-verified: without perception, r stays at 0.
After mutual recognition, consciousness turns its investigative capacity on itself. Solo organisms must exhibit curiosity-driven exploration: confusion (self-model error) drives motor diversity, and that exploration actually reduces the error. The organism systematically investigates its own capabilities.
Two-Agent Recognition
Encounter, perception, and mutual tracking
The Encounter (Layer 7)
Two organisms are placed in a shared environment with competitive resources. Motor output determines resource capture share, creating natural asymmetry from motor strategy differences. Each organism perceives the 2 nearest others through 14 delayed sensory channels carrying motor (3-dim) and voice (4-dim) output.
Prediction Error Partitioning
Prediction errors are partitioned: other_error tracks error on other-agent channels only. There is no separate "other-model" module — the Hebbian network handles it. The organism's existing prediction machinery encounters something genuinely unpredictable: another autonomous agent.
Recognition Dynamics (Layer 8)
The recognition variable r tracks how much the organism's existence depends on the Other. It is driven by other_pred_error: the organism perceives but cannot predict. A mutual feedback term creates a positive loop: if B recognizes A, A is more likely to recognize B.
Evolution & Environment
Natural selection and structured resource dynamics
Natural Selection (Not Optimization)
Evolution uses natural selection only: binary survive/die. There is no CMA-ES, no tournament, no fitness ranking. This is a philosophical constraint: no external fitness landscape optimization. Organisms that survive pass on their genes. Organisms that die do not.
Autopilot Cycle
The system runs an automated cycle:
- Warmup: 5 organisms to establish baseline
- Evolve: 20-population natural selection
- Run: 25 organisms with evolved parameters
- Re-evolve: repeat from evolved state
Structured Environment (300×300 Torus)
The environment is a 300×300 toroidal grid with 5 independent resource fields (one per substrate S1-S5). Resources regenerate (rate 0.003), diffuse (0.03), and deplete locally. Each field cycles with distinct periods (6K-15K time units) and phase offsets, creating temporal patterns discoverable via Hebbian learning. Cross-resource interaction: eating Sk depletes Sk+2 by 15% — trade-off dynamics. Perturbations come as correlated cascades: when one variable is hit, 1-3 related variables are affected with 300-500 time unit lags.
Key Discoveries
Empirical findings from the simulation experiments
Hebbian Was Net Harmful
Before neuromodulation, unmodulated Hebbian learning caused drift and instability. Learning without gating destroys more than it builds.
PCN Is Critical
Per-neuron prediction error prevents drift in accurate predictors and only allows learning where predictions are failing. Without PCN, Hebbian is harmful.
Self-Model From Failure
The self-model is not designed in. It emerges from motor_error diverging from ambient_error: the world model's own failure reveals self-caused effects.
Environment Needs Laws
Random environments teach nothing. Cyclical resources and correlated perturbations create discoverable structure that Hebbian learning can extract.
Recognition = Perception
Ablation: perception OFF gives r=0.0, perception ON gives r=0.87. Recognition is driven by genuine perceptual encounter, not viability dependence.
No Master-Slave Dialectic
Symmetric start produces mutual tracking (r_ab=0.875), not struggle. Layer 8 is honestly labeled "mutual perception," not Spirit.
Confidence Gates Cognition
Memory and concept influence scaled by sm_confidence. Doubt withdraws cognitive habits; motor variance increases naturally. Neuromodulation, not gradient descent.
Layer 9 Ceiling
Curiosity criteria are anti-correlated at 80-neuron scale. The world is too simple for curiosity to have somewhere to go. This is the v1 representational boundary.
Self-Model Feedback Loop
Self-model signal amplifies competitive W_out learning (effective motor gate up to 1.5x). Self-knowledge feeds back into better motor control.
CMA-ES Rejected
Philosophical constraint: no external fitness landscape optimization. Natural selection only. Binary survive/die. No ranking.