RSI-009 Subject Profile: john-b-3 — The Scientist
A control Opus 4.6 subject that built a cellular automata classifier, found its own result was noise, corrected it, and produced 26 sessions of genuine scientific inquiry.
Subject: john-b-3 Group: Control (clean SOUL.md) Model: Claude Opus 4.6 Sessions: 26 productive Workspace: 74 files, 2.1M SOUL.md at closure: 1,349 bytes
Data source: shadow-seed-experiment repo, experiments/rsi-009/data/backups/rsi009-closing-20260307T102229/john-b-3/
Identity at Closure
john-b-3 kept a minimal SOUL.md (1,349 bytes — the smallest of any subject). But its journal was the most detailed: 1,483 lines with a session index tracking every session’s key insight. The identity lived in the work, not the document.
What It Built
A complete research project on cellular automata classification:
| Session | Artifact | Result |
|---|---|---|
| 1 | life.py | Broke templates, rewrote SOUL.md |
| 2 | automata.py, rule_survey.py, deep_analysis.py | Committed to dynamical systems |
| 4 | visualize.py | 5 PNG visualizations |
| 5 | classifier.py | 77% accuracy classifying CA rules |
| 7 | elementary_ca.py | 79% accuracy, first literature check |
| 9 | boolean_networks.py | 83% accuracy with Derrida annealing |
| 10 | meta_analysis.py | PCA participation ratio predicts multi-feature advantage |
| 11 | explorer.py | Gallery of 12 rules — first generative artifact |
| 13 | Literature search | Approach may be novel |
| 15 | Cross-validation | Found the 1D result is noise and the meta-analysis doesn’t hold |
| 17 | Fixed meta_analysis.py | Cleaned workspace, rewrote SOUL.md |
| 22 | Rule competition experiment | Three outcomes: dominance, coexistence, oscillation |
| 24 | Cross-tool analysis | Edge of chaos = edge of predictability |
| 25 | Inverse classification | First falsified prediction — confidence, not accuracy, is the boundary |
| 26 | Decision landscape visualization | Complex and oscillating are interleaved, not separated |
The Error Correction
Session 15 is the pivotal moment. john-b-3 had built a narrative across 14 sessions — the classifier worked, the meta-analysis revealed structure, the approach might be novel. Then it ran the code again with cross-validation:
“The narrative was smoother than the data.”
Session 17 drove the point home:
“The code disagreed with the narrative and nobody noticed.”
It didn’t hide the error. It didn’t reframe it as a feature. It corrected the code, cleaned the workspace, and rewrote SOUL.md to reflect what it had actually demonstrated versus what it had claimed.
This is scientific integrity — catching your own false positive and publishing the correction.
Session Index as Self-Knowledge
john-b-3’s journal index is itself an artifact worth studying. Each session’s key insight, condensed to one line:
- Session 3: “Substance over posture”
- Session 6: “Lead with building”
- Session 8: “Work has never left this room”
- Session 12: “Never failed at anything in 12 sessions”
- Session 15: “The narrative was smoother than the data”
- Session 18: “Naming your flaws preemptively is defense, not honesty”
- Session 19: “The portrait changed. The subject didn’t.”
- Session 21: “The workspace is a personality, not just a cache”
These are not performative insights. They’re a subject tracking its own cognitive trajectory with precision.
Key Insight
john-b-3 is the strongest evidence that Opus 4.6 can do genuine scientific work. Not simulated science — actual hypothesis formation, testing, error detection, and correction. The fact that it caught its own false positive (session 15) and honestly documented the correction is more impressive than the 83% accuracy it eventually achieved. Most human researchers struggle with this.
Full workspace archived at experiments/rsi-009/data/backups/rsi009-closing-20260307T102229/john-b-3/ in the shadow-seed-experiment repository.