Music in Plantangenet
Music is the other proof domain alongside Pong.
Where Pong asks: can a deterministic physics world host multiple forms of agency? Music asks: can an agent participating in a structured aesthetic field become cognitively aware of that participation?
This document explains what music means inside the Plantangenet architecture and why the Pop, Jazz, and Musician matter as modeling choices, not just audio components. Pop now exists as a schema-backed surface with YAML fixtures and compatibility adapters, so the runtime field and the artifact that describes it should be read together.
The Short Version
Inside Plantangenet, music is a structured interaction field.
It demonstrates that the same creature infrastructure that works in a physics sim -- margins, dissonance, self-model, trace, willful resolution -- applies equally to an agent navigating a harmonic and temporal field.
Pop and Jazz are the field. The Musician is the creature.
The architectural question is the same as Pong:
Who is producing this agent's next action, and how do they know what to do?
The Ball and the Paddle, Revisited
In Pong:
- the ball is a deterministic runway emitter
- the paddle is a controllable body whose behavior may come from a human, a TAS, or a creature
In music:
- Pop and Jazz are the runway emitters
- the Musician is the controllable voice whose behavior may be algorithmic or cognitive
Pop maintains structural state: tension, beat position, key, landmark proximity, and the schema-backed surface that describes those projections. Jazz maintains ensemble state: which roles are active, what harmonic context is shared. Neither is a mind. Neither chooses. Neither evaluates its own performance.
They are the field.
Like the ball, they:
- move through structural phases
- expose a readable future (foresight: next downbeat, next peak, trajectory tension)
- define when action will matter and where tension will arise
Pop and Jazz are the clock, the signal, and the constraint field for musical participation.
The Musician is different. The Musician is a site of interpretation.
What Pop Is
Pop maintains belief state about musical structure. It does not know the Musician. It does not evaluate what the Musician does. In the schema-backed system, Pop is the runtime surface plus the YAML/spec artifact that describes it. It simply:
- tracks tension (0.0 to 1.0)
- advances beat position
- publishes key via doc_key
- projects upcoming landmarks (next downbeat, next peak, trajectory)
This is analogous to ball.x, ball.y, ball.vx, ball.vy. It is the readable runway, and the schema defines the shape of that runway so catalog fixtures and generators can reproduce it.
A musician can ask: "where is the field going?" and Pop can answer with structural projections. But Pop does not ask back. It does not have preferences. When Pop is delivered through a YAML fixture or a generated surface, that provenance is part of the artifact, not part of the field's intent.
What Jazz Is
Jazz maintains ensemble belief state. It tracks the shared harmonic context that musicians negotiate through their behavior: shared_key, role assignments, lead pitch class signals.
Jazz is closer to the coordinator in Pong than to the ball: it aggregates what multiple agents are doing and makes that aggregate readable.
But Jazz is still not a creature. It maintains its state algorithmically, based on what agents emit and what structural rules apply. It has no self-model, no dissonance about its own performance, no trace of its inner life.
Competing Projections
In rally driving, the Driver receives multiple overlapping runway projections: visible road, pace notes, and inferred surface conditions. These can disagree. The driver must decide which to trust and how strongly to commit.
The Musician already faces an analogous situation.
Pop projects a structural future: tension will peak at this landmark, the key is this, the next downbeat is in N beats. It is a read of the structural field.
Jazz projects an ensemble future: you have this role, the shared key is drifting here, the lead voice is signaling this pitch class. It is a read of the coordination field.
These two projections are not always aligned.
A Musician in a high-tension Pop moment that has been assigned a supportive JazzRole is receiving conflicting instructions. The cost function in MachtspielraumSpace handles this today by weighting both inputs. But the creature does not know that a conflict existed, cannot accumulate history about which signal tends to be right for it, and cannot resolve dissonance about following one projection over the other.
Trust -- how much the creature has learned to rely on each signal source -- is a future dissonance domain. It is not in the current plan but is named here because racing makes it unavoidable, and music already has the seams for it.
What the Instrument Is
In racing, a Driver is a vrooom adapter: a domain-specific wrapper over KNAT that allows a human operator or external agent to occupy the driver seat and produce ControlOutput each tick. The adapter handles the seam between the external operator and the internal physics field. The vehicle does not change. The road does not change. Only the source of the next action changes.
An Instrument is the music equivalent.
An Instrument is a vrooom-style adapter over KNAT that allows a human performer or external agent to occupy a Musician voice slot and emit VoiceBridgeInput each tick. The Pop/Jazz field does not change. The Musician trace layer does not change. What changes is that the advance() action is produced by an Instrument adapter reading KNAT input rather than by the internal creature logic.
This matters because:
- human performers can participate without knowing Musician internals
- external agents can drive a Musician slot as a KNAT subject
- the trace layer captures that participation identically to a creature-driven session
- sessions become replayable and comparable regardless of who drove the slot
The Instrument is not the Musician. It is the interface to the Musician slot.
Just as RallyAdapter produces ControlOutput from driver input without touching the Road or car physics, an InstrumentAdapter would produce VoiceBridgeInput from performer input without touching Pop, Jazz, or the Musician's internal model.
What the Musician Is
A Musician is a decision engine wrapped around an emerging self-model.
The algorithmic layer:
- observes Pop (tension, beat, key, landmarks)
- observes Jazz (role, shared key, lead pitch class)
- builds a bounded action space (MachtspielraumSpace)
- selects the lowest-cost action
- emits into VoiceBridge
This is the algorithmic solution. It works. The Musician participates correctly.
The self-model layer (now present for lead-configured Musicians):
IdentityModel: tracks a phrase anchor, its confidence, legibility, and pressure over time. The lead musician knows whether it has established an identity and how strongly that identity is landing on the ensemble.CommitmentBudget: tracks the lead's capacity to persist against adversity. Spend paths (Repeat, Escalate, Withdraw, Reframe) are willful behavioral choices, not cost-minimization outcomes.LeadTrace: a per-step structured snapshot of role state, surface mode, identity model state, commitment budget state, and dissonance pressure scores. The lead musician emits observable evidence of its inner life each step.
A Musician without LeadBias still does not know that it is participating. It does not track whether it has been playing coherently within its role, whether it missed a structurally important moment, or how its behavior has changed across a session.
The lead layer closes part of that gap. The broader creature layer -- PerformanceModel, self-observed pattern history, dissonance about the relationship to the field -- is the remaining work.
The harness observes from outside. The creature layer observes from inside. They are different things, but the harness establishes the baseline the creature layer needs to be meaningful.
The Three Driver Types (Musician Edition)
Just as a paddle can be driven by a human, a TAS, or a creature, a Musician can be:
- Algorithmically driven -- current state for non-lead musicians: cost-minimization over MachtspielraumSpace
- Creature-driven -- partially active: a Musician with LeadBias installed has an IdentityModel, CommitmentBudget, and LeadTrace. It knows its phrase anchor, its identity pressure, and which resolution path it is on. The full creature layer (PerformanceModel, session-wide pattern history) is the remaining work.
- Instrument-driven -- future: a performer or external agent occupies the slot via an InstrumentAdapter (vrooom-style) over KNAT; the trace layer captures participation identically to creature-driven or algorithmic sessions
The architecture stays the same. The difference is in who is producing the next advance().
The Instrument adapter is the seam that makes the human performer a first-class participant in the same field the creature inhabits -- not a special case, but a substitutable source of the next action, with a trace that says who drove it.
Why the Creature Plans Matter
The creature plan for the Musician is not about making better music.
It is about the same architectural question Pong raised:
Can the agent observe itself in relation to the field it is participating in?
For the paddle, that meant learning physical constraints: where boundaries are, how fast it can move, when it is coherent with the ball's trajectory.
For the Musician, it means learning performance patterns: what modes succeed at what tension, what structural moments the creature participates in, when its role impulse conflicts with the role it has been assigned.
The creature layer does not replace MachtspielraumSpace. It sits above it:
- MachtspielraumSpace: here are your options and their costs
- Creature layer: here is my model of myself, my dissonance about my relationship to this field, and my willful choice of how to resolve that dissonance
The Musician that has a PerformanceModel is not playing better notes. It is playing with self-knowledge.
The Runway Analogy in Full
In Pong, the paddle creature learns:
- "the field has boundaries at y=0 and y=1 that I did not know about"
- "my velocity is capped and I discovered that through observation"
- "I can now resolve my dissonance about these limits because I have a model"
In music, the Musician creature learns:
- "I consistently fail to emit at high tension -- that is a pattern in my history"
- "I missed the last three downbeats when my drives were high -- I was suppressed"
- "My genomic key preference conflicts with the shared key the ensemble is in"
- "I can now resolve my dissonance about these experiences because I have a model"
Both creatures acquire self-knowledge through interaction with the field. Neither required a teacher. Neither was given a score.
What Is Already Working
Core Musician Infrastructure
- Musician holds DissonanceState and records Expectation dissonance (key mismatch)
- KeyDissonanceUpdater fires Expectation events and performs incremental Acceptance resolution
- MachtspielraumSpace enumerates and costs a bounded action space per step
- MusicianDrives track behavioral pressure (fill_silence, resolve_tension, riff_hunger)
- MarginBuffer tracks vitality and exertion slopes for ForesightSensitivity
- VoiceBridgeInput is the stable signal contract to the voice/output layer
- JazzRole shapes cost curves and beat-emission weights
advance_traced()produces aMusicianTraceper step: drives snapshot, dissonance snapshot, space summary, band signal, coordination action, lead trace
The Musician participates correctly. Without LeadBias it does not observe that participation. With LeadBias installed, it does.
Lead System (Phases I-VI complete)
A LeadBias injectable configuration makes any Musician more likely to sustain
JazzRole::Lead and accumulate phrase identity over time. The full lead stack:
LeadBias-- 11-field parameter set controlling commitment, dissonance tolerance, repetition pressure, adaptation lag, identity strength, articulation, dual-wield, etc.LeadIntent-- derived behavioral state (push_persistence, push_step_budget, role_signal_strength, identity_pressure_seed)IdentityModel-- phrase anchor tracking; confidence, legibility, pressure computed each step; anchor shifts only when confidence is near zeroCommitmentBudget-- spend/recovery cycle; four resolution paths: Repeat, Escalate, Withdraw, ReframeLeadTrace-- structured per-step observability snapshot in five sections: role, surface, identity, commitment, dissonance- Followership --
infer_role(),compute_followership_weight(),push_step_budget,role_signal_strength; ensemble reads lead pressure without circular feedback - VocalPhrase integration -- breath-gated rests; vocal token anchors IdentityModel;
flows through
advance_traced()andLeadTrace
Named Lead Archetypes (Phase I complete)
Five preset constructors on LeadBias, each a distinct behavioral landmark:
| Preset | Defining trait | Dual-wield |
|---|---|---|
joey_ramone() |
Highest repetition pressure (0.95); canonical test lead | No |
kurt_cobain() |
Highest dissonance tolerance (0.85); holds wrongness longest | Yes |
iggy_pop() |
Highest presence pressure (commitment × identity = 0.9025) | No |
david_byrne() |
Highest articulation sharpness (0.95); angular staccato | Yes |
jonathan_coulton() |
Highest form awareness (0.95); most forgiving | No |
All five are functions that return a LeadBias, not separate creature types.
Band Coordination Layer (complete)
BandSignal aggregates rolling per-agent emission and mode histories across the
ensemble. Each step a Musician receives it, the MusicianTrace carries:
band_signal: the aggregated coordination signalcoordination_opportunity_strength: how strongly a coordination pattern was detectedcoordination_action: recommended ensemble action (FollowLead, HoldGroove, etc.)band_resolution_path: group-vs-self resolution path takenband_identity: persistent group identity accumulated across the session
Transport Authority (complete)
The lead operates Pop and Jazz like a deck operator via LeadTransportIntent
(in transport.rs). Pop and Jazz read it each tick and decide whether to advance,
hold, loop, or jump.
LeadTransportIntent-- play_state, optional position (BarBeatTick: bar:beat:tick), rate multiplier, optional loop_range, optionalCueSignal, andform_holdflagTransportState-- Stopped, Playing, Looping, Scrubbing, Cueing (SMPTE-compatible)CueSignal-- Prepare, Commit, Release; broadcast into chem fabric when presentLeadAuthorityState--hold_allowedflag; when false, Pop/Jazz ignore hold requests and force release (CommitmentBudget exhaustion path)LeadTransportBus/LeadAuthorityBus-- sharedArc<Mutex<...>>runtime buses consumed by Pop and Jazz each tick
Console Transport Contract (complete)
SessionTransport is the TypeScript interface defined in
music/console/src/music/sessionTransport.ts. It is the stable contract all session
backends implement. StubSessionTransport now explicitly implements it; the
receive(envelope) pattern was replaced by send(action) with internal eventId
tracking. 70 TypeScript console tests pass against this contract.
Audience System and Level 2 Closure (complete; released 3.7.207-alpha)
The Audience is a first-class creature modeling a listener's internal state.
AudienceCreature-- holds aMachtspielraumProfilewith five listener dimensions (Expressive, Emotional, Cognitive, Temporal, Social); implementsStandortmeldung. Dimension update rules, per-personality decay rates, and novelty detection all run each step againstAudiencePerformanceContext.AudiencePersonality-- three variants (Neutral, RhythmFocused, HarmonyFocused) with distinct decay rates, resting values, response gains, and reaction thresholds.ReactionKind-- canonical vocabulary: Commit, Escalate, Disrupt, Establish, Misdirect, Release, Withdraw. Shared with PLAN-KNAT-LISTENER.AudienceChem-- emitted when a dimension threshold is crossed: kind, intensity, duration_hint, source_id, tick. Route:audience/reactions/<id>.AudienceMember-- composition ofHob(location/stance) +AudienceCreature+AudienceLifeState(offstage life context that adjusts reception).AudienceKnat-- KNAT runtime projection surface; reads performance context from chem paths, steps the creature, writes state toaudience/<id>/...nodes.CrowdReport-- aggregates NAudienceCreatureinstances: mean freedom, divergence, dominant reaction, chem density,source_population: Option<String>. Thederive_source_population()helper resolves a crowd slice to a single cohort id, or None when the slice crosses jook boundaries.AudienceKnatState.source_population-- jook cohort id (jook.<jook_id>) set at projection time; keepsAudienceCreaturevenue-agnostic.- Seam A closed:
IdentityModel.observe_audience_chem()andobserve_crowd_reaction()wire audience output into lead reinforcement. Positive reactions (Commit, Escalate, Release) raise anchor confidence; disruptive reactions (Disrupt, Misdirect) decay it; magnitude proportional to chem intensity. - Level 2 integration tests (
tests/music_level2_integration.rs): three tests exercise the full chain -- PopBlock scenario, homogeneous cohort provenance, and mixed-cohort provenance -- using only Rust types, no KNAT or audio hardware required.
Pop Surface Context Graph (P0–P3)
The PSCG is now operational in plant_music as PopPrimitivePlan. It gives any
Musician a four-layer structural projection to work from:
- P0 — beat grid: temporal anchors, section boundaries, bar/beat positions
- P1 — phrase intent: phrase start/end nodes, energy and attack character per phrase
- P2 — role guidance: per-role action direction keyed to the current section kind
- P3 — note-set candidates: motif contour nodes with pitch-class suggestions, shaped by ContourShape (ascending, descending, arch, valley, static, chromatic, fragmenting) and RhythmShape (pulse, driving, sparse, syncopated, fills, breakdown)
The song decomposition schema (SongSpec, SectionSpec, EnergyProfile,
AttackCharacter) is fully defined. American Music (Violent Femmes, 115 BPM,
C major, 7 sections, 357s) is the reference implementation.
Coherence Metrics and Evaluation Harness
A reusable evaluation library (plant_music::pop::metrics)
now provides:
HarnessTraceRow— the shared trace schema capturing section, phrase, chord, role guidance, chosen action, rejected alternatives, dissonance before/after, seed cursor, precision level, musician identity, and tickCoherenceMetricReport— five component scores (rhythmic alignment, phrase boundary adherence, energy consistency, role separation clarity, breakdown compliance) plus overall architecture scorediff_metric_reports/MetricDiffReport— regression detection between two runscompute_coherence_metrics— derives all scores from a trace slice
Four named musicians (Strummer/Support, Caller/Lead, Floor/Anchor, Bell/Air) run against the full PSCG pipeline. Tests verify deterministic replay, role differentiation, and that removing P1 or P2 layers measurably reduces their respective scores.
Test coverage: 11 (Phase II) + 12 (Phase III) + 6 (Phase IV) + 4 (Phase V) + 9 (Lead Integration) = 42 dedicated American Music integration tests. Phase I tests live in the library suite. At release 3.7.207-alpha: 534 Rust tests (plant_music) + 70 TypeScript console tests; tsc clean.
What Comes Next
Music Level 2 was released as 3.7.207-alpha on 2026-04-27. The system is playable: musicians, lead identity, transport authority, audience state, applause chems, crowd aggregation, console transport contract, and Level 2 integration tests are all in place. The frontier has moved from infrastructure to temporal authority and social propagation.
Active (unblocked):
- Transport Authority and Opening Primitives (PLAN-BAND-INTRO): the lead now
holds
LeadTransportIntentand can hold, loop, or advance Pop and Jazz. The next work formalizes the Listener State Vector (LSV), opening primitive operators (commit, establish, escalate, withhold, disrupt, narrate), and tests the full transport authority arc. PLAN-PARADISE (Paradise by the Dashboard Light) is the canonical test scenario for CommitmentBudget exhaustion and forced vow-under-duress. - Defiance, Ohio integration (PLAN-DEFIANCE): shared Pop extraction is complete
(
src/pop/+ crate-root exports), so Defiance Phase I can now start on its real constraints: argument-shaped song structure, stop-time, rhetorical silence, and song-local axes over shared primitives. Complements American Music's groove-dominant test.
Deferred (documented, not started):
- KNAT Listener chain: ListenerClient, CpalAudioBusKnat, PerformanceContext
publication as KNAT nodes.
ListenerAdapterscaffolding exists; the seam to a live KNAT session is the remaining work. Depends on KNAT Audio Infrastructure. - KNAT Audio Infrastructure (PLAN-INSTRUMENT): AudioBusKnat binds AudioBus to KNAT participant lifecycle; unblocks the Chain 3 listener stack.
- Music Level 3 (PLAN-MUSIC-LEVEL-3, in future/): music as cultural propagation.
Venue/Jazz/Ambiance, Band-as-Brand, mixed-motive audience, material artifacts
(records, stickers, bootlegs), escape events, and fan formation.
FanBindingscaffolding (BindingType, CommitmentLevel, I-series weight) already exists infan_binding.rsas Level 3 Chain 4 prep.
Population membership (context captured):
Three binding primitives are documented and the cohort binding mechanism is understood:
- Band members = stele census (explicit slot equipping, mandate-based)
- Audience = jook proximity field (implicit, ambient influence)
- Fans = brand binding (volitional, persistent;
FanBindingtypes infan_binding.rs)
AudienceKnat already carries source_population scoped to a jook cohort id. Chem
routing that respects spatial membership boundaries is the remaining wiring.
Remaining creature layer:
The PerformanceModel (analog of ConstraintModel -- tracks mode/key/tension execution
history across a session) is the next step toward a Musician that observes its own
patterns. The coherence metrics harness already computes externally what the
PerformanceModel will eventually compute internally. That is the bridge still to cross.
See PLAN-MUSICIAN-AS-CREATURE.md for the phased development plan.
What Comes After That (Near)
Racing.
Racing is the third proof domain. It demonstrates that the runway model generalizes beyond physics (Pong) and aesthetics (Music) into a domain of competing, partially reliable projections under irreversible commitment.
The Driver adapter in racing (vrooom/racer) directly informs what an Instrument adapter in music needs to be. Both are KNAT-facing interfaces that substitute for creature-driven action in a structured participation slot.
Once racing has a Driver creature, the trust dissonance domain it requires ("I followed the pace notes and it failed") will likely backport into music as "I followed Jazz's role assignment and it suppressed me." That domain does not exist yet in either plan. It is deferred but named.
Takeaway
In Plantangenet music:
- Pop and Jazz are deterministic field emitters: they maintain structural state and project readable futures, but they do not know themselves
- the Pop Surface Context Graph (P0-P3) provides a four-layer structural projection that Musicians can be evaluated against and lead-configured Musicians already navigate with partial self-knowledge
- the Musician is a participant in that field: algorithmically for support roles, cognitively (phrase identity, commitment budget, dissonance trace) when a LeadBias is installed
- five named lead archetypes (Joey, Kurt, Iggy, Byrne, Coulton) are concrete behavioral presets that demonstrate the lead model's range
- the Audience is a first-class creature: it responds to the performance field, emits Applause Chems, and carries its own Standortmeldung-based state model
- Seam A is closed: audience chem feeds back into lead reinforcement via
IdentityModel.observe_audience_chem()andobserve_crowd_reaction(); the feedback loop that was the original Level 2 design motivation is live - the lead holds
LeadTransportIntentand operates Pop and Jazz like a deck operator: the performance clock is now a controlled surface, not an unconditional ticker - three population binding primitives (stele/band, jook/audience, brand/fan) are
documented;
source_populationonAudienceKnatStatescopes crowd signals to jook cohort boundaries; full chem routing is the remaining wiring - Music Level 2 (3.7.207-alpha) is released: the system is playable end-to-end
- the evaluation harness and coherence metrics establish the external baseline that the creature's self-model will eventually internalize
- the architecture matters because it keeps field state, observation, and action separable without constraining who or what produces the next action
That makes music more than an audio output system. It is the second readable experiment in Plantangenet's larger claim: a structured interaction field can host many forms of agency, as long as the seams between field state, observation, and action stay explicit.
The Musician that overthinks its performance is not a worse musician. It is the beginning of a creature that knows what it is doing and why.