Skip to content

Music in Plantangenet

Music is the other proof domain alongside Pong.

Where Pong asks: can a deterministic physics world host multiple forms of agency? Music asks: can an agent participating in a structured aesthetic field become cognitively aware of that participation?

This document explains what music means inside the Plantangenet architecture and why the Pop, Jazz, and Musician matter as modeling choices, not just audio components. Pop now exists as a schema-backed surface with YAML fixtures and compatibility adapters, so the runtime field and the artifact that describes it should be read together.

The Short Version

Inside Plantangenet, music is a structured interaction field.

It demonstrates that the same creature infrastructure that works in a physics sim -- margins, dissonance, self-model, trace, willful resolution -- applies equally to an agent navigating a harmonic and temporal field.

Pop and Jazz are the field. The Musician is the creature.

The architectural question is the same as Pong:

Who is producing this agent's next action, and how do they know what to do?

The Ball and the Paddle, Revisited

In Pong:

  • the ball is a deterministic runway emitter
  • the paddle is a controllable body whose behavior may come from a human, a TAS, or a creature

In music:

  • Pop and Jazz are the runway emitters
  • the Musician is the controllable voice whose behavior may be algorithmic or cognitive

Pop maintains structural state: tension, beat position, key, landmark proximity, and the schema-backed surface that describes those projections. Jazz maintains ensemble state: which roles are active, what harmonic context is shared. Neither is a mind. Neither chooses. Neither evaluates its own performance.

They are the field.

Like the ball, they:

  • move through structural phases
  • expose a readable future (foresight: next downbeat, next peak, trajectory tension)
  • define when action will matter and where tension will arise

Pop and Jazz are the clock, the signal, and the constraint field for musical participation.

The Musician is different. The Musician is a site of interpretation.

What Pop Is

Pop maintains belief state about musical structure. It does not know the Musician. It does not evaluate what the Musician does. In the schema-backed system, Pop is the runtime surface plus the YAML/spec artifact that describes it. It simply:

  • tracks tension (0.0 to 1.0)
  • advances beat position
  • publishes key via doc_key
  • projects upcoming landmarks (next downbeat, next peak, trajectory)

This is analogous to ball.x, ball.y, ball.vx, ball.vy. It is the readable runway, and the schema defines the shape of that runway so catalog fixtures and generators can reproduce it.

A musician can ask: "where is the field going?" and Pop can answer with structural projections. But Pop does not ask back. It does not have preferences. When Pop is delivered through a YAML fixture or a generated surface, that provenance is part of the artifact, not part of the field's intent.

What Jazz Is

Jazz maintains ensemble belief state. It tracks the shared harmonic context that musicians negotiate through their behavior: shared_key, role assignments, lead pitch class signals.

Jazz is closer to the coordinator in Pong than to the ball: it aggregates what multiple agents are doing and makes that aggregate readable.

But Jazz is still not a creature. It maintains its state algorithmically, based on what agents emit and what structural rules apply. It has no self-model, no dissonance about its own performance, no trace of its inner life.

Competing Projections

In rally driving, the Driver receives multiple overlapping runway projections: visible road, pace notes, and inferred surface conditions. These can disagree. The driver must decide which to trust and how strongly to commit.

The Musician already faces an analogous situation.

Pop projects a structural future: tension will peak at this landmark, the key is this, the next downbeat is in N beats. It is a read of the structural field.

Jazz projects an ensemble future: you have this role, the shared key is drifting here, the lead voice is signaling this pitch class. It is a read of the coordination field.

These two projections are not always aligned.

A Musician in a high-tension Pop moment that has been assigned a supportive JazzRole is receiving conflicting instructions. The cost function in MachtspielraumSpace handles this today by weighting both inputs. But the creature does not know that a conflict existed, cannot accumulate history about which signal tends to be right for it, and cannot resolve dissonance about following one projection over the other.

Trust -- how much the creature has learned to rely on each signal source -- is a future dissonance domain. It is not in the current plan but is named here because racing makes it unavoidable, and music already has the seams for it.

What the Instrument Is

In racing, a Driver is a vrooom adapter: a domain-specific wrapper over KNAT that allows a human operator or external agent to occupy the driver seat and produce ControlOutput each tick. The adapter handles the seam between the external operator and the internal physics field. The vehicle does not change. The road does not change. Only the source of the next action changes.

An Instrument is the music equivalent.

An Instrument is a vrooom-style adapter over KNAT that allows a human performer or external agent to occupy a Musician voice slot and emit VoiceBridgeInput each tick. The Pop/Jazz field does not change. The Musician trace layer does not change. What changes is that the advance() action is produced by an Instrument adapter reading KNAT input rather than by the internal creature logic.

This matters because:

  • human performers can participate without knowing Musician internals
  • external agents can drive a Musician slot as a KNAT subject
  • the trace layer captures that participation identically to a creature-driven session
  • sessions become replayable and comparable regardless of who drove the slot

The Instrument is not the Musician. It is the interface to the Musician slot.

Just as RallyAdapter produces ControlOutput from driver input without touching the Road or car physics, an InstrumentAdapter would produce VoiceBridgeInput from performer input without touching Pop, Jazz, or the Musician's internal model.

What the Musician Is

A Musician is a decision engine wrapped around an emerging self-model.

The algorithmic layer:

  • observes Pop (tension, beat, key, landmarks)
  • observes Jazz (role, shared key, lead pitch class)
  • builds a bounded action space (MachtspielraumSpace)
  • selects the lowest-cost action
  • emits into VoiceBridge

This is the algorithmic solution. It works. The Musician participates correctly.

The self-model layer (now present for lead-configured Musicians):

  • IdentityModel: tracks a phrase anchor, its confidence, legibility, and pressure over time. The lead musician knows whether it has established an identity and how strongly that identity is landing on the ensemble.
  • CommitmentBudget: tracks the lead's capacity to persist against adversity. Spend paths (Repeat, Escalate, Withdraw, Reframe) are willful behavioral choices, not cost-minimization outcomes.
  • LeadTrace: a per-step structured snapshot of role state, surface mode, identity model state, commitment budget state, and dissonance pressure scores. The lead musician emits observable evidence of its inner life each step.

A Musician without LeadBias still does not know that it is participating. It does not track whether it has been playing coherently within its role, whether it missed a structurally important moment, or how its behavior has changed across a session.

The lead layer closes part of that gap. The broader creature layer -- PerformanceModel, self-observed pattern history, dissonance about the relationship to the field -- is the remaining work.

The harness observes from outside. The creature layer observes from inside. They are different things, but the harness establishes the baseline the creature layer needs to be meaningful.

The Three Driver Types (Musician Edition)

Just as a paddle can be driven by a human, a TAS, or a creature, a Musician can be:

  1. Algorithmically driven -- current state for non-lead musicians: cost-minimization over MachtspielraumSpace
  2. Creature-driven -- partially active: a Musician with LeadBias installed has an IdentityModel, CommitmentBudget, and LeadTrace. It knows its phrase anchor, its identity pressure, and which resolution path it is on. The full creature layer (PerformanceModel, session-wide pattern history) is the remaining work.
  3. Instrument-driven -- future: a performer or external agent occupies the slot via an InstrumentAdapter (vrooom-style) over KNAT; the trace layer captures participation identically to creature-driven or algorithmic sessions

The architecture stays the same. The difference is in who is producing the next advance().

The Instrument adapter is the seam that makes the human performer a first-class participant in the same field the creature inhabits -- not a special case, but a substitutable source of the next action, with a trace that says who drove it.

Why the Creature Plans Matter

The creature plan for the Musician is not about making better music.

It is about the same architectural question Pong raised:

Can the agent observe itself in relation to the field it is participating in?

For the paddle, that meant learning physical constraints: where boundaries are, how fast it can move, when it is coherent with the ball's trajectory.

For the Musician, it means learning performance patterns: what modes succeed at what tension, what structural moments the creature participates in, when its role impulse conflicts with the role it has been assigned.

The creature layer does not replace MachtspielraumSpace. It sits above it:

  • MachtspielraumSpace: here are your options and their costs
  • Creature layer: here is my model of myself, my dissonance about my relationship to this field, and my willful choice of how to resolve that dissonance

The Musician that has a PerformanceModel is not playing better notes. It is playing with self-knowledge.

The Runway Analogy in Full

In Pong, the paddle creature learns:

  • "the field has boundaries at y=0 and y=1 that I did not know about"
  • "my velocity is capped and I discovered that through observation"
  • "I can now resolve my dissonance about these limits because I have a model"

In music, the Musician creature learns:

  • "I consistently fail to emit at high tension -- that is a pattern in my history"
  • "I missed the last three downbeats when my drives were high -- I was suppressed"
  • "My genomic key preference conflicts with the shared key the ensemble is in"
  • "I can now resolve my dissonance about these experiences because I have a model"

Both creatures acquire self-knowledge through interaction with the field. Neither required a teacher. Neither was given a score.

What Is Already Working

Core Musician Infrastructure

  • Musician holds DissonanceState and records Expectation dissonance (key mismatch)
  • KeyDissonanceUpdater fires Expectation events and performs incremental Acceptance resolution
  • MachtspielraumSpace enumerates and costs a bounded action space per step
  • MusicianDrives track behavioral pressure (fill_silence, resolve_tension, riff_hunger)
  • MarginBuffer tracks vitality and exertion slopes for ForesightSensitivity
  • VoiceBridgeInput is the stable signal contract to the voice/output layer
  • JazzRole shapes cost curves and beat-emission weights
  • advance_traced() produces a MusicianTrace per step: drives snapshot, dissonance snapshot, space summary, band signal, coordination action, lead trace

The Musician participates correctly. Without LeadBias it does not observe that participation. With LeadBias installed, it does.

Lead System (Phases I-VI complete)

A LeadBias injectable configuration makes any Musician more likely to sustain JazzRole::Lead and accumulate phrase identity over time. The full lead stack:

  • LeadBias -- 11-field parameter set controlling commitment, dissonance tolerance, repetition pressure, adaptation lag, identity strength, articulation, dual-wield, etc.
  • LeadIntent -- derived behavioral state (push_persistence, push_step_budget, role_signal_strength, identity_pressure_seed)
  • IdentityModel -- phrase anchor tracking; confidence, legibility, pressure computed each step; anchor shifts only when confidence is near zero
  • CommitmentBudget -- spend/recovery cycle; four resolution paths: Repeat, Escalate, Withdraw, Reframe
  • LeadTrace -- structured per-step observability snapshot in five sections: role, surface, identity, commitment, dissonance
  • Followership -- infer_role(), compute_followership_weight(), push_step_budget, role_signal_strength; ensemble reads lead pressure without circular feedback
  • VocalPhrase integration -- breath-gated rests; vocal token anchors IdentityModel; flows through advance_traced() and LeadTrace

Named Lead Archetypes (Phase I complete)

Five preset constructors on LeadBias, each a distinct behavioral landmark:

Preset Defining trait Dual-wield
joey_ramone() Highest repetition pressure (0.95); canonical test lead No
kurt_cobain() Highest dissonance tolerance (0.85); holds wrongness longest Yes
iggy_pop() Highest presence pressure (commitment × identity = 0.9025) No
david_byrne() Highest articulation sharpness (0.95); angular staccato Yes
jonathan_coulton() Highest form awareness (0.95); most forgiving No

All five are functions that return a LeadBias, not separate creature types.

Band Coordination Layer (complete)

BandSignal aggregates rolling per-agent emission and mode histories across the ensemble. Each step a Musician receives it, the MusicianTrace carries:

  • band_signal: the aggregated coordination signal
  • coordination_opportunity_strength: how strongly a coordination pattern was detected
  • coordination_action: recommended ensemble action (FollowLead, HoldGroove, etc.)
  • band_resolution_path: group-vs-self resolution path taken
  • band_identity: persistent group identity accumulated across the session

Transport Authority (complete)

The lead operates Pop and Jazz like a deck operator via LeadTransportIntent (in transport.rs). Pop and Jazz read it each tick and decide whether to advance, hold, loop, or jump.

  • LeadTransportIntent -- play_state, optional position (BarBeatTick: bar:beat:tick), rate multiplier, optional loop_range, optional CueSignal, and form_hold flag
  • TransportState -- Stopped, Playing, Looping, Scrubbing, Cueing (SMPTE-compatible)
  • CueSignal -- Prepare, Commit, Release; broadcast into chem fabric when present
  • LeadAuthorityState -- hold_allowed flag; when false, Pop/Jazz ignore hold requests and force release (CommitmentBudget exhaustion path)
  • LeadTransportBus / LeadAuthorityBus -- shared Arc<Mutex<...>> runtime buses consumed by Pop and Jazz each tick

Console Transport Contract (complete)

SessionTransport is the TypeScript interface defined in music/console/src/music/sessionTransport.ts. It is the stable contract all session backends implement. StubSessionTransport now explicitly implements it; the receive(envelope) pattern was replaced by send(action) with internal eventId tracking. 70 TypeScript console tests pass against this contract.

Audience System and Level 2 Closure (complete; released 3.7.207-alpha)

The Audience is a first-class creature modeling a listener's internal state.

  • AudienceCreature -- holds a MachtspielraumProfile with five listener dimensions (Expressive, Emotional, Cognitive, Temporal, Social); implements Standortmeldung. Dimension update rules, per-personality decay rates, and novelty detection all run each step against AudiencePerformanceContext.
  • AudiencePersonality -- three variants (Neutral, RhythmFocused, HarmonyFocused) with distinct decay rates, resting values, response gains, and reaction thresholds.
  • ReactionKind -- canonical vocabulary: Commit, Escalate, Disrupt, Establish, Misdirect, Release, Withdraw. Shared with PLAN-KNAT-LISTENER.
  • AudienceChem -- emitted when a dimension threshold is crossed: kind, intensity, duration_hint, source_id, tick. Route: audience/reactions/<id>.
  • AudienceMember -- composition of Hob (location/stance) + AudienceCreature + AudienceLifeState (offstage life context that adjusts reception).
  • AudienceKnat -- KNAT runtime projection surface; reads performance context from chem paths, steps the creature, writes state to audience/<id>/... nodes.
  • CrowdReport -- aggregates N AudienceCreature instances: mean freedom, divergence, dominant reaction, chem density, source_population: Option<String>. The derive_source_population() helper resolves a crowd slice to a single cohort id, or None when the slice crosses jook boundaries.
  • AudienceKnatState.source_population -- jook cohort id (jook.<jook_id>) set at projection time; keeps AudienceCreature venue-agnostic.
  • Seam A closed: IdentityModel.observe_audience_chem() and observe_crowd_reaction() wire audience output into lead reinforcement. Positive reactions (Commit, Escalate, Release) raise anchor confidence; disruptive reactions (Disrupt, Misdirect) decay it; magnitude proportional to chem intensity.
  • Level 2 integration tests (tests/music_level2_integration.rs): three tests exercise the full chain -- PopBlock scenario, homogeneous cohort provenance, and mixed-cohort provenance -- using only Rust types, no KNAT or audio hardware required.

Pop Surface Context Graph (P0–P3)

The PSCG is now operational in plant_music as PopPrimitivePlan. It gives any Musician a four-layer structural projection to work from:

  • P0 — beat grid: temporal anchors, section boundaries, bar/beat positions
  • P1 — phrase intent: phrase start/end nodes, energy and attack character per phrase
  • P2 — role guidance: per-role action direction keyed to the current section kind
  • P3 — note-set candidates: motif contour nodes with pitch-class suggestions, shaped by ContourShape (ascending, descending, arch, valley, static, chromatic, fragmenting) and RhythmShape (pulse, driving, sparse, syncopated, fills, breakdown)

The song decomposition schema (SongSpec, SectionSpec, EnergyProfile, AttackCharacter) is fully defined. American Music (Violent Femmes, 115 BPM, C major, 7 sections, 357s) is the reference implementation.

Coherence Metrics and Evaluation Harness

A reusable evaluation library (plant_music::pop::metrics) now provides:

  • HarnessTraceRow — the shared trace schema capturing section, phrase, chord, role guidance, chosen action, rejected alternatives, dissonance before/after, seed cursor, precision level, musician identity, and tick
  • CoherenceMetricReport — five component scores (rhythmic alignment, phrase boundary adherence, energy consistency, role separation clarity, breakdown compliance) plus overall architecture score
  • diff_metric_reports / MetricDiffReport — regression detection between two runs
  • compute_coherence_metrics — derives all scores from a trace slice

Four named musicians (Strummer/Support, Caller/Lead, Floor/Anchor, Bell/Air) run against the full PSCG pipeline. Tests verify deterministic replay, role differentiation, and that removing P1 or P2 layers measurably reduces their respective scores.

Test coverage: 11 (Phase II) + 12 (Phase III) + 6 (Phase IV) + 4 (Phase V) + 9 (Lead Integration) = 42 dedicated American Music integration tests. Phase I tests live in the library suite. At release 3.7.207-alpha: 534 Rust tests (plant_music) + 70 TypeScript console tests; tsc clean.

What Comes Next

Music Level 2 was released as 3.7.207-alpha on 2026-04-27. The system is playable: musicians, lead identity, transport authority, audience state, applause chems, crowd aggregation, console transport contract, and Level 2 integration tests are all in place. The frontier has moved from infrastructure to temporal authority and social propagation.

Active (unblocked):

  • Transport Authority and Opening Primitives (PLAN-BAND-INTRO): the lead now holds LeadTransportIntent and can hold, loop, or advance Pop and Jazz. The next work formalizes the Listener State Vector (LSV), opening primitive operators (commit, establish, escalate, withhold, disrupt, narrate), and tests the full transport authority arc. PLAN-PARADISE (Paradise by the Dashboard Light) is the canonical test scenario for CommitmentBudget exhaustion and forced vow-under-duress.
  • Defiance, Ohio integration (PLAN-DEFIANCE): shared Pop extraction is complete (src/pop/ + crate-root exports), so Defiance Phase I can now start on its real constraints: argument-shaped song structure, stop-time, rhetorical silence, and song-local axes over shared primitives. Complements American Music's groove-dominant test.

Deferred (documented, not started):

  • KNAT Listener chain: ListenerClient, CpalAudioBusKnat, PerformanceContext publication as KNAT nodes. ListenerAdapter scaffolding exists; the seam to a live KNAT session is the remaining work. Depends on KNAT Audio Infrastructure.
  • KNAT Audio Infrastructure (PLAN-INSTRUMENT): AudioBusKnat binds AudioBus to KNAT participant lifecycle; unblocks the Chain 3 listener stack.
  • Music Level 3 (PLAN-MUSIC-LEVEL-3, in future/): music as cultural propagation. Venue/Jazz/Ambiance, Band-as-Brand, mixed-motive audience, material artifacts (records, stickers, bootlegs), escape events, and fan formation. FanBinding scaffolding (BindingType, CommitmentLevel, I-series weight) already exists in fan_binding.rs as Level 3 Chain 4 prep.

Population membership (context captured):

Three binding primitives are documented and the cohort binding mechanism is understood:

  • Band members = stele census (explicit slot equipping, mandate-based)
  • Audience = jook proximity field (implicit, ambient influence)
  • Fans = brand binding (volitional, persistent; FanBinding types in fan_binding.rs)

AudienceKnat already carries source_population scoped to a jook cohort id. Chem routing that respects spatial membership boundaries is the remaining wiring.

Remaining creature layer:

The PerformanceModel (analog of ConstraintModel -- tracks mode/key/tension execution history across a session) is the next step toward a Musician that observes its own patterns. The coherence metrics harness already computes externally what the PerformanceModel will eventually compute internally. That is the bridge still to cross.

See PLAN-MUSICIAN-AS-CREATURE.md for the phased development plan.

What Comes After That (Near)

Racing.

Racing is the third proof domain. It demonstrates that the runway model generalizes beyond physics (Pong) and aesthetics (Music) into a domain of competing, partially reliable projections under irreversible commitment.

The Driver adapter in racing (vrooom/racer) directly informs what an Instrument adapter in music needs to be. Both are KNAT-facing interfaces that substitute for creature-driven action in a structured participation slot.

Once racing has a Driver creature, the trust dissonance domain it requires ("I followed the pace notes and it failed") will likely backport into music as "I followed Jazz's role assignment and it suppressed me." That domain does not exist yet in either plan. It is deferred but named.

Takeaway

In Plantangenet music:

  • Pop and Jazz are deterministic field emitters: they maintain structural state and project readable futures, but they do not know themselves
  • the Pop Surface Context Graph (P0-P3) provides a four-layer structural projection that Musicians can be evaluated against and lead-configured Musicians already navigate with partial self-knowledge
  • the Musician is a participant in that field: algorithmically for support roles, cognitively (phrase identity, commitment budget, dissonance trace) when a LeadBias is installed
  • five named lead archetypes (Joey, Kurt, Iggy, Byrne, Coulton) are concrete behavioral presets that demonstrate the lead model's range
  • the Audience is a first-class creature: it responds to the performance field, emits Applause Chems, and carries its own Standortmeldung-based state model
  • Seam A is closed: audience chem feeds back into lead reinforcement via IdentityModel.observe_audience_chem() and observe_crowd_reaction(); the feedback loop that was the original Level 2 design motivation is live
  • the lead holds LeadTransportIntent and operates Pop and Jazz like a deck operator: the performance clock is now a controlled surface, not an unconditional ticker
  • three population binding primitives (stele/band, jook/audience, brand/fan) are documented; source_population on AudienceKnatState scopes crowd signals to jook cohort boundaries; full chem routing is the remaining wiring
  • Music Level 2 (3.7.207-alpha) is released: the system is playable end-to-end
  • the evaluation harness and coherence metrics establish the external baseline that the creature's self-model will eventually internalize
  • the architecture matters because it keeps field state, observation, and action separable without constraining who or what produces the next action

That makes music more than an audio output system. It is the second readable experiment in Plantangenet's larger claim: a structured interaction field can host many forms of agency, as long as the seams between field state, observation, and action stay explicit.

The Musician that overthinks its performance is not a worse musician. It is the beginning of a creature that knows what it is doing and why.