GRPO Combat

Lorewright GRPO combat is now a single enrichment pipeline over the 36-example working set. The only supported CLI entrypoint is poetry run lorewright grpo-combat-enrich.

Active Workflow

grpo-combat-enrich executes engine/grpo/combat_enrichment.py and:

  1. Reads the 36-example input file (data/grpo_combat_sample36_levels_hp_obstacles.json by default).
  2. Rehydrates actor/combatant context and joins TL;DR summaries.
  3. Loads actor snapshots from the app DB (data/lorewright-app.sqlite by default via DEFAULT_APP_DB_PATH).
  4. Emits an enrichment dataset artifact (data/grpo_combat_sample36_levels_hp_obstacles_enriched.json by default) validated against CombatEnrichmentDataset.

Schema

engine/grpo/schemas.py now defines enrichment-only models:

  • CombatEnrichmentDataset
  • CombatEnrichmentRecord
  • nested actor/action/combatant/output-schema records
  • enrichment schema version constants:
    • COMBAT_ENRICHMENT_SCHEMA_VERSION
    • COMBAT_ENRICHMENT_DATASET_VERSION

Current Dependency

The enrichment path currently depends on the app DB containing 12 persisted characters (one floor mapping per class bucket used by the sample set), with accessible snapshots in character_snapshots. If this DB contract changes, grpo-combat-enrich output composition changes immediately because actor selection is DB-backed.

Validation Surface

GRPO regression coverage is intentionally focused on:

  • tests/test_grpo_combat_enrichment.py

No canonical migration, self-actor Phase 1, or self-actor Phase 2 validation path remains in the supported workflow.