MVP Prototype Backlog

The prototype target is a single-user tabletop loop where Lorewright can act as a rules-aware Dungeon Master for one SRD 2014 campaign, remember campaign facts across turns through Mem0, and call a low-cost model many times to generate and refine world state. The prototype should prove the product loop before more GRPO combat dataset work resumes.

MVP Definition

The minimally viable prototype is:

  • One saved campaign with one or more saved characters.
  • A DM chat turn that receives player input, retrieves relevant Mem0 campaign memory and SRD context, calls the model, and returns a response.
  • Durable Mem0 memory for campaign canon: locations, NPCs, active quests, scene state, unresolved promises, and session notes.
  • Rule-grounded assistance for common 5e SRD 2014 situations, especially checks, saves, attacks, spell/rules lookup, rest, XP or milestone notes, inventory notes, and simple encounter state.
  • Deterministic combat resolution where the LLM proposes structured NPC moves and engine code validates/applies 5e mechanics.
  • An audit trail that records what the model said, what memory changed, what tools ran, and what source facts were used.

The MVP is not full tactical AI, not adversarial combat optimization, not multiclass support, not 2024 rules support, and not a polished campaign authoring suite.

Current State

Lorewright has strong foundations for character and rules infrastructure:

  • Character persistence already stores immutable snapshots, current character metadata, and decision plans in data/lorewright-app.sqlite through engine/characters/store.py.
  • The SRD 2014 level-up engine already performs deterministic preview, validation, and apply behavior in engine/srd_level_up/engine.py.
  • The web app already supports character creation, selection, preview/apply, snapshots, feature catalog display, ASI/feat edits, and decision-plan persistence through web/src/App.tsx.
  • TL;DR builders and sync logic already create compact spell, feat, feature, and character-level summaries for model context through engine/tldr/ and engine/tldr/store.py.
  • DM orchestration exists as a small shell in engine/dm/: a model gateway protocol, tool registry, single-turn orchestrator, and placeholder rules/encounter tools.
  • mem0ai is already declared as a dependency in pyproject.toml, but no Mem0 adapter or campaign memory namespace exists yet.

The project is not yet at playable prototype because:

  • OpenAIModelGateway.complete() is still a placeholder echo path, not a real API-backed model call.
  • SessionOrchestrator.run_turn() is stateless and does not persist session turns, conversation history, or world facts.
  • Tools are advertised in a prompt but are not automatically invoked or fed back into the model.
  • lookup_rule and balance_encounter return placeholder results.
  • There are no campaign/session schemas, Mem0 memory adapter, app-db memory audit tables, or API routes.
  • The web UI has no DM chat, campaign memory view, or turn audit surface.
  • Combat has no canonical state schema, deterministic rules resolver, or structured NPC move contract.
  • The GRPO combat dataset work is useful later, but it does not unlock the first playable loop.

Product Principle

Build the prototype as a narrow vertical slice: Mem0 campaign memory plus one model-backed DM turn plus deterministic rules tools. Memory retrieval may be probabilistic, but combat state changes must be deterministic and auditable. Every task should either make that loop playable, make the model safer against rules drift, or make state recoverable after a bad response.

Architecture boundary: Decision: Mem0 Memory and Deterministic Combat Boundary.

Backlog

P0 - Playable Spine

MVP-01: Real model gateway

  • Replace the placeholder gateway with an API-backed OpenAI adapter behind the existing ModelGateway protocol.
  • Keep a stub gateway for tests.
  • Make model name configurable, with the user-intended cheap model as the default only after availability is verified in implementation.
  • Acceptance: a test can inject a fake client and assert request shape, system prompt handling, response parsing, and error behavior.

MVP-02: Campaign/session persistence and Mem0 adapter

  • Add app-db tables and Pydantic schemas for campaigns, sessions, turns, Mem0 memory operation audit rows, and tool-call audit rows.
  • Add a CampaignMemory adapter boundary around Mem0 so tests can use an in-memory fake.
  • Namespace memories by campaign, with optional session, character, NPC, and location metadata.
  • Acceptance: create/list/get campaign, append turn, add/search memory through the adapter, and retrieve recent turns work against an isolated temp SQLite database with a fake memory backend.

MVP-03: DM turn API

  • Add POST /api/campaigns/{campaign_id}/turns that loads campaign state, recent turns, relevant Mem0 memories, and selected character summaries, then runs the orchestrator.
  • Persist the user input, model response, Mem0 memory operations, and used tools.
  • Acceptance: FastAPI test proves one turn is persisted and returned with audit metadata.

MVP-04: Minimal DM chat UI

  • Add a campaign selector and DM chat panel to web/.
  • Show conversation, active scene memories, and a compact audit drawer for tool and memory updates.
  • Acceptance: local UI can create/select a campaign, send one turn, refresh, and still show the prior turn.

P1 - Rule Grounding

MVP-05: Rules lookup tool with real retrieval

  • Replace lookup_rule placeholder with lookup against SQLite SRD rows and synced TL;DR tables.
  • Scope lookup to spells, feats, features, conditions, equipment/actions if present in the source database.
  • Return source identifiers and compact quoted facts or summaries, not freeform model prose.
  • Acceptance: tests cover spell, feature, and missing-query behavior.

MVP-06: Tool-call loop

  • Extend the orchestrator from one completion to a bounded tool loop.
  • Use a structured model output contract for final_response, tool_requests, and memory_updates.
  • Limit iterations and fail closed when a requested tool is unknown.
  • Acceptance: orchestrator test proves a fake model can request lookup_rule, receive the result, and produce a final response with used_tools.

MVP-07: Rules-first DM prompt contract

  • Create the system prompt as a versioned template that tells the DM to separate narrative, rulings, mechanics, and memory updates.
  • Require the model to cite tool results when it makes mechanical claims.
  • Acceptance: prompt builder tests cover required sections and source context inclusion.

MVP-08: Deterministic combat state schema

  • Add canonical combat records: encounter, participants, HP, AC, initiative, speed/position if modeled, conditions, resources, round, turn, and event log.
  • Keep combat state in app-db snapshots/events, not in Mem0. Mem0 may summarize combat context after the fact.
  • Acceptance: tests can create an encounter, add participants, advance initiative, and persist a combat snapshot/event log.

MVP-09: Structured NPC move contract

  • Define Pydantic schemas for model-proposed NPC moves: actor id, intent, action type, target ids, movement, chosen attack/spell/feature, resource request, and narrative note.
  • The contract should support "ask for legal options" as a safe fallback when the model is uncertain.
  • Acceptance: schema tests reject unknown fields, missing actors, unrecognized actions, and ambiguous target references.

MVP-10: Combat rules resolver

  • Implement deterministic validation and application for the first useful combat subset: initiative, movement bookkeeping, attack roll, damage roll, saving throw, HP changes, death/downed state, conditions, and spell slot/resource consumption for a narrow SRD spell/action set.
  • The engine owns dice rolling or consumes explicit dice inputs for reproducible tests; the LLM never decides hits or damage.
  • Acceptance: golden combat tests prove the same input state plus dice seed/input always produces the same output state and audit events.

P2 - Memory Quality

MVP-11: Mem0 retrieval policy

  • Define how Lorewright queries Mem0: campaign namespace, active scene filters, character/NPC/location metadata, max memories, and fallback behavior.
  • Store local audit rows for every add/search/update/delete call so memory behavior is inspectable.
  • Acceptance: fake Mem0 tests prove campaign-scoped retrieval and metadata filters prevent cross-campaign memory bleed.

MVP-12: Memory write review

  • Treat model-proposed memory changes as structured deltas with confidence and reason.
  • Auto-apply low-risk facts for MVP, but expose them in the UI audit drawer so the user can correct bad memory.
  • Acceptance: Mem0 writes show source turn, reason, and local audit status, and can be corrected or deactivated through a follow-up operation.

MVP-13: Session recap generation

  • Add an endpoint/CLI that summarizes a session into durable memory facts after play.
  • Use the model for prose condensation but require structured output and preserve raw turns.
  • Acceptance: recap records are linked to source turns and can be regenerated.

P3 - Prototype Hardening

MVP-14: Golden DM and combat scenarios

  • Add 5-10 fixed prototype transcripts that cover tavern roleplay, ability check, spell lookup, simple combat exchange, rest, inventory note, and quest memory.
  • Acceptance: tests assert response shape, tool-use audit, Mem0 memory operations, combat state transitions, and structured NPC move handling, not exact prose.

MVP-15: Cost and latency guardrails

  • Track model calls, estimated tokens, retries, and elapsed time per turn.
  • Add a per-turn max-call limit and visible failure response.
  • Acceptance: a test proves the orchestrator stops after the configured tool/model loop limit.

MVP-16: Prototype demo seed

  • Add a CLI or script that seeds one campaign, one character, and one opening scene.
  • Acceptance: a fresh local setup can run the seed, start the API/UI, and play the first turn without manual database editing.

Explicitly Parked Until After MVP

  • Additional GRPO combat enrichment beyond preserving the existing dataset and tests.
  • Full tactical combat automation beyond the deterministic MVP subset.
  • Multiclassing.
  • 2024 rules parity.
  • Custom vector memory infrastructure outside Mem0.
  • Multiplayer/authentication.
  • Deployment hardening beyond local/single-user operation.

Suggested Task Order

  1. Implement MVP-01 and keep the test stub path solid.
  2. Implement MVP-02 and MVP-03 together so state exists before UI work.
  3. Implement MVP-05 before broadening model freedom; the first useful tool should be real.
  4. Implement MVP-06 and MVP-07 to make the DM loop mechanically grounded.
  5. Implement MVP-04 as soon as a persisted turn endpoint exists.
  6. Implement MVP-08, MVP-09, and MVP-10 as the first deterministic combat slice.
  7. Add MVP-11 and MVP-12 when the chat loop starts producing durable facts.
  8. Add MVP-14 and MVP-15 before calling the prototype viable.

Prototype Exit Criteria

The prototype is real when a user can start a campaign, send 10-20 turns across at least two browser sessions, have the DM retrieve and update Mem0-backed campaign memory, use SRD-backed rules lookup for mechanical claims, run a small deterministic combat where NPCs submit structured moves, preserve character/combat state, and leave behind an inspectable audit trail of model calls, tools, memory operations, and combat events.