# Memory Architecture Roadmap (2025-10-15) ## Current Snapshot - `PgDal` now prefers Qt6/QSql (`QPSQL`) with an in-memory fallback for `stub://` DSNs; schema migrations live in `db/init/`. - `kompanion --init` guides DSN detection (psql socket probing), applies migrations, and persists config via `~/.config/kompanionrc`. - MCP handlers still parse JSON manually but leverage the shared DAL; resource descriptors under `resources/memory/kom.memory.v1/` capture episodic/semantic contracts. - Contract tests (`contract_memory`, `contract_mcp_tools`, `mcp_memory_exchange`) validate the Qt-backed DAL and MCP handlers. ## 1. CTest Target: `contract_memory` 1. Keep `contract_memory.cpp` focused on exercising `PgDal` write/read surfaces; expand as DAL features land. 2. Ensure the executable runs without Postgres by defaulting to `stub://memory` when `PG_DSN` is absent. 3. Layer follow-up assertions once the QSql path is exercised end-to-end (CI can target the packaged test database). ## 2. DAL (Qt6/QSql) Evolution **Dependencies** - Qt6 (Core, Sql) with the `QPSQL` driver available at runtime. - KDE Frameworks `ConfigCore` for persisting DSNs in `kompanionrc`. **Implementation Steps** 1. Parse libpq-style DSNs with `QUrl`, open `QSqlDatabase` connections when the DSN is not `stub://`, and maintain the existing in-memory fallback for tests. 2. Use `QSqlQuery` `INSERT ... RETURNING` statements for namespaces, items, chunks, and embeddings; emit vector literals (`[0.1,0.2]`) when targeting pgvector columns. 3. Surface detailed `QSqlError` messages (throwing `std::runtime_error`) so MCP handlers and the CLI can report actionable failures. 4. Share configuration between CLI and MCP runners via KConfig (`Database/PgDsn`), seeded through the new `kompanion --init` wizard. ## 3. MCP `resources/*` & Episodic→Semantic Sync **Directory Layout** - Create `resources/memory/kom.memory.v1/` for tool descriptors and schema fragments: - `episodic.json` – raw conversation timeline. - `semantic.json` – chunked embeddings metadata. - `jobs/semantic_sync.json` – background job contract. **Design Highlights** 1. Episodic resource fields: `namespace`, `thread_id`, `speaker`, `content`, `sensitivity`, `tags`, `created_at`. 2. Semantic resource references episodic items (`episodic_id`, `chunk_id`, `model`, `dim`, `vector_ref`). 3. DAL sync job flow: - Locate episodic rows with `embedding_status='pending'` (and `sensitivity!='secret'`). - Batch call embedder(s); write `memory_chunks` + `embeddings`. - Mark episodic rows as `embedding_status='done'`, capture audit entries (e.g., ledger append). 4. Expose a placeholder MCP tool `kom.memory.v1.sync_semantic` that enqueues or executes the job. 5. Note TTL and privacy requirements; skip items with `expires_at` in the past or flagged secret. **Ξlope Alignment Notes (2025-10-15)** - Episodic resources capture resonance links and identity hints so the Librarian layer (see `elope/doc/architecture_memory.md`) can strengthen cross-agent patterns without raw content sharing. - Semantic resources surface `identity_vector` and `semantic_weight`, enabling supersemantic indexing once crystallization occurs. - `jobs/semantic_sync` maintains `cursor_event_id` and skips `sensitivity=secret`, mirroring the elope crystallization guidance in `/tmp/mem-elope.txt`. ## 4. `hybrid_search_v1` with `pgvector` **SQL Components** 1. Update migrations (`sql/pg/001_init.sql`) to include: - `tsvector` generated column or expression for lexical search. - `GIN` index on the lexical field (either `to_tsvector` or `pg_trgm`). - Per-model `ivfflat` index on `embeddings.vector`. 2. Prepared statements: - Text: `SELECT id, ts_rank_cd(...) AS score FROM memory_items ... WHERE namespace_id=$1 AND text_query=$2 LIMIT $3`. - Vector: `SELECT item_id, 1 - (vector <=> $2::vector) AS score FROM embeddings ... WHERE namespace_id=$1 ORDER BY vector <-> $2 LIMIT $3`. 3. Merge results in C++ with Reciprocal Rank Fusion or weighted sum, ensuring deterministic ordering on ties. **Handler Integration** 1. Ensure `PgDal::hybridSearch` delegates to SQL-based lexical/vector search when a database connection is active, reusing the in-memory fallback only for `stub://`. 2. Return richer matches (id, score, optional chunk text) to satisfy MCP response schema. 3. Update `HandlersMemory::search_memory` to surface the new scores and annotate whether lexical/vector contributed (optional metadata). 4. Exercise hybrid queries in contract tests against the packaged test database (`db/scripts/create-test-db.sh`). ## 5. Secret Handling, Snapshots, and CLI Hooks - **Secret propagation**: episodic `sensitivity` + `embeddable` flags gate embedding generation. DAL queries will add predicates (`metadata->>'sensitivity' != 'secret'`) before hybrid search. - **Snapshots**: episodic entries with `content_type = snapshot` reference durable artifacts; sync summarises them into semantic text while retaining `snapshot_ref` for CLI inspection. - **Hybrid policy**: `pgSearchVector` will filter by caller capability (namespace scope, secret clearance) before ranking; contract tests must assert omission of secret-tagged items. - **CLI sketch**: plan for a Qt `QCoreApplication` tool (`kom_mctl`) exposing commands to list namespaces, tail episodic streams, trigger `sync_semantic`, and inspect resonance graphs—all wired through the new prepared statements. - **Observability**: CLI should read the `jobs/semantic_sync` state block to display cursors, pending counts, and last error logs; dry-run mode estimates embeddings without committing. - **Activation parity**: Long term, mirror the KDE `akonadiclient`/`akonadi-console` pattern—Kompanion CLI doubles as an MCP surface today and later as a DBus-activated helper so tools can be socket-triggered into the memory service. - **KConfig defaults**: `kom_mcp` and `kompanion` load `Database/PgDsn` from `~/.config/kompanionrc` (see `docs/configuration.md`) when `PG_DSN` is unset, keeping deployments kioskable. - **CLI UX**: `kompanion --init` guides first-run setup (auto-detects databases, applies schemas); `-I/--interactive` keeps a JSON REPL open, and `-V/--verbose` echoes request/response streams for future HTTP transport parity. ## Next-Step Checklist - [x] Promote Qt6/QSql backend (QPSQL) as default DAL; retain `stub://` fallback for tests. - [x] Normalize contract_memory CTest target and remove stale library target. - [ ] Author `resources/memory/` descriptors and sync job outline. - [ ] Extend DAL header to expose richer query structs (filters, pagination, secret handling). - [x] Update `docs/mcp-memory-api.md` to mention episodic sync + hybrid search fields. - [ ] Create follow-up acf subtasks when concrete implementation begins (pgvector migration, scheduler hook, runtime wiring).