37 lines
1.7 KiB
Markdown
37 lines
1.7 KiB
Markdown
# Target Database Schema for Kompanion Memory (v0)
|
||
|
||
**Primary**: Postgres 14+ with `pgvector` (v0.6+) for embeddings.
|
||
**Alt**: SQLite 3 + FAISS (local dev / fallback).
|
||
|
||
## Design Principles
|
||
- **Namespaces** (`project:user:thread`) partition memory and enable scoped retrieval.
|
||
- **Separation of items vs chunks**: items are logical notes/contexts; chunks are embedding units.
|
||
- **Metadata-first**: JSONB metadata with selective indexed keys; tags array.
|
||
- **Retention**: TTL via `expires_at`; soft-delete via `deleted_at`.
|
||
- **Versioning**: monotonically increasing `revision`; latest view via upsert.
|
||
- **Observability**: created/updated audit, model/dim for embeddings.
|
||
|
||
## Entities
|
||
- `namespaces` – registry of logical scopes.
|
||
- `threads` – optional conversational threads within a namespace.
|
||
- `users` – optional association to user identity.
|
||
- `memory_items` – logical items with rich metadata and raw content.
|
||
- `memory_chunks` – embedding-bearing chunks derived from items.
|
||
- `embeddings` – embedding vectors (one per chunk + model info).
|
||
|
||
## Retrieval Flow
|
||
1) Query text → embed → ANN search on `embeddings.vector` (filtered by namespace/thread/tags/metadata).
|
||
2) Join back to `memory_items` to assemble content and metadata.
|
||
|
||
## Indexing
|
||
- `embeddings`: `USING ivfflat (vector) WITH (lists=100)` (tune), plus btree on `(model, dim)`.
|
||
- `memory_items`: GIN on `metadata`, GIN on `tags`, btree on `(namespace_id, thread_id, created_at)`.
|
||
|
||
## SQLite Mapping
|
||
- Same tables sans vector column; store vectors in a sidecar FAISS index keyed by `chunk_id`. Maintain consistency via triggers in app layer.
|
||
|
||
## Open Questions
|
||
- Hybrid search strategy (BM25 + vector) — defer to v1.
|
||
- Eventing for cache warms and eviction.
|
||
- Encryption at rest and PII handling.
|