metal-kompanion/docs/db-schema.md

1.7 KiB
Raw Blame History

Target Database Schema for Kompanion Memory (v0)

Primary: Postgres 14+ with pgvector (v0.6+) for embeddings. Alt: SQLite 3 + FAISS (local dev / fallback).

Design Principles

  • Namespaces (project:user:thread) partition memory and enable scoped retrieval.
  • Separation of items vs chunks: items are logical notes/contexts; chunks are embedding units.
  • Metadata-first: JSONB metadata with selective indexed keys; tags array.
  • Retention: TTL via expires_at; soft-delete via deleted_at.
  • Versioning: monotonically increasing revision; latest view via upsert.
  • Observability: created/updated audit, model/dim for embeddings.

Entities

  • namespaces registry of logical scopes.
  • threads optional conversational threads within a namespace.
  • users optional association to user identity.
  • memory_items logical items with rich metadata and raw content.
  • memory_chunks embedding-bearing chunks derived from items.
  • embeddings embedding vectors (one per chunk + model info).

Retrieval Flow

  1. Query text → embed → ANN search on embeddings.vector (filtered by namespace/thread/tags/metadata).
  2. Join back to memory_items to assemble content and metadata.

Indexing

  • embeddings: USING ivfflat (vector) WITH (lists=100) (tune), plus btree on (model, dim).
  • memory_items: GIN on metadata, GIN on tags, btree on (namespace_id, thread_id, created_at).

SQLite Mapping

  • Same tables sans vector column; store vectors in a sidecar FAISS index keyed by chunk_id. Maintain consistency via triggers in app layer.

Open Questions

  • Hybrid search strategy (BM25 + vector) — defer to v1.
  • Eventing for cache warms and eviction.
  • Encryption at rest and PII handling.