3.7 KiB
3.7 KiB
Design Decision: Local-First Personal Store with Optional Federated Services
Decision: Kompanion adopts a two-tier architecture. A personal, local store (Akonadi-like) is the authoritative home of a user's data and operates fully offline. An optional federated layer provides encrypted backups, multi-device sync, and paid cloud conveniences (e.g., hosted search/rerank). Users can run purely local, or selectively enable cloud features.
Encryption Note: We deliberately leave the exact cryptography suite open to allow hardware/OS keychains, libsodium, AES-GCM, or XChaCha20-Poly1305. The guardrails below assume end-to-end encryption (E2EE) with keys controlled by the user.
1) Personal Store (Local Core) — kom.local.v1
- Runs entirely on-device; no network required.
- DB: SQLite (+ FTS/trigram for "rgrep" feel) + FAISS for vectors.
- Embeddings/Reranker: local (Ollama + optional local reranker).
- Privacy defaults: do-not-embed secrets; private-vault items are never vectorized/FTS'd; E2EE for backups/exports.
- Backup tools:
backup.export_encrypted,backup.import_encrypted(E2EE blobs).
2) Federated Services (Optional) — kom.cloud.v1
- Adds encrypted sync, cloud backup, micropayment-backed hosted compute (e.g., heavy reranking), and optional hosted pgvector search.
- Server sees ciphertext plus minimal metadata; hosted search is opt-in and may store embeddings either encrypted or plaintext only by explicit consent.
- Per-namespace tenancy and isolation (RLS when using Postgres).
3) Key & Auth Model
- Users may only retain authentication/secret-store access; Kompanion handles day-to-day operations.
- Device enrollment shares/wraps keys securely (mechanism TBD; QR/device handoff).
- Key rotation and export are first-class; backups are always encrypted client-side.
4) Search Modes
- Lexical: FTS + trigram, scoped to namespace/thread/user; grep-like snippets.
- Semantic: vector ANN with local reranker by default.
- Hybrid: configurable orchestration; always respects scope and privacy flags.
5) Privacy Controls
- Sensitivity flags:
metadata.sensitivity = secret|private|normal. secretitems: E2EE only (no FTS, no embeddings).- Server-side scope injection (namespace/user) in all handlers; default-deny posture.
- Purge policy: soft-delete + scheduled hard-delete; cascades to chunks/embeddings and remote copies.
6) Compatibility with Postgres+pgvector
- When cloud search is enabled, a hosted Postgres+pgvector instance enforces isolation via RLS and per-namespace session GUCs.
- Local SQLite store remains the source of truth unless user opts to delegate search to cloud.
Action List (from privacy review)
- DB hardening (cloud path): add RLS policies; add FTS + pg_trgm; unique
(namespace_id, key); partial ANN indexes per model. - Server enforcement: inject namespace/user via session context (GUCs); default-deny widening; rate limits.
- Redaction pipeline: protect secrets before embedding; skip embedding/FTS for
secretitems. - Private vault mode: key-only retrieval paths for sensitive items (no index participation).
- Backups: define E2EE export/import format; provider adapters (e.g., Google Drive) use pre-encrypted blobs.
- Sync: event-log format (append-only); conflict rules; device enrollment + key wrapping; later CRDT if needed.
- Purging: scheduled hard-deletes; admin "nuke namespace/user" procedure.
- Tests: cross-tenant leakage, redaction invariants, purge/TTL, hybrid-vs-lexical, hosted-vs-local parity.
Files to Watch
docs/db-schema.md,sql/pg/001_init.sql(cloud path)src/mcp/ToolSchemas.jsonand MCP handlers (scope + sensitivity gates)kom.local.v1.backup.*,kom.cloud.v1.*(new tool surfaces)