metal-kompanion/docs/design-local-first-architec...

57 lines
3.7 KiB
Markdown

# Design Decision: Local-First Personal Store with Optional Federated Services
**Decision**: Kompanion adopts a **two-tier architecture**. A personal, local store (Akonadi-like) is the *authoritative home* of a user's data and operates fully offline. An optional federated layer provides encrypted backups, multi-device sync, and paid cloud conveniences (e.g., hosted search/rerank). Users can run **purely local**, or selectively enable cloud features.
**Encryption Note**: We deliberately leave the *exact cryptography suite* open to allow hardware/OS keychains, libsodium, AES-GCM, or XChaCha20-Poly1305. The guardrails below assume **end-to-end encryption (E2EE)** with keys controlled by the user.
---
## 1) Personal Store (Local Core) — `kom.local.v1`
- Runs entirely on-device; no network required.
- DB: SQLite (+ FTS/trigram for "rgrep" feel) + FAISS for vectors.
- Embeddings/Reranker: local (Ollama + optional local reranker).
- Privacy defaults: do-not-embed secrets; private-vault items are never vectorized/FTS'd; E2EE for backups/exports.
- Backup tools: `backup.export_encrypted`, `backup.import_encrypted` (E2EE blobs).
## 2) Federated Services (Optional) — `kom.cloud.v1`
- Adds encrypted sync, cloud backup, micropayment-backed hosted compute (e.g., heavy reranking), and optional hosted pgvector search.
- Server sees ciphertext plus minimal metadata; hosted search is opt-in and may store embeddings either encrypted or plaintext **only by explicit consent**.
- Per-namespace tenancy and isolation (RLS when using Postgres).
## 3) Key & Auth Model
- Users may **only retain authentication/secret-store access**; Kompanion handles day-to-day operations.
- Device enrollment shares/wraps keys securely (mechanism TBD; QR/device handoff).
- Key rotation and export are first-class; backups are always encrypted client-side.
## 4) Search Modes
- **Lexical**: FTS + trigram, scoped to namespace/thread/user; grep-like snippets.
- **Semantic**: vector ANN with local reranker by default.
- **Hybrid**: configurable orchestration; always respects scope and privacy flags.
## 5) Privacy Controls
- Sensitivity flags: `metadata.sensitivity = secret|private|normal`.
- `secret` items: E2EE only (no FTS, no embeddings).
- Server-side scope injection (namespace/user) in all handlers; default-deny posture.
- Purge policy: soft-delete + scheduled hard-delete; cascades to chunks/embeddings and remote copies.
## 6) Compatibility with Postgres+pgvector
- When cloud search is enabled, a hosted Postgres+pgvector instance enforces isolation via RLS and per-namespace session GUCs.
- Local SQLite store remains the source of truth unless user opts to delegate search to cloud.
---
## Action List (from privacy review)
1. **DB hardening (cloud path)**: add RLS policies; add FTS + pg_trgm; unique `(namespace_id, key)`; partial ANN indexes per model.
2. **Server enforcement**: inject namespace/user via session context (GUCs); default-deny widening; rate limits.
3. **Redaction pipeline**: protect secrets before embedding; skip embedding/FTS for `secret` items.
4. **Private vault mode**: key-only retrieval paths for sensitive items (no index participation).
5. **Backups**: define E2EE export/import format; provider adapters (e.g., Google Drive) use pre-encrypted blobs.
6. **Sync**: event-log format (append-only); conflict rules; device enrollment + key wrapping; later CRDT if needed.
7. **Purging**: scheduled hard-deletes; admin "nuke namespace/user" procedure.
8. **Tests**: cross-tenant leakage, redaction invariants, purge/TTL, hybrid-vs-lexical, hosted-vs-local parity.
## Files to Watch
- `docs/db-schema.md`, `sql/pg/001_init.sql` (cloud path)
- `src/mcp/ToolSchemas.json` and MCP handlers (scope + sensitivity gates)
- `kom.local.v1.backup.*`, `kom.cloud.v1.*` (new tool surfaces)