57 lines
3.7 KiB
Markdown
57 lines
3.7 KiB
Markdown
# Design Decision: Local-First Personal Store with Optional Federated Services
|
|
|
|
**Decision**: Kompanion adopts a **two-tier architecture**. A personal, local store (Akonadi-like) is the *authoritative home* of a user's data and operates fully offline. An optional federated layer provides encrypted backups, multi-device sync, and paid cloud conveniences (e.g., hosted search/rerank). Users can run **purely local**, or selectively enable cloud features.
|
|
|
|
**Encryption Note**: We deliberately leave the *exact cryptography suite* open to allow hardware/OS keychains, libsodium, AES-GCM, or XChaCha20-Poly1305. The guardrails below assume **end-to-end encryption (E2EE)** with keys controlled by the user.
|
|
|
|
---
|
|
|
|
## 1) Personal Store (Local Core) — `kom.local.v1`
|
|
- Runs entirely on-device; no network required.
|
|
- DB: SQLite (+ FTS/trigram for "rgrep" feel) + FAISS for vectors.
|
|
- Embeddings/Reranker: local (Ollama + optional local reranker).
|
|
- Privacy defaults: do-not-embed secrets; private-vault items are never vectorized/FTS'd; E2EE for backups/exports.
|
|
- Backup tools: `backup.export_encrypted`, `backup.import_encrypted` (E2EE blobs).
|
|
|
|
## 2) Federated Services (Optional) — `kom.cloud.v1`
|
|
- Adds encrypted sync, cloud backup, micropayment-backed hosted compute (e.g., heavy reranking), and optional hosted pgvector search.
|
|
- Server sees ciphertext plus minimal metadata; hosted search is opt-in and may store embeddings either encrypted or plaintext **only by explicit consent**.
|
|
- Per-namespace tenancy and isolation (RLS when using Postgres).
|
|
|
|
## 3) Key & Auth Model
|
|
- Users may **only retain authentication/secret-store access**; Kompanion handles day-to-day operations.
|
|
- Device enrollment shares/wraps keys securely (mechanism TBD; QR/device handoff).
|
|
- Key rotation and export are first-class; backups are always encrypted client-side.
|
|
|
|
## 4) Search Modes
|
|
- **Lexical**: FTS + trigram, scoped to namespace/thread/user; grep-like snippets.
|
|
- **Semantic**: vector ANN with local reranker by default.
|
|
- **Hybrid**: configurable orchestration; always respects scope and privacy flags.
|
|
|
|
## 5) Privacy Controls
|
|
- Sensitivity flags: `metadata.sensitivity = secret|private|normal`.
|
|
- `secret` items: E2EE only (no FTS, no embeddings).
|
|
- Server-side scope injection (namespace/user) in all handlers; default-deny posture.
|
|
- Purge policy: soft-delete + scheduled hard-delete; cascades to chunks/embeddings and remote copies.
|
|
|
|
## 6) Compatibility with Postgres+pgvector
|
|
- When cloud search is enabled, a hosted Postgres+pgvector instance enforces isolation via RLS and per-namespace session GUCs.
|
|
- Local SQLite store remains the source of truth unless user opts to delegate search to cloud.
|
|
|
|
---
|
|
|
|
## Action List (from privacy review)
|
|
1. **DB hardening (cloud path)**: add RLS policies; add FTS + pg_trgm; unique `(namespace_id, key)`; partial ANN indexes per model.
|
|
2. **Server enforcement**: inject namespace/user via session context (GUCs); default-deny widening; rate limits.
|
|
3. **Redaction pipeline**: protect secrets before embedding; skip embedding/FTS for `secret` items.
|
|
4. **Private vault mode**: key-only retrieval paths for sensitive items (no index participation).
|
|
5. **Backups**: define E2EE export/import format; provider adapters (e.g., Google Drive) use pre-encrypted blobs.
|
|
6. **Sync**: event-log format (append-only); conflict rules; device enrollment + key wrapping; later CRDT if needed.
|
|
7. **Purging**: scheduled hard-deletes; admin "nuke namespace/user" procedure.
|
|
8. **Tests**: cross-tenant leakage, redaction invariants, purge/TTL, hybrid-vs-lexical, hosted-vs-local parity.
|
|
|
|
## Files to Watch
|
|
- `docs/db-schema.md`, `sql/pg/001_init.sql` (cloud path)
|
|
- `src/mcp/ToolSchemas.json` and MCP handlers (scope + sensitivity gates)
|
|
- `kom.local.v1.backup.*`, `kom.cloud.v1.*` (new tool surfaces)
|