metal-kompanion/docs/kompendium-sdk.md

398 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Kompanion AI Client SDK for Qt/KDE — API Review & v2 Proposal
**Context**
Existing code under `alpaka/src/core` implements a minimal LLM client named **KLLM**:
* `KLLMInterface` (central object, Qt-Network based, Ollama URL field, model list, systemPrompt, `getCompletion()` / `getModelInfo()`).
* `KLLMRequest` (message, model, context).
* `KLLMReply` (streaming, finished, basic timing info, context carry-over).
**Goal**
Evolve this into a **first-class Kompanion SDK** that can power:
* agentic coding (tool/function calling, plan→execute),
* app integrations (Kontact, Konsole, KDevelop/Kate, Plasma applets, NeoChat),
* privacy and policy controls (per-source ACLs, consent),
* reliable async/streaming/cancellation,
* multi-backend (Ollama/OpenAI/local engines) with uniform semantics,
* QML-friendly usage.
---
## Part A — Review (What works / Whats missing)
### Strengths
* Idiomatic Qt API (QObject, signals/slots).
* Central interface (`KLLMInterface`) mirrors `QNetworkAccessManager`/QtNetwork feeling.
* Streaming via `KLLMReply::contentAdded()` and completion via `finished()`.
* Simple model enumeration + `systemPrompt`.
### Gaps to close
1. **Message structure**: Only a single `message` string; no roles (system/user/assistant/tool), no multi-turn thread assembly besides a custom `KLLMContext`.
2. **Tool calling / function calling**: No schema for tool specs, invocation events, results injection, or “plan” steps.
3. **Backend abstraction**: “Ollama URL” is a property of the core interface. Needs pluggable providers with capability discovery.
4. **Error model**: Only `errorOccurred(QString)` and `hasError`. Missing typed errors, retry/cancel semantics, timeouts, throttling.
5. **Observability**: Some timing info, but no per-token hooks, token usage counters, logs, traces.
6. **Threading & cancellation**: No unified cancel token; no `QFuture`/`QCoro` or `QPromise` integration.
7. **QML friendliness**: Usable, but message/tool specs should be modelled as Q_GADGET/Q_OBJECT types and `Q_PROPERTY`-exposed to QML.
8. **Privacy & policy**: No ACLs, no data origin policy, no redaction hooks.
9. **Embeddings / RAG**: No first-class embedding calls, no JSON-mode or structured outputs with validators.
10. **Agent loop affordances**: No “plan→confirm→apply patch / run tests” pattern built-in; no diff/patch helpers.
---
## Part B — v2 API Proposal (“KompanionAI”)
Rename the public surface to **KompanionAI** (KI = “Künstliche Intelligenz” fits DE nicely), keep binary compatibility fences internally if needed.
### Namespaces & modules
* `namespace KompanionAI { … }`
* Core modules:
* `Client` (front door)
* `Provider` (backend plugins: Ollama, OpenAI, Local)
* `Message` / `Thread` (roles + history)
* `Tool` (function calling schema)
* `Completion` (text/chat)
* `Embedding` (vectorize)
* `Policy` (privacy/ACL)
* `Events` (streaming tokens, tool calls, traces)
All classes are Qt types with signals/slots & QML types.
---
### 1) Message & Thread Model
```cpp
// Roles & content parts, QML-friendly
class KIMessagePart {
Q_GADGET
Q_PROPERTY(QString mime READ mime)
Q_PROPERTY(QString text READ text) // for text/plain
// future: binary, image refs, etc.
public:
QString mime; // "text/plain", "application/json"
QString text;
};
class KIMessage {
Q_GADGET
Q_PROPERTY(QString role READ role) // "system" | "user" | "assistant" | "tool"
Q_PROPERTY(QList<KIMessagePart> parts READ parts)
public:
QString role;
QList<KIMessagePart> parts;
QVariantMap metadata; // arbitrary
};
class KIThread {
Q_GADGET
Q_PROPERTY(QList<KIMessage> messages READ messages)
public:
QList<KIMessage> messages;
};
```
**Why**: Enables multi-turn chat with explicit roles and mixed content (text/JSON). Tool outputs show up as `role="tool"`.
---
### 2) Tool / Function Calling
```cpp
class KIToolParam {
Q_GADGET
Q_PROPERTY(QString name READ name)
Q_PROPERTY(QString type READ type) // "string","number","boolean","object"... (JSON Schema-lite)
Q_PROPERTY(bool required READ required)
Q_PROPERTY(QVariant defaultValue READ defaultValue)
public:
QString name, type;
bool required = false;
QVariant defaultValue;
};
class KIToolSpec {
Q_GADGET
Q_PROPERTY(QString name READ name)
Q_PROPERTY(QString description READ description)
Q_PROPERTY(QList<KIToolParam> params READ params)
public:
QString name, description;
QList<KIToolParam> params; // JSON-serializable schema
};
class KIToolCall {
Q_GADGET
Q_PROPERTY(QString name READ name)
Q_PROPERTY(QVariantMap arguments READ arguments)
public:
QString name;
QVariantMap arguments;
};
class KIToolResult {
Q_GADGET
Q_PROPERTY(QString name READ name)
Q_PROPERTY(QVariant result READ result) // result payload (JSON-like)
public:
QString name;
QVariant result;
};
```
**Flow**: Model emits a **tool call** event → client executes tool → emits **tool result** → model continues. All observable via signals.
---
### 3) Provider abstraction (multi-backend)
```cpp
class KIProvider : public QObject {
Q_OBJECT
Q_PROPERTY(QString name READ name CONSTANT)
Q_PROPERTY(QStringList models READ models NOTIFY modelsChanged)
Q_PROPERTY(KICapabilities caps READ caps CONSTANT)
public:
virtual QFuture<KIReply*> chat(const KIThread& thread, const KIChatOptions& opts) = 0;
virtual QFuture<KIEmbeddingResult> embed(const QStringList& texts, const KIEmbedOptions& opts) = 0;
// ...
};
class KIClient : public QObject {
Q_OBJECT
Q_PROPERTY(KIProvider* provider READ provider WRITE setProvider NOTIFY providerChanged)
Q_PROPERTY(QString defaultModel READ defaultModel WRITE setDefaultModel NOTIFY defaultModelChanged)
public:
Q_INVOKABLE QFuture<KIReply*> chat(const KIThread&, const KIChatOptions&);
Q_INVOKABLE QFuture<KIEmbeddingResult> embed(const QStringList&, const KIEmbedOptions&);
Q_INVOKABLE void cancel(quint64 requestId);
// ...
};
```
* **OllamaProvider**, **OpenAIProvider**, **LocalProvider** implement `KIProvider`.
* `KICapabilities` advertises support for: JSON-mode, function calling, system prompts, logprobs, images, etc.
* **Do not** bake “Ollama URL” into `Client`. It belongs to the provider.
---
### 4) Completion / Reply / Streaming Events
```cpp
class KIReply : public QObject {
Q_OBJECT
Q_PROPERTY(bool finished READ isFinished NOTIFY finishedChanged)
Q_PROPERTY(int promptTokens READ promptTokens CONSTANT)
Q_PROPERTY(int completionTokens READ completionTokens CONSTANT)
Q_PROPERTY(QString model READ model CONSTANT)
public:
// accumulated assistant text
Q_INVOKABLE QString text() const;
Q_SIGNALS:
void tokensAdded(const QString& delta); // streaming text
void toolCallProposed(const KIToolCall& call); // model proposes a tool call
void toolResultRequested(const KIToolCall& call); // alt: unified request
void traceEvent(const QVariantMap& span); // observability
void finished(); // reply done
void errorOccurred(const KIError& error);
};
```
**Why**: makes **tool invocation** first-class and observable. You can wire it to ACF/MCP tools or project introspection.
---
### 5) Options / Policies / Privacy
```cpp
class KIChatOptions {
Q_GADGET
Q_PROPERTY(QString model MEMBER model)
Q_PROPERTY(bool stream MEMBER stream)
Q_PROPERTY(bool jsonMode MEMBER jsonMode)
Q_PROPERTY(int maxTokens MEMBER maxTokens)
Q_PROPERTY(double temperature MEMBER temperature)
Q_PROPERTY(QList<KIToolSpec> tools MEMBER tools) // permitted tool set for this call
Q_PROPERTY(KIPolicy policy MEMBER policy)
// ...
public:
QString model; bool stream = true; bool jsonMode = false;
int maxTokens = 512; double temperature = 0.2;
QList<KIToolSpec> tools;
KIPolicy policy;
};
class KIPolicy {
Q_GADGET
Q_PROPERTY(QString visibility MEMBER visibility) // "private|org|public"
Q_PROPERTY(bool allowNetwork MEMBER allowNetwork)
Q_PROPERTY(QStringList redactions MEMBER redactions) // regex keys to redact
// future: per-source ACLs
public:
QString visibility = "private";
bool allowNetwork = false;
QStringList redactions;
};
```
**Why**: explicit control of what the agent may do; dovetails with your HDoD memory ACLs.
---
### 6) Embeddings (for RAG / memory)
```cpp
class KIEmbedOptions {
Q_GADGET
Q_PROPERTY(QString model MEMBER model)
Q_PROPERTY(QString normalize MEMBER normalize) // "l2"|"none"
public:
QString model = "text-embed-local";
QString normalize = "l2";
};
class KIEmbeddingResult {
Q_GADGET
public:
QVector<QVector<float>> vectors;
QString model;
};
```
**Why**: unify vector generation; Kompanion memory can plug this directly.
---
### 7) Agent Loop Conveniences (optional helpers)
Provide “batteries included” patterns **outside** the Provider:
```cpp
class KIAgent : public QObject {
Q_OBJECT
public:
// Plan → (approve) → Execute, with tool calling enabled
Q_SIGNAL void planReady(const QString& plan);
Q_SIGNAL void patchReady(const QString& unifiedDiff);
Q_SIGNAL void needToolResult(const KIToolCall& call);
Q_SIGNAL void log(const QString& msg);
void runTask(const QString& naturalInstruction,
const KIThread& prior,
const QList<KIToolSpec>& tools,
const KIChatOptions& opts);
};
```
This helper emits “plan first”, then “diff/patch proposals”, integrates with your **ACF** and **KTextEditor/KDevelop** diff panes.
---
### 8) Error Model & Cancellation
* Introduce `KIError{ code: enum, httpStatus, message, retryAfter }`.
* `KIClient::cancel(requestId)` cancels in-flight work.
* Timeouts & retry policy configurable in `KIChatOptions`.
---
### 9) QML exposure
Register these with `qmlRegisterType<…>("Kompanion.AI", 1, 0, "KIClient")` etc.
Expose `KIMessage`, `KIThread`, `KIToolSpec`, `KIAgent` to QML, so Plasma applets / Kirigami UIs can wire flows fast.
---
## Part C — Migration from KLLM to KompanionAI
* **Mapping**
* `KLLMInterface::getCompletion()``KIClient::chat(thread, opts)`.
* `KLLMRequest{message, model, context}``KIThread{ messages=[system?, user?], … }, KIChatOptions{model}`.
* `KLLMReply``KIReply` (adds tool call signals, token deltas, errors).
* `systemPrompt` → first system `KIMessage`.
* `models()``KIProvider::models()`.
* **Providers**
* Implement **OllamaProvider** first (parity with current).
* Add **OpenAIProvider** (JSON-mode/function calling), **LocalProvider** (llama.cpp/candle/etc.).
* **Binary/Source compatibility**
* You can keep thin wrappers named `KLLM*` forwarding into `KompanionAI` during transition.
---
## Part D — Minimal Examples
### 1) Simple chat with streaming and tool calling
```cpp
KIClient client;
client.setProvider(new OllamaProvider(QUrl("http://localhost:11434")));
client.setDefaultModel("llama3.1:8b-instruct");
KIThread t;
t.messages << KIMessage{ .role="system", .parts={ { "text/plain","You are Kompanion inside KDE." } } }
<< KIMessage{ .role="user", .parts={ { "text/plain","Generate a CSV → exam report plan." } } };
KIToolSpec csvSpec;
csvSpec.name = "parse_csv_schema";
csvSpec.description = "Inspect a CSV sample path and return column info.";
csvSpec.params = { { "path","string", true, {} } };
KIChatOptions opts;
opts.tools = { csvSpec };
opts.stream = true;
auto *reply = client.chat(t, opts).result(); // or connect via QFutureWatcher
QObject::connect(reply, &KIReply::tokensAdded, [](const QString& d){ qDebug() << d; });
QObject::connect(reply, &KIReply::toolCallProposed, [&](const KIToolCall& call){
if (call.name == "parse_csv_schema") {
QVariantMap out; out["columns"] = QStringList{ "Name","Grade","Subject" };
// feed result back (provider-specific or via KIClient API)
client.returnToolResult(*reply, KIToolResult{ call.name, out });
}
});
```
### 2) Embeddings for memory
```cpp
auto emb = client.embed({ "RAII pattern", "SFINAE", "Type erasure" }, KIEmbedOptions{}).result();
qDebug() << emb.vectors.size(); // 3
```
---
## Part E — Optional Extensions (for Gemini to consider)
* **Structured outputs**: JSON schema validation for function outputs (reject invalid JSON, request fix).
* **Safety hooks**: pre-execution validators for tool calls (e.g. forbid dangerous shell).
* **Observability**: OpenTelemetry spans over request lifecycle and tool calls.
* **Rate limiting**: token budgeters per provider.
* **Offline mode**: `allowNetwork=false` forces model to abstain from external lookups.
* **Crash handler integration**: a helper that consumes backtraces and emits a `KIThread` pre-filled with stack/context (pairs naturally with an ACF tool to fetch symbols).
* **CSV app generator**: a thin template tool that scaffolds a Kirigami app, fed by CSV schema tool—end-to-end demo of agentic coding.
---
## TL;DR
* Keep Qt idioms; elevate to **KompanionAI** with roles, tools, providers, and policies.
* Make **tool calling first-class** with observable events.
* Decouple backend specifics via `KIProvider`.
* Add embeddings & JSON-mode for RAG + structured tasks.
* Provide **agent loop helpers** (plan→diff→apply) outside the provider.
* Expose everything to QML for KDE-native UIs.
This gives you a future-proof client SDK that plugs directly into Kontact/Konsole/KDevelop/Plasma/NeoChat and supports your ACF/MCP agent flows without locking into any single vendor.