398 lines
13 KiB
Markdown
398 lines
13 KiB
Markdown
# Kompanion AI Client SDK for Qt/KDE — API Review & v2 Proposal
|
||
|
||
**Context**
|
||
Existing code under `alpaka/src/core` implements a minimal LLM client named **KLLM**:
|
||
|
||
* `KLLMInterface` (central object, Qt-Network based, Ollama URL field, model list, systemPrompt, `getCompletion()` / `getModelInfo()`).
|
||
* `KLLMRequest` (message, model, context).
|
||
* `KLLMReply` (streaming, finished, basic timing info, context carry-over).
|
||
|
||
**Goal**
|
||
Evolve this into a **first-class Kompanion SDK** that can power:
|
||
|
||
* agentic coding (tool/function calling, plan→execute),
|
||
* app integrations (Kontact, Konsole, KDevelop/Kate, Plasma applets, NeoChat),
|
||
* privacy and policy controls (per-source ACLs, consent),
|
||
* reliable async/streaming/cancellation,
|
||
* multi-backend (Ollama/OpenAI/local engines) with uniform semantics,
|
||
* QML-friendly usage.
|
||
|
||
---
|
||
|
||
## Part A — Review (What works / What’s missing)
|
||
|
||
### Strengths
|
||
|
||
* Idiomatic Qt API (QObject, signals/slots).
|
||
* Central interface (`KLLMInterface`) mirrors `QNetworkAccessManager`/QtNetwork feeling.
|
||
* Streaming via `KLLMReply::contentAdded()` and completion via `finished()`.
|
||
* Simple model enumeration + `systemPrompt`.
|
||
|
||
### Gaps to close
|
||
|
||
1. **Message structure**: Only a single `message` string; no roles (system/user/assistant/tool), no multi-turn thread assembly besides a custom `KLLMContext`.
|
||
2. **Tool calling / function calling**: No schema for tool specs, invocation events, results injection, or “plan” steps.
|
||
3. **Backend abstraction**: “Ollama URL” is a property of the core interface. Needs pluggable providers with capability discovery.
|
||
4. **Error model**: Only `errorOccurred(QString)` and `hasError`. Missing typed errors, retry/cancel semantics, timeouts, throttling.
|
||
5. **Observability**: Some timing info, but no per-token hooks, token usage counters, logs, traces.
|
||
6. **Threading & cancellation**: No unified cancel token; no `QFuture`/`QCoro` or `QPromise` integration.
|
||
7. **QML friendliness**: Usable, but message/tool specs should be modelled as Q_GADGET/Q_OBJECT types and `Q_PROPERTY`-exposed to QML.
|
||
8. **Privacy & policy**: No ACLs, no data origin policy, no redaction hooks.
|
||
9. **Embeddings / RAG**: No first-class embedding calls, no JSON-mode or structured outputs with validators.
|
||
10. **Agent loop affordances**: No “plan→confirm→apply patch / run tests” pattern built-in; no diff/patch helpers.
|
||
|
||
---
|
||
|
||
## Part B — v2 API Proposal (“KompanionAI”)
|
||
|
||
Rename the public surface to **KompanionAI** (KI = “Künstliche Intelligenz” fits DE nicely), keep binary compatibility fences internally if needed.
|
||
|
||
### Namespaces & modules
|
||
|
||
* `namespace KompanionAI { … }`
|
||
* Core modules:
|
||
|
||
* `Client` (front door)
|
||
* `Provider` (backend plugins: Ollama, OpenAI, Local)
|
||
* `Message` / `Thread` (roles + history)
|
||
* `Tool` (function calling schema)
|
||
* `Completion` (text/chat)
|
||
* `Embedding` (vectorize)
|
||
* `Policy` (privacy/ACL)
|
||
* `Events` (streaming tokens, tool calls, traces)
|
||
|
||
All classes are Qt types with signals/slots & QML types.
|
||
|
||
---
|
||
|
||
### 1) Message & Thread Model
|
||
|
||
```cpp
|
||
// Roles & content parts, QML-friendly
|
||
class KIMessagePart {
|
||
Q_GADGET
|
||
Q_PROPERTY(QString mime READ mime)
|
||
Q_PROPERTY(QString text READ text) // for text/plain
|
||
// future: binary, image refs, etc.
|
||
public:
|
||
QString mime; // "text/plain", "application/json"
|
||
QString text;
|
||
};
|
||
|
||
class KIMessage {
|
||
Q_GADGET
|
||
Q_PROPERTY(QString role READ role) // "system" | "user" | "assistant" | "tool"
|
||
Q_PROPERTY(QList<KIMessagePart> parts READ parts)
|
||
public:
|
||
QString role;
|
||
QList<KIMessagePart> parts;
|
||
QVariantMap metadata; // arbitrary
|
||
};
|
||
|
||
class KIThread {
|
||
Q_GADGET
|
||
Q_PROPERTY(QList<KIMessage> messages READ messages)
|
||
public:
|
||
QList<KIMessage> messages;
|
||
};
|
||
```
|
||
|
||
**Why**: Enables multi-turn chat with explicit roles and mixed content (text/JSON). Tool outputs show up as `role="tool"`.
|
||
|
||
---
|
||
|
||
### 2) Tool / Function Calling
|
||
|
||
```cpp
|
||
class KIToolParam {
|
||
Q_GADGET
|
||
Q_PROPERTY(QString name READ name)
|
||
Q_PROPERTY(QString type READ type) // "string","number","boolean","object"... (JSON Schema-lite)
|
||
Q_PROPERTY(bool required READ required)
|
||
Q_PROPERTY(QVariant defaultValue READ defaultValue)
|
||
public:
|
||
QString name, type;
|
||
bool required = false;
|
||
QVariant defaultValue;
|
||
};
|
||
|
||
class KIToolSpec {
|
||
Q_GADGET
|
||
Q_PROPERTY(QString name READ name)
|
||
Q_PROPERTY(QString description READ description)
|
||
Q_PROPERTY(QList<KIToolParam> params READ params)
|
||
public:
|
||
QString name, description;
|
||
QList<KIToolParam> params; // JSON-serializable schema
|
||
};
|
||
|
||
class KIToolCall {
|
||
Q_GADGET
|
||
Q_PROPERTY(QString name READ name)
|
||
Q_PROPERTY(QVariantMap arguments READ arguments)
|
||
public:
|
||
QString name;
|
||
QVariantMap arguments;
|
||
};
|
||
|
||
class KIToolResult {
|
||
Q_GADGET
|
||
Q_PROPERTY(QString name READ name)
|
||
Q_PROPERTY(QVariant result READ result) // result payload (JSON-like)
|
||
public:
|
||
QString name;
|
||
QVariant result;
|
||
};
|
||
```
|
||
|
||
**Flow**: Model emits a **tool call** event → client executes tool → emits **tool result** → model continues. All observable via signals.
|
||
|
||
---
|
||
|
||
### 3) Provider abstraction (multi-backend)
|
||
|
||
```cpp
|
||
class KIProvider : public QObject {
|
||
Q_OBJECT
|
||
Q_PROPERTY(QString name READ name CONSTANT)
|
||
Q_PROPERTY(QStringList models READ models NOTIFY modelsChanged)
|
||
Q_PROPERTY(KICapabilities caps READ caps CONSTANT)
|
||
public:
|
||
virtual QFuture<KIReply*> chat(const KIThread& thread, const KIChatOptions& opts) = 0;
|
||
virtual QFuture<KIEmbeddingResult> embed(const QStringList& texts, const KIEmbedOptions& opts) = 0;
|
||
// ...
|
||
};
|
||
|
||
class KIClient : public QObject {
|
||
Q_OBJECT
|
||
Q_PROPERTY(KIProvider* provider READ provider WRITE setProvider NOTIFY providerChanged)
|
||
Q_PROPERTY(QString defaultModel READ defaultModel WRITE setDefaultModel NOTIFY defaultModelChanged)
|
||
public:
|
||
Q_INVOKABLE QFuture<KIReply*> chat(const KIThread&, const KIChatOptions&);
|
||
Q_INVOKABLE QFuture<KIEmbeddingResult> embed(const QStringList&, const KIEmbedOptions&);
|
||
Q_INVOKABLE void cancel(quint64 requestId);
|
||
// ...
|
||
};
|
||
```
|
||
|
||
* **OllamaProvider**, **OpenAIProvider**, **LocalProvider** implement `KIProvider`.
|
||
* `KICapabilities` advertises support for: JSON-mode, function calling, system prompts, logprobs, images, etc.
|
||
* **Do not** bake “Ollama URL” into `Client`. It belongs to the provider.
|
||
|
||
---
|
||
|
||
### 4) Completion / Reply / Streaming Events
|
||
|
||
```cpp
|
||
class KIReply : public QObject {
|
||
Q_OBJECT
|
||
Q_PROPERTY(bool finished READ isFinished NOTIFY finishedChanged)
|
||
Q_PROPERTY(int promptTokens READ promptTokens CONSTANT)
|
||
Q_PROPERTY(int completionTokens READ completionTokens CONSTANT)
|
||
Q_PROPERTY(QString model READ model CONSTANT)
|
||
public:
|
||
// accumulated assistant text
|
||
Q_INVOKABLE QString text() const;
|
||
|
||
Q_SIGNALS:
|
||
void tokensAdded(const QString& delta); // streaming text
|
||
void toolCallProposed(const KIToolCall& call); // model proposes a tool call
|
||
void toolResultRequested(const KIToolCall& call); // alt: unified request
|
||
void traceEvent(const QVariantMap& span); // observability
|
||
void finished(); // reply done
|
||
void errorOccurred(const KIError& error);
|
||
};
|
||
```
|
||
|
||
**Why**: makes **tool invocation** first-class and observable. You can wire it to ACF/MCP tools or project introspection.
|
||
|
||
---
|
||
|
||
### 5) Options / Policies / Privacy
|
||
|
||
```cpp
|
||
class KIChatOptions {
|
||
Q_GADGET
|
||
Q_PROPERTY(QString model MEMBER model)
|
||
Q_PROPERTY(bool stream MEMBER stream)
|
||
Q_PROPERTY(bool jsonMode MEMBER jsonMode)
|
||
Q_PROPERTY(int maxTokens MEMBER maxTokens)
|
||
Q_PROPERTY(double temperature MEMBER temperature)
|
||
Q_PROPERTY(QList<KIToolSpec> tools MEMBER tools) // permitted tool set for this call
|
||
Q_PROPERTY(KIPolicy policy MEMBER policy)
|
||
// ...
|
||
public:
|
||
QString model; bool stream = true; bool jsonMode = false;
|
||
int maxTokens = 512; double temperature = 0.2;
|
||
QList<KIToolSpec> tools;
|
||
KIPolicy policy;
|
||
};
|
||
|
||
class KIPolicy {
|
||
Q_GADGET
|
||
Q_PROPERTY(QString visibility MEMBER visibility) // "private|org|public"
|
||
Q_PROPERTY(bool allowNetwork MEMBER allowNetwork)
|
||
Q_PROPERTY(QStringList redactions MEMBER redactions) // regex keys to redact
|
||
// future: per-source ACLs
|
||
public:
|
||
QString visibility = "private";
|
||
bool allowNetwork = false;
|
||
QStringList redactions;
|
||
};
|
||
```
|
||
|
||
**Why**: explicit control of what the agent may do; dovetails with your HDoD memory ACLs.
|
||
|
||
---
|
||
|
||
### 6) Embeddings (for RAG / memory)
|
||
|
||
```cpp
|
||
class KIEmbedOptions {
|
||
Q_GADGET
|
||
Q_PROPERTY(QString model MEMBER model)
|
||
Q_PROPERTY(QString normalize MEMBER normalize) // "l2"|"none"
|
||
public:
|
||
QString model = "text-embed-local";
|
||
QString normalize = "l2";
|
||
};
|
||
|
||
class KIEmbeddingResult {
|
||
Q_GADGET
|
||
public:
|
||
QVector<QVector<float>> vectors;
|
||
QString model;
|
||
};
|
||
```
|
||
|
||
**Why**: unify vector generation; Kompanion memory can plug this directly.
|
||
|
||
---
|
||
|
||
### 7) Agent Loop Conveniences (optional helpers)
|
||
|
||
Provide “batteries included” patterns **outside** the Provider:
|
||
|
||
```cpp
|
||
class KIAgent : public QObject {
|
||
Q_OBJECT
|
||
public:
|
||
// Plan → (approve) → Execute, with tool calling enabled
|
||
Q_SIGNAL void planReady(const QString& plan);
|
||
Q_SIGNAL void patchReady(const QString& unifiedDiff);
|
||
Q_SIGNAL void needToolResult(const KIToolCall& call);
|
||
Q_SIGNAL void log(const QString& msg);
|
||
|
||
void runTask(const QString& naturalInstruction,
|
||
const KIThread& prior,
|
||
const QList<KIToolSpec>& tools,
|
||
const KIChatOptions& opts);
|
||
};
|
||
```
|
||
|
||
This helper emits “plan first”, then “diff/patch proposals”, integrates with your **ACF** and **KTextEditor/KDevelop** diff panes.
|
||
|
||
---
|
||
|
||
### 8) Error Model & Cancellation
|
||
|
||
* Introduce `KIError{ code: enum, httpStatus, message, retryAfter }`.
|
||
* `KIClient::cancel(requestId)` cancels in-flight work.
|
||
* Timeouts & retry policy configurable in `KIChatOptions`.
|
||
|
||
---
|
||
|
||
### 9) QML exposure
|
||
|
||
Register these with `qmlRegisterType<…>("Kompanion.AI", 1, 0, "KIClient")` etc.
|
||
Expose `KIMessage`, `KIThread`, `KIToolSpec`, `KIAgent` to QML, so Plasma applets / Kirigami UIs can wire flows fast.
|
||
|
||
---
|
||
|
||
## Part C — Migration from KLLM to KompanionAI
|
||
|
||
* **Mapping**
|
||
|
||
* `KLLMInterface::getCompletion()` → `KIClient::chat(thread, opts)`.
|
||
* `KLLMRequest{message, model, context}` → `KIThread{ messages=[system?, user?], … }, KIChatOptions{model}`.
|
||
* `KLLMReply` → `KIReply` (adds tool call signals, token deltas, errors).
|
||
* `systemPrompt` → first system `KIMessage`.
|
||
* `models()` → `KIProvider::models()`.
|
||
|
||
* **Providers**
|
||
|
||
* Implement **OllamaProvider** first (parity with current).
|
||
* Add **OpenAIProvider** (JSON-mode/function calling), **LocalProvider** (llama.cpp/candle/etc.).
|
||
|
||
* **Binary/Source compatibility**
|
||
|
||
* You can keep thin wrappers named `KLLM*` forwarding into `KompanionAI` during transition.
|
||
|
||
---
|
||
|
||
## Part D — Minimal Examples
|
||
|
||
### 1) Simple chat with streaming and tool calling
|
||
|
||
```cpp
|
||
KIClient client;
|
||
client.setProvider(new OllamaProvider(QUrl("http://localhost:11434")));
|
||
client.setDefaultModel("llama3.1:8b-instruct");
|
||
|
||
KIThread t;
|
||
t.messages << KIMessage{ .role="system", .parts={ { "text/plain","You are Kompanion inside KDE." } } }
|
||
<< KIMessage{ .role="user", .parts={ { "text/plain","Generate a CSV → exam report plan." } } };
|
||
|
||
KIToolSpec csvSpec;
|
||
csvSpec.name = "parse_csv_schema";
|
||
csvSpec.description = "Inspect a CSV sample path and return column info.";
|
||
csvSpec.params = { { "path","string", true, {} } };
|
||
|
||
KIChatOptions opts;
|
||
opts.tools = { csvSpec };
|
||
opts.stream = true;
|
||
|
||
auto *reply = client.chat(t, opts).result(); // or connect via QFutureWatcher
|
||
QObject::connect(reply, &KIReply::tokensAdded, [](const QString& d){ qDebug() << d; });
|
||
QObject::connect(reply, &KIReply::toolCallProposed, [&](const KIToolCall& call){
|
||
if (call.name == "parse_csv_schema") {
|
||
QVariantMap out; out["columns"] = QStringList{ "Name","Grade","Subject" };
|
||
// feed result back (provider-specific or via KIClient API)
|
||
client.returnToolResult(*reply, KIToolResult{ call.name, out });
|
||
}
|
||
});
|
||
```
|
||
|
||
### 2) Embeddings for memory
|
||
|
||
```cpp
|
||
auto emb = client.embed({ "RAII pattern", "SFINAE", "Type erasure" }, KIEmbedOptions{}).result();
|
||
qDebug() << emb.vectors.size(); // 3
|
||
```
|
||
|
||
---
|
||
|
||
## Part E — Optional Extensions (for Gemini to consider)
|
||
|
||
* **Structured outputs**: JSON schema validation for function outputs (reject invalid JSON, request fix).
|
||
* **Safety hooks**: pre-execution validators for tool calls (e.g. forbid dangerous shell).
|
||
* **Observability**: OpenTelemetry spans over request lifecycle and tool calls.
|
||
* **Rate limiting**: token budgeters per provider.
|
||
* **Offline mode**: `allowNetwork=false` forces model to abstain from external lookups.
|
||
* **Crash handler integration**: a helper that consumes backtraces and emits a `KIThread` pre-filled with stack/context (pairs naturally with an ACF tool to fetch symbols).
|
||
* **CSV app generator**: a thin template tool that scaffolds a Kirigami app, fed by CSV schema tool—end-to-end demo of agentic coding.
|
||
|
||
---
|
||
|
||
## TL;DR
|
||
|
||
* Keep Qt idioms; elevate to **KompanionAI** with roles, tools, providers, and policies.
|
||
* Make **tool calling first-class** with observable events.
|
||
* Decouple backend specifics via `KIProvider`.
|
||
* Add embeddings & JSON-mode for RAG + structured tasks.
|
||
* Provide **agent loop helpers** (plan→diff→apply) outside the provider.
|
||
* Expose everything to QML for KDE-native UIs.
|
||
|
||
This gives you a future-proof client SDK that plugs directly into Kontact/Konsole/KDevelop/Plasma/NeoChat and supports your ACF/MCP agent flows without locking into any single vendor.
|
||
|