13 KiB
Kompanion AI Client SDK for Qt/KDE — API Review & v2 Proposal
Context
Existing code under alpaka/src/core implements a minimal LLM client named KLLM:
KLLMInterface(central object, Qt-Network based, Ollama URL field, model list, systemPrompt,getCompletion()/getModelInfo()).KLLMRequest(message, model, context).KLLMReply(streaming, finished, basic timing info, context carry-over).
Goal Evolve this into a first-class Kompanion SDK that can power:
- agentic coding (tool/function calling, plan→execute),
- app integrations (Kontact, Konsole, KDevelop/Kate, Plasma applets, NeoChat),
- privacy and policy controls (per-source ACLs, consent),
- reliable async/streaming/cancellation,
- multi-backend (Ollama/OpenAI/local engines) with uniform semantics,
- QML-friendly usage.
Part A — Review (What works / What’s missing)
Strengths
- Idiomatic Qt API (QObject, signals/slots).
- Central interface (
KLLMInterface) mirrorsQNetworkAccessManager/QtNetwork feeling. - Streaming via
KLLMReply::contentAdded()and completion viafinished(). - Simple model enumeration +
systemPrompt.
Gaps to close
- Message structure: Only a single
messagestring; no roles (system/user/assistant/tool), no multi-turn thread assembly besides a customKLLMContext. - Tool calling / function calling: No schema for tool specs, invocation events, results injection, or “plan” steps.
- Backend abstraction: “Ollama URL” is a property of the core interface. Needs pluggable providers with capability discovery.
- Error model: Only
errorOccurred(QString)andhasError. Missing typed errors, retry/cancel semantics, timeouts, throttling. - Observability: Some timing info, but no per-token hooks, token usage counters, logs, traces.
- Threading & cancellation: No unified cancel token; no
QFuture/QCoroorQPromiseintegration. - QML friendliness: Usable, but message/tool specs should be modelled as Q_GADGET/Q_OBJECT types and
Q_PROPERTY-exposed to QML. - Privacy & policy: No ACLs, no data origin policy, no redaction hooks.
- Embeddings / RAG: No first-class embedding calls, no JSON-mode or structured outputs with validators.
- Agent loop affordances: No “plan→confirm→apply patch / run tests” pattern built-in; no diff/patch helpers.
Part B — v2 API Proposal (“KompanionAI”)
Rename the public surface to KompanionAI (KI = “Künstliche Intelligenz” fits DE nicely), keep binary compatibility fences internally if needed.
Namespaces & modules
-
namespace KompanionAI { … } -
Core modules:
Client(front door)Provider(backend plugins: Ollama, OpenAI, Local)Message/Thread(roles + history)Tool(function calling schema)Completion(text/chat)Embedding(vectorize)Policy(privacy/ACL)Events(streaming tokens, tool calls, traces)
All classes are Qt types with signals/slots & QML types.
1) Message & Thread Model
// Roles & content parts, QML-friendly
class KIMessagePart {
Q_GADGET
Q_PROPERTY(QString mime READ mime)
Q_PROPERTY(QString text READ text) // for text/plain
// future: binary, image refs, etc.
public:
QString mime; // "text/plain", "application/json"
QString text;
};
class KIMessage {
Q_GADGET
Q_PROPERTY(QString role READ role) // "system" | "user" | "assistant" | "tool"
Q_PROPERTY(QList<KIMessagePart> parts READ parts)
public:
QString role;
QList<KIMessagePart> parts;
QVariantMap metadata; // arbitrary
};
class KIThread {
Q_GADGET
Q_PROPERTY(QList<KIMessage> messages READ messages)
public:
QList<KIMessage> messages;
};
Why: Enables multi-turn chat with explicit roles and mixed content (text/JSON). Tool outputs show up as role="tool".
2) Tool / Function Calling
class KIToolParam {
Q_GADGET
Q_PROPERTY(QString name READ name)
Q_PROPERTY(QString type READ type) // "string","number","boolean","object"... (JSON Schema-lite)
Q_PROPERTY(bool required READ required)
Q_PROPERTY(QVariant defaultValue READ defaultValue)
public:
QString name, type;
bool required = false;
QVariant defaultValue;
};
class KIToolSpec {
Q_GADGET
Q_PROPERTY(QString name READ name)
Q_PROPERTY(QString description READ description)
Q_PROPERTY(QList<KIToolParam> params READ params)
public:
QString name, description;
QList<KIToolParam> params; // JSON-serializable schema
};
class KIToolCall {
Q_GADGET
Q_PROPERTY(QString name READ name)
Q_PROPERTY(QVariantMap arguments READ arguments)
public:
QString name;
QVariantMap arguments;
};
class KIToolResult {
Q_GADGET
Q_PROPERTY(QString name READ name)
Q_PROPERTY(QVariant result READ result) // result payload (JSON-like)
public:
QString name;
QVariant result;
};
Flow: Model emits a tool call event → client executes tool → emits tool result → model continues. All observable via signals.
3) Provider abstraction (multi-backend)
class KIProvider : public QObject {
Q_OBJECT
Q_PROPERTY(QString name READ name CONSTANT)
Q_PROPERTY(QStringList models READ models NOTIFY modelsChanged)
Q_PROPERTY(KICapabilities caps READ caps CONSTANT)
public:
virtual QFuture<KIReply*> chat(const KIThread& thread, const KIChatOptions& opts) = 0;
virtual QFuture<KIEmbeddingResult> embed(const QStringList& texts, const KIEmbedOptions& opts) = 0;
// ...
};
class KIClient : public QObject {
Q_OBJECT
Q_PROPERTY(KIProvider* provider READ provider WRITE setProvider NOTIFY providerChanged)
Q_PROPERTY(QString defaultModel READ defaultModel WRITE setDefaultModel NOTIFY defaultModelChanged)
public:
Q_INVOKABLE QFuture<KIReply*> chat(const KIThread&, const KIChatOptions&);
Q_INVOKABLE QFuture<KIEmbeddingResult> embed(const QStringList&, const KIEmbedOptions&);
Q_INVOKABLE void cancel(quint64 requestId);
// ...
};
- OllamaProvider, OpenAIProvider, LocalProvider implement
KIProvider. KICapabilitiesadvertises support for: JSON-mode, function calling, system prompts, logprobs, images, etc.- Do not bake “Ollama URL” into
Client. It belongs to the provider.
4) Completion / Reply / Streaming Events
class KIReply : public QObject {
Q_OBJECT
Q_PROPERTY(bool finished READ isFinished NOTIFY finishedChanged)
Q_PROPERTY(int promptTokens READ promptTokens CONSTANT)
Q_PROPERTY(int completionTokens READ completionTokens CONSTANT)
Q_PROPERTY(QString model READ model CONSTANT)
public:
// accumulated assistant text
Q_INVOKABLE QString text() const;
Q_SIGNALS:
void tokensAdded(const QString& delta); // streaming text
void toolCallProposed(const KIToolCall& call); // model proposes a tool call
void toolResultRequested(const KIToolCall& call); // alt: unified request
void traceEvent(const QVariantMap& span); // observability
void finished(); // reply done
void errorOccurred(const KIError& error);
};
Why: makes tool invocation first-class and observable. You can wire it to ACF/MCP tools or project introspection.
5) Options / Policies / Privacy
class KIChatOptions {
Q_GADGET
Q_PROPERTY(QString model MEMBER model)
Q_PROPERTY(bool stream MEMBER stream)
Q_PROPERTY(bool jsonMode MEMBER jsonMode)
Q_PROPERTY(int maxTokens MEMBER maxTokens)
Q_PROPERTY(double temperature MEMBER temperature)
Q_PROPERTY(QList<KIToolSpec> tools MEMBER tools) // permitted tool set for this call
Q_PROPERTY(KIPolicy policy MEMBER policy)
// ...
public:
QString model; bool stream = true; bool jsonMode = false;
int maxTokens = 512; double temperature = 0.2;
QList<KIToolSpec> tools;
KIPolicy policy;
};
class KIPolicy {
Q_GADGET
Q_PROPERTY(QString visibility MEMBER visibility) // "private|org|public"
Q_PROPERTY(bool allowNetwork MEMBER allowNetwork)
Q_PROPERTY(QStringList redactions MEMBER redactions) // regex keys to redact
// future: per-source ACLs
public:
QString visibility = "private";
bool allowNetwork = false;
QStringList redactions;
};
Why: explicit control of what the agent may do; dovetails with your HDoD memory ACLs.
6) Embeddings (for RAG / memory)
class KIEmbedOptions {
Q_GADGET
Q_PROPERTY(QString model MEMBER model)
Q_PROPERTY(QString normalize MEMBER normalize) // "l2"|"none"
public:
QString model = "text-embed-local";
QString normalize = "l2";
};
class KIEmbeddingResult {
Q_GADGET
public:
QVector<QVector<float>> vectors;
QString model;
};
Why: unify vector generation; Kompanion memory can plug this directly.
7) Agent Loop Conveniences (optional helpers)
Provide “batteries included” patterns outside the Provider:
class KIAgent : public QObject {
Q_OBJECT
public:
// Plan → (approve) → Execute, with tool calling enabled
Q_SIGNAL void planReady(const QString& plan);
Q_SIGNAL void patchReady(const QString& unifiedDiff);
Q_SIGNAL void needToolResult(const KIToolCall& call);
Q_SIGNAL void log(const QString& msg);
void runTask(const QString& naturalInstruction,
const KIThread& prior,
const QList<KIToolSpec>& tools,
const KIChatOptions& opts);
};
This helper emits “plan first”, then “diff/patch proposals”, integrates with your ACF and KTextEditor/KDevelop diff panes.
8) Error Model & Cancellation
- Introduce
KIError{ code: enum, httpStatus, message, retryAfter }. KIClient::cancel(requestId)cancels in-flight work.- Timeouts & retry policy configurable in
KIChatOptions.
9) QML exposure
Register these with qmlRegisterType<…>("Kompanion.AI", 1, 0, "KIClient") etc.
Expose KIMessage, KIThread, KIToolSpec, KIAgent to QML, so Plasma applets / Kirigami UIs can wire flows fast.
Part C — Migration from KLLM to KompanionAI
-
Mapping
KLLMInterface::getCompletion()→KIClient::chat(thread, opts).KLLMRequest{message, model, context}→KIThread{ messages=[system?, user?], … }, KIChatOptions{model}.KLLMReply→KIReply(adds tool call signals, token deltas, errors).systemPrompt→ first systemKIMessage.models()→KIProvider::models().
-
Providers
- Implement OllamaProvider first (parity with current).
- Add OpenAIProvider (JSON-mode/function calling), LocalProvider (llama.cpp/candle/etc.).
-
Binary/Source compatibility
- You can keep thin wrappers named
KLLM*forwarding intoKompanionAIduring transition.
- You can keep thin wrappers named
Part D — Minimal Examples
1) Simple chat with streaming and tool calling
KIClient client;
client.setProvider(new OllamaProvider(QUrl("http://localhost:11434")));
client.setDefaultModel("llama3.1:8b-instruct");
KIThread t;
t.messages << KIMessage{ .role="system", .parts={ { "text/plain","You are Kompanion inside KDE." } } }
<< KIMessage{ .role="user", .parts={ { "text/plain","Generate a CSV → exam report plan." } } };
KIToolSpec csvSpec;
csvSpec.name = "parse_csv_schema";
csvSpec.description = "Inspect a CSV sample path and return column info.";
csvSpec.params = { { "path","string", true, {} } };
KIChatOptions opts;
opts.tools = { csvSpec };
opts.stream = true;
auto *reply = client.chat(t, opts).result(); // or connect via QFutureWatcher
QObject::connect(reply, &KIReply::tokensAdded, [](const QString& d){ qDebug() << d; });
QObject::connect(reply, &KIReply::toolCallProposed, [&](const KIToolCall& call){
if (call.name == "parse_csv_schema") {
QVariantMap out; out["columns"] = QStringList{ "Name","Grade","Subject" };
// feed result back (provider-specific or via KIClient API)
client.returnToolResult(*reply, KIToolResult{ call.name, out });
}
});
2) Embeddings for memory
auto emb = client.embed({ "RAII pattern", "SFINAE", "Type erasure" }, KIEmbedOptions{}).result();
qDebug() << emb.vectors.size(); // 3
Part E — Optional Extensions (for Gemini to consider)
- Structured outputs: JSON schema validation for function outputs (reject invalid JSON, request fix).
- Safety hooks: pre-execution validators for tool calls (e.g. forbid dangerous shell).
- Observability: OpenTelemetry spans over request lifecycle and tool calls.
- Rate limiting: token budgeters per provider.
- Offline mode:
allowNetwork=falseforces model to abstain from external lookups. - Crash handler integration: a helper that consumes backtraces and emits a
KIThreadpre-filled with stack/context (pairs naturally with an ACF tool to fetch symbols). - CSV app generator: a thin template tool that scaffolds a Kirigami app, fed by CSV schema tool—end-to-end demo of agentic coding.
TL;DR
- Keep Qt idioms; elevate to KompanionAI with roles, tools, providers, and policies.
- Make tool calling first-class with observable events.
- Decouple backend specifics via
KIProvider. - Add embeddings & JSON-mode for RAG + structured tasks.
- Provide agent loop helpers (plan→diff→apply) outside the provider.
- Expose everything to QML for KDE-native UIs.
This gives you a future-proof client SDK that plugs directly into Kontact/Konsole/KDevelop/Plasma/NeoChat and supports your ACF/MCP agent flows without locking into any single vendor.