Use fox for AI summaries with cube as fallback #3

Closed
opened 2026-05-30 01:09:16 +02:00 by arne · 0 comments
Owner

AI summaries currently go through a single Ollama backend (cube, qwen3:14b). We want fox (gemma4:26b, served by llama-swap over an OpenAI-compatible API) as the primary summarizer because its output is tighter and at least as fast, while keeping cube as a fallback so summarization keeps working when fox is unreachable.

The wrinkle is that the two speak different protocols: cube uses Ollama-native /api/chat, fox is OpenAI-only /v1/chat/completions (it 404s on /api/chat). So this is not a config-only switch — the summarizer needs an abstraction over both protocols plus per-request fallback that always prefers fox and retries on cube when fox errors.

Scope: introduce an analyzer abstraction with an OpenAI-compatible client and the existing Ollama client, a fox-primary/cube-fallback wrapper, config for both backends, and tests. Out of scope: circuit-breaking/cooldown on fox failure (per-request retry is enough for now), and any change to the summary prompt itself.

AI summaries currently go through a single Ollama backend (cube, qwen3:14b). We want fox (gemma4:26b, served by llama-swap over an OpenAI-compatible API) as the primary summarizer because its output is tighter and at least as fast, while keeping cube as a fallback so summarization keeps working when fox is unreachable. The wrinkle is that the two speak different protocols: cube uses Ollama-native /api/chat, fox is OpenAI-only /v1/chat/completions (it 404s on /api/chat). So this is not a config-only switch — the summarizer needs an abstraction over both protocols plus per-request fallback that always prefers fox and retries on cube when fox errors. Scope: introduce an analyzer abstraction with an OpenAI-compatible client and the existing Ollama client, a fox-primary/cube-fallback wrapper, config for both backends, and tests. Out of scope: circuit-breaking/cooldown on fox failure (per-request retry is enough for now), and any change to the summary prompt itself.
arne closed this issue 2026-05-30 01:16:49 +02:00
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
arne/news#3
No description provided.