Use fox for AI summaries with cube as fallback

arne commented

2026-05-30 01:09:16 +02:00

Owner

AI summaries currently go through a single Ollama backend (cube, qwen3:14b). We want fox (gemma4:26b, served by llama-swap over an OpenAI-compatible API) as the primary summarizer because its output is tighter and at least as fast, while keeping cube as a fallback so summarization keeps working when fox is unreachable.

The wrinkle is that the two speak different protocols: cube uses Ollama-native /api/chat, fox is OpenAI-only /v1/chat/completions (it 404s on /api/chat). So this is not a config-only switch — the summarizer needs an abstraction over both protocols plus per-request fallback that always prefers fox and retries on cube when fox errors.

Scope: introduce an analyzer abstraction with an OpenAI-compatible client and the existing Ollama client, a fox-primary/cube-fallback wrapper, config for both backends, and tests. Out of scope: circuit-breaking/cooldown on fox failure (per-request retry is enough for now), and any change to the summary prompt itself.

AI summaries currently go through a single Ollama backend (cube, qwen3:14b). We want fox (gemma4:26b, served by llama-swap over an OpenAI-compatible API) as the primary summarizer because its output is tighter and at least as fast, while keeping cube as a fallback so summarization keeps working when fox is unreachable. The wrinkle is that the two speak different protocols: cube uses Ollama-native /api/chat, fox is OpenAI-only /v1/chat/completions (it 404s on /api/chat). So this is not a config-only switch — the summarizer needs an abstraction over both protocols plus per-request fallback that always prefers fox and retries on cube when fox errors. Scope: introduce an analyzer abstraction with an OpenAI-compatible client and the existing Ollama client, a fox-primary/cube-fallback wrapper, config for both backends, and tests. Out of scope: circuit-breaking/cooldown on fox failure (per-request retry is enough for now), and any change to the summary prompt itself.

arne referenced this issue from a commit

2026-05-30 01:15:04 +02:00

Use fox for AI summaries with cube as fallback

arne referenced this issue from a pull request that will close it,

2026-05-30 01:15:14 +02:00

Use fox for AI summaries with cube as fallback #4

arne closed this issue

2026-05-30 01:16:49 +02:00

Rows
Columns

Use fox for AI summaries with cube as fallback #3