AI / LLM Integration
Connect multiple LLM providers and use them for text generation and multi-turn chat from a unified API.
Use case: Add AI-powered features to your application — content generation, Q&A, summarization — using whichever provider you have access to.
How It Works
- Providers (OpenAI, Anthropic, Gemini, Ollama) are configured via the admin panel; API keys are stored in
ai.config.json. POST /api/ai/generateandPOST /api/ai/chatproxy requests to the selected provider using nativefetch.- Each call records token usage to
ai.usage.jsonfor persistent tracking.
Supported Providers
| Provider ID | Label | Requires API Key | Default Models |
|---|---|---|---|
openai | OpenAI | Yes | gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo |
anthropic | Anthropic | Yes | claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5-20251001 |
gemini | Google Gemini | Yes | gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash |
ollama | Ollama (Local) | No | llama3.2, mistral, codellama, phi3 |
Configuration
Providers are configured via:
- Admin panel → AI Integration → Providers tab: Enable provider, paste API key, select default model.
- Direct file edit: Edit
ai.config.jsonin the project root.
Ollama requires a running local Ollama server. Default base URL: http://localhost:11434.
Token Usage Tracking
Every call to POST /api/ai/generate, POST /api/ai/chat, and POST /api/agents/:id/run records:
- Input token count
- Output token count
- Per-provider breakdown
Data is persisted to ai.usage.json and survives server restarts. Totals are visible in the admin dashboard.
AI Playground (Admin Panel)
- Generate mode: Single prompt with provider, model, system prompt, temperature, max tokens controls. Shows token count after each request.
- Chat mode: Multi-turn conversation with scrollable history and persistent context.
- Session totals: Cumulative token counts for the current browser session with a reset button.