Overview
The Chat API provides AI-powered conversations with Retrieval-Augmented Generation (RAG) over your sources.
How It Works
- Your message is analyzed for intent and entities
- Relevant source chunks are retrieved via hybrid search
- Context is assembled with source citations
- LLM generates a grounded response
- Citations link back to original sources
Features
- RAG Integration: Responses grounded in your sources
- Streaming: Real-time token streaming via SSE
- Context Awareness: Optionally include scenario/stack context
- Multi-Provider: Works with Anthropic, OpenAI, Google, Ollama
Streaming Example
const response = await fetch('/api/workspaces/{id}/chat/stream', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: 'Compare latency of Claude vs GPT-4',
use_sources: true
})
});
const reader = response.body.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
// Process SSE events
}