Skip to content

Overview

Chat

The Chat API provides AI-powered conversations with Retrieval-Augmented Generation (RAG) over your sources.

How It Works

Your message is analyzed for intent and entities
Relevant source chunks are retrieved via hybrid search
Context is assembled with source citations
LLM generates a grounded response
Citations link back to original sources

Features

RAG Integration: Responses grounded in your sources
Streaming: Real-time token streaming via SSE
Context Awareness: Optionally include scenario/stack context
Multi-Provider: Works with Anthropic, OpenAI, Google, Ollama

Streaming Example

const response = await fetch('/api/workspaces/{id}/chat/stream', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    message: 'Compare latency of Claude vs GPT-4',
    use_sources: true
  })
});

const reader = response.body.getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  // Process SSE events
}