Skip to content

Overview

The Chat API provides AI-powered conversations with Retrieval-Augmented Generation (RAG) over your sources.

How It Works

  1. Your message is analyzed for intent and entities
  2. Relevant source chunks are retrieved via hybrid search
  3. Context is assembled with source citations
  4. LLM generates a grounded response
  5. Citations link back to original sources

Features

  • RAG Integration: Responses grounded in your sources
  • Streaming: Real-time token streaming via SSE
  • Context Awareness: Optionally include scenario/stack context
  • Multi-Provider: Works with Anthropic, OpenAI, Google, Ollama

Streaming Example

const response = await fetch('/api/workspaces/{id}/chat/stream', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    message: 'Compare latency of Claude vs GPT-4',
    use_sources: true
  })
});

const reader = response.body.getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  // Process SSE events
}