Skip to content

Overview

Scenarios define your AI workload requirements including performance SLOs, budget constraints, and compliance needs.

Scenario Structure

Workload Configuration

  • Category: inference, training, comparison
  • Type: llm_inference, embedding, fine_tuning, etc.
  • Traffic Profile: steady, bursty, spiky
  • Token Sizes: Request/response token expectations

Service Level Objectives (SLOs)

  • Latency: P50, P95, P99 targets in milliseconds
  • Throughput: Requests per second
  • Availability: Uptime percentage (e.g., 99.9%)
  • Error Rate: Acceptable error percentage

Budget Constraints

  • Monthly Limit: Total spend cap in USD
  • Cost Per Request: Target cost per 1K requests
  • Compute Allocation: Percentage for compute vs. other costs

Compliance Requirements

  • Regions: Allowed deployment regions
  • Data Residency: Data storage location requirements
  • Certifications: Required compliance (SOC2, HIPAA, etc.)

Document Extraction

Extract scenarios from existing requirements documents:

POST /api/workspaces/{id}/scenarios/extract
Content-Type: application/json

{
  "document_content": "base64-encoded-content",
  "document_type": "pdf",
  "extraction_mode": "full"
}