Skip to content

Overview

Scenarios

Scenarios define your AI workload requirements including performance SLOs, budget constraints, and compliance needs.

Scenario Structure

Workload Configuration

Category: inference, training, comparison
Type: llm_inference, embedding, fine_tuning, etc.
Traffic Profile: steady, bursty, spiky
Token Sizes: Request/response token expectations

Service Level Objectives (SLOs)

Latency: P50, P95, P99 targets in milliseconds
Throughput: Requests per second
Availability: Uptime percentage (e.g., 99.9%)
Error Rate: Acceptable error percentage

Budget Constraints

Monthly Limit: Total spend cap in USD
Cost Per Request: Target cost per 1K requests
Compute Allocation: Percentage for compute vs. other costs

Compliance Requirements

Regions: Allowed deployment regions
Data Residency: Data storage location requirements
Certifications: Required compliance (SOC2, HIPAA, etc.)

Document Extraction

Extract scenarios from existing requirements documents:

POST /api/workspaces/{id}/scenarios/extract
Content-Type: application/json

{
  "document_content": "base64-encoded-content",
  "document_type": "pdf",
  "extraction_mode": "full"
}