Overview
Scenarios
Section titled “Scenarios”Scenarios define your AI workload requirements including performance SLOs, budget constraints, and compliance needs.
Scenario Structure
Workload Configuration
- Category:
inference,training,comparison - Type:
llm_inference,embedding,fine_tuning, etc. - Traffic Profile:
steady,bursty,spiky - Token Sizes: Request/response token expectations
Service Level Objectives (SLOs)
- Latency: P50, P95, P99 targets in milliseconds
- Throughput: Requests per second
- Availability: Uptime percentage (e.g., 99.9%)
- Error Rate: Acceptable error percentage
Budget Constraints
- Monthly Limit: Total spend cap in USD
- Cost Per Request: Target cost per 1K requests
- Compute Allocation: Percentage for compute vs. other costs
Compliance Requirements
- Regions: Allowed deployment regions
- Data Residency: Data storage location requirements
- Certifications: Required compliance (SOC2, HIPAA, etc.)
Document Extraction
Extract scenarios from existing requirements documents:
POST /api/workspaces/{id}/scenarios/extract
Content-Type: application/json
{
"document_content": "base64-encoded-content",
"document_type": "pdf",
"extraction_mode": "full"
}