Run evaluation
POST /workspaces/{workspace_id}/evaluations/{evaluation_id}/run
POST
/workspaces/{workspace_id}/evaluations/{evaluation_id}/run
Start running an evaluation
Authorizations
Section titled “Authorizations ”Parameters
Section titled “ Parameters ”Path Parameters
Section titled “Path Parameters ” workspace_id
required
string format: uuid
Workspace ID
evaluation_id
required
string format: uuid
Evaluation ID
Responses
Section titled “ Responses ”Evaluation started
object
id
string format: uuid
workspace_id
string format: uuid
name
string
description
string
evaluation_type
string
targets
Array<object>
object
target_type
string
target_id
ID of stack or scenario (if applicable)
string format: uuid
model_provider
Model provider for direct model evaluation
string
model_id
Model ID for direct model evaluation
string
label
Display label for this target
string
benchmarks
Array<object>
object
benchmark_type
string
subset
Specific benchmark subset
string
num_samples
Number of samples to evaluate
integer
custom_eval
object
prompt_template
Prompt template for evaluation
string
criteria
Array<object>
object
name
string
weight
number
description
string
judge_config
object
provider
string
model
string
methodology
object
sample_size
integer
confidence_level
number
random_seed
integer
status
string
progress_percent
number
started_at
string format: date-time
completed_at
string format: date-time
error_message
string
results
object
target_results
Array<object>
object
target_label
string
metrics
object
key
additional properties
number
benchmark_scores
object
key
additional properties
number
comparisons
Array<object>
object
target_a
string
target_b
string
metric
string
delta
number
winner
string
overall_summary
string
created_at
string format: date-time
updated_at
string format: date-time