SvelteBench Visualization

Note: OpenAI thinking models (o3, o4) do not support temperature adjustments. o1-pro models use "medium" reasoning effort setting.

← Back to All Results

OpenAI

o1-pro-2025-03-19

Test Status pass@1 pass@10 Passing Samples Errors Actions
counter ❌ FAIL 0.0000 0.0000 0/1 1
derived ❌ FAIL 0.0000 0.0000 0/1 1
derived-by ❌ FAIL 0.0000 0.0000 0/1 3
each ❌ FAIL 0.0000 0.0000 0/1 1
effect ❌ FAIL 0.0000 0.0000 0/1 1
hello-world ✅ PASS 1.0000 1.0000 1/1 0
inspect ❌ FAIL 0.0000 0.0000 0/1 1
props ❌ FAIL 0.0000 0.0000 0/1 1
snippets ❌ FAIL 0.0000 0.0000 0/1 1