SvelteBench Visualization

Note: OpenAI thinking models (o3, o4) do not support temperature adjustments.

← Back to All Results

Google

gemini-2.5-pro-preview-05-06

Test Status pass@1 pass@10 Passing Samples Errors Actions
counter ✅ PASS 1.0000 1.0000 10/10 0
derived ✅ PASS 1.0000 1.0000 10/10 0
derived-by ⚠️ PARTIAL 0.9000 1.0000 9/10 1
each ✅ PASS 1.0000 1.0000 10/10 0
effect ⚠️ PARTIAL 0.8000 1.0000 8/10 4
hello-world ✅ PASS 1.0000 1.0000 10/10 0
inspect ❌ FAIL 0.0000 0.0000 0/10 13
props ⚠️ PARTIAL 0.9000 1.0000 9/10 1
snippets ⚠️ PARTIAL 0.6000 1.0000 6/10 4