SvelteBench Visualization

Note: OpenAI thinking models (o3, o4) do not support temperature adjustments. o1-pro models use "medium" reasoning effort setting.

← Back to All Results

OpenRouter

meta-llama/llama-4-maverick

Test Status pass@1 pass@10 Passing Samples Errors Actions
counter ✅ PASS 1.0000 1.0000 10/10 0
derived ✅ PASS 1.0000 1.0000 10/10 0
derived-by ✅ PASS 1.0000 1.0000 10/10 0
each ✅ PASS 1.0000 1.0000 10/10 0
effect ❌ FAIL 0.0000 0.0000 0/10 10
hello-world ⚠️ PARTIAL 0.8000 1.0000 8/10 2
inspect ❌ FAIL 0.0000 0.0000 0/10 13
props ✅ PASS 1.0000 1.0000 10/10 0
snippets ❌ FAIL 0.0000 0.0000 0/10 10