SvelteBench Visualization

← Back to All Results

Anthropic

claude-3-5-haiku-20241022

Test Status Tests Passed Errors Actions
counter ✅ PASS 4/4 0
derived ✅ PASS 2/2 0
derived-by ✅ PASS 3/3 0
each ❌ FAIL 1/2 1
effect ✅ PASS 2/2 0
hello-world ✅ PASS 2/2 0
inspect ❌ FAIL 0/4 4
snippets ❌ FAIL 0/0 1

claude-3-7-sonnet-20250219

Test Status Tests Passed Errors Actions
counter ✅ PASS 4/4 0
derived ✅ PASS 2/2 0
derived-by ✅ PASS 3/3 0
each ❌ FAIL 0/2 2
effect ✅ PASS 2/2 0
hello-world ✅ PASS 2/2 0
inspect ❌ FAIL 0/4 4
snippets ❌ FAIL 0/0 1

OpenAI

gpt-4o

Test Status Tests Passed Errors Actions
counter ❌ FAIL 0/0 1
derived ❌ FAIL 0/0 1
derived-by ❌ FAIL 0/0 1
each ❌ FAIL 1/2 1
effect ❌ FAIL 0/0 1
hello-world ✅ PASS 2/2 0
inspect ❌ FAIL 0/0 1
snippets ❌ FAIL 0/0 1

o3-mini

Test Status Tests Passed Errors Actions
counter ✅ PASS 4/4 0
derived ❌ FAIL 0/0 1
derived-by ❌ FAIL 0/0 1
each ❌ FAIL 1/2 1
effect ❌ FAIL 0/0 1
hello-world ✅ PASS 2/2 0
inspect ❌ FAIL 0/4 4
snippets ❌ FAIL 0/0 1