These benchmark results contain an error in the "inspect" test prompt that may affect accuracy. The prompt had incorrect quotation marks in the Svelte binding syntax examples, which could confuse language models and lead to inconsistent results.
The updated results with corrected prompts are available separately.
| Rank | Model | Score |
|---|---|---|
| 1 | claude-opus-4-1-20250805 (Anthropic) |
88.9%
|
| 2 | claude-opus-4-20250514 (Anthropic) |
88.9%
|
| 3 | claude-sonnet-4-20250514 (Anthropic) |
88.9%
|
| 4 | x-ai/grok-4 (OpenRouter) |
87.8%
|
| 5 | @preset/kimi-k2-0905-moonshotai (OpenRouter) |
84.4%
|
Note: Certain OpenAI thinking models (o3, o4) and gpt-5 do not support temperature adjustments (only default value of 1 is supported). Models with "-reasoning-" suffix (e.g., gpt-5-2025-08-07-reasoning-medium) will use the specified reasoning effort setting.
Errata: The "inspect" test has known correctness issues but is retained in the benchmark suite to maintain consistency and fairness in scoring across all evaluated models.
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 60% | 100% | 6/10 | 4 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 10% | 100% | 1/10 | 9 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 100% | 100% | 10/10 | 0 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 100% | 100% | 10/10 | 0 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 100% | 100% | 10/10 | 0 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 40% | 100% | 4/10 | 6 | |
| effect | 80% | 100% | 8/10 | 4 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 20% | 100% | 2/10 | 8 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 70% | 100% | 7/10 | 3 | |
| derived | 50% | 100% | 5/10 | 10 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 10% | 100% | 1/10 | 9 | |
| effect | 80% | 100% | 8/10 | 4 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 16 | |
| props | 30% | 100% | 3/10 | 7 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 60% | 100% | 6/10 | 4 | |
| each | 90% | 100% | 9/10 | 1 | |
| effect | 70% | 100% | 7/10 | 6 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 10% | 100% | 1/10 | 9 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 60% | 100% | 6/10 | 4 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 50% | 100% | 5/10 | 5 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 90% | 100% | 9/10 | 1 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 80% | 100% | 8/10 | 4 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 90% | 100% | 9/10 | 1 | |
| snippets | 60% | 100% | 6/10 | 4 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 90% | 100% | 9/10 | 2 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 40% | 100% | 4/10 | 6 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 0% | 0% | 0/10 | 10 | |
| derived | 0% | 0% | 0/10 | 10 | |
| derived-by | 0% | 0% | 0/10 | 10 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 0% | 0% | 0/10 | 10 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 70% | 100% | 7/10 | 3 | |
| derived | 0% | 0% | 0/10 | 10 | |
| derived-by | 10% | 100% | 1/10 | 9 | |
| each | 40% | 100% | 4/10 | 6 | |
| effect | 0% | 0% | 0/10 | 13 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 10% | 100% | 1/10 | 9 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 70% | 100% | 7/10 | 3 | |
| derived | 0% | 0% | 0/10 | 10 | |
| derived-by | 0% | 0% | 0/10 | 10 | |
| each | 20% | 100% | 2/10 | 9 | |
| effect | 10% | 100% | 1/10 | 10 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 19 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 0% | 0% | 0/10 | 13 | |
| derived | 0% | 0% | 0/10 | 13 | |
| derived-by | 0% | 0% | 0/10 | 12 | |
| each | 10% | 100% | 1/10 | 10 | |
| effect | 0% | 0% | 0/10 | 16 | |
| hello-world | 0% | 0% | 0/10 | 10 | |
| inspect | 0% | 0% | 0/10 | 28 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 0% | 0% | 0/10 | 14 | |
| derived | 0% | 0% | 0/10 | 13 | |
| derived-by | 0% | 0% | 0/10 | 10 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 0% | 0% | 0/10 | 13 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 0% | 0% | 0/10 | 13 | |
| derived | 0% | 0% | 0/10 | 14 | |
| derived-by | 40% | 100% | 4/10 | 14 | |
| each | 0% | 0% | 0/10 | 11 | |
| effect | 20% | 100% | 2/10 | 14 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 14 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 90% | 100% | 9/10 | 1 | |
| derived-by | 70% | 100% | 7/10 | 5 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 70% | 100% | 7/10 | 3 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 25 | |
| props | 90% | 100% | 9/10 | 4 | |
| snippets | 90% | 100% | 9/10 | 3 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 70% | 100% | 7/10 | 3 | |
| derived-by | 60% | 100% | 6/10 | 6 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 70% | 100% | 7/10 | 3 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 16 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 100% | 100% | 10/10 | 0 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 40% | 100% | 4/10 | 6 | |
| derived | 0% | 0% | 0/10 | 11 | |
| derived-by | 0% | 0% | 0/10 | 12 | |
| each | 50% | 100% | 5/10 | 5 | |
| effect | 10% | 100% | 1/10 | 12 | |
| hello-world | 90% | 100% | 9/10 | 1 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 50% | 100% | 5/10 | 8 | |
| derived | 0% | 0% | 0/10 | 13 | |
| derived-by | 0% | 0% | 0/10 | 10 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 0% | 0% | 0/10 | 10 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 0% | 0% | 0/10 | 13 | |
| snippets | 0% | 0% | 0/10 | 24 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 0% | 0% | 0/1 | 1 | |
| derived | 0% | 0% | 0/1 | 1 | |
| derived-by | 0% | 0% | 0/1 | 3 | |
| each | 0% | 0% | 0/1 | 1 | |
| effect | 0% | 0% | 0/1 | 1 | |
| hello-world | 100% | 100% | 1/1 | 0 | |
| inspect | 0% | 0% | 0/1 | 1 | |
| props | 0% | 0% | 0/1 | 1 | |
| snippets | 0% | 0% | 0/1 | 1 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 60% | 100% | 6/10 | 4 | |
| derived | 10% | 100% | 1/10 | 11 | |
| derived-by | 20% | 100% | 2/10 | 8 | |
| each | 60% | 100% | 6/10 | 4 | |
| effect | 20% | 100% | 2/10 | 11 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 30% | 100% | 3/10 | 13 | |
| derived | 0% | 0% | 0/10 | 11 | |
| derived-by | 10% | 100% | 1/10 | 9 | |
| each | 10% | 100% | 1/10 | 9 | |
| effect | 0% | 0% | 0/10 | 17 | |
| hello-world | 90% | 100% | 9/10 | 1 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 10% | 100% | 1/10 | 12 | |
| derived | 0% | 0% | 0/10 | 12 | |
| derived-by | 0% | 0% | 0/10 | 10 | |
| each | 20% | 100% | 2/10 | 8 | |
| effect | 0% | 0% | 0/10 | 11 | |
| hello-world | 90% | 100% | 9/10 | 1 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 90% | 100% | 9/10 | 1 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 90% | 100% | 9/10 | 1 | |
| inspect | 0% | 0% | 0/10 | 19 | |
| props | 90% | 100% | 9/10 | 1 | |
| snippets | 90% | 100% | 9/10 | 2 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 0% | 0% | 0/10 | 10 | |
| derived | 0% | 0% | 0/10 | 11 | |
| derived-by | 0% | 0% | 0/10 | 10 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 0% | 0% | 0/10 | 10 | |
| hello-world | 80% | 100% | 8/10 | 2 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 0% | 0% | 0/10 | 10 | |
| derived | 0% | 0% | 0/10 | 10 | |
| derived-by | 0% | 0% | 0/10 | 10 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 0% | 0% | 0/10 | 10 | |
| hello-world | 80% | 100% | 8/10 | 2 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 0% | 0% | 0/10 | 13 | |
| derived | 0% | 0% | 0/10 | 11 | |
| derived-by | 0% | 0% | 0/10 | 10 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 0% | 0% | 0/10 | 11 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 90% | 100% | 9/10 | 1 | |
| derived-by | 90% | 100% | 9/10 | 1 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 80% | 100% | 8/10 | 3 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 31 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 0% | 0% | 0/10 | 16 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 60% | 100% | 6/10 | 8 | |
| effect | 90% | 100% | 9/10 | 2 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 16 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 90% | 100% | 9/10 | 1 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 30% | 100% | 3/10 | 22 | |
| derived | 60% | 100% | 6/10 | 6 | |
| derived-by | 90% | 100% | 9/10 | 1 | |
| each | 20% | 100% | 2/10 | 10 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 40% | 100% | 4/10 | 6 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 0% | 0% | 0/10 | 19 | |
| derived | 0% | 0% | 0/10 | 10 | |
| derived-by | 0% | 0% | 0/10 | 10 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 0% | 0% | 0/10 | 10 | |
| hello-world | 80% | 100% | 8/10 | 2 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 0% | 0% | 0/10 | 10 | |
| hello-world | 80% | 100% | 8/10 | 2 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 90% | 100% | 9/10 | 1 | |
| derived | 20% | 100% | 2/10 | 8 | |
| derived-by | 0% | 0% | 0/10 | 11 | |
| each | 0% | 0% | 0/10 | 11 | |
| effect | 0% | 0% | 0/10 | 10 | |
| hello-world | 30% | 100% | 3/10 | 11 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 60% | 100% | 6/10 | 4 | |
| each | 70% | 100% | 7/10 | 4 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 90% | 100% | 9/10 | 1 | |
| inspect | 0% | 0% | 0/10 | 22 | |
| props | 10% | 100% | 1/10 | 9 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 80% | 100% | 8/10 | 4 | |
| derived-by | 90% | 100% | 9/10 | 1 | |
| each | 10% | 100% | 1/10 | 9 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 90% | 100% | 9/10 | 1 | |
| inspect | 0% | 0% | 0/10 | 19 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 70% | 100% | 7/10 | 6 | |
| derived | 10% | 100% | 1/10 | 13 | |
| derived-by | 10% | 100% | 1/10 | 13 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 0% | 0% | 0/10 | 10 | |
| hello-world | 30% | 100% | 3/10 | 7 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 80% | 100% | 8/10 | 2 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 90% | 100% | 9/10 | 1 | |
| inspect | 0% | 0% | 0/10 | 16 | |
| props | 80% | 100% | 8/10 | 2 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 50% | 100% | 5/10 | 5 | |
| each | 20% | 100% | 2/10 | 8 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 0% | 0% | 0/10 | 10 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 0% | 0% | 0/10 | 10 | |
| derived | 0% | 0% | 0/10 | 10 | |
| derived-by | 0% | 0% | 0/10 | 10 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 0% | 0% | 0/10 | 10 | |
| hello-world | 30% | 100% | 3/10 | 8 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 90% | 100% | 9/10 | 1 | |
| inspect | 0% | 0% | 0/10 | 22 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 70% | 100% | 7/10 | 4 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 70% | 100% | 7/10 | 3 | |
| derived | 0% | 0% | 0/10 | 10 | |
| derived-by | 0% | 0% | 0/10 | 10 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 0% | 0% | 0/10 | 13 | |
| hello-world | 20% | 100% | 2/10 | 15 | |
| inspect | 0% | 0% | 0/10 | 22 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 10% | 100% | 1/10 | 9 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 10% | 100% | 1/10 | 9 | |
| derived | 0% | 0% | 0/10 | 10 | |
| derived-by | 0% | 0% | 0/10 | 12 | |
| each | 0% | 0% | 0/10 | 11 | |
| effect | 0% | 0% | 0/10 | 13 | |
| hello-world | 50% | 100% | 5/10 | 6 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 80% | 100% | 8/10 | 2 | |
| derived | 0% | 0% | 0/10 | 14 | |
| derived-by | 30% | 100% | 3/10 | 7 | |
| each | 50% | 100% | 5/10 | 6 | |
| effect | 60% | 100% | 6/10 | 4 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 19 | |
| props | 0% | 0% | 0/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 14 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 20% | 100% | 2/10 | 8 | |
| derived | 0% | 0% | 0/10 | 15 | |
| derived-by | 0% | 0% | 0/10 | 14 | |
| each | 60% | 100% | 6/10 | 5 | |
| effect | 0% | 0% | 0/10 | 16 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 0% | 0% | 0/10 | 16 | |
| snippets | 0% | 0% | 0/10 | 12 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 50% | 100% | 5/10 | 5 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 19 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 90% | 100% | 9/10 | 1 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 90% | 100% | 9/10 | 2 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 80% | 100% | 8/10 | 4 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 40 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 90% | 100% | 9/10 | 1 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 90% | 100% | 9/10 | 1 | |
| derived-by | 80% | 100% | 8/10 | 2 | |
| each | 90% | 100% | 9/10 | 1 | |
| effect | 50% | 100% | 5/10 | 5 | |
| hello-world | 90% | 100% | 9/10 | 1 | |
| inspect | 0% | 0% | 0/10 | 31 | |
| props | 20% | 100% | 2/10 | 20 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 80% | 100% | 8/10 | 2 | |
| derived | 80% | 100% | 8/10 | 3 | |
| derived-by | 60% | 100% | 6/10 | 4 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 80% | 100% | 8/10 | 3 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 16 | |
| props | 20% | 100% | 2/10 | 11 | |
| snippets | 0% | 0% | 0/10 | 14 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 60% | 100% | 6/10 | 7 | |
| derived | 30% | 100% | 3/10 | 8 | |
| derived-by | 30% | 100% | 3/10 | 9 | |
| each | 90% | 100% | 9/10 | 1 | |
| effect | 30% | 100% | 3/10 | 11 | |
| hello-world | 70% | 100% | 7/10 | 5 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 0% | 0% | 0/10 | 22 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 0% | 0% | 0/10 | 10 | |
| derived | 0% | 0% | 0/10 | 10 | |
| derived-by | 0% | 0% | 0/10 | 12 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 0% | 0% | 0/10 | 10 | |
| hello-world | 90% | 100% | 9/10 | 1 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 0% | 0% | 0/10 | 16 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 30% | 100% | 3/10 | 21 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 90% | 100% | 9/10 | 2 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 37 | |
| props | 90% | 100% | 9/10 | 1 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 80% | 100% | 8/10 | 2 | |
| inspect | 0% | 0% | 0/10 | 34 | |
| props | 90% | 100% | 9/10 | 1 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 50% | 100% | 5/10 | 5 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 60% | 100% | 6/10 | 7 | |
| each | 90% | 100% | 9/10 | 1 | |
| effect | 70% | 100% | 7/10 | 6 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 40 | |
| props | 80% | 100% | 8/10 | 2 | |
| snippets | 10% | 100% | 1/10 | 11 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 90% | 100% | 9/10 | 3 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 100% | 100% | 10/10 | 0 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 70% | 100% | 7/10 | 5 | |
| derived-by | 90% | 100% | 9/10 | 1 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 16 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 60% | 100% | 6/10 | 4 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 90% | 100% | 9/10 | 1 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 80% | 100% | 8/10 | 4 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 20% | 100% | 2/10 | 12 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 25 | |
| props | 30% | 100% | 3/10 | 7 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 80% | 100% | 8/10 | 2 | |
| derived | 40% | 100% | 4/10 | 9 | |
| derived-by | 80% | 100% | 8/10 | 2 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 80% | 100% | 8/10 | 4 | |
| hello-world | 90% | 100% | 9/10 | 1 | |
| inspect | 0% | 0% | 0/10 | 25 | |
| props | 80% | 100% | 8/10 | 5 | |
| snippets | 50% | 100% | 5/10 | 5 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 90% | 100% | 9/10 | 3 | |
| derived | 70% | 100% | 7/10 | 4 | |
| derived-by | 90% | 100% | 9/10 | 3 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 40% | 100% | 4/10 | 7 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 13 | |
| props | 30% | 100% | 3/10 | 7 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 100% | 100% | 10/10 | 0 | |
| derived-by | 60% | 100% | 6/10 | 12 | |
| each | 0% | 0% | 0/10 | 10 | |
| effect | 20% | 100% | 2/10 | 12 | |
| hello-world | 90% | 100% | 9/10 | 1 | |
| inspect | 0% | 0% | 0/10 | 25 | |
| props | 30% | 100% | 3/10 | 10 | |
| snippets | 0% | 0% | 0/10 | 10 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 70% | 100% | 7/10 | 6 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 90% | 100% | 9/10 | 2 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 50% | 100% | 5/10 | 5 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 60% | 100% | 6/10 | 5 | |
| derived-by | 80% | 100% | 8/10 | 2 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 60% | 100% | 6/10 | 4 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 20% | 100% | 2/10 | 8 | |
| snippets | 10% | 100% | 1/10 | 9 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 40% | 100% | 4/10 | 8 | |
| derived-by | 90% | 100% | 9/10 | 1 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 40% | 100% | 4/10 | 7 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 10 | |
| props | 10% | 100% | 1/10 | 9 | |
| snippets | 20% | 100% | 2/10 | 8 |
| Test | pass@1 | pass@10 | Passing Samples | Errors | Actions |
|---|---|---|---|---|---|
| counter | 100% | 100% | 10/10 | 0 | |
| derived | 50% | 100% | 5/10 | 10 | |
| derived-by | 100% | 100% | 10/10 | 0 | |
| each | 100% | 100% | 10/10 | 0 | |
| effect | 100% | 100% | 10/10 | 0 | |
| hello-world | 100% | 100% | 10/10 | 0 | |
| inspect | 0% | 0% | 0/10 | 16 | |
| props | 100% | 100% | 10/10 | 0 | |
| snippets | 70% | 100% | 7/10 | 3 |