
No ratings yet
Be the first to review this model
A multimodal reasoning model combining Phi-4's reasoning with SigLIP-2 vision encoding. Scores 17% higher than Gemma 3 12B on MathVista. Open-weight from Microsoft Research for visual reasoning tasks.
Released
March 4, 2026
Parameters
15B
Context
16K
Pricing
Free
| Benchmark | Category | Score | Performance |
|---|---|---|---|
MMLU | knowledge | 79.8% | 80 |
MATH | reasoning | 82.1% | 82 |
Last updated: March 15, 2026
Benchmark scores may vary based on evaluation methodology and conditions.