← Back to ModelsCompare Models →
XiaomiOfficial
Xiaomi: MiMo-V2-Omni
MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...
Pricing
Input
$0.40/1M tokens
Output
$2.00/1M tokens
Cache Read
$0.08/1M tokens
Capabilities
Context Window262K tokens
Max Output66K tokens
SpeedMedium
Release2026-03
Vision
Tool Use
API Access
Local / Open
API Access
Official Links
Compare Xiaomi: MiMo-V2-Omni
See how it stacks up side by side.