TokenCenter
← Back to Models
XiaomiOfficial

Xiaomi: MiMo-V2-Omni

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...

Pricing

Input
$0.40/1M tokens
Output
$2.00/1M tokens
Cache Read
$0.08/1M tokens

Capabilities

Context Window262K tokens
Max Output66K tokens
SpeedMedium
Release2026-03
Vision
Tool Use
API Access
Local / Open

API Access

Official Links

Compare Xiaomi: MiMo-V2-Omni

See how it stacks up side by side.

Compare Models →