Model Release
2026-06-02 mimo-v2.5-asr Released
Model Introduction:
-
Bilingual & Dialects: Supports Chinese, English, code-switching, and various regional dialects (Wu, Cantonese, Minnan, Sichuanese).
-
Lyrics Transcription: High-accuracy Chinese/English lyrics transcription in mixed vocal-instrumental tracks.
-
Robust in Complex Audio: Excels in challenging environments (high noise, far-field, multi-speaker).
-
Knowledge-Intensive AI: Pinpoint accuracy for classical poetry, jargon, and proper nouns, with auto-punctuation.
2026-04-23 mimo-v2.5-pro Released
Model Introduction:
-
Trillion parameters, efficient architecture: 1T total parameters | 42B activations | 1M ultra-long context
-
Ultimate Agent Performance: In high-intensity agent scenarios, it performs comparably to Claude Opus4.6
2026-04-23 mimo-v2.5 Released
Model Introduction:
-
Native full-modal perception + 1M context: Supports native understanding of images, videos, audio, and text, enabling cross-modal precise perception and long-range reasoning, with comprehensive perception capabilities ranking among the industry's forefront
-
Powerful full-modal Agent capabilities: It has native Agent execution capabilities, enabling it to efficiently complete complex tasks such as browsing, understanding, reasoning, and operation, with its performance in daily tasks comparable to that of mimo-v2.5-pro
-
Combining Performance and Efficiency: While maintaining leading capabilities, achieving superior token efficiency, and positioned at the Pareto frontier of performance and efficiency
2026-04-23 MiMo-V2.5-TTS Series Release
Model Introduction:
-
Premium Voice TTS: Built-in with multiple high-quality premium voices, it has strong capabilities in understanding and adhering to style instructions, supports fine-grained control over speech rate, emotion, tone, etc., and meets the expression needs of multiple scenarios
-
Timbre Design: Supports quickly defining and generating new timbres through a single sentence, making timbre creation more intuitive and efficient
-
Timbre Cloning: Based on a small number of audio samples, it can reproduce the target timbre with high fidelity, while maintaining the consistency of timbre characteristics and possessing good generalization and stability
2026-03-18 mimo-v2-pro Release
Model overview:
-
Uses hybrid architecture with a 1:7 ratio of Global Attention to Sliding Window Attention (SWA);
-
1T total parameters, with 42B active parameters;
-
Supports an ultra-long context window of 1M tokens.
Model details: https://platform.xiaomimimo.com/#/docs/news/v2-pro-release
2026-03-18 mimo-v2-omni Release
Model overview:
-
Supports up to 256K context length;
-
Supports text, vision, and speech modalities.
Model details: https://platform.xiaomimimo.com/#/docs/news/v2-omni-release
2026-03-18 mimo-v2-tts Release
Model overview:
-
Pretrained on over 100 million hours of data, using a self-developed multi-codebook speech modeling architecture;
-
Offers unique capabilities such as style control, singing, and voice cloning.
Pricing: free for a limited time.
Model details: https://platform.xiaomimimo.com/#/docs/news/v2-tts-release
2026-02-04 mimo-v2-flash Update
-
Upgraded Coding Capabilities in Thinking Mode: Specifically optimized for programming scenarios, the Thinking Mode now achieves a score of 78.6 on SWE-Bench Verified. Both the resolution rate and the quality of code generation have been significantly improved.
-
Substantial Boost in Tool Calling Accuracy: Stability issues regarding tool usage have been resolved. Tool calling accuracy in Thinking Mode has surged from 64% to 97.0%, greatly enhancing execution reliability in Agent scenarios.
-
Enhanced Instruction Following & Reduced Hallucinations:
-
Instruction Following: Improved adherence to specific instructions, achieving an AA-IFBench score of 72.
-
Factuality: Enhanced rigor in factual responses, with the Non-Hallucination Rate updated to 52%.
-
Optimized Handling of Complex Tasks: Performance on Arena-Hard (Hard Prompts) in Thinking Mode has been strengthened, with the score rising to 60.6. The model now demonstrates superior performance when handling high-difficulty logic problems.
-
More Efficient Chain-of-Thought (CoT): By optimizing CoT generation strategies, the consumption of redundant tokens has been significantly reduced. In benchmarks such as AIME25 and HMMT, the average generation length has decreased by 13% to 30%. This effectively lowers latency and token costs while maintaining model performance.
| mimo-v2-flash-0204 | mimo-v2-flash-0112 | mimo-v2-flash | |
|---|---|---|---|
| SWE-Bench Verified Non-Thinking |
73.7 | 73.3 | 73.4 |
| SWE-Bench Verified Thinking |
78.6 | 74.2 | - |
| Arena-Hard(Hard Prompt) Non-Thinking |
49.3 | 52.7 | 46.0 |
| Arena-Hard(Creative Writing) Non-Thinking |
85.0 | 86.0 | 78.3 |
| Aren-Hard(Hard Prompt) Thinking |
60.6 | 58.3 | 54.1 |
| Arena-Hard(Creative Writing) Thinking |
85.8 | 90.4 | 86.2 |
| AA-IFBench | 72 | - | 64 |
| AA-Omniscience Accuracy | 19 | - | 27 |
| AA-Omniscience Non-Hallucination Rate | 52% | - | 9% |
| Tool call success rate Thinking |
97.0% | 64% | 44% |
| Benchmark | mimo-v2-flash (Acc) | mimo-v2-flash (Avg Tokens) | mimo-v2-flash-0204 (Acc) | mimo-v2-flash-0204 (Avg Tokens) | Length Reduction Ratio (%) |
|---|---|---|---|---|---|
| AIME25 | 94.8 | 26984 | 91.1 | 18879 | 30.04% |
| HMMT_Feb_25 | 94.2 | 29294 | 92.9 | 21470 | 26.71% |
| LiveCodeBench-AA | 83.2 | 21488 | 84.9 | 18335 | 14.67% |
| GPQA-Diamond | 83.7 | 15862 | 83.8 | 13659 | 13.89% |
Note: The model API call method and model name remain unchanged
2026-01-12 mimo-v2-flash Update
-
Enhanced general capabilities: Improved the model’s performance on a wide range of general-purpose tasks.
-
Upgraded coding performance in Thinking mode: Strengthened code generation quality in Thinking mode, especially for programming scenarios.
-
Deep integration with Claude Code: Fully supports using Thinking mode in Claude Code.
- Best practice: Set Thinking as the default mode to achieve more stable, higher-quality code generation.
-
Optimized Experience for Other Code Agents: Synchronized improvements to the interaction experience and generation quality across code assistant tools (Code Scaffolds) such as Kilo, Cline, and Roo.
-
Improved Stability & Instruction Following: Enhanced output stability and significantly improved adherence to specific output formats.
| mimo-v2-flash-0112 | mimo-v2-flash | |
|---|---|---|
| SWE-Bench Verified Non-Thinking |
73.3 | 73.4 |
| SWE-Bench Verified Thinking | 74.2 | - |
| Arena-Hard(Hard Prompt) Non-Thinking |
52.7 | 46.0 |
| Arena-Hard(Creative Writing) Non-Thinking |
86.0 | 78.3 |
| Arena-Hard(Hard Prompt) Thinking |
58.3 | 54.1 |
| Arena-Hard(Creative Writing) Thinking |
90.4 | 86.2 |
Note: The model API call method and model name remain unchanged
2025-12-16 mimo-v2-flash Release
Model overview:
-
Uses hybrid architecture with a 1:5 ratio of Global Attention to Sliding Window Attention (SWA), a window size of 128, native 32K context, and extended training up to 256K;
-
Introduces 3 MTP layers, delivering 2.5 to 3.7× faster inference.
Pricing: input 0.1/M tokens, output 0.3/M tokens.
Model details: mimo-v2-flash: High-Efficiency Inference, Code & Agent Foundation Model
Usage guide: First API call