Model and Rate Limit
This page lists all the models currently supported by the Xiaomi MiMo API Open Platform, including model capabilities, length limits, and rate-limiting quotas, to help you select the appropriate model based on your usage scenario.
Rate Limiting Instructions
The platform sets a model concurrency limit for each account. When the server load is high, response delays or 429 error may occur. We recommend that you reasonably plan your request frequency and implement request retry and backoff strategies in high-concurrency scenarios to avoid triggering rate limits.
- RPM (Requests Per Minute): The maximum number of requests initiated per minute. The calculation scope is the sum of the total number of requests from all API Keys under a single account when calling the same model.
- TPM (Tokens Per Minute): The maximum number of Tokens that can be interacted with per minute. The calculation scope is the sum of the total number of requested Tokens for all API Keys under a single account when calling the same model.
Text Generation Model
| Model Series | Model ID (Model ID) | Capability Support | Length Limit (token) | Rate Limiting |
|---|---|---|---|---|
| Pro Series | mimo-v2.5-pro |
Text Generation Deep Thinking Streaming Output Function Call Structured Output Web Search |
Context Window: 1M Maximum Output: 128K |
Maximum RPM: 100 Maximum TPM: 10M |
mimo-v2-pro | ||||
| Omni Series | mimo-v2.5 |
Text Generation Full-modal Understanding Deep Thinking Streaming Output Function Call Structured Output Web Search |
Context Window: 1M Maximum Output: 128K | |
mimo-v2-omni |
Context Window: 256K Maximum Output: 128K | |||
| Flash Series | mimo-v2-flash |
Text Generation Deep Thinking Streaming Output Function Call Structured Output Web Search |
Context Window: 256K Maximum Output: 64K |
Text-to-Speech (TTS) Model
| Model ID (Model ID) | Capability Support | Length Limit (token) | Rate Limiting |
|---|---|---|---|
mimo-v2.5-tts |
Speech Synthesis | Context Window: 8K Maximum Output: 8K |
Maximum RPM: 100 Maximum TPM: 10M |
mimo-v2.5-tts-voiceclone |
Speech Synthesis Timbre Cloning | ||
mimo-v2.5-tts-voicedesign |
Speech Synthesis Timbre Design | ||
mimo-v2-tts |
Speech Synthesis |
Quick Selection Guide
| Requirement Scenario | Recommendation Model |
|---|---|
| Complex reasoning, in-depth analysis, long document processing | mimo-v2.5-pro |
| Understanding of image, audio, and video content | mimo-v2.5 or mimo-v2-omni |
| High concurrency, low cost, and fast response | out-of-v2-flash |
| Text-to-Speech (Standard Preset Voice) | mimo-v2.5-tts |
| Voice Cloning (Upload Audio Sample) | mimo-v2.5-tts-voiceclone |
| Customized Tone Design | mimo-v2.5-tts-voicedesign |
Update Time May 22, 2026