音频理解

音频理解模型可以根据您传入的音频进行回答,支持音频 URL 和 Base64 编码两种传入方式,适用于音频分析等场景。

快速开始

注意:获取 API Key 等准备工作,请参考 首次调用API

通过音频 URL 方式传入模型快速体验音频理解效果,示例代码如下。

Curl

curl --location --request POST 'https://api.xiaomimimo.com/v1/chat/completions' \
--header "api-key: $MIMO_API_KEY" \
--header "Content-Type: application/json" \
--data-raw '{
    "model": "mimo-v2.5",
    "messages": [
        {
            "role": "system",
            "content": "You are MiMo, an AI assistant developed by Xiaomi. Today is date: Tuesday, December 16, 2025. Your knowledge cutoff date is December 2024."
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "input_audio",
                    "input_audio": {
                        "data": "https://example-files.cnbj1.mi-fds.com/example-files/audio/audio_example.wav"
                    }
                },
                {
                    "type": "text",
                    "text": "please describe the content of the audio"
                }
            ]
        }
    ],
    "max_completion_tokens": 1024
}'

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("MIMO_API_KEY"),
    base_url="https://api.xiaomimimo.com/v1"
)

completion = client.chat.completions.create(
    model="mimo-v2.5",
    messages=[
        {
            "role": "system",
            "content": "You are MiMo, an AI assistant developed by Xiaomi. Today is date: Tuesday, December 16, 2025. Your knowledge cutoff date is December 2024."
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "input_audio",
                    "input_audio": {
                        "data": "https://example-files.cnbj1.mi-fds.com/example-files/audio/audio_example.wav"
                    }
                },
                {
                    "type": "text",
                    "text": "please describe the content of the audio"
                }
            ]
        }
    ],
    max_completion_tokens=1024
)

print(completion.model_dump_json())

响应结果

{
    "id": "550a678a6c2046a29128883eaaf849e7",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "",
                "role": "assistant",
                "tool_calls": null,
                "reasoning_content": "Good morning. Could you tell me what the weather will be like today?"
            }
        }
    ],
    "created": 1776850627,
    "model": "mimo-v2.5",
    "object": "chat.completion",
    "usage": {
        "completion_tokens": 17,
        "prompt_tokens": 86,
        "total_tokens": 103,
        "completion_tokens_details": {
            "reasoning_tokens": 15
        },
        "prompt_tokens_details": {
            "audio_tokens": 25,
            "cached_tokens": 82
        }
    }
}

支持的模型列表

当前仅支持 mimo-v2.5mimo-v2-omni 模型。

音频传入方式

支持的音频传入方式如下:

  • 音频 URL 传入:需提供公网可访问的音频 URL 地址。

  • Base64 编码传入:将音频转换为 Base64 编码字符串后再传入。

音频 URL 传入

通过公网可访问的音频 URL 地址直接传入音频文件,适用于音频文件已存储在公网可访问环境的场景。单个音频文件大小不能超过 100 MB。

Curl

curl --location --request POST 'https://api.xiaomimimo.com/v1/chat/completions' \
--header "api-key: $MIMO_API_KEY" \
--header "Content-Type: application/json" \
--data-raw '{
    "model": "mimo-v2.5",
    "messages": [
        {
            "role": "system",
            "content": "You are MiMo, an AI assistant developed by Xiaomi. Today is date: Tuesday, December 16, 2025. Your knowledge cutoff date is December 2024."
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "input_audio",
                    "input_audio": {
                        "data": "https://example-files.cnbj1.mi-fds.com/example-files/audio/audio_example.wav"
                    }
                },
                {
                    "type": "text",
                    "text": "please describe the content of the audio"
                }
            ]
        }
    ],
    "max_completion_tokens": 1024
}'

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("MIMO_API_KEY"),
    base_url="https://api.xiaomimimo.com/v1"
)

completion = client.chat.completions.create(
    model="mimo-v2.5",
    messages=[
        {
            "role": "system",
            "content": "You are MiMo, an AI assistant developed by Xiaomi. Today is date: Tuesday, December 16, 2025. Your knowledge cutoff date is December 2024."
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "input_audio",
                    "input_audio": {
                        "data": "https://example-files.cnbj1.mi-fds.com/example-files/audio/audio_example.wav"
                    }
                },
                {
                    "type": "text",
                    "text": "please describe the content of the audio"
                }
            ]
        }
    ],
    max_completion_tokens=1024
)

print(completion.model_dump_json())

Base64 编码传入

将音频文件转换为 Base64 编码字符串后传入,适用于音频文件无法通过公网 URL 访问的场景。转换后的 Base64 编码的字符串大小不能超过 50 MB。

请在 Base64 编码前携带前缀:data:{MIME_TYPE};base64,$BASE64_AUDIO

  • {MIME_TYPE}:音频的 MIME 类型(媒体类型),用于标识音频格式,需替换为实际音频对应的 MIME 值。
  • $BASE64_AUDIO:音频文件的纯 Base64 编码字符串(不含任何前缀)。

Curl

curl --location --request POST 'https://api.xiaomimimo.com/v1/chat/completions' \
--header "api-key: $MIMO_API_KEY" \
--header "Content-Type: application/json" \
--data-raw '{
    "model": "mimo-v2.5",
    "messages": [
        {
            "role": "system",
            "content": "You are MiMo, an AI assistant developed by Xiaomi. Today is date: Tuesday, December 16, 2025. Your knowledge cutoff date is December 2024."
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "input_audio",
                    "input_audio": {
                        "data": "data:{MIME_TYPE};base64,$BASE64_AUDIO"
                    }
                },
                {
                    "type": "text",
                    "text": "please describe the content of the audio"
                }
            ]
        }
    ],
    "max_completion_tokens": 1024
}'

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("MIMO_API_KEY"),
    base_url="https://api.xiaomimimo.com/v1"
)

completion = client.chat.completions.create(
    model="mimo-v2.5",
    messages=[
        {
            "role": "system",
            "content": "You are MiMo, an AI assistant developed by Xiaomi. Today is date: Tuesday, December 16, 2025. Your knowledge cutoff date is December 2024."
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "input_audio",
                    "input_audio": {
                        "data": "data:{MIME_TYPE};base64,$BASE64_AUDIO"
                    }
                },
                {
                    "type": "text",
                    "text": "please describe the content of the audio"
                }
            ]
        }
    ],
    max_completion_tokens=1024
)

print(completion.model_dump_json())

音频限制

  • 音频格式:MP3,WAV,FLAC,M4A,OGG。

音频文件格式变种较多,不能保证所有文件都能被识别,请通过测试验证文件能够被正常识别。

  • 音频大小:

    • 以 URL 方式传入时:单个音频文件大小不超过 100 MB。

    • 以 Base64 编码传入时:单个音频的 Base64 编码字符串大小不超过 50 MB。

  • 音频数量:传入多个音频时,音频数量受模型上下文长度限制,所有音频和文本的总 Token 数必须小于模型的上下文长度。

注:计算音频的 Token 请参考 音频 Token 用量说明。模型上下文长度请参考 定价与限速

音频 Token 用量说明

音频的 Token 转化请参考以下代码。估算结果仅供参考,实际用量以 API 响应为准。

总 Tokens 数 ≈ 音频时长(单位:秒,例如:10.6 秒)* 6.25

计费说明

  • 计费:总费用根据输入、输入(命中缓存)和输出 Token 数计算;价格请参考 定价与限速

    • 可通过 音频 Token 用量说明 计算音频的 Token 消耗。估算结果仅供参考,实际用量以 API 响应为准。
  • 查看账单:您可以在控制台的 账单明细 页面查看账单及用量。

常见问题

是否支持本地文件上传?

mimo-v2.5mimo-v2-omni 模型暂不支持音频本地文件上传。支持的上传方式请参考 音频传入方式

更新时间 2026 年 04 月 22 日
We use cookies and similar technologies of our own to ensure the proper functioning of the website, customize content according to user preferences and analyze users' interactions on the website, as well as their browsing habits. You can find more information in our Cookie Policy. Select an option or go to Cookie Settings to manage your preferences. Learn More.