MiniMax-M3 - OpenAI-Compatible API
- Use the OpenAI Chat Completions protocol to call the MiniMax-M3 model
- Multi-turn conversation: Supports single-turn or multi-turn contextual dialogue
- System prompts: Customize AI role and behavior via
role=systemmessages - Multimodal input:
contentsupports mixed text + image / video - Thinking mode: Controlled via
thinking.type; thinking content is returned viareasoning_content - Streaming output: Supports SSE streaming responses
- Tool calling: Supports Function Calling
Documentation Index
Fetch the complete documentation index at: https://docs.evolink.ai/llms.txt
Use this file to discover all available pages before exploring further.
https://direct.evolink.ai, which has better support for text models and long-lived connections. https://api.evolink.ai is the primary endpoint for multimodal services and serves as a fallback address for text models.Authorizations
##All APIs require Bearer Token authentication##
Get API Key:
Visit API Key Management Page to get your API Key
Add to request header:
Authorization: Bearer YOUR_API_KEYBody
Chat model name
MiniMax-M3 "MiniMax-M3"
List of conversation messages, supports multi-turn dialogue
Messages with different roles have different field structures; select the corresponding role to view
1- System Message
- User Message
- Assistant Message
- Tool Message
Controls deep thinking
Notes:
- Defaults to
adaptive: The model adaptively decides whether to engage in deep thinking based on problem difficulty - By default, thinking content is inlined in the response
content(wrapped in<think>...</think>tags); to separate it into a dedicated field, usereasoning_split
Whether to split thinking content into a separate field
false(default): Thinking content is inlined incontent, wrapped in<think>...</think>tagstrue: Thinking content is split intochoices[].message.reasoning_contentandreasoning_details
Sampling temperature, controls output randomness
Notes:
- Lower values (e.g. 0.2): More deterministic, focused output
- Higher values (e.g. 1.5): More random, creative output
- Range:
[0, 2], default 1
0 <= x <= 21
Nucleus Sampling parameter
Notes:
- Controls sampling from tokens with cumulative probability
- e.g. 0.95 means selecting from tokens reaching 95% cumulative probability
- Range:
[0, 1], MiniMax-M3 default 0.95
Recommendation: Do not adjust temperature and top_p simultaneously
0 <= x <= 10.95
Upper limit for generated content length (in tokens)
Notes:
- MiniMax-M3 recommended 131,072 (128K), maximum 524,288 (512K)
- Tokens generated by thinking also count toward this limit
- If generation is interrupted due to
length, try increasing this value
1 <= x <= 524288131072
Whether to return the response in streaming mode
true: Streaming response, returns content in real-time chunks via SSE (Server-Sent Events)false: Wait for complete response before returning (default)
false
Streaming response options
Only effective when stream=true
Tool definition list for Function Calling
Each tool requires a name, description, and parameter schema
Legacy generation length limit parameter
Note: Deprecated, please use max_completion_tokens instead
x >= 1Response
Chat completion successful
Unique identifier for the chat completion
"0668a381bdc3c0ded310e27c9a46d16e7"
Model name actually used
"MiniMax-M3"
Response type
chat.completion "chat.completion"
Creation timestamp (Unix seconds)
1777026807
List of chat completion choices
Token usage statistics
Whether the input content triggered a sensitive word filter. If the input severely violates policies, the API will return a content violation error with empty response content
Type of sensitive word triggered by input (returned when input_sensitive is true): 1 severe violation; 2 pornography; 3 advertising; 4 prohibited content; 5 abusive language; 6 violence/terrorism; 7 other
Whether the output content triggered a sensitive word filter
Type of sensitive word triggered by output
Status code and error details