Skip to main content
POST
/
v1
/
chat
/
completions
curl --request POST \
  --url https://direct.evolink.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "MiniMax-M3",
  "messages": [
    {
      "role": "user",
      "content": "Please introduce yourself"
    }
  ]
}
'
{
  "id": "066b36619b147e326d17053cccdef70f",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "<think>\nThe user is asking about the capital of France, which is a common knowledge question. The answer is Paris.\n</think>\nThe capital of France is **Paris**.",
        "role": "assistant",
        "name": "MiniMax AI",
        "audio_content": ""
      }
    }
  ],
  "created": 1777026807,
  "model": "MiniMax-M3",
  "object": "chat.completion",
  "usage": {
    "total_tokens": 60,
    "total_characters": 0,
    "prompt_tokens": 7,
    "completion_tokens": 53,
    "prompt_tokens_details": {
      "cached_tokens": 0
    }
  },
  "input_sensitive": false,
  "output_sensitive": false,
  "input_sensitive_type": 0,
  "output_sensitive_type": 0,
  "base_resp": {
    "status_code": 0,
    "status_msg": ""
  }
}

Documentation Index

Fetch the complete documentation index at: https://docs.evolink.ai/llms.txt

Use this file to discover all available pages before exploring further.

BaseURL: The default BaseURL is https://direct.evolink.ai, which has better support for text models and long-lived connections. https://api.evolink.ai is the primary endpoint for multimodal services and serves as a fallback address for text models.

Authorizations

Authorization
string
header
required

##All APIs require Bearer Token authentication##

Get API Key:

Visit API Key Management Page to get your API Key

Add to request header:

Authorization: Bearer YOUR_API_KEY

Body

application/json
model
enum<string>
required

Chat model name

Available options:
MiniMax-M3
Example:

"MiniMax-M3"

messages
(System Message · object | User Message · object | Assistant Message · object | Tool Message · object)[]
required

List of conversation messages, supports multi-turn dialogue

Messages with different roles have different field structures; select the corresponding role to view

Minimum array length: 1
thinking
object

Controls deep thinking

Notes:

  • Defaults to adaptive: The model adaptively decides whether to engage in deep thinking based on problem difficulty
  • By default, thinking content is inlined in the response content (wrapped in <think>...</think> tags); to separate it into a dedicated field, use reasoning_split
reasoning_split
boolean

Whether to split thinking content into a separate field

  • false (default): Thinking content is inlined in content, wrapped in <think>...</think> tags
  • true: Thinking content is split into choices[].message.reasoning_content and reasoning_details
temperature
number
default:1

Sampling temperature, controls output randomness

Notes:

  • Lower values (e.g. 0.2): More deterministic, focused output
  • Higher values (e.g. 1.5): More random, creative output
  • Range: [0, 2], default 1
Required range: 0 <= x <= 2
Example:

1

top_p
number
default:0.95

Nucleus Sampling parameter

Notes:

  • Controls sampling from tokens with cumulative probability
  • e.g. 0.95 means selecting from tokens reaching 95% cumulative probability
  • Range: [0, 1], MiniMax-M3 default 0.95

Recommendation: Do not adjust temperature and top_p simultaneously

Required range: 0 <= x <= 1
Example:

0.95

max_completion_tokens
integer

Upper limit for generated content length (in tokens)

Notes:

  • MiniMax-M3 recommended 131,072 (128K), maximum 524,288 (512K)
  • Tokens generated by thinking also count toward this limit
  • If generation is interrupted due to length, try increasing this value
Required range: 1 <= x <= 524288
Example:

131072

stream
boolean
default:false

Whether to return the response in streaming mode

  • true: Streaming response, returns content in real-time chunks via SSE (Server-Sent Events)
  • false: Wait for complete response before returning (default)
Example:

false

stream_options
object

Streaming response options

Only effective when stream=true

tools
object[]

Tool definition list for Function Calling

Each tool requires a name, description, and parameter schema

max_tokens
integer
deprecated

Legacy generation length limit parameter

Note: Deprecated, please use max_completion_tokens instead

Required range: x >= 1

Response

Chat completion successful

id
string

Unique identifier for the chat completion

Example:

"0668a381bdc3c0ded310e27c9a46d16e7"

model
string

Model name actually used

Example:

"MiniMax-M3"

object
enum<string>

Response type

Available options:
chat.completion
Example:

"chat.completion"

created
integer

Creation timestamp (Unix seconds)

Example:

1777026807

choices
object[]

List of chat completion choices

usage
object

Token usage statistics

input_sensitive
boolean

Whether the input content triggered a sensitive word filter. If the input severely violates policies, the API will return a content violation error with empty response content

input_sensitive_type
integer

Type of sensitive word triggered by input (returned when input_sensitive is true): 1 severe violation; 2 pornography; 3 advertising; 4 prohibited content; 5 abusive language; 6 violence/terrorism; 7 other

output_sensitive
boolean

Whether the output content triggered a sensitive word filter

output_sensitive_type
integer

Type of sensitive word triggered by output

base_resp
object

Status code and error details