Skip to main content
POST
/
v1
/
messages
curl --request POST \
  --url https://direct.evolink.ai/v1/messages \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "deepseek-v4-flash",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": "Hello, world"
    }
  ]
}
'
{
  "id": "53ee6690-e14a-4e6b-890b-a135100d51c7",
  "type": "message",
  "role": "assistant",
  "model": "deepseek-v4-flash",
  "content": [
    {
      "type": "thinking",
      "thinking": "The user is asking about Japan's capital — a basic geography question. The answer is Tokyo, just give it directly.",
      "signature": "53ee6690-e14a-4e6b-890b-a135100d51c7"
    },
    {
      "type": "text",
      "text": "The capital of Japan is **Tokyo**."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 7,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0,
    "output_tokens": 77,
    "service_tier": "standard"
  }
}
BaseURL: The default BaseURL is https://direct.evolink.ai, which has better support for text models and long-lived connections. https://api.evolink.ai is the primary endpoint for multimodal services and serves as a fallback address for text models.

Authorizations

Authorization
string
header
required

##All APIs require Bearer Token authentication##

Get API Key:

Visit the API Key Management Page to obtain your API Key

Add to request header:

Authorization: Bearer YOUR_API_KEY

Note: Although Anthropic's native API uses the x-api-key header, EvoLink's /v1/messages uniformly uses Bearer Token authentication.

Body

application/json
model
enum<string>
default:deepseek-v4-flash
required

Model to invoke

  • deepseek-v4-flash: Fast general-purpose
  • deepseek-v4-pro: Deep reasoning

Tip: Both models have thinking enabled by default, and responses always contain a type="thinking" content block; set thinking.type="disabled" explicitly to turn it off. An unspecified or unsupported model is automatically mapped to deepseek-v4-flash.

Available options:
deepseek-v4-flash,
deepseek-v4-pro
Example:

"deepseek-v4-flash"

max_tokens
integer
required

Maximum number of tokens to generate (required)

Notes:

  • The V4 series can reach up to 384,000
  • Tokens produced by thinking also count toward the max_tokens limit
Required range: 1 <= x <= 384000
Example:

1024

messages
object[]
required

List of conversation messages, alternating between user / assistant turns

Notes:

  • At least one message must be included
  • The final message is typically role=user
  • image / document content types are not yet supported
Minimum array length: 1
system

System prompt, used to define the AI's role and behavior

Notes:

  • Accepts a string or an array of strings
  • Unlike the system message on the OpenAI endpoint, the Anthropic endpoint uses a top-level system field
Example:

"You are a helpful assistant."

temperature
number
default:1

Sampling temperature

Notes:

  • Range [0.0, 2.0]
  • Default 1; higher values are more diverse, lower values are more deterministic
Required range: 0 <= x <= 2
Example:

1

top_p
number
default:1

Nucleus sampling threshold

Notes:

  • Range [0, 1]
  • Do not adjust temperature and top_p simultaneously
Required range: 0 <= x <= 1
Example:

1

stop_sequences
string[]

Custom stop sequences

Notes:

  • Generation stops when the model encounters any of these strings
  • Up to 4 entries (following Anthropic's specification)
Maximum array length: 4
stream
boolean
default:false

Whether to return via SSE streaming

  • true: Server-Sent Events streaming return
  • false: Return the complete response all at once (default)
Example:

false

thinking
object

Thinking mode control (V4)

Notes:

  • Enabled by default on both models (type=enabled)
  • When enabled, the response content array includes a type="thinking" reasoning block (billed as output tokens)
  • Note: The API ignores Anthropic's native budget_tokens field; use output_config.effort to control depth
  • In multi-turn conversations, simply place the thinking block from the previous response back into the assistant content array verbatim (the Anthropic protocol is more lenient — missing thinking will not cause an error, but preserving the signature helps context consistency)
output_config
object

Output configuration (V4 extension)

Notes: DeepSeek only supports the effort field

tools
object[]

List of tool definitions

Notes:

  • Follows the Anthropic tool definition specification
  • input_schema uses a JSON Schema object
tool_choice
object

Controls tool-calling behavior

Available types:

  • auto: Model decides automatically (default when tools are provided)
  • any: Must call some tool (without specifying which)
  • tool: Must call the specified name
  • none: Forbid tool calls

Response

Message object

Anthropic-style message response

id
string

Unique message ID

type
enum<string>

Response object type

Available options:
message
role
enum<string>
Available options:
assistant
model
string

Model actually used

Example:

"deepseek-v4-pro"

content
object[]

List of response content blocks

Possible block types:

  • thinking: Reasoning process (only when thinking is enabled)
  • text: Final answer text
  • tool_use: Tool call initiated by the model
stop_reason
enum<string>

Stop reason

  • end_turn: Natural completion
  • max_tokens: Reached the max_tokens limit
  • stop_sequence: Hit a stop_sequences entry
  • tool_use: Model triggered a tool call
Available options:
end_turn,
max_tokens,
stop_sequence,
tool_use
stop_sequence
string | null

The specific sequence hit when stop_reason=stop_sequence; otherwise null

usage
object

Token usage statistics (Anthropic specification)