GLM-5.2 - Anthropic Compatible API

Authorizations

Authorization

string

header

required

##All interfaces require authentication using a Bearer Token##

Get an API Key:

Visit the API Key management page to obtain your API Key

Add it to the request header when using:

Authorization: Bearer YOUR_API_KEY

Note: EvoLink uses Bearer Token authentication uniformly for /v1/messages.

Body

application/json

model

enum<string>

required

The model to call

Available options:

glm-5.2

Example:

"glm-5.2"

messages

object[]

required

The list of conversation messages, alternating between user / assistant by turn

Notes:

Must contain at least 1 message
The last message is usually role=user
Multi-turn context is supported, and the model references prior messages

Minimum array length: 1

Show child attributes

max_tokens

integer

The upper limit on the length of generated content (number of tokens)

Notes:

Tokens produced by thinking also count toward this limit
When the limit is reached, content is truncated and the response returns stop_reason=max_tokens

Required range: x >= 1

Example:

1024

system

System prompt, used to set the AI's role and behavior

Notes:

Supports a string or an array of content blocks
Passed via the top-level system field (do not place it inside messages)
The model follows the system constraints
⚠️ An overly long system may be truncated: For long context, place it in messages rather than piling everything into system

Example:

"You are a helpful assistant."

temperature

number

Sampling temperature

Notes:

Higher values make output more varied, lower values more deterministic
Recommended range [0, 1]

Required range: 0 <= x <= 1

Example:

1

top_p

number

Nucleus sampling threshold

Notes:

Range [0, 1]
It is recommended not to adjust temperature and top_p at the same time

Required range: 0 <= x <= 1

Example:

0.9

top_k

integer

Sample only from the K highest-probability tokens (an Anthropic-specific parameter)

Notes:

Smaller values make output more deterministic, larger values make candidates more diverse

Required range: x >= 0

Example:

10

stop_sequences

string[]

Custom stop sequences: generation stops when it hits any of these strings

Notes:

Hitting one truncates output, and content before the hit is returned normally
⚠️ Note: When a stop sequence is hit, GLM-5.2 returns stop_reason as end_turn (rather than the Anthropic-standard stop_sequence), and the response does not include a stop_sequence field. If your client relies on stop_reason=="stop_sequence" to detect a hit, special handling is required

Example:

["\n\n"]

stream

boolean

default:false

Whether to return via SSE streaming

true: Server-Sent Events streaming (standard Anthropic event sequence: message_start / content_block_start / content_block_delta / message_delta / message_stop)
false: Returns the complete response all at once (default)

Example:

false

thinking

object

Controls deep thinking

Notes:

GLM-5.2 is a reasoning model, and thinking is enabled by default when this field is not passed
When enabled, the response content array includes a type="thinking" reasoning-process block (billed as output tokens, and signature may be an empty string)
Pass {"type":"disabled"} to turn off thinking, significantly reducing output tokens
⚠️ Only the binary type toggle takes effect: thinking budget / level parameters such as budget_tokens and effort have no effect (they are ignored), so the amount of thinking cannot be finely controlled

Show child attributes

tools

object[]

The list of tool definitions

Notes:

Follows the Anthropic tool definition spec
input_schema uses a JSON Schema object
The model returns standard tool_use blocks with stop_reason=tool_use

Show child attributes

tool_choice

object

Tool selection strategy

Show child attributes

metadata

object

Request metadata

Show child attributes

Response

Message object

Anthropic-style message response

string

The message's unique ID (format: msg_<uuid>)

type

enum<string>

Response object type

Available options:

message

role

enum<string>

Available options:

assistant

model

string

The model actually used

Example:

"glm-5.2"

content

object[]

The list of response content blocks

Possible block types:

thinking: the reasoning process (when thinking is enabled, which is the default)
text: the final answer text
tool_use: a tool call initiated by the model

Show child attributes

stop_reason

enum<string>

Stop reason

end_turn: natural completion (⚠️ also returned when stop_sequences is hit)
max_tokens: reached the max_tokens limit
tool_use: the model triggered a tool call

Available options:

end_turn,

max_tokens,

tool_use

usage

object

Token usage statistics (Anthropic spec)

Show child attributes