curl --request POST \
  --url https://api.evolink.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "kimi-k2-thinking",
  "messages": [
    {
      "role": "user",
      "content": "Please introduce yourself"
    }
  ],
  "temperature": 1
}'

{
  "id": "cmpl-04ea926191a14749b7f2c7a48a68abc6",
  "model": "kimi-k2-thinking",
  "object": "chat.completion",
  "created": 1698999496,
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hi there! How can I help you?",
        "reasoning_content": "The user just said \"hi\". This is a very simple greeting. I should be friendly, helpful, and professional in my response..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 8,
    "completion_tokens": 292,
    "total_tokens": 300,
    "prompt_tokens_details": {
      "cached_tokens": 8
    }
  }
}

Kimi-K2

Kimi K2 - Complete API Reference

Use OpenAI SDK format to call Kimi-K2 model
Synchronous processing mode, real-time response
Text conversation: Single or multi-turn contextual dialogue, see simple_text and multi_turn examples
System prompts: Customize AI role and behavior, see system_prompt example
Multimodal input: Supports text + image mixed input, see vision example
Tool calling: Supports Function Calling, see tool_use example
Partial Mode: Supports prefill mode, see partial_mode example

POST

chat

completions

curl --request POST \
  --url https://api.evolink.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "kimi-k2-thinking",
  "messages": [
    {
      "role": "user",
      "content": "Please introduce yourself"
    }
  ],
  "temperature": 1
}'

{
  "id": "cmpl-04ea926191a14749b7f2c7a48a68abc6",
  "model": "kimi-k2-thinking",
  "object": "chat.completion",
  "created": 1698999496,
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hi there! How can I help you?",
        "reasoning_content": "The user just said \"hi\". This is a very simple greeting. I should be friendly, helpful, and professional in my response..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 8,
    "completion_tokens": 292,
    "total_tokens": 300,
    "prompt_tokens_details": {
      "cached_tokens": 8
    }
  }
}

Authorizations

Authorization

string

header

required

##All endpoints require Bearer Token authentication##

Get API Key:

Visit API Key Management Page to get your API Key

Add to request headers:

Authorization: Bearer YOUR_API_KEY

Body

application/json

model

enum<string>

required

Model name for chat completion

Available options:

kimi-k2-thinking,

kimi-k2-thinking-turbo

Example:

"kimi-k2-thinking"

messages

object[]

required

List of messages for the conversation, supports multi-turn dialogue and multimodal input

Minimum length: 1

Show child attributes

stream

boolean

default:false

Whether to stream the response

true: Stream response, returns content chunk by chunk in real-time
false: Wait for complete response and return all at once

Example:

false

max_tokens

integer

Maximum number of tokens to generate in the response

Note:

Too small value may cause truncated response
If max tokens is reached, finish_reason will be "length", otherwise "stop"

Required range: x >= 1

Example:

2000

temperature

number

default:1

Sampling temperature, controls randomness of output

Note:

Lower values (e.g., 0.2): More deterministic and focused output
Higher values (e.g., 1.5): More random and creative output
Recommended value for kimi-k2-thinking series: 1.0

Required range: 0 <= x <= 2

Example:

1

top_p

number

default:1

Nucleus sampling parameter

Note:

Controls sampling from tokens with cumulative probability
For example, 0.9 means sampling from tokens with top 90% cumulative probability
Default: 1.0 (considers all tokens)

Suggestion: Do not adjust both temperature and top_p simultaneously

Required range: 0 <= x <= 1

Example:

0.9

top_k

integer

Top-K sampling parameter

Note:

For example, 10 limits sampling to the top 10 highest probability tokens
Smaller values make output more focused
Default: no limit

Required range: x >= 1

Example:

40

integer

default:1

Number of completions to generate for each input message

Note:

Default: 1, maximum: 5
When temperature is very close to 0, only 1 result can be returned

Required range: 1 <= x <= 5

Example:

1

presence_penalty

number

default:0

Presence penalty, number between -2.0 and 2.0

Note:

Positive values penalize new tokens based on whether they appear in the text, increasing likelihood of discussing new topics

Required range: -2 <= x <= 2

Example:

0

frequency_penalty

number

default:0

Frequency penalty, number between -2.0 and 2.0

Note:

Positive values penalize new tokens based on their frequency in the text, decreasing likelihood of repeating same phrases verbatim

Required range: -2 <= x <= 2

Example:

0

response_format

object

Response format settings

Note:

Set to {"type": "json_object"} to enable JSON mode, ensuring model generates valid JSON
When using response_format with {"type": "json_object"}, explicitly guide the model to output JSON format in your prompt
Default: {"type": "text"}
Warning: Do not mix partial mode with response_format=json_object

Show child attributes

stop

Stop sequences, generation stops when these sequences are matched

Note:

The stop sequences themselves will not be included in the output
Maximum 5 strings, each no longer than 32 bytes Single stop word

tools

object[]

List of tools for Tool Use or Function Calling

Note:

Each tool must include a type
The function structure must include name, description, and parameters
Maximum 128 functions in tools array

Maximum length: 128

Show child attributes

Response

Chat completion successful

string

Unique identifier for the chat completion

Example:

"cmpl-04ea926191a14749b7f2c7a48a68abc6"

model

string

The model used for completion

Example:

"kimi-k2-thinking"

object

enum<string>

Response type

Available options:

chat.completion

Example:

"chat.completion"

created

integer

Unix timestamp when the completion was created

Example:

1698999496

choices

object[]

List of completion choices

Show child attributes

usage

object

Token usage statistics

Show child attributes

Gemini 2.5 Flash - Native API - API Reference Query Task Status

⌘I

Image Series

Video Series

Text Series

Task Management

File Management

Kimi K2 - Complete API Reference

Authorizations

Body

Response