##All endpoints require Bearer Token authentication##
Get API Key:
Visit API Key Management Page to get your API Key
Add to request headers:
Authorization: Bearer YOUR_API_KEYModel name for chat completion
kimi-k2-thinking, kimi-k2-thinking-turbo "kimi-k2-thinking"
List of messages for the conversation, supports multi-turn dialogue and multimodal input
1Whether to stream the response
true: Stream response, returns content chunk by chunk in real-timefalse: Wait for complete response and return all at oncefalse
Maximum number of tokens to generate in the response
Note:
x >= 12000
Sampling temperature, controls randomness of output
Note:
0 <= x <= 21
Nucleus sampling parameter
Note:
Suggestion: Do not adjust both temperature and top_p simultaneously
0 <= x <= 10.9
Top-K sampling parameter
Note:
x >= 140
Number of completions to generate for each input message
Note:
1 <= x <= 51
Presence penalty, number between -2.0 and 2.0
Note:
-2 <= x <= 20
Frequency penalty, number between -2.0 and 2.0
Note:
-2 <= x <= 20
Response format settings
Note:
Stop sequences, generation stops when these sequences are matched
Note:
List of tools for Tool Use or Function Calling
Note:
128Chat completion successful
Unique identifier for the chat completion
"cmpl-04ea926191a14749b7f2c7a48a68abc6"
The model used for completion
"kimi-k2-thinking"
Response type
chat.completion "chat.completion"
Unix timestamp when the completion was created
1698999496
List of completion choices
Token usage statistics