> ## Documentation Index
> Fetch the complete documentation index at: https://docs.evolink.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Kimi K2 - Complete API Reference

> - Use OpenAI SDK format to call Kimi-K2 model
- Synchronous processing mode, real-time response
- **Text conversation**: Single or multi-turn contextual dialogue, see simple_text and multi_turn examples
- **System prompts**: Customize AI role and behavior, see system_prompt example
- **Multimodal input**: Supports text + image mixed input, see vision example
- **Tool calling**: Supports Function Calling, see tool_use example
- **Partial Mode**: Supports prefill mode, see partial_mode example

<Note>
  **BaseURL**: The default BaseURL is `https://direct.evolink.ai`, which has better support for text models and long-lived connections. `https://api.evolink.ai` is the primary endpoint for multimodal services and serves as a fallback address for text models.
</Note>


## OpenAPI

````yaml /en/api-manual/language-series/kimi-k2/Kimi-K2-api.json POST /v1/chat/completions
openapi: 3.1.0
info:
  title: Kimi-K2 Complete API Reference
  description: >-
    Complete API reference for Kimi-K2 chat interface, including all parameters
    and advanced features
  license:
    name: MIT
  version: 1.0.0
servers:
  - url: https://direct.evolink.ai
    description: Production (recommended)
  - url: https://api.evolink.ai
    description: Alternative URL
security:
  - bearerAuth: []
tags:
  - name: Chat Completion
    description: AI chat completion related endpoints
paths:
  /v1/chat/completions:
    post:
      tags:
        - Chat Completion
      summary: Kimi-K2 Chat Interface
      description: >-
        - Use OpenAI SDK format to call Kimi-K2 model

        - Synchronous processing mode, real-time response

        - **Text conversation**: Single or multi-turn contextual dialogue, see
        simple_text and multi_turn examples

        - **System prompts**: Customize AI role and behavior, see system_prompt
        example

        - **Multimodal input**: Supports text + image mixed input, see vision
        example

        - **Tool calling**: Supports Function Calling, see tool_use example

        - **Partial Mode**: Supports prefill mode, see partial_mode example
      operationId: createChatCompletion
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ChatCompletionRequest'
            examples:
              simple_text:
                summary: Single-turn text conversation
                value:
                  model: kimi-k2-thinking
                  messages:
                    - role: user
                      content: Please introduce yourself
                  temperature: 1
              multi_turn:
                summary: Multi-turn conversation (context understanding)
                value:
                  model: kimi-k2-thinking
                  messages:
                    - role: user
                      content: What is Python?
                    - role: assistant
                      content: Python is a high-level programming language...
                    - role: user
                      content: What are its advantages?
                  temperature: 1
              system_prompt:
                summary: Using system prompts
                value:
                  model: kimi-k2-thinking
                  messages:
                    - role: system
                      content: >-
                        You are Kimi, an AI assistant provided by Moonshot AI.
                        You excel at conversations in both Chinese and English.
                        You provide safe, helpful, and accurate answers. You
                        will reject any questions involving terrorism, racial
                        discrimination, or violence.
                    - role: user
                      content: What is 1+1?
                  temperature: 1
              vision:
                summary: Multimodal input (text + image)
                value:
                  model: kimi-k2-thinking
                  messages:
                    - role: user
                      content:
                        - type: text
                          text: >-
                            Please describe the scene and main elements in this
                            image in detail.
                        - type: image_url
                          image_url:
                            url: >-
                              data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==
                  temperature: 1
              tool_use:
                summary: Tool calling (Function Calling)
                value:
                  model: kimi-k2-thinking
                  messages:
                    - role: user
                      content: Write code to determine if 3214567 is a prime number.
                  tools:
                    - type: function
                      function:
                        name: CodeRunner
                        description: >-
                          Code executor that supports running python and
                          javascript code
                        parameters:
                          type: object
                          properties:
                            language:
                              type: string
                              enum:
                                - python
                                - javascript
                            code:
                              type: string
                              description: Write your code here
                  temperature: 1
              partial_mode:
                summary: Partial Mode (prefill mode)
                value:
                  model: kimi-k2-thinking
                  messages:
                    - role: system
                      content: >-
                        Extract name, size, price and colors from the product
                        description and output as a JSON object.
                    - role: user
                      content: >-
                        The SmartHome Mini is a compact smart home assistant,
                        available in black and silver, priced at $149, with
                        dimensions of 256 x 128 x 128mm.
                    - role: assistant
                      content: '{'
                      partial: true
                  temperature: 1
      responses:
        '200':
          description: Chat completion successful
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ChatCompletionResponse'
        '400':
          description: Invalid request parameters
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: 400
                  message: Invalid request parameters
                  type: invalid_request_error
        '401':
          description: Unauthorized, invalid or expired token
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: 401
                  message: Invalid or expired token
                  type: authentication_error
        '402':
          description: Insufficient quota, recharge required
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: 402
                  message: Insufficient quota
                  type: insufficient_quota_error
                  fallback_suggestion: https://evolink.ai/dashboard/billing
        '403':
          description: Access denied
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: 403
                  message: Access denied for this model
                  type: permission_error
                  param: model
        '404':
          description: Resource not found
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: 404
                  message: Specified model not found
                  type: not_found_error
                  param: model
                  fallback_suggestion: kimi-k2-thinking
        '413':
          description: Request payload too large
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: 413
                  message: Image file too large
                  type: request_too_large_error
                  param: content
                  fallback_suggestion: compress image to under 10MB
        '429':
          description: Rate limit exceeded
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: 429
                  message: Rate limit exceeded
                  type: rate_limit_error
                  fallback_suggestion: retry after 60 seconds
        '500':
          description: Internal server error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: 500
                  message: Internal server error
                  type: internal_server_error
                  fallback_suggestion: try again later
        '502':
          description: Upstream service error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: 502
                  message: Upstream AI service unavailable
                  type: upstream_error
                  fallback_suggestion: try different model
        '503':
          description: Service temporarily unavailable
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: 503
                  message: Service temporarily unavailable
                  type: service_unavailable_error
                  fallback_suggestion: retry after 30 seconds
components:
  schemas:
    ChatCompletionRequest:
      type: object
      required:
        - model
        - messages
      properties:
        model:
          type: string
          description: Model name for chat completion
          enum:
            - kimi-k2-thinking
            - kimi-k2-thinking-turbo
          example: kimi-k2-thinking
        messages:
          type: array
          description: >-
            List of messages for the conversation, supports multi-turn dialogue
            and multimodal input
          items:
            $ref: '#/components/schemas/Message'
          minItems: 1
        stream:
          type: boolean
          description: >-
            Whether to stream the response


            - `true`: Stream response, returns content chunk by chunk in
            real-time

            - `false`: Wait for complete response and return all at once
          default: false
          example: false
        max_tokens:
          type: integer
          description: >-
            Maximum number of tokens to generate in the response


            **Note**:

            - Too small value may cause truncated response

            - If max tokens is reached, finish_reason will be "length",
            otherwise "stop"
          minimum: 1
          example: 2000
        temperature:
          type: number
          description: |-
            Sampling temperature, controls randomness of output

            **Note**:
            - Lower values (e.g., 0.2): More deterministic and focused output
            - Higher values (e.g., 1.5): More random and creative output
            - **Recommended value for kimi-k2-thinking series: 1.0**
          minimum: 0
          maximum: 2
          default: 1
          example: 1
        top_p:
          type: number
          description: >-
            Nucleus sampling parameter


            **Note**:

            - Controls sampling from tokens with cumulative probability

            - For example, 0.9 means sampling from tokens with top 90%
            cumulative probability

            - Default: 1.0 (considers all tokens)


            **Suggestion**: Do not adjust both temperature and top_p
            simultaneously
          minimum: 0
          maximum: 1
          default: 1
          example: 0.9
        top_k:
          type: integer
          description: >-
            Top-K sampling parameter


            **Note**:

            - For example, 10 limits sampling to the top 10 highest probability
            tokens

            - Smaller values make output more focused

            - Default: no limit
          minimum: 1
          example: 40
        'n':
          type: integer
          description: |-
            Number of completions to generate for each input message

            **Note**:
            - Default: 1, maximum: 5
            - When temperature is very close to 0, only 1 result can be returned
          minimum: 1
          maximum: 5
          default: 1
          example: 1
        presence_penalty:
          type: number
          description: >-
            Presence penalty, number between -2.0 and 2.0


            **Note**:

            - Positive values penalize new tokens based on whether they appear
            in the text, increasing likelihood of discussing new topics
          minimum: -2
          maximum: 2
          default: 0
          example: 0
        frequency_penalty:
          type: number
          description: >-
            Frequency penalty, number between -2.0 and 2.0


            **Note**:

            - Positive values penalize new tokens based on their frequency in
            the text, decreasing likelihood of repeating same phrases verbatim
          minimum: -2
          maximum: 2
          default: 0
          example: 0
        response_format:
          type: object
          description: >-
            Response format settings


            **Note**:

            - Set to {"type": "json_object"} to enable JSON mode, ensuring model
            generates valid JSON

            - When using response_format with {"type": "json_object"},
            explicitly guide the model to output JSON format in your prompt

            - Default: {"type": "text"}

            - **Warning**: Do not mix partial mode with
            response_format=json_object
          properties:
            type:
              type: string
              enum:
                - text
                - json_object
              description: Response format type
              default: text
        stop:
          oneOf:
            - type: string
              description: Single stop word
            - type: array
              description: List of stop words, maximum 5
              items:
                type: string
              maxItems: 5
          description: |-
            Stop sequences, generation stops when these sequences are matched

            **Note**:
            - The stop sequences themselves will not be included in the output
            - Maximum 5 strings, each no longer than 32 bytes
        tools:
          type: array
          description: >-
            List of tools for Tool Use or Function Calling


            **Note**:

            - Each tool must include a type

            - The function structure must include name, description, and
            parameters

            - Maximum 128 functions in tools array
          items:
            $ref: '#/components/schemas/Tool'
          maxItems: 128
    ChatCompletionResponse:
      type: object
      properties:
        id:
          type: string
          description: Unique identifier for the chat completion
          example: cmpl-04ea926191a14749b7f2c7a48a68abc6
        model:
          type: string
          description: The model used for completion
          example: kimi-k2-thinking
        object:
          type: string
          enum:
            - chat.completion
          description: Response type
          example: chat.completion
        created:
          type: integer
          description: Unix timestamp when the completion was created
          example: 1698999496
        choices:
          type: array
          description: List of completion choices
          items:
            $ref: '#/components/schemas/Choice'
        usage:
          $ref: '#/components/schemas/Usage'
    ErrorResponse:
      type: object
      properties:
        error:
          type: object
          properties:
            code:
              type: integer
              description: HTTP status error code
            message:
              type: string
              description: Error message
            type:
              type: string
              description: Error type
            param:
              type: string
              description: Related parameter name
            fallback_suggestion:
              type: string
              description: Suggestion for handling the error
    Message:
      type: object
      required:
        - role
        - content
      properties:
        role:
          type: string
          description: |-
            Message role

            - `user`: User message
            - `assistant`: AI assistant message (for multi-turn conversation)
            - `system`: System prompt (defines AI's role and behavior)
          enum:
            - user
            - assistant
            - system
          example: user
        content:
          oneOf:
            - type: string
              description: Plain text message content
              example: Please introduce yourself
            - type: array
              description: Multimodal message content, supports text and image mixed input
              items:
                $ref: '#/components/schemas/ContentPart'
          description: >-
            Message content. Supports two formats:


            **1. Plain text string**: Can directly pass a string, e.g.,
            `"content":"Please introduce yourself"`


            **2. Object array** (supports multimodal input): See ContentPart
            definition below
        name:
          type: string
          description: >-
            Message name


            **Note**:

            - Used for role-playing scenarios, the name field can be viewed as
            part of the output prefix
          example: Kal'tsit
        partial:
          type: boolean
          description: >-
            Whether to enable Partial Mode


            **Note**:

            - Only set in the last message with role=assistant

            - Set to true to enable partial mode (prefill mode)

            - **Warning**: Do not mix partial mode with
            response_format=json_object
          default: false
    Tool:
      type: object
      required:
        - type
        - function
      properties:
        type:
          type: string
          enum:
            - function
          description: Tool type
        function:
          type: object
          required:
            - name
            - description
            - parameters
          properties:
            name:
              type: string
              description: |-
                Function name

                **Note**:
                - Must follow regex pattern: ^[a-zA-Z_][a-zA-Z0-9-_]{0,63}$
                - Using clear English names will be better accepted by the model
              pattern: ^[a-zA-Z_][a-zA-Z0-9-_]{0,63}$
              example: CodeRunner
            description:
              type: string
              description: >-
                Function description, explains what the function does to help
                the model judge and select
              example: Code executor that supports running python and javascript code
            parameters:
              type: object
              description: |-
                Function parameter definition

                **Note**:
                - The root of parameters must be an object
                - Content is a subset of JSON Schema
    Choice:
      type: object
      properties:
        index:
          type: integer
          description: Index of this choice
          example: 0
        message:
          $ref: '#/components/schemas/AssistantMessage'
        finish_reason:
          type: string
          description: |-
            Reason why the completion finished

            - `stop`: Natural completion
            - `length`: Reached maximum token limit
            - `content_filter`: Content was filtered
          enum:
            - stop
            - length
            - content_filter
          example: stop
    Usage:
      type: object
      description: Token usage statistics
      properties:
        prompt_tokens:
          type: integer
          description: Number of tokens in the input
          example: 8
        completion_tokens:
          type: integer
          description: Number of tokens in the output
          example: 292
        total_tokens:
          type: integer
          description: Total number of tokens used
          example: 300
        prompt_tokens_details:
          type: object
          description: Detailed breakdown of input tokens
          properties:
            cached_tokens:
              type: integer
              description: Number of cached tokens
              example: 8
    ContentPart:
      oneOf:
        - $ref: '#/components/schemas/TextContent'
        - $ref: '#/components/schemas/ImageContent'
    AssistantMessage:
      type: object
      properties:
        role:
          type: string
          description: Role of the message sender
          enum:
            - assistant
          example: assistant
        content:
          type: string
          description: AI's response content
          example: Hi there! How can I help you?
        reasoning_content:
          type: string
          description: >-
            Reasoning process content (only returned by kimi-k2-thinking series
            models)


            **Note**:

            - Signature feature of Kimi K2 thinking series models

            - Shows the model's thinking and reasoning process

            - Helps understand how the model arrives at its final answer
          example: >-
            The user just said "hi". This is a very simple greeting. I should be
            friendly, helpful, and professional in my response...
    TextContent:
      title: Text content
      type: object
      required:
        - type
        - text
      properties:
        type:
          type: string
          enum:
            - text
          description: Content type
        text:
          type: string
          description: Text content
          example: Please describe this image in detail
    ImageContent:
      title: Image content
      type: object
      required:
        - type
        - image_url
      properties:
        type:
          type: string
          enum:
            - image_url
          description: Content type
        image_url:
          type: object
          required:
            - url
          properties:
            url:
              type: string
              format: uri
              description: >-
                Image URL or Base64 encoding


                **Format**:

                - URL format: `https://example.com/image.png`

                - Base64 format: `data:image/<format>;base64,<Base64 encoding>`
                (format must be lowercase, e.g., png, jpg, jpeg, webp)


                **Limits**:

                - Maximum image size: `10MB`

                - Supported formats: `.jpeg`, `.jpg`, `.png`, `.webp`
              example: data:image/png;base64,iVBORw0KGgo...
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      description: >-
        ##All APIs require Bearer Token authentication##


        **Get API Key:**


        Visit [API Key Management Page](https://evolink.ai/dashboard/keys) to
        get your API Key


        **Add to request header:**

        ```

        Authorization: Bearer YOUR_API_KEY

        ```

````