> ## Documentation Index
> Fetch the complete documentation index at: https://docs.evolink.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Wan2.7 Reference Video

> - WAN2.7 (wan2.7-reference-video) model supports reference-to-video generation, using people or objects as protagonists to produce single-character performances or multi-character interactions
- Multimodal inputs: starting frame (`image_start`), multiple reference images (`image_urls`), multiple reference videos (`video_urls`), and per-character voice bindings
- **At least one** reference image (`image_urls`) or reference video (`video_urls`) must be provided; passing only `image_start` does not satisfy this. The total of `image_urls` + `video_urls` must be ≤ 5
- **Character indexing in prompt:** in Chinese use "图1, 图2 / 视频1, 视频2"; in English use "Image 1", "Video 1" — these correspond 1-based to the order of `image_urls` / `video_urls`. Images and videos are counted independently, so "Image 1" and "Video 1" can coexist
- **Multi-character voice binding:** prefer `model_params.voice_bindings` (precise binding); the legacy `audio_urls` (positional alignment) is also supported
- Asynchronous processing mode, use the returned task ID to [query status](/en/api-manual/task-management/get-task-detail)
- Generated video links are valid for 24 hours, please save them promptly
- **Billing:** charged based on "input video duration + output video duration"; only successful generations are billed, failed tasks are free



## OpenAPI

````yaml /en/api-manual/video-series/wan2.7/wan2.7-reference-video.json POST /v1/videos/generations
openapi: 3.1.0
info:
  title: wan2.7-reference-video API
  description: >-
    Generate videos using the WAN2.7 model with reference images, reference
    videos, and audio inputs
  license:
    name: MIT
  version: 1.0.0
servers:
  - url: https://api.evolink.ai
    description: Production Environment
security:
  - bearerAuth: []
paths:
  /v1/videos/generations:
    post:
      tags:
        - Video Generation
      summary: wan2.7-reference-video API
      description: >-
        - WAN2.7 (wan2.7-reference-video) model supports reference-to-video
        generation, using people or objects as protagonists to produce
        single-character performances or multi-character interactions

        - Multimodal inputs: starting frame (`image_start`), multiple reference
        images (`image_urls`), multiple reference videos (`video_urls`), and
        per-character voice bindings

        - **At least one** reference image (`image_urls`) or reference video
        (`video_urls`) must be provided; passing only `image_start` does not
        satisfy this. The total of `image_urls` + `video_urls` must be ≤ 5

        - **Character indexing in prompt:** in Chinese use "图1, 图2 / 视频1, 视频2";
        in English use "Image 1", "Video 1" — these correspond 1-based to the
        order of `image_urls` / `video_urls`. Images and videos are counted
        independently, so "Image 1" and "Video 1" can coexist

        - **Multi-character voice binding:** prefer
        `model_params.voice_bindings` (precise binding); the legacy `audio_urls`
        (positional alignment) is also supported

        - Asynchronous processing mode, use the returned task ID to [query
        status](/en/api-manual/task-management/get-task-detail)

        - Generated video links are valid for 24 hours, please save them
        promptly

        - **Billing:** charged based on "input video duration + output video
        duration"; only successful generations are billed, failed tasks are free
      operationId: createWan27ReferenceVideoGeneration
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Wan27ReferenceVideoRequest'
            examples:
              single_reference_video:
                summary: Single reference video
                value:
                  model: wan2.7-reference-video
                  prompt: The character from the reference video dancing on a meadow
                  video_urls:
                    - https://example.com/reference.mp4
              multi_subject_with_voice_bindings:
                summary: Multi-subject reference + precise voice binding (recommended)
                value:
                  model: wan2.7-reference-video
                  prompt: >-
                    Image 1 holds Image 2 and plays a soft country folk song on
                    the chair in Image 3, saying: "What lovely sunshine today"
                  image_urls:
                    - https://example.com/girl.jpg
                    - https://example.com/object.png
                    - https://example.com/chair.png
                  model_params:
                    voice_bindings:
                      image1: https://example.com/girl_voice.mp3
                  duration: 10
              multi_grid_storyboard:
                summary: Single reference image (multi-grid storyboard)
                value:
                  model: wan2.7-reference-video
                  prompt: >-
                    Reference image, 3D cartoon adventure film style.
                    Storyboard: 1. Wide shot of the fantasy forest; 2. The boy
                    parts vines to scout; 3. The little robot scans ahead; 4.
                    Close-up of treasure map; 5. The boy's excited face; 6. They
                    leap over roots and venture deeper
                  image_urls:
                    - https://example.com/storyboard.png
                  duration: 10
      responses:
        '200':
          description: Video task created successfully
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/VideoGenerationResponse'
        '400':
          description: Invalid request parameters
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: invalid_request
                  message: Invalid request parameters
                  type: invalid_request_error
        '401':
          description: Unauthenticated, invalid or expired token
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: unauthorized
                  message: Invalid or expired token
                  type: authentication_error
        '402':
          description: Insufficient quota, top-up required
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: insufficient_quota
                  message: Insufficient quota. Please top up your account.
                  type: insufficient_quota
        '403':
          description: No access permission
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: model_access_denied
                  message: 'Token does not have access to model: wan2.7-reference-video'
                  type: invalid_request_error
        '429':
          description: Rate limit exceeded
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: rate_limit_exceeded
                  message: Too many requests, please try again later
                  type: rate_limit_error
        '500':
          description: Internal server error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: internal_error
                  message: Internal server error
                  type: api_error
components:
  schemas:
    Wan27ReferenceVideoRequest:
      required:
        - model
        - prompt
      type: object
      properties:
        model:
          type: string
          description: Model name, must be `wan2.7-reference-video`
          enum:
            - wan2.7-reference-video
          example: wan2.7-reference-video
        prompt:
          type: string
          description: >-
            Text prompt for video generation. Supports Chinese and English; each
            character / letter / punctuation counts as 1, with overflow
            auto-truncated. Maximum length 5000 characters


            **Character indexing rules:**

            - Chinese: use "图1, 图2 / 视频1, 视频2" — corresponds 1-based to the
            order of `image_urls` / `video_urls`

            - English: use "Image 1", "Video 1" (capitalised, with a space
            between word and digit)

            - Images and videos are counted independently, so "Image 1" and
            "Video 1" can coexist

            - If only one reference image or one reference video is provided,
            you can simply write "the reference image" or "the reference video"


            **Multi-grid (storyboard) image:** when one multi-grid image is
            provided, describe key shots in storyboard form; the model
            recognises the grid layout and fills in the missing transitions
          maxLength: 5000
          example: >-
            Video 1 holds Image 3 and plays a soft country folk tune on the
            chair in Image 4
        negative_prompt:
          type: string
          description: >-
            Negative prompt describing what should not appear in the video.
            Supports both Chinese and English. Maximum length 500 characters;
            overflow is auto-truncated
          maxLength: 500
          example: Blurry, low quality
        image_start:
          type: string
          format: uri
          description: >-
            Starting-frame image URL, used as the first frame of the generated
            video. **Does not count** toward the `image_urls` + `video_urls` ≤ 5
            limit. **Does not accept voice binding** (the starting frame itself
            is not assigned a voice)


            **Use cases:**

            - Subject already appears in the starting frame: combine with
            reference materials to reinforce identity consistency

            - Subject not in the starting frame: reference materials define new
            subjects appearing as the video progresses


            **Image limits:**

            - Formats: JPEG, JPG, PNG (transparency not supported), BMP, WEBP

            - Resolution: width and height in `[240, 8000]` pixels

            - Aspect ratio: 1:8 ~ 8:1

            - File size: up to `20MB`
          example: https://example.com/first_frame.jpg
        image_urls:
          type: array
          items:
            type: string
            format: uri
          description: >-
            Reference image URL array. Can supply subjects (people / animals /
            objects) or scene backgrounds; when a subject is included, each
            image should contain a **single** character


            **Quantity limits:**

            - `image_urls` + `video_urls` total ≤ 5

            - At least one of `image_urls` / `video_urls` must be provided
            (passing only `image_start` is not enough)


            **Image limits:**

            - Formats: JPEG, JPG, PNG (transparency not supported), BMP, WEBP

            - Resolution: width and height in `[240, 8000]` pixels

            - Aspect ratio: 1:8 ~ 8:1

            - File size: up to `20MB`
          example:
            - https://example.com/ref1.jpg
            - https://example.com/ref2.jpg
        video_urls:
          type: array
          items:
            type: string
            format: uri
          description: >-
            Reference video URL array. The video should ideally feature a
            subject (person / animal / object); empty or pure-background footage
            is discouraged. When a subject is included, each video should
            contain a **single** character. Audio in the video can be used as a
            voice reference


            **Quantity limits:**

            - `image_urls` + `video_urls` total ≤ 5

            - At least one of `image_urls` / `video_urls` must be provided


            **Video limits:**

            - Formats: mp4, mov

            - Duration: `1 ~ 30` seconds

            - Resolution: width and height in `[240, 4096]` pixels

            - Aspect ratio: 1:8 ~ 8:1

            - File size: up to `100MB`


            **Note:** when `video_urls` is provided, `duration` is capped at 10
            seconds
          example:
            - https://example.com/reference.mp4
        audio_urls:
          type: array
          items:
            type: string
            format: uri
          maxItems: 5
          description: >-
            **[Compatibility field — prefer `model_params.voice_bindings`]**


            Reference voice URL array. Bound positionally to reference materials
            in this order: first match against `video_urls`, then against
            `image_urls` (in their array order, one-to-one). Up to 5 elements


            **Priority:**

            - When both `model_params.voice_bindings` and `audio_urls` are
            supplied, only `voice_bindings` is used and this field is ignored

            - If a video in `video_urls` carries audio and no voice binding is
            set for it, the original audio is used; an explicit voice binding
            overrides the original audio


            **Audio limits:**

            - Supported formats: `wav`, `mp3`

            - Duration range: `1 ~ 10` seconds

            - File size: up to `15MB`
          example:
            - https://example.com/voice1.mp3
            - https://example.com/voice2.mp3
        model_params:
          type: object
          description: Advanced parameter container (recommended)
          properties:
            voice_bindings:
              type: object
              description: >-
                **Precise voice binding for multiple characters (recommended).**
                Has priority over `audio_urls`


                **Binding rules:**

                - Keys are of the form `image{N}` or `video{N}` (1-based, no
                leading zeros, e.g. `image1`, `video2`)

                - The N in the key matches "图N / Image N" or "视频N / Video N" in
                the prompt and corresponds to `image_urls[N-1]` /
                `video_urls[N-1]`

                - Each value is the voice audio URL for that character

                - Total bindings ≤ 5

                - `image_start` (starting frame) **does not accept** voice
                bindings

                - You may skip some characters (e.g. bind only image1 and
                image3, leaving image2 unbound)


                **Audio limits:**

                - Supported formats: `wav`, `mp3`

                - Duration range: `1 ~ 10` seconds

                - File size: up to `15MB`


                **Validation errors:**

                - Bad key format: `voice_bindings key "img1"/"image01" invalid`

                - Index out of range: `voice_bindings.image5 out of range`

                - Empty value, duplicate binding, or more than 5 bindings
              additionalProperties:
                type: string
                format: uri
              example:
                image1: https://example.com/voice_a.mp3
                image3: https://example.com/voice_c.mp3
                video1: https://example.com/voice_v.mp3
        quality:
          type: string
          description: |-
            Video quality, defaults to `720p`

            **Options:**
            - `720p`: Standard definition, standard price, this is the default
            - `1080p`: High definition, higher price
          enum:
            - 720p
            - 1080p
          default: 720p
          example: 720p
        aspect_ratio:
          type: string
          description: >-
            Video aspect ratio, defaults to `16:9`


            **Behavior:**

            - `image_start` not provided: video is generated using the specified
            `aspect_ratio`

            - `image_start` provided: this field is **ignored**; the video uses
            an aspect ratio close to the starting-frame image


            **Output resolution per quality tier:**


            | Quality | 16:9 | 9:16 | 1:1 | 4:3 | 3:4 |

            | --- | --- | --- | --- | --- | --- |

            | 720p | 1280×720 | 720×1280 | 960×960 | 1104×832 | 832×1104 |

            | 1080p | 1920×1080 | 1080×1920 | 1440×1440 | 1648×1248 | 1248×1648
            |
          enum:
            - '16:9'
            - '9:16'
            - '1:1'
            - '4:3'
            - '3:4'
          default: '16:9'
          example: '16:9'
        duration:
          type: number
          description: |-
            Video duration in seconds (integer)

            **Range:**
            - Without `video_urls`: `2 ~ 15`, default `5`
            - With `video_urls`: `2 ~ 10` (capped at 10 seconds)

            **Billing:** based on the actual generated video duration
          minimum: 2
          maximum: 15
          default: 5
          example: 5
        seed:
          type: integer
          description: >-
            Random seed, defaults to random


            **Notes:**

            - Range: `1` ~ `2147483647`

            - Fixing the seed reduces variation when iterating on prompts and
            improves reproducibility
          minimum: 1
          maximum: 2147483647
          example: 42
        prompt_extend:
          type: boolean
          description: >-
            Whether to enable intelligent prompt rewriting. When enabled, a
            large model will optimize the prompt, which significantly improves
            results for simple or insufficiently descriptive prompts.


            **Note:** Default is `false`. Omitting the field or sending `false`
            will not trigger rewriting; explicitly send `true` to enable.
          default: false
          example: false
        callback_url:
          type: string
          description: >-
            HTTPS callback URL for task completion


            **Callback Timing:**

            - Triggered when task is completed, failed, or cancelled

            - Sent after billing confirmation


            **Security Restrictions:**

            - Only HTTPS protocol is supported

            - Callbacks to internal IP addresses are prohibited (127.0.0.1,
            10.x.x.x, 172.16-31.x.x, 192.168.x.x, etc.)

            - URL length must not exceed `2048` characters


            **Callback Mechanism:**

            - Timeout: `10` seconds

            - Up to `3` retries after failure (retries at `1`/`2`/`4` seconds
            after failure)

            - Callback response format is consistent with the task query API
            response

            - 2xx status codes are considered successful, other status codes
            trigger retries
          format: uri
          example: https://your-domain.com/webhooks/video-task-completed
    VideoGenerationResponse:
      type: object
      properties:
        created:
          type: integer
          description: Task creation timestamp
          example: 1757169743
        id:
          type: string
          description: Task ID
          example: task-unified-1757169743-7cvnl5zw
        model:
          type: string
          description: Actual model name used
          example: wan2.7-reference-video
        object:
          type: string
          enum:
            - video.generation.task
          description: Specific task type
        progress:
          type: integer
          description: Task progress percentage (0-100)
          minimum: 0
          maximum: 100
          example: 0
        status:
          type: string
          description: Task status
          enum:
            - pending
            - processing
            - completed
            - failed
          example: pending
        task_info:
          $ref: '#/components/schemas/VideoTaskInfo'
          description: Detailed video task information
        type:
          type: string
          enum:
            - text
            - image
            - audio
            - video
          description: Task output type
          example: video
        usage:
          $ref: '#/components/schemas/Usage'
          description: Usage and billing information
    ErrorResponse:
      type: object
      properties:
        error:
          type: object
          properties:
            code:
              type: string
              description: Error code identifier
            message:
              type: string
              description: Error description
            type:
              type: string
              description: Error type
    VideoTaskInfo:
      type: object
      properties:
        can_cancel:
          type: boolean
          description: Whether the task can be cancelled
          example: true
        estimated_time:
          type: integer
          description: Estimated completion time (seconds)
          minimum: 0
          example: 120
    Usage:
      type: object
      description: Usage and billing information
      properties:
        billing_rule:
          type: string
          description: Billing rule
          enum:
            - per_call
            - per_token
            - per_second
          example: per_call
        credits_reserved:
          type: number
          description: Estimated credits to be consumed
          minimum: 0
          example: 5
        user_group:
          type: string
          description: User group category
          enum:
            - default
            - vip
          example: default
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      description: >-
        ## All APIs require Bearer Token authentication ##


        **Get your API Key:**


        Visit the [API Key management page](https://evolink.ai/dashboard/keys)
        to obtain your API Key


        **Add to request headers:**

        ```

        Authorization: Bearer YOUR_API_KEY

        ```

````