Wan2.7 Reference Video

Authorizations

Authorization

string

header

required

All APIs require Bearer Token authentication

Get your API Key:

Visit the API Key management page to obtain your API Key

Add to request headers:

Authorization: Bearer YOUR_API_KEY

Body

application/json

model

enum<string>

required

Model name, must be wan2.7-reference-video

Available options:

wan2.7-reference-video

Example:

"wan2.7-reference-video"

prompt

string

required

Text prompt for video generation. Supports Chinese and English; each character / letter / punctuation counts as 1, with overflow auto-truncated. Maximum length 5000 characters

Character indexing rules:

Chinese: use "图1, 图2 / 视频1, 视频2" — corresponds 1-based to the order of image_urls / video_urls
English: use "Image 1", "Video 1" (capitalised, with a space between word and digit)
Images and videos are counted independently, so "Image 1" and "Video 1" can coexist
If only one reference image or one reference video is provided, you can simply write "the reference image" or "the reference video"

Multi-grid (storyboard) image: when one multi-grid image is provided, describe key shots in storyboard form; the model recognises the grid layout and fills in the missing transitions

Maximum string length: 5000

Example:

"Video 1 holds Image 3 and plays a soft country folk tune on the chair in Image 4"

negative_prompt

string

Negative prompt describing what should not appear in the video. Supports both Chinese and English. Maximum length 500 characters; overflow is auto-truncated

Maximum string length: 500

Example:

"Blurry, low quality"

image_start

string<uri>

Starting-frame image URL, used as the first frame of the generated video. Does not count toward the image_urls + video_urls ≤ 5 limit. Does not accept voice binding (the starting frame itself is not assigned a voice)

Use cases:

Subject already appears in the starting frame: combine with reference materials to reinforce identity consistency
Subject not in the starting frame: reference materials define new subjects appearing as the video progresses

Image limits:

Formats: JPEG, JPG, PNG (transparency not supported), BMP, WEBP
Resolution: width and height in [240, 8000] pixels
Aspect ratio: 1:8 ~ 8:1
File size: up to 20MB

Example:

"https://example.com/first_frame.jpg"

image_urls

string<uri>[]

Reference image URL array. Can supply subjects (people / animals / objects) or scene backgrounds; when a subject is included, each image should contain a single character

Quantity limits:

image_urls + video_urls total ≤ 5
At least one of image_urls / video_urls must be provided (passing only image_start is not enough)

Image limits:

Formats: JPEG, JPG, PNG (transparency not supported), BMP, WEBP
Resolution: width and height in [240, 8000] pixels
Aspect ratio: 1:8 ~ 8:1
File size: up to 20MB

Example:

[
  "https://example.com/ref1.jpg",
  "https://example.com/ref2.jpg"
]

video_urls

string<uri>[]

Reference video URL array. The video should ideally feature a subject (person / animal / object); empty or pure-background footage is discouraged. When a subject is included, each video should contain a single character. Audio in the video can be used as a voice reference

Quantity limits:

image_urls + video_urls total ≤ 5
At least one of image_urls / video_urls must be provided

Video limits:

Formats: mp4, mov
Duration: 1 ~ 30 seconds
Resolution: width and height in [240, 4096] pixels
Aspect ratio: 1:8 ~ 8:1
File size: up to 100MB

Note: when video_urls is provided, duration is capped at 10 seconds

Example:

["https://example.com/reference.mp4"]

audio_urls

string<uri>[]

[Compatibility field — prefer model_params.voice_bindings]

Reference voice URL array. Bound positionally to reference materials in this order: first match against video_urls, then against image_urls (in their array order, one-to-one). Up to 5 elements

Priority:

When both model_params.voice_bindings and audio_urls are supplied, only voice_bindings is used and this field is ignored
If a video in video_urls carries audio and no voice binding is set for it, the original audio is used; an explicit voice binding overrides the original audio

Audio limits:

Supported formats: wav, mp3
Duration range: 1 ~ 10 seconds
File size: up to 15MB

Maximum array length: 5

Example:

[
  "https://example.com/voice1.mp3",
  "https://example.com/voice2.mp3"
]

model_params

object

Advanced parameter container (recommended)

Show child attributes

quality

enum<string>

default:720p

Video quality, defaults to 720p

Options:

720p: Standard definition, standard price, this is the default
1080p: High definition, higher price

Available options:

720p,

1080p

Example:

"720p"

aspect_ratio

enum<string>

default:16:9

Video aspect ratio, defaults to 16:9

Behavior:

image_start not provided: video is generated using the specified aspect_ratio
image_start provided: this field is ignored; the video uses an aspect ratio close to the starting-frame image

Output resolution per quality tier:

Quality	16:9	9:16	1:1	4:3	3:4
720p	1280×720	720×1280	960×960	1104×832	832×1104
1080p	1920×1080	1080×1920	1440×1440	1648×1248	1248×1648

Available options:

16:9,

9:16,

1:1,

4:3,

3:4

Example:

"16:9"

duration

number

default:5

Video duration in seconds (integer)

Range:

Without video_urls: 2 ~ 15, default 5
With video_urls: 2 ~ 10 (capped at 10 seconds)

Billing: based on the actual generated video duration

Required range: 2 <= x <= 15

Example:

5

seed

integer

Random seed, defaults to random

Notes:

Range: 1 ~ 2147483647
Fixing the seed reduces variation when iterating on prompts and improves reproducibility

Required range: 1 <= x <= 2147483647

Example:

42

prompt_extend

boolean

default:false

Whether to enable intelligent prompt rewriting. When enabled, a large model will optimize the prompt, which significantly improves results for simple or insufficiently descriptive prompts.

Note: Default is false. Omitting the field or sending false will not trigger rewriting; explicitly send true to enable.

Example:

false

callback_url

string<uri>

HTTPS callback URL for task completion

Callback Timing:

Triggered when task is completed, failed, or cancelled
Sent after billing confirmation

Security Restrictions:

Only HTTPS protocol is supported
Callbacks to internal IP addresses are prohibited (127.0.0.1, 10.x.x.x, 172.16-31.x.x, 192.168.x.x, etc.)
URL length must not exceed 2048 characters

Callback Mechanism:

Timeout: 10 seconds
Up to 3 retries after failure (retries at 1/2/4 seconds after failure)
Callback response format is consistent with the task query API response
2xx status codes are considered successful, other status codes trigger retries

Example:

"https://your-domain.com/webhooks/video-task-completed"

Response

Video task created successfully

created

integer

Task creation timestamp

Example:

1757169743

string

Task ID

Example:

"task-unified-1757169743-7cvnl5zw"

model

string

Actual model name used

Example:

"wan2.7-reference-video"

object

enum<string>

Specific task type

Available options:

video.generation.task

progress

integer

Task progress percentage (0-100)

Required range: 0 <= x <= 100

Example:

0

status

enum<string>

Task status

Available options:

pending,

processing,

completed,

failed

Example:

"pending"

task_info

object

Detailed video task information

Show child attributes

type

enum<string>

Task output type

Available options:

text,

image,

audio,

video

Example:

"video"

usage

object

Usage and billing information

Show child attributes

Image Series

Video Series

Audio Series

Text Series

Account Management

Task Management

File Management

Wan2.7 Reference Video

Authorizations

All APIs require Bearer Token authentication

Body

Response

Image Series

Video Series

Audio Series

Text Series

Account Management

Task Management

File Management

Documentation Index

Authorizations

All APIs require Bearer Token authentication

Body

Response