image_start), multiple reference images (image_urls), multiple reference videos (video_urls), and per-character voice bindingsimage_urls) or reference video (video_urls) must be provided; passing only image_start does not satisfy this. The total of image_urls + video_urls must be ≤ 5image_urls / video_urls. Images and videos are counted independently, so “Image 1” and “Video 1” can coexistmodel_params.voice_bindings (precise binding); the legacy audio_urls (positional alignment) is also supportedDocumentation Index
Fetch the complete documentation index at: https://docs.evolink.ai/llms.txt
Use this file to discover all available pages before exploring further.
Get your API Key:
Visit the API Key management page to obtain your API Key
Add to request headers:
Authorization: Bearer YOUR_API_KEYModel name, must be wan2.7-reference-video
wan2.7-reference-video "wan2.7-reference-video"
Text prompt for video generation. Supports Chinese and English; each character / letter / punctuation counts as 1, with overflow auto-truncated. Maximum length 5000 characters
Character indexing rules:
image_urls / video_urlsMulti-grid (storyboard) image: when one multi-grid image is provided, describe key shots in storyboard form; the model recognises the grid layout and fills in the missing transitions
5000"Video 1 holds Image 3 and plays a soft country folk tune on the chair in Image 4"
Negative prompt describing what should not appear in the video. Supports both Chinese and English. Maximum length 500 characters; overflow is auto-truncated
500"Blurry, low quality"
Starting-frame image URL, used as the first frame of the generated video. Does not count toward the image_urls + video_urls ≤ 5 limit. Does not accept voice binding (the starting frame itself is not assigned a voice)
Use cases:
Image limits:
[240, 8000] pixels20MB"https://example.com/first_frame.jpg"
Reference image URL array. Can supply subjects (people / animals / objects) or scene backgrounds; when a subject is included, each image should contain a single character
Quantity limits:
image_urls + video_urls total ≤ 5image_urls / video_urls must be provided (passing only image_start is not enough)Image limits:
[240, 8000] pixels20MB[
"https://example.com/ref1.jpg",
"https://example.com/ref2.jpg"
]Reference video URL array. The video should ideally feature a subject (person / animal / object); empty or pure-background footage is discouraged. When a subject is included, each video should contain a single character. Audio in the video can be used as a voice reference
Quantity limits:
image_urls + video_urls total ≤ 5image_urls / video_urls must be providedVideo limits:
1 ~ 30 seconds[240, 4096] pixels100MBNote: when video_urls is provided, duration is capped at 10 seconds
["https://example.com/reference.mp4"][Compatibility field — prefer model_params.voice_bindings]
Reference voice URL array. Bound positionally to reference materials in this order: first match against video_urls, then against image_urls (in their array order, one-to-one). Up to 5 elements
Priority:
model_params.voice_bindings and audio_urls are supplied, only voice_bindings is used and this field is ignoredvideo_urls carries audio and no voice binding is set for it, the original audio is used; an explicit voice binding overrides the original audioAudio limits:
wav, mp31 ~ 10 seconds15MB5[
"https://example.com/voice1.mp3",
"https://example.com/voice2.mp3"
]Advanced parameter container (recommended)
Video quality, defaults to 720p
Options:
720p: Standard definition, standard price, this is the default1080p: High definition, higher price720p, 1080p "720p"
Video aspect ratio, defaults to 16:9
Behavior:
image_start not provided: video is generated using the specified aspect_ratioimage_start provided: this field is ignored; the video uses an aspect ratio close to the starting-frame imageOutput resolution per quality tier:
| Quality | 16:9 | 9:16 | 1:1 | 4:3 | 3:4 |
|---|---|---|---|---|---|
| 720p | 1280×720 | 720×1280 | 960×960 | 1104×832 | 832×1104 |
| 1080p | 1920×1080 | 1080×1920 | 1440×1440 | 1648×1248 | 1248×1648 |
16:9, 9:16, 1:1, 4:3, 3:4 "16:9"
Video duration in seconds (integer)
Range:
video_urls: 2 ~ 15, default 5video_urls: 2 ~ 10 (capped at 10 seconds)Billing: based on the actual generated video duration
2 <= x <= 155
Random seed, defaults to random
Notes:
1 ~ 21474836471 <= x <= 214748364742
Whether to enable intelligent prompt rewriting. When enabled, a large model will optimize the prompt, which significantly improves results for simple or insufficiently descriptive prompts.
Note: Default is false. Omitting the field or sending false will not trigger rewriting; explicitly send true to enable.
false
HTTPS callback URL for task completion
Callback Timing:
Security Restrictions:
2048 charactersCallback Mechanism:
10 seconds3 retries after failure (retries at 1/2/4 seconds after failure)"https://your-domain.com/webhooks/video-task-completed"
Video task created successfully
Task creation timestamp
1757169743
Task ID
"task-unified-1757169743-7cvnl5zw"
Actual model name used
"wan2.7-reference-video"
Specific task type
video.generation.task Task progress percentage (0-100)
0 <= x <= 1000
Task status
pending, processing, completed, failed "pending"
Detailed video task information
Task output type
text, image, audio, video "video"
Usage and billing information