Wan2.7 Image to Video
- WAN2.7 (wan2.7-image-to-video) model supports image-to-video generation with multimodal inputs (image / audio / video)
- Choose one of three generation modes via the
generation_modeparameter:first_frame: First-frame to video — generate a video starting from the given first frame, optional driving audiofirst_last_frame: First-and-last-frame to video — generate a video by interpolating between the first and last frames, optional driving audiovideo_continuation: Video continuation — continue an input video clip; an optional ending frame is allowed (no driving audio)
generation_modeis optional (backward compatible); if omitted, an appropriate mode will be selected automatically based on the materials in the request- Valid material combinations (any other combination will be rejected):
image_start(first frame)image_start+audio_urls(first frame + driving audio)image_start+image_end(first + last frame)image_start+image_end+audio_urls(first + last frame + driving audio)video_urls(video continuation)video_urls+image_end(video continuation + last frame)
- Asynchronous processing mode, use the returned task ID to query status
- Generated video links are valid for 24 hours, please save them promptly
Authorizations
All APIs require Bearer Token authentication
Get your API Key:
Visit the API Key management page to obtain your API Key
Add to request headers:
Authorization: Bearer YOUR_API_KEY
Body
Model name, must be wan2.7-image-to-video
wan2.7-image-to-video "wan2.7-image-to-video"
Generation mode that determines which material combinations are valid. Explicitly specifying it is recommended
Values:
first_frame: First-frame to video. Required:image_start. Optional:audio_urls. Not accepted:image_end,video_urlsfirst_last_frame: First-and-last-frame to video. Required:image_start+image_end. Optional:audio_urls. Not accepted:video_urlsvideo_continuation: Video continuation. Required:video_urls[0]. Optional:image_end(used as ending frame). Not accepted:image_start,audio_urls
Backward-compatible behavior: when generation_mode is omitted, an appropriate mode will be selected automatically based on the materials in the request; explicit specification is recommended to avoid ambiguity
first_frame, first_last_frame, video_continuation "first_frame"
Text prompt for video generation. Supports both Chinese and English; each character/letter counts as 1, with overflow auto-truncated. Maximum length: 5000 characters
5000"A cat playing piano"
Negative prompt describing what should not appear in the video. Supports both Chinese and English. Maximum length 500 characters; overflow is auto-truncated
500"Blurry, low quality"
First-frame image URL
Mode constraints:
first_framemode: requiredfirst_last_framemode: requiredvideo_continuationmode: not allowed
Image limits:
- Formats: JPEG, JPG, PNG (transparency not supported), BMP, WEBP
- Resolution: width and height in
[240, 8000]pixels - Aspect ratio: 1:8 ~ 8:1
- File size: up to
20MB
"https://example.com/first_frame.jpg"
Ending-frame image URL
Mode constraints:
first_last_framemode: requiredvideo_continuationmode: optional (acts as the ending frame for the continuation)first_framemode: not allowed (usefirst_last_frameif both first and last frames are needed)
Image limits:
- Formats: JPEG, JPG, PNG (transparency not supported), BMP, WEBP
- Resolution: width and height in
[240, 8000]pixels - Aspect ratio: 1:8 ~ 8:1
- File size: up to
20MB
"https://example.com/last_frame.jpg"
Video continuation URL array. Only 1 element is supported
Mode constraints:
video_continuationmode: requiredfirst_frame/first_last_framemode: not allowed- Cannot be combined with
audio_urls
Video limits:
- Formats: mp4, mov
- Duration:
2 ~ 10seconds (length of the input clip itself) - Resolution: width and height in
[240, 4096]pixels - Aspect ratio: 1:8 ~ 8:1
- File size: up to
100MB
Continuation duration rules:
durationrepresents the total final output video length (input clip + model-generated continuation)- Generated continuation length =
duration− input video length durationmust be ≥ input video length- Billing is based on the total final output length (i.e.
duration)
Examples:
| Input clip length | duration | Continuation generated | Final output | Billed |
|---|---|---|---|---|
| 3s | 15 | 12s | 15s | 15s |
| 5s | 10 | 5s | 10s | 10s |
| 8s | 8 | 0s (input only) | 8s | 8s |
["https://example.com/clip.mp4"]
Driving audio URL array. Currently supports only 1 element. The model will use this audio as the driving source for video generation (e.g. lip sync, motion alignment)
Mode constraints:
first_framemode: optionalfirst_last_framemode: optionalvideo_continuationmode: not allowed (cannot be combined withvideo_urls)
Format requirements:
- Supported formats:
wav,mp3 - Duration range:
2 ~ 30seconds - File size: up to
15MB
Truncation handling:
- If audio length exceeds
duration, the first N seconds are extracted and the rest discarded - If audio length is shorter than the video duration, the remaining portion is silent. For example: if audio is 3s and video duration is 5s, the first 3s have sound and the last 2s are silent
1 element["https://example.com/audio.mp3"]
Video quality, defaults to 720p
Options:
720p: Standard definition, standard price, this is the default1080p: High definition, higher price
720p, 1080p "720p"
Video duration in seconds (integer). Range 2 ~ 15, default 5
Meaning:
first_frame/first_last_framemodes: total length of the generated videovideo_continuationmode: total length of the final output video (= original input clip + model-generated continuation)
Additional constraints in video_continuation mode:
durationmust be ≥ input video length (otherwise an error is returned)- Generated continuation length =
duration− input video length - When
durationequals the input video length, no continuation is generated and the input clip is returned as-is - See the continuation duration rules and examples in the
video_urlsfield for details
Billing: based on the actual generated video duration
2 <= x <= 155
Random seed, defaults to random
Notes:
- Range:
1~2147483647 - Fixing the seed reduces variation when iterating on prompts and improves reproducibility
1 <= x <= 214748364742
Whether to enable intelligent prompt rewriting. When enabled, a large model will optimize the prompt, which significantly improves results for simple or insufficiently descriptive prompts.
Note: Default is false. Omitting the field or sending false will not trigger rewriting; explicitly send true to enable.
false
HTTPS callback URL for task completion
Callback Timing:
- Triggered when task is completed, failed, or cancelled
- Sent after billing confirmation
Security Restrictions:
- Only HTTPS protocol is supported
- Callbacks to internal IP addresses are prohibited (127.0.0.1, 10.x.x.x, 172.16-31.x.x, 192.168.x.x, etc.)
- URL length must not exceed
2048characters
Callback Mechanism:
- Timeout:
10seconds - Up to
3retries after failure (retries at1/2/4seconds after failure) - Callback response format is consistent with the task query API response
- 2xx status codes are considered successful, other status codes trigger retries
"https://your-domain.com/webhooks/video-task-completed"
Response
Video task created successfully
Task creation timestamp
1757169743
Task ID
"task-unified-1757169743-7cvnl5zw"
Actual model name used
"wan2.7-image-to-video"
Specific task type
video.generation.task Task progress percentage (0-100)
0 <= x <= 1000
Task status
pending, processing, completed, failed "pending"
Detailed video task information
Task output type
text, image, audio, video "video"
Usage and billing information