Gemini Omni Flash Image-to-Video
- The Gemini Omni Flash (gemini-omni-flash-image-to-video) model supports image-to-video mode, producing a video with native audio from an input image and a text prompt
- Image input: Provided via
image_urls; currently only 1 image is supported - Duration control: Set an integer duration of
3~10seconds viaduration, or passautoto let the model decide - Aspect ratio: Choose
16:9,9:16, orautoviaaspect_ratio - Native audio: The model automatically generates synchronized audio for the video, no extra parameters required
- Negative description: Write it directly into
prompt(e.g.No dialogue); this model does not provide a separate negative prompt parameter - Asynchronous processing mode; use the returned task ID to query the result
- The generated video link is valid for 24 hours, please save it promptly
Authorizations
##All endpoints require authentication using a Bearer Token##
Get your API Key:
Visit the API Key management page to obtain your API Key
Add it to the request header:
Authorization: Bearer YOUR_API_KEYBody
Model name, fixed to gemini-omni-flash-image-to-video
gemini-omni-flash-image-to-video "gemini-omni-flash-image-to-video"
Text prompt for video generation, supports both English and Chinese
Usage tips:
- Describe the subject's actions, camera movement, mood changes, etc.; the more specific, the more stable the result
- Write negative requirements directly into the prompt (e.g.
No dialogue,no text on screen); this model does not provide a separate negative prompt parameter
"Have the person in the image slowly turn their head and smile while the leaves in the background sway gently in the breeze"
Array of input images; currently only 1 is supported
Supported forms:
- HTTP/HTTPS image URL
- Data URL in the form
data:image/...;base64,... - Plain base64 image string
Format requirements: png, jpeg, webp are supported
1 element["https://example.com/portrait.jpg"]Video duration (seconds), default 10
Value notes:
- Integer: range
3 ~ 10seconds auto: the model decides the output duration
Billing note: The actual charge is based on the usage of the generated video
3 <= x <= 106
Video aspect ratio, default 16:9
Value notes:
16:9: landscape9:16: portraitauto: the model decides the aspect ratio
16:9, 9:16, auto "16:9"
HTTPS callback URL to notify when the task completes
Callback timing:
- Triggered when the task completes (completed), fails (failed), or is cancelled (cancelled)
- Sent after billing is confirmed
Security restrictions:
- HTTPS protocol only
- Callbacks to internal IP addresses are forbidden (127.0.0.1, 10.x.x.x, 172.16-31.x.x, 192.168.x.x, etc.)
- URL length must not exceed
2048characters
Callback mechanism:
- Timeout:
10seconds - Up to
3retries on failure (retried1s /2s /4s after each failure respectively) - The callback body format matches the response of the task query endpoint
- A 2xx status code from the callback URL is treated as success; other status codes trigger a retry
"https://your-domain.com/webhooks/video-task-completed"
Response
Video task created successfully
Task creation timestamp
1757169743
Task ID
"task-unified-1757169743-7cvnl5zw"
The model name actually used
"gemini-omni-flash-image-to-video"
The specific type of the task
video.generation.task Task progress percentage (0-100)
0 <= x <= 1000
Task status
pending, processing, completed, failed "pending"
Detailed video task information
The output type of the task
text, image, audio, video "video"
Usage and billing information