AiHubMix Documentation Hub

Quick Start

Video generation is an asynchronous operation. The whole process is divided into three steps:

Submit task → get video_id
Poll status → wait until status becomes completed
Download video → get the MP4 file

Minimal Example

# Step 1: Submit the video generation task
curl -X POST https://aihubmix.com/v1/videos \
  -H "Authorization: Bearer $AIHUBMIX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "wan2.6-t2v",
    "prompt": "A cat playing jazz on a piano, warm lighting, cinematic shot",
    "seconds": "5",
    "size": "1280x720"
  }'

# Example response:
# {
#   "id": "eyJtb2RlbCI6IndhbjI...",
#   "object": "video",
#   "status": "in_progress",
#   "model": "wan2.6-t2v",
#   "duration": 5,
#   "width": 1280,
#   "height": 720,
#   ...
# }

# Step 2: Poll the status (query every 15 seconds until status is completed)
curl https://aihubmix.com/v1/videos/{video_id} \
  -H "Authorization: Bearer $AIHUBMIX_API_KEY"

# Step 3: Download the video
curl https://aihubmix.com/v1/videos/{video_id}/content \
  -H "Authorization: Bearer $AIHUBMIX_API_KEY" \
  --output video.mp4

API Overview

Endpoint	Method	Path	Description
Create Video	POST	`/v1/videos`	Submit a video generation task
Query Status	GET	`/v1/videos/{video_id}`	Query task status and progress
Download Video	GET	`/v1/videos/{video_id}/content`	Download the generated MP4 video
Delete Task	DELETE	`/v1/videos/{video_id}`	Delete a video task

Base URL: https://aihubmix.com Authentication: Bearer Token

Authorization: Bearer $AIHUBMIX_API_KEY

Supported Models

Text-to-Video

Vendor	Model Name	Features
OpenAI	`sora-2`	Standard video generation, supports audio-video sync
OpenAI	`sora-2-pro`	High-quality version, more refined and stable visuals
Google	`veo-3.1-generate-preview`	Latest Veo 3.1, native audio, supports 4K
Google	`veo-3.1-fast-generate-preview`	Veo 3.1 fast version, faster generation speed
Google	`veo-3.0-generate-preview`	Veo 3.0, high-fidelity video
Google	`veo-2.0-generate-001`	Veo 2.0, stable version
Alibaba	`wan2.6-t2v`	Latest Tongyi Wanxiang, audio-video sync
Alibaba	`wan2.5-t2v-preview`	Tongyi Wanxiang 2.5, optimized for Chinese
Alibaba	`wan2.2-t2v-plus`	Tongyi Wanxiang 2.2
ByteDance	`jimeng-3.0-pro`	Jimeng 3.0 Pro, 1080P HD
ByteDance	`jimeng-3.0-1080p`	Jimeng 3.0 1080P
ByteDance	`doubao-seedance-2-0-260128`	Professional-grade multimodal creative video model Seedance 2.0
ByteDance	`doubao-seedance-2-0-fast-260128`	Seedance 2.0 fast version
Kuaishou	`kling-v3`, `kling-v2-6`, `kling-v2-5-turbo`, `kling-v2-1`	Kling text-to-video / image-to-video, newer versions support 3–15 seconds
Kuaishou	`kling-v3-omni`, `kling-video-o1`	Kling OmniVideo multimodal, supports reference video, native audio, multi-shot

Image-to-Video

Vendor	Model Name	Features
Alibaba	`wan2.6-i2v`	Latest Tongyi Wanxiang image-to-video
Alibaba	`wan2.5-i2v-preview`	Tongyi Wanxiang 2.5 image-to-video
Alibaba	`wan2.2-i2v-plus`	Tongyi Wanxiang 2.2 image-to-video
ByteDance	`doubao-seedance-2-0-260128`	Multimodal reference inputs, supports image/video/audio
ByteDance	`doubao-seedance-2-0-fast-260128`	Seedance 2.0 fast version
Kuaishou	`kling-v1-6`, etc.	Kling image-to-video, supports end frame and multi-image reference (up to 4 images)

Image-to-video requires passing the reference image via the input_reference parameter (Alibaba Tongyi Wanxiang); Doubao Seedance passes it via the extra_body.content array, which supports image, video, and audio reference types; Kling uses image / image_tail / image_list to pass images — see the Kling section below for details.

API Details

Request Headers

Authorization: Bearer $AIHUBMIX_API_KEY
Content-Type: application/json

Create a Video Generation Task

POST /v1/videos

Request Body

Parameter	Type	Required	Description
`model`	string	Yes	Model name, e.g. `wan2.6-t2v`, `sora-2`
`prompt`	string	Yes	Video description text
`seconds`	string	No	Video duration (seconds), always passed as a string, e.g. `"5"`, `"8"` (see per-model details)
`size`	string	No	Resolution, format `widthxheight`, e.g. `1920x1080` (supported values vary by model)
`input_reference`	string/object	No	Reference image (image-to-video), supports URL or base64

Response formats vary slightly across models, but all include the id (video_id) and status fields. Just use status to determine task progress.

Example Response (Tongyi Wanxiang / Veo / Jimeng AI)

{
  "id": "eyJtb2RlbCI6IndhbjI...",
  "object": "video",
  "created": 1772460274,
  "model": "wan2.6-t2v",
  "status": "in_progress",
  "prompt": "A cat watching the rain on a windowsill",
  "duration": 5,
  "width": 1920,
  "height": 1080,
  "url": null,
  "error": null
}

Example Response (Sora)

{
  "id": "eyJtb2RlbCI6InNvcmEtMi...",
  "object": "video",
  "created_at": 1772451930,
  "status": "queued",
  "model": "sora-2",
  "progress": 0,
  "prompt": "A cinematic drone shot over mountains",
  "seconds": "8",
  "size": "1280x720"
}

Common Status Values

Status	Description
`queued`	Queued (Sora-specific)
`in_progress`	Generating
`completed`	Generation complete, ready to download
`failed`	Generation failed

Query Video Status

GET /v1/videos/{video_id}

Poll this endpoint to check whether the task is complete. We recommend querying every 15 seconds.

Example Response (Generation Complete - Tongyi Wanxiang)

{
  "id": "eyJtb2RlbCI6IndhbjI...",
  "object": "video",
  "status": "completed",
  "model": "wan2.5-t2v-preview",
  "duration": 5,
  "width": 1920,
  "height": 1080,
  "url": "https://aihubmix.com/v1/videos/eyJtb2RlbCI6IndhbjI.../content",
  "error": null
}

Example Response (Generation Complete - Sora)

{
  "id": "eyJtb2RlbCI6InNvcmEtMi...",
  "object": "video",
  "created_at": 1772451930,
  "status": "completed",
  "completed_at": 1772452114,
  "expires_at": 1772538330,
  "model": "sora-2",
  "progress": 100,
  "prompt": "A cinematic drone shot over mountains",
  "seconds": "8",
  "size": "1280x720"
}

All models use status == "completed" to determine the completion state, then call the /content endpoint to download.

Download Video Content

GET /v1/videos/{video_id}/content

Once the status is completed, call this endpoint to download the MP4 video file. Response: Returns the video binary stream directly (Content-Type: video/mp4).

curl https://aihubmix.com/v1/videos/{video_id}/content \
  -H "Authorization: Bearer $AIHUBMIX_API_KEY" \
  --output my_video.mp4

Note: Video download links usually have a 24-hour validity period, so download and save them promptly.

Delete a Video Task

This endpoint is used to delete an already-created video task.

DELETE /v1/videos/{video_id}

Per-Model Parameter Details

OpenAI Sora

Parameter	Supported Values
Model	`sora-2`, `sora-2-pro`
Duration (seconds)	`"4"` (default), `"8"`, `"12"`
Resolution (size)	`720x1280` (default), `1280x720`, `1024x1792`, `1792x1024`
Image-to-Video	Supported, pass the image via `input_reference`

Tip: The seconds parameter for all models is always passed as a string (e.g. "8").

Example

curl -X POST https://aihubmix.com/v1/videos \
  -H "Authorization: Bearer $AIHUBMIX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sora-2",
    "prompt": "A cinematic drone shot soaring over a misty mountain range at sunrise, golden light filtering through the clouds",
    "seconds": "8",
    "size": "1280x720"
  }'

Google Veo

Parameter	Supported Values
Model	`veo-3.1-generate-preview` (recommended), `veo-3.1-fast-generate-preview` (fast), `veo-3.0-generate-preview`, `veo-2.0-generate-001`
Duration (seconds)	Veo 3/3.1: `"4"`, `"6"`, `"8"`; Veo 2: `"5"`~`"8"` (default `"8"`)
Resolution (size)	`720p` (default), `1080p`, `4k` (4K only for Veo 3+), or pixel format such as `1280x720`, `1920x1080`
Aspect Ratio	16:9 (default), 9:16
Image-to-Video	Supported, pass the first-frame image via `input_reference` (Veo 3.1); when used, `seconds` is fixed at `"8"`

Example

curl -X POST https://aihubmix.com/v1/videos \
  -H "Authorization: Bearer $AIHUBMIX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veo-3.1-generate-preview",
    "prompt": "A tranquil Japanese garden, cherry blossom petals slowly drifting down, koi swimming in the pond, with the melodious sound of wind chimes in the background",
    "seconds": "8",
    "size": "1280x720"
  }'

Tip: Veo supports native audio generation; you can describe sound effects in the prompt, such as “the sound of birds chirping in the background” or “a piano melody”.

Tongyi Wanxiang

Parameter	Supported Values
Text-to-Video Models	`wan2.6-t2v` (recommended), `wan2.5-t2v-preview`, `wan2.2-t2v-plus`
Image-to-Video Models	`wan2.6-i2v` (recommended), `wan2.5-i2v-preview`, `wan2.2-i2v-plus`
Duration (seconds)	Varies by model (see below), default `"5"`
Resolution (size)	See the table below; both `x` and `` separators are accepted (e.g. `1920x1080` or `19201080`)
Image-to-Video	Pass the image URL or base64 via `input_reference`

Duration Supported by Each Model

Model	seconds Allowed Values	Default
`wan2.6-t2v` / `wan2.6-i2v`	`"2"`~`"15"` (any integer value)	`"5"`
`wan2.5-t2v-preview` / `wan2.5-i2v-preview`	`"5"` or `"10"`	`"5"`
`wan2.2-t2v-plus` / `wan2.2-i2v-plus`	`"5"` (fixed)	`"5"`

Supported Resolutions (width*height)

Clarity	Available Resolutions
480P	`832x480`, `480x832`, `624x624`
720P	`1280x720` (default), `720x1280`, `960x960`, `1088x832` (4:3), `832x1088` (3:4)
1080P	`1920x1080`, `1080x1920`, `1440x1440`, `1632x1248` (4:3), `1248x1632` (3:4)

Note: wan2.6 supports only 720P and 1080P; wan2.5 supports 480P, 720P, and 1080P; wan2.2 supports only 480P and 1080P.

Example

curl -X POST https://aihubmix.com/v1/videos \
  -H "Authorization: Bearer $AIHUBMIX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "wan2.6-t2v",
    "prompt": "A winding stream flows through an autumn forest, golden fallen leaves drifting on the water surface, sunlight casting dappled light and shadow through the leaves",
    "seconds": "5",
    "size": "1920x1080"
  }'

Tip: wan2.5 and above generate videos with sound by default (automatic dubbing); Chinese prompts work better.

Jimeng AI

Parameter	Supported Values
Model	`jimeng-3.0-pro` (recommended), `jimeng-3.0-1080p`
Duration (seconds)	`"5"` or `"10"` (default `"5"`)
Resolution (size)	Supports aspect ratio format or pixel format
Image-to-Video	Supported, pass the image URL or base64 via `input_reference`

Supported Aspect Ratios and Corresponding Resolutions

Aspect Ratio (size)	Actual Resolution
`16:9` or `1920x1080`	1920×1088
`9:16` or `1080x1920`	1088×1920
`4:3` or `1664x1248`	1664×1248
`3:4` or `1248x1664`	1248×1664
`1:1` or `1440x1440`	1440×1440
`21:9` or `2176x928`	2176×928

Example

curl -X POST https://aihubmix.com/v1/videos \
  -H "Authorization: Bearer $AIHUBMIX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "jimeng-3.0-pro",
    "prompt": "A young woman in Hanfu dances gracefully amid a bamboo forest, her long dress flowing in the wind, with a faint morning mist in the background",
    "seconds": "5",
    "size": "16:9"
  }'

Doubao Seedance

Parameter	Supported Values
Model	`doubao-seedance-2-0-260128`, `doubao-seedance-2-0-fast-260128`
Resolution (resolution)	`"480p"`, `"720p"` (default)
Duration (duration)	Integer, range `4`~`15`, or `-1` (model decides automatically)
Aspect Ratio (ratio)	`"adaptive"` (default, auto-adapts), `"16:9"`, `"9:16"`, `"1:1"`, `"4:3"`, `"3:4"`, `"21:9"`
Audio Video (generate_audio)	Defaults to `true`; set to `false` to generate a silent video
Watermark (watermark)	Defaults to `false`
Multimodal Reference	Supports image, video, and audio

Reference Types Supported by extra_body.content

Type	`type` Value	`role` Value	Description
Reference Image	`image_url`	`reference_image`	Visual/style reference image
Reference Video	`video_url`	`reference_video`	Camera movement/composition reference video
Reference Audio	`audio_url`	`reference_audio`	Background music audio file

Example

Seedance 2.0 / 2.0 Fast

curl -X POST "https://aihubmix.com/v1/videos" \
  -H "Authorization: Bearer $AIHUBMIX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "doubao-seedance-2-0-260128",
    "prompt": "Use the first-person POV framing from Video 1 throughout, and use Audio 1 as the background music for the entire clip. Create a first-person fruit tea commercial featuring the Seedance brand limited-edition apple fruit tea, "Ping Ping An An." 

Opening frame: Image 1. From a first-person perspective, your hand picks a dew-covered Aksu red apple, accompanied by a crisp, satisfying bite-like tapping sound.

Seconds 2–4: Fast-paced cuts. Your hand drops freshly cut apple chunks into a shaker, adds ice and tea base, then shakes vigorously. The sound of ice clinking and shaking syncs with upbeat percussion. Background voiceover: "Freshly cut, freshly shaken."

Seconds 4–6: First-person close-up of the finished drink. The layered fruit tea is poured into a clear cup. Your hand gently squeezes a creamy topping across the surface. A pink label is placed on the cup. The camera pushes in to highlight the rich texture and layering.

Seconds 6–8: First-person hand holding the drink. You raise the fruit tea from Image 2 toward the camera, as if offering it directly to the viewer. The label is clearly visible. Background voiceover: "Take a refreshing sip."

Final frame: Freeze on Image 2. 

All background voiceovers should be in a female voice.",
    "extra_body": {
      "content": [
        {
          "type": "image_url",
          "image_url": {
            "url": "https://ark-project.tos-cn-beijing.volces.com/doc_image/r2v_tea_pic1.jpg"
          },
          "role": "reference_image"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://ark-project.tos-cn-beijing.volces.com/doc_image/r2v_tea_pic2.jpg"
          },
          "role": "reference_image"
        },
        {
          "type": "video_url",
          "video_url": {
            "url": "https://ark-project.tos-cn-beijing.volces.com/doc_video/r2v_tea_video1.mp4"
          },
          "role": "reference_video"
        },
        {
          "type": "audio_url",
          "audio_url": {
            "url": "https://ark-project.tos-cn-beijing.volces.com/doc_audio/r2v_tea_audio1.mp3"
          },
          "role": "reference_audio"
        }
      ],
      "ratio": "16:9",
      "duration": 11,
      "watermark": false
    }
  }'

Kling

Kling supports four types of capabilities: text-to-video, image-to-video, multi-image reference video, and OmniVideo multimodal. All are invoked through the unified /v1/videos endpoint, and the gateway automatically routes to the corresponding Kling endpoint based on “model name + input form”, with no need for the caller to differentiate.

Capability	Models
Text-to-Video / Image-to-Video	`kling-v1`, `kling-v1-5`, `kling-v1-6`, `kling-v2-1`, `kling-v2-5-turbo`, `kling-v2-6`, `kling-v3`
Multi-image Reference	`kling-v1-6`
OmniVideo Multimodal	`kling-video-o1`, `kling-v3-omni`

Parameters

Parameter	Type	Description
`model`	string	Required, `kling-*`, determines capability and version
`prompt`	string	Text prompt
`negative_prompt`	string	Negative prompt
`mode`	string	Generation mode: `std` (720P) / `pro` (1080P) / `4k`, default `std`
`duration` / `seconds`	string	Duration (seconds); older models `5`/`10`, newer models `3`~`15`, default `5`
`aspect_ratio`	string	Frame: `16:9` / `9:16` / `1:1` (required for omni pure text-to-video and video reference; defaults to `16:9` if omitted)
`cfg_scale`	float	Prompt relevance `[0, 1]`, default `0.5` (not supported by `kling-v2.x`)
`image`	string	Image-to-Video: single image, image URL or Base64 (Base64 without the `data:image/...;base64,` prefix)
`image_tail`	string	Image-to-Video: end-frame image (optional)
`image_list`	array	Multi-image Reference: array of image URLs, up to 4 images
`sound`	string	omni: `on`/`off`, whether to generate native audio, default `off`
`video_list`	array	omni: reference video `[{ "video_url": "...", "refer_type": "feature" }]`; `refer_type` takes `feature` (video reference) / `base` (video editing)

Unsupported or unmapped key parameters will raise an explicit error rather than being silently dropped. Other native Kling parameters can be placed in extra_body to pass through to the upstream.

Example

curl https://aihubmix.com/v1/videos \
  -H "Authorization: Bearer $AIHUBMIX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kling-v1-6",
    "prompt": "An orange cat running on a sunlit grassy meadow",
    "mode": "std",
    "duration": "5"
  }'

Notes

Three asynchronous steps: submit to get video_id → poll GET /v1/videos/{video_id} until status is completed → GET /v1/videos/{video_id}/content to download the MP4. Status values: in_progress / completed / failed.
Video output usually takes 1–3 minutes; result video URLs are cleaned up after 30 days, so transfer and save them promptly.
Delete task: Kling has no delete endpoint; DELETE /v1/videos/{video_id} returns 501 not_supported.
Billing: charged by model × mode × duration × capability (with or without reference video / audio); no charge for failed generation, and queries and downloads are not billed.

Complete Invocation Examples

import requests
import time

API_KEY = "AIHUBMIX_API_KEY"
BASE_URL = "https://aihubmix.com"
HEADERS = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

# Step 1: Create the video generation task
response = requests.post(
    f"{BASE_URL}/v1/videos",
    headers=HEADERS,
    json={
        "model": "wan2.6-t2v",
        "prompt": "A desert under a starry sky, a meteor streaking across the night sky, the glow of a distant campfire flickering in the breeze",
        "seconds": "5",
        "size": "1920x1080"
    }
)
result = response.json()
video_id = result["id"]
print(f"Task created, video_id: {video_id}")

# Step 2: Poll the status
while True:
    status_response = requests.get(
        f"{BASE_URL}/v1/videos/{video_id}",
        headers=HEADERS
    )
    status_data = status_response.json()
    current_status = status_data["status"]
    print(f"Current status: {current_status}")

    if current_status == "completed":
        print("Video generation complete!")
        break
    elif current_status == "failed":
        error_msg = status_data.get("error", {})
        if isinstance(error_msg, dict):
            error_msg = error_msg.get("message", "Unknown error")
        print(f"Generation failed: {error_msg}")
        break

    time.sleep(15)  # Query every 15 seconds

# Step 3: Download the video
video_response = requests.get(
    f"{BASE_URL}/v1/videos/{video_id}/content",
    headers=HEADERS
)
with open("output.mp4", "wb") as f:
    f.write(video_response.content)
print(f"Video saved as output.mp4 ({len(video_response.content) / 1024 / 1024:.1f} MB)")

FAQ

How long does video generation take?

Video generation usually takes 1-5 minutes, depending on the model, resolution, and duration. We recommend setting a 15-second polling interval.

How do I use the `input_reference` parameter?

input_reference is used in image-to-video scenarios and supports three ways of passing input:

// Method 1: Pass the image URL directly
"input_reference": "https://example.com/image.jpg"

// Method 2: Pass a base64-encoded image (object format)
"input_reference": {
  "mime_type": "image/jpeg",
  "data": "<BASE64_ENCODED_IMAGE>"
}

// Method 3: Pass a data URL
"input_reference": "data:image/jpeg;base64,<BASE64_ENCODED_IMAGE>"

How long is the video download link valid?

Generated video download links usually have a 24-hour validity period, so download and save them promptly.

What are the differences in the `seconds` parameter across models?

Model	Allowed Values	Default
Sora (`sora-2` / `sora-2-pro`)	`"4"`, `"8"`, `"12"`	`"4"`
Veo 3/3.1 (`veo-3.1-generate-preview`, etc.)	`"4"`, `"6"`, `"8"`	`"8"`
Veo 2 (`veo-2.0-generate-001`)	`"5"`~`"8"`	`"8"`
Tongyi Wanxiang `wan2.6`	`"2"`~`"15"`	`"5"`
Tongyi Wanxiang `wan2.5`	`"5"`, `"10"`	`"5"`
Tongyi Wanxiang `wan2.2`	`"5"` (fixed)	`"5"`
Jimeng AI (`jimeng-3.0-pro`, etc.)	`"5"`, `"10"`	`"5"`
Doubao Seedance (`doubao-seedance-2-0-*`)	integer `duration4`~`15` or `-1`	`5`
Kling new versions (`kling-v2-x` / `kling-v3`, etc.)	`"3"`~`"15"`	`"5"`
Kling old versions (`kling-v1` / `kling-v1-5` / `kling-v1-6`)	`"5"`, `"10"`	`"5"`

> Tip: The seconds parameter for all models is always passed as a string (e.g. "8"), and the API handles it automatically.

What are the differences in the `size` parameter format across models?

Model	Supported size Values
Sora	`1280x720`, `720x1280`, `1024x1792`, `1792x1024`
Veo	pixel format (`1280x720`, etc.) or resolution labels (`720p`, `1080p`, `4k`)
Tongyi Wanxiang	pixel format, both `x` and `` accepted (e.g. `1920x1080` or `19201080`)
Jimeng AI	aspect ratio format (`16:9`, `9:16`, etc.) or pixel format
Doubao Seedance	aspect ratio format (`"adaptive"`, `"16:9"`, `"9:16"`, etc.)
Kling	does not use `size`; uses `mode` (`std`/`pro`/`4k` controls clarity) + `aspect_ratio` (`16:9`/`9:16`/`1:1` controls frame)

What is the difference between `seconds` and `duration`?

The two have the same meaning, both representing the video duration. The API supports both parameter names (except Sora, which only accepts seconds). We recommend using seconds consistently.

How do I write better prompts?

Describe specific scenes: include subject, action, environment, lighting, atmosphere
Specify camera language: such as “close-up”, “aerial shot”, “push-in shot”, “slow motion”
Describe style: such as “cinematic”, “documentary style”, “animation style”
Chinese models work better with Chinese prompts: Tongyi Wanxiang is optimized for Chinese
Veo supports audio descriptions: you can describe sounds in the prompt, such as “birds chirping” or “a piano melody”

How do I handle a failed task?

When status is failed, the error field in the response contains error information:

{
  "status": "failed",
  "error": {
    "message": "Video generation failed due to content policy violation",
    "type": "video_generation_error"
  }
}

Common failure reasons include: content violations, prompt too long, unsupported image format, etc. Adjust based on the error message and retry.

Last updated: 2026-06-01

​Quick Start

​API Overview

​Supported Models

​Text-to-Video

​Image-to-Video

​API Details

​Request Headers

​Create a Video Generation Task

​Request Body

​Example Response (Tongyi Wanxiang / Veo / Jimeng AI)

​Common Status Values

​Query Video Status

​Example Response (Generation Complete - Tongyi Wanxiang)

​Example Response (Generation Complete - Sora)

​Download Video Content

​Delete a Video Task

​Per-Model Parameter Details

​OpenAI Sora

​Google Veo

​Tongyi Wanxiang

​Jimeng AI

​Doubao Seedance

​Kling

​Complete Invocation Examples

​FAQ

​How long does video generation take?

​How do I use the input_reference parameter?

​How long is the video download link valid?

​What are the differences in the seconds parameter across models?

​What are the differences in the size parameter format across models?

​What is the difference between seconds and duration?

​How do I write better prompts?

​How do I handle a failed task?

​Common failure reasons include: content violations, prompt too long, unsupported image format, etc. Adjust based on the error message and retry.

Quick Start

API Overview

Supported Models

Text-to-Video

Image-to-Video

API Details

Request Headers

Create a Video Generation Task

Request Body

Example Response (Tongyi Wanxiang / Veo / Jimeng AI)

Common Status Values

Query Video Status

Example Response (Generation Complete - Tongyi Wanxiang)

Example Response (Generation Complete - Sora)

Download Video Content

Delete a Video Task

Per-Model Parameter Details

OpenAI Sora

Google Veo

Tongyi Wanxiang

Jimeng AI

Doubao Seedance

Kling

Complete Invocation Examples

FAQ

How long does video generation take?

How do I use the `input_reference` parameter?

How long is the video download link valid?

What are the differences in the `seconds` parameter across models?

What are the differences in the `size` parameter format across models?

What is the difference between `seconds` and `duration`?

How do I write better prompts?

How do I handle a failed task?

Common failure reasons include: content violations, prompt too long, unsupported image format, etc. Adjust based on the error message and retry.