> ## Documentation Index
> Fetch the complete documentation index at: https://docs.evolink.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Seed-Audio 1.0 Audio Generation

> - Multimodal audio generation with three modes: **text-to-audio**, **reference-audio (voice cloning)**, and **reference-image**
- Up to `120` seconds of audio per request
- Asynchronous mode — use the returned task ID to [query the result](/en/api-manual/task-management/get-task-detail)
- Generated audio links are valid for 24 hours, please save them promptly



## OpenAPI

````yaml en/api-manual/audio-series/doubao-seed-audio/doubao-seed-audio-1-0.json POST /v1/audios/generations
openapi: 3.1.0
info:
  title: Seed-Audio 1.0 Audio Generation API
  description: >-
    Seed-Audio 1.0 multimodal audio generation API. Supports three modes —
    text-to-audio, reference-audio (voice cloning), and reference-image
    generation — producing up to 120 seconds of audio per request. Ideal for
    audiobooks, dubbing, gaming, and more.
  license:
    name: MIT
  version: 1.0.0
servers:
  - url: https://api.evolink.ai
    description: Production
security:
  - bearerAuth: []
tags:
  - name: Audio Generation
    description: Seed-Audio 1.0 audio generation endpoints
paths:
  /v1/audios/generations:
    post:
      tags:
        - Audio Generation
      summary: Seed-Audio 1.0 Audio Generation
      description: >-
        - Multimodal audio generation with three modes: **text-to-audio**,
        **reference-audio (voice cloning)**, and **reference-image**

        - Up to `120` seconds of audio per request

        - Asynchronous mode — use the returned task ID to [query the
        result](/en/api-manual/task-management/get-task-detail)

        - Generated audio links are valid for 24 hours, please save them
        promptly
      operationId: createSeedAudio10
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/SeedAudioRequest'
            examples:
              basic:
                summary: Text-to-audio
                value:
                  model: doubao-seed-audio-1-0
                  prompt: >-
                    Welcome to the audio generation service. The weather is
                    lovely today.
                  format: mp3
              with_voice:
                summary: Generate with a specified voice
                value:
                  model: doubao-seed-audio-1-0
                  prompt: Good evening, everyone, and welcome to the evening news.
                  audio_references:
                    - zh_female_vv_uranus_bigtts
                  speech_rate: 1.25
              voice_clone:
                summary: Reference-audio generation (voice cloning)
                value:
                  model: doubao-seed-audio-1-0
                  prompt: '@audio1 Hi there, nice to meet you.'
                  audio_references:
                    - https://example.com/ref-voice.mp3
              multi_voice:
                summary: Mixed voices (voice ID + audio URL)
                value:
                  model: doubao-seed-audio-1-0
                  prompt: '@audio1 Hi there! @audio2 How''s your day going?'
                  audio_references:
                    - zh_female_vv_uranus_bigtts
                    - https://example.com/ref-voice.mp3
              image_ref:
                summary: Reference-image generation
                value:
                  model: doubao-seed-audio-1-0
                  prompt: Synthesize a voiceover that matches the mood of the image.
                  image_urls:
                    - https://example.com/scene.jpg
      responses:
        '200':
          description: Audio generation task created successfully
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/SeedAudioResponse'
        '400':
          description: Invalid request parameters
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: missing_text
                  message: 'Missing required parameter: prompt'
                  type: invalid_request_error
        '401':
          description: Unauthenticated; token invalid or expired
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: unauthorized
                  message: Invalid or expired token
                  type: authentication_error
        '402':
          description: Insufficient quota; top-up required
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: insufficient_quota
                  message: Insufficient quota. Please top up your account.
                  type: insufficient_quota
        '403':
          description: No access permission
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: model_access_denied
                  message: 'Token does not have access to model: doubao-seed-audio-1-0'
                  type: invalid_request_error
        '429':
          description: Rate limit exceeded
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: rate_limit_exceeded
                  message: Too many requests, please try again later
                  type: rate_limit_error
        '500':
          description: Internal server error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              example:
                error:
                  code: internal_error
                  message: Internal server error
                  type: api_error
components:
  schemas:
    SeedAudioRequest:
      type: object
      required:
        - model
        - prompt
      properties:
        model:
          type: string
          description: Model name
          enum:
            - doubao-seed-audio-1-0
          default: doubao-seed-audio-1-0
          example: doubao-seed-audio-1-0
        prompt:
          type: string
          description: >-
            The prompt or text to synthesize into audio


            **Three generation modes (auto-detected by which reference resources
            you pass):**

            - **Text-to-audio**: pass only `prompt` to generate audio directly
            from the prompt

            - **Reference-audio (voice cloning)**: pair with `audio_references`;
            use the literal marker `@audioN` to reference the Nth item (numbered
            from `1`, in array order)

            - **Reference-image**: pair with `image_urls`; `prompt` only needs
            the text to synthesize


            > Audio references (`audio_references`) and image references
            (`image_urls`) are **mutually exclusive** — only one may be used per
            request.


            **Constraints:**

            - Up to `1500` characters
          maxLength: 1500
          example: >-
            Welcome to the audio generation service. The weather is lovely
            today.
        audio_references:
          type: array
          description: >-
            List of reference resources. Each item can be a **voice ID** or a
            **reference-audio URL**, and the two may be **mixed** within the
            same array


            - **Voice ID**: the `voice_type` of a preset voice — see the full
            list in [Seed-Audio 1.0 Voice
            List](/en/api-manual/audio-series/doubao-seed-audio/doubao-seed-audio-1-0-voices)

            - **Audio URL**: upload a reference audio clip for voice cloning

            - **Mutually exclusive with `image_urls`**: reference audio and
            reference image are either-or; they cannot be sent together in one
            request

            - Use the literal marker `@audioN` in `prompt` to reference the Nth
            item (numbered from `1`, in array order)

            - If omitted, the model generates a voice freely based on `prompt`


            **Quantity limit:**

            - Up to `3` items total in the array (voice IDs and audio URLs
            combined)


            **Audio URL constraints:**

            - Each reference clip ≤ `30` seconds and ≤ `10 MB`

            - Formats: `wav` / `mp3` / `pcm` / `ogg_opus`
          items:
            type: string
          maxItems: 3
          example:
            - zh_female_vv_uranus_bigtts
        image_urls:
          type: array
          description: >-
            List of reference-image URLs; generates audio matching the mood of
            the image


            - When using an image reference, `prompt` only needs the text to
            synthesize

            - **Mutually exclusive with `audio_references`**: reference image
            and reference audio are either-or; they cannot be sent together in
            one request


            **Constraints:**

            - Currently only `1` image, ≤ `10 MB`

            - Formats: `jpeg` / `png` / `webp`
          items:
            type: string
            format: uri
          maxItems: 1
          example:
            - https://example.com/scene.jpg
        format:
          type: string
          description: Output audio format
          enum:
            - wav
            - mp3
            - pcm
            - ogg_opus
          default: wav
          example: mp3
        sample_rate:
          type: integer
          description: Output sample rate (Hz)
          enum:
            - 8000
            - 16000
            - 24000
            - 32000
            - 44100
            - 48000
          default: 24000
          example: 24000
        speech_rate:
          type: number
          description: |-
            Speech-rate multiplier (supports two decimal places)

            - `1.0`: normal speed (default)
            - `2.0`: 2x speed; `0.5`: half speed

            Range `0.5` to `2.0`
          minimum: 0.5
          maximum: 2
          multipleOf: 0.01
          default: 1
          example: 1.25
        loudness_rate:
          type: number
          description: |-
            Loudness multiplier (supports two decimal places)

            - `1.0`: normal loudness (default)
            - `2.0`: 2x loudness; `0.5`: half loudness

            Range `0.5` to `2.0`
          minimum: 0.5
          maximum: 2
          multipleOf: 0.01
          default: 1
          example: 0.85
        pitch_rate:
          type: integer
          description: >-
            Pitch adjustment, in **semitones**


            - `0`: default pitch (no change)

            - **Positive values raise the pitch**: the larger the value, the
            higher and sharper the voice; `12` raises it by one octave

            - **Negative values lower the pitch**: the smaller the value, the
            lower and deeper the voice; `-12` lowers it by one octave


            Range `-12` to `12`
          minimum: -12
          maximum: 12
          default: 0
          example: 0
        callback_url:
          type: string
          description: >-
            HTTPS callback URL invoked when the task finishes


            **When it fires:**

            - Triggered when the task is completed, failed, or cancelled

            - Sent after billing is finalized


            **Security restrictions:**

            - HTTPS only

            - Callbacks to internal IP addresses are forbidden (127.0.0.1,
            10.x.x.x, 172.16-31.x.x, 192.168.x.x, etc.)

            - URL length must not exceed `2048` characters


            **Callback mechanism:**

            - Timeout: `10` seconds

            - Up to `3` retries on failure (at `1` / `2` / `4` seconds after
            each failure)

            - The callback body has the same format as the task-query response

            - A 2xx response is treated as success; other status codes trigger a
            retry
          format: uri
          example: https://your-domain.com/webhooks/audio-completed
    SeedAudioResponse:
      type: object
      properties:
        created:
          type: integer
          description: Task creation timestamp
          example: 1775200000
        id:
          type: string
          description: Task ID
          example: task-unified-1775200000-abcd1234
        model:
          type: string
          description: The model actually used
          example: doubao-seed-audio-1-0
        object:
          type: string
          enum:
            - audio.generation.task
          description: Specific task type
        progress:
          type: integer
          description: Task progress percentage (0-100)
          minimum: 0
          maximum: 100
          example: 0
        status:
          type: string
          description: Task status
          enum:
            - pending
            - processing
            - completed
            - failed
          example: pending
        task_info:
          $ref: '#/components/schemas/AudioTaskInfo'
          description: Detailed audio task information
        type:
          type: string
          enum:
            - audio
          description: Task output type
          example: audio
        usage:
          $ref: '#/components/schemas/AudioUsage'
          description: Usage and billing information
    ErrorResponse:
      type: object
      properties:
        error:
          type: object
          properties:
            code:
              type: string
              description: Error code identifier
            message:
              type: string
              description: Error description
            type:
              type: string
              description: Error type
    AudioTaskInfo:
      type: object
      properties:
        can_cancel:
          type: boolean
          description: Whether the task can be cancelled
          example: true
        estimated_time:
          type: integer
          description: Estimated time to completion (seconds)
          minimum: 0
          example: 15
        audio_type:
          type: string
          description: Audio task type
          example: audio_generation
    AudioUsage:
      type: object
      description: Usage information
      properties:
        credits_reserved:
          type: number
          description: >-
            Estimated credits to be consumed (reserved by the maximum duration,
            settled by the actual duration when the task finishes)
          minimum: 0
          example: 9.6
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      description: >-
        ##All endpoints require Bearer Token authentication##


        **Get your API Key:**


        Visit the [API Key management page](https://evolink.ai/dashboard/keys)
        to obtain your API Key


        **Add it to the request header:**

        ```

        Authorization: Bearer YOUR_API_KEY

        ```

````