Skip to main content
POST
/
v1
/
chat
/
completions
Smart Model Routing
curl --request POST \
  --url https://api.evolink.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "evolink/auto",
  "messages": [
    {
      "role": "user",
      "content": "Introduce the history of artificial intelligence"
    }
  ],
  "temperature": 0.7,
  "top_p": 0.9,
  "top_k": 40,
  "stream": false
}
'
{
  "id": "chatcmpl-20260308112637503180122ABCD1234",
  "model": "gpt-5.4",
  "object": "chat.completion",
  "created": 1741428397,
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The history of artificial intelligence dates back to the 1950s..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 120,
    "total_tokens": 135
  }
}

Smart Model Routing

EvoLink Auto is an intelligent model routing feature that automatically selects a suitable AI model based on your request content, without manual model specification.

Key Benefits

  • Smart Matching: Automatically analyzes request content and selects a suitable model
  • Cost Optimization: Prioritizes cost-effective models while maintaining quality
  • Load Balancing: Automatically distributes requests across multiple models for improved stability
  • Transparency: Returns the actual model name used in the response for tracking and optimization

How It Works

The system selects the best-fit model from the model pool based on request complexity, length, and type.

Supported Models

EvoLink Auto intelligently routes between mainstream AI models including GPT-4, GPT-3.5, Claude, Gemini, and more.

Limitations

  • Not suitable for scenarios requiring specific model capabilities (e.g., GPT-4 vision features)
  • Does not guarantee the same model for every request

Use Cases

Ideal for scenarios where you’re unsure which model to use, or want the system to automatically optimize model selection.
Simply set the model parameter to evolink/auto, and the system will automatically select a suitable model for you.
Alternative URL: For long-running tasks, you can switch the BaseURL to https://direct.evolink.ai, which is optimized for extended operations.

Authorizations

Authorization
string
header
required

All endpoints require Bearer Token authentication

Get your API Key:

Visit the API Key Management Page to get your API Key

Add the following to your request headers:

Authorization: Bearer YOUR_API_KEY

Body

application/json
model
enum<string>
default:evolink/auto
required

Use smart routing

Available options:
evolink/auto
Example:

"evolink/auto"

messages
object[]
required

List of conversation messages

Minimum array length: 1
Example:
[
{
"role": "user",
"content": "Introduce the history of artificial intelligence"
}
]
temperature
number

Sampling temperature, controls the randomness of the output

Notes:

  • Lower values (e.g., 0.2): More deterministic and focused output
  • Higher values (e.g., 1.5): More random and creative output
Required range: 0 <= x <= 2
Example:

0.7

top_p
number

Nucleus Sampling parameter

Notes:

  • Controls sampling from the top tokens by cumulative probability
  • For example, 0.9 means sampling from tokens whose cumulative probability reaches 90%
  • Default: 1.0 (considers all tokens)

Recommendation: Do not adjust both temperature and top_p simultaneously

Required range: 0 <= x <= 1
Example:

0.9

top_k
integer

Top-K sampling parameter

Notes:

  • For example, 10 means only the top 10 highest-probability tokens are considered during each sampling step
  • Smaller values make the output more focused
  • No limit by default
Required range: x >= 1
Example:

40

stream
boolean
default:false

Whether to return the response in streaming mode

  • true: Stream response, returning content in real-time chunks
  • false: Wait for the complete response before returning
Example:

false

Response

Request successful

id
string

Unique identifier for the chat completion

Example:

"chatcmpl-20260308112637503180122ABCD1234"

model
string

The model actually used

Example:

"gpt-5.4"

object
enum<string>

Response type

Available options:
chat.completion
Example:

"chat.completion"

created
integer

Creation timestamp

Example:

1741428397

choices
object[]

List of generated choices

usage
object

Token usage statistics