EvoLink Auto - Smart Model Routing
The system automatically selects the most suitable model to process the request
Smart Model Routing
EvoLink Auto is an intelligent model routing feature that automatically selects a suitable AI model based on your request content, without manual model specification.Key Benefits
- Smart Matching: Automatically analyzes request content and selects a suitable model
- Cost Optimization: Prioritizes cost-effective models while maintaining quality
- Load Balancing: Automatically distributes requests across multiple models for improved stability
- Transparency: Returns the actual model name used in the response for tracking and optimization
How It Works
The system selects the best-fit model from the model pool based on request complexity, length, and type.Supported Models
EvoLink Auto intelligently routes between mainstream AI models including GPT-4, GPT-3.5, Claude, Gemini, and more.Limitations
- Not suitable for scenarios requiring specific model capabilities (e.g., GPT-4 vision features)
- Does not guarantee the same model for every request
Use Cases
Ideal for scenarios where you’re unsure which model to use, or want the system to automatically optimize model selection.model parameter to evolink/auto, and the system will automatically select a suitable model for you.https://direct.evolink.ai, which has better support for text models and long-lived connections. https://api.evolink.ai is the primary endpoint for multimodal services and serves as a fallback address for text models.Authorizations
All endpoints require Bearer Token authentication
Get your API Key:
Visit the API Key Management Page to get your API Key
Add the following to your request headers:
Authorization: Bearer YOUR_API_KEYBody
Use smart routing
evolink/auto "evolink/auto"
List of conversation messages
1[
{
"role": "user",
"content": "Introduce the history of artificial intelligence"
}
]Sampling temperature, controls the randomness of the output
Notes:
- Lower values (e.g., 0.2): More deterministic and focused output
- Higher values (e.g., 1.5): More random and creative output
0 <= x <= 20.7
Nucleus Sampling parameter
Notes:
- Controls sampling from the top tokens by cumulative probability
- For example, 0.9 means sampling from tokens whose cumulative probability reaches 90%
- Default: 1.0 (considers all tokens)
Recommendation: Do not adjust both temperature and top_p simultaneously
0 <= x <= 10.9
Top-K sampling parameter
Notes:
- For example, 10 means only the top 10 highest-probability tokens are considered during each sampling step
- Smaller values make the output more focused
- No limit by default
x >= 140
Whether to return the response in streaming mode
true: Stream response, returning content in real-time chunksfalse: Wait for the complete response before returning
false
Response
Request successful
Unique identifier for the chat completion
"chatcmpl-20260308112637503180122ABCD1234"
The model actually used
"gpt-5.4"
Response type
chat.completion "chat.completion"
Creation timestamp
1741428397
List of generated choices
Token usage statistics