Authorizations
##All APIs require Bearer Token authentication##
Get API Key:
Visit API Key Management Page to get your API Key
Add to request header:
Authorization: Bearer YOUR_API_KEYBody
Chat model name
gemini-2.5-flash "gemini-2.5-flash"
List of chat messages, supports multi-turn dialogue and multimodal input
1Whether to return response in streaming mode
true: Streaming return, receives content in real-time chunksfalse: Returns complete response at once
false
Maximum number of tokens for generated response
Description:
- Too small value may cause response truncation
x >= 12000
Sampling temperature, controls output randomness
Description:
- Lower values (e.g., 0.2): More deterministic, focused output
- Higher values (e.g., 1.5): More random, creative output
0 <= x <= 20.7
Nucleus Sampling parameter
Description:
- Controls sampling from tokens with cumulative probability
- For example, 0.9 means selecting from tokens with cumulative probability up to 90%
- Default: 1.0 (considers all tokens)
Recommendation: Do not adjust temperature and top_p simultaneously
0 <= x <= 10.9
Top-K sampling parameter
Description:
- For example, 10 means limiting sampling to consider only the top 10 most probable tokens
- Smaller values make output more focused
- Default: no limit
x >= 140
Response
Chat completion generated successfully
Unique identifier for the chat completion
"chatcmpl-20251010015944503180122WJNB8Eid"
Model name actually used
"gemini-2.5-flash"
Response type
chat.completion "chat.completion"
Creation timestamp
1760032810
List of chat completion choices
Token usage statistics