All the knobs and dials to make AI models do exactly what you want.
array
string
string
integer
[1, context_length)
number
[0, 2]
1.0
The creativity dial. This is probably the most important parameter you’ll use.
boolean
false
Get results as they’re generated instead of waiting for the full response.
number
(0, 1]
1.0
Alternative to temperature. Controls randomness by only considering the top % of probable next tokens.
temperature
OR top_p
, not both. They don’t play well together.
integer
[1, ∞)
number
[-2, 2]
0
Reduces word repetition. Positive values make the model avoid repeating words it’s already used.
number
[-2, 2]
0
Encourages talking about new topics. Positive values push the model to explore different subjects.
number
(0, 2]
1.0
Another way to fight repetition. Values > 1.0 discourage repeating, < 1.0 encourage it.
string | array
null
Tell the model when to shut up. Generation stops when it hits these sequences.
integer
object
null
Force the model to output in specific formats.
{"type": "json_object"}
: Forces valid JSON{"type": "text"}
: Regular text (default)array
null
Give the model superpowers by defining functions it can call.
string | object
"auto"
Control how the model uses tools.
"auto"
: Model decides when to use tools"none"
: No tools allowed{"type": "function", "function": {"name": "function_name"}}
: Force a specific toolarray
[]
Apply smart transformations to your messages before sending them to models.
"middle-out"
: Rearranges messages for better context utilization (especially useful for long conversations)array
null
Specify fallback models in order of preference.
object
null
Fine-tune provider selection and behavior.
frequency_penalty
and presence_penalty
top_k
or repetition_penalty
response_format
for structured outputtop_k
parameterrepetition_penalty
over frequency penaltiestop_k
parameterrepetition_penalty
temperature: 3.0
becomes temperature: 2.0
top_p: 1.5
becomes top_p: 1.0
max_tokens: "100"
❌ (should be a number)stream: "true"
❌ (should be boolean true
)temperature
and max_tokens
, add complexity latermax_tokens
= higher costs