Entropic 2.3.8
Local-first agentic inference engine
Loading...
Searching...
No Matches
entropic::GenerationParams Struct Reference

Generation parameters for a single inference call. More...

#include <entropic/types/config.h>

Public Attributes

float temperature = 0.7f
 Sampling temperature.
 
float top_p = 0.9f
 Nucleus sampling threshold.
 
int top_k = 40
 Top-K sampling.
 
float repeat_penalty = 1.1f
 Repetition penalty.
 
int max_tokens = 4096
 Maximum tokens to generate.
 
int seed = -1
 RNG seed for reproducible sampling.
 
int reasoning_budget = -1
 Per-call think budget override (-1 = unlimited)
 
bool enable_thinking = true
 Enable <think> blocks (false if reasoning_budget == 0)
 
std::string grammar
 GBNF grammar string (empty = unconstrained)
 
std::string grammar_key
 Grammar registry key.
 
std::vector< std::string > stop
 Stop sequences.
 
int logprobs = 0
 Top log-probs per token (0 = disabled)
 
int time_limit_ms = 0
 Wall-clock time cap in milliseconds.
 
std::string profile
 GPU resource profile name.
 
bool auto_adapt = true
 Enable throughput-based max_tokens auto-adaptation.
 
float adapt_headroom = 0.9f
 Target time usage fraction for auto-adaptation.
 

Detailed Description

Generation parameters for a single inference call.

Version
2.0.6-rc16 — added seed

Definition at line 227 of file config.h.

Member Data Documentation

◆ adapt_headroom

float entropic::GenerationParams::adapt_headroom = 0.9f

Target time usage fraction for auto-adaptation.

0.9 means "use at most 90% of time_limit_ms for generation".

Version
1.9.7

Definition at line 272 of file config.h.

◆ auto_adapt

bool entropic::GenerationParams::auto_adapt = true

Enable throughput-based max_tokens auto-adaptation.

When true, the orchestrator may reduce max_tokens to fit within time_limit_ms based on recent throughput measurements. Ignored if time_limit_ms == 0.

Version
1.9.7

Definition at line 267 of file config.h.

◆ enable_thinking

bool entropic::GenerationParams::enable_thinking = true

Enable <think> blocks (false if reasoning_budget == 0)

Definition at line 239 of file config.h.

◆ grammar

std::string entropic::GenerationParams::grammar

GBNF grammar string (empty = unconstrained)

Definition at line 240 of file config.h.

◆ grammar_key

std::string entropic::GenerationParams::grammar_key

Grammar registry key.

Resolved to GBNF content by orchestrator before passing to the backend. If both grammar and grammar_key are set, grammar (raw string) takes precedence.

Version
1.9.3

Definition at line 245 of file config.h.

◆ logprobs

int entropic::GenerationParams::logprobs = 0

Top log-probs per token (0 = disabled)

Definition at line 247 of file config.h.

◆ max_tokens

int entropic::GenerationParams::max_tokens = 4096

Maximum tokens to generate.

Definition at line 232 of file config.h.

◆ profile

std::string entropic::GenerationParams::profile

GPU resource profile name.

Resolved to GPUResourceProfile by the orchestrator before passing to the backend. Empty string means use the "balanced" profile.

Version
1.9.7

Definition at line 260 of file config.h.

◆ reasoning_budget

int entropic::GenerationParams::reasoning_budget = -1

Per-call think budget override (-1 = unlimited)

Definition at line 238 of file config.h.

◆ repeat_penalty

float entropic::GenerationParams::repeat_penalty = 1.1f

Repetition penalty.

Definition at line 231 of file config.h.

◆ seed

int entropic::GenerationParams::seed = -1

RNG seed for reproducible sampling.

-1 = random (default). Maps to LLAMA_DEFAULT_SEED when negative. (P2-14)

Version
2.0.6-rc16

Definition at line 237 of file config.h.

◆ stop

std::vector<std::string> entropic::GenerationParams::stop

Stop sequences.

Definition at line 246 of file config.h.

◆ temperature

float entropic::GenerationParams::temperature = 0.7f

Sampling temperature.

Definition at line 228 of file config.h.

◆ time_limit_ms

int entropic::GenerationParams::time_limit_ms = 0

Wall-clock time cap in milliseconds.

Generation is cancelled if this limit is reached. 0 = no time limit (default).

Version
1.9.7

Definition at line 254 of file config.h.

◆ top_k

int entropic::GenerationParams::top_k = 40

Top-K sampling.

Definition at line 230 of file config.h.

◆ top_p

float entropic::GenerationParams::top_p = 0.9f

Nucleus sampling threshold.

Definition at line 229 of file config.h.


The documentation for this struct was generated from the following file: