Entropic 2.3.8
Local-first agentic inference engine
Loading...
Searching...
No Matches
entropic::ThroughputTracker Class Reference

EWMA-based throughput tracker for generation budgeting. More...

#include <entropic/inference/throughput_tracker.h>

Public Member Functions

void record (int tokens_generated, int64_t elapsed_ms)
 Record a completed generation sample.
 
double tok_per_sec () const
 Current smoothed throughput estimate.
 
int64_t predict_ms (int token_count) const
 Predict wall-clock time for generating N tokens.
 
int recommend_tokens (int64_t time_budget_ms, float headroom=0.9f, int floor=64) const
 Recommend max_tokens to fit within a time budget.
 
int sample_count () const
 Number of recorded samples.
 
void reset ()
 Reset all throughput data.
 

Detailed Description

EWMA-based throughput tracker for generation budgeting.

Single concrete class (no three-layer hierarchy). Records tok/s samples from completed generations and provides smoothed estimates for auto-adaptation of max_tokens.

Version
1.9.7

Definition at line 43 of file throughput_tracker.h.

Member Function Documentation

◆ predict_ms()

int64_t entropic::ThroughputTracker::predict_ms ( int  token_count) const

Predict wall-clock time for generating N tokens.

Parameters
token_countDesired token count.
Returns
Predicted milliseconds. 0 if no throughput data.
Version
1.9.7
Parameters
token_countDesired token count.
Returns
Predicted milliseconds. 0 if no throughput data.

Definition at line 69 of file throughput_tracker.cpp.

◆ recommend_tokens()

int entropic::ThroughputTracker::recommend_tokens ( int64_t  time_budget_ms,
float  headroom = 0.9f,
int  floor = 64 
) const

Recommend max_tokens to fit within a time budget.

Parameters
time_budget_msAvailable wall-clock time.
headroomFraction of budget to target (e.g., 0.9 = 90%).
floorMinimum token count to return (never recommend fewer).
Returns
Recommended max_tokens. Returns floor if no throughput data.
Version
1.9.7
Parameters
time_budget_msAvailable wall-clock time.
headroomFraction of budget to target.
floorMinimum token count to return.
Returns
Recommended max_tokens.

Definition at line 87 of file throughput_tracker.cpp.

◆ record()

void entropic::ThroughputTracker::record ( int  tokens_generated,
int64_t  elapsed_ms 
)

Record a completed generation sample.

Parameters
tokens_generatedNumber of tokens produced.
elapsed_msWall-clock generation time in milliseconds.

Updates the EWMA. Ignores samples with fewer than kMinTokens tokens or elapsed_ms <= 0 (degenerate generations).

Version
1.9.7
Parameters
tokens_generatedNumber of tokens produced.
elapsed_msWall-clock generation time in milliseconds.

Definition at line 27 of file throughput_tracker.cpp.

◆ reset()

void entropic::ThroughputTracker::reset ( )

Reset all throughput data.

Version
1.9.7

Definition at line 115 of file throughput_tracker.cpp.

◆ sample_count()

int entropic::ThroughputTracker::sample_count ( ) const

Number of recorded samples.

Returns
Sample count (lock-free read).
Version
1.9.7
Returns
Sample count.

Definition at line 106 of file throughput_tracker.cpp.

◆ tok_per_sec()

double entropic::ThroughputTracker::tok_per_sec ( ) const

Current smoothed throughput estimate.

Returns
Tokens per second (EWMA). 0.0 if no samples recorded.
Version
1.9.7
Returns
Tokens per second (EWMA). 0.0 if no samples recorded.

Definition at line 58 of file throughput_tracker.cpp.


The documentation for this class was generated from the following files: