Entropic 2.3.8
Local-first agentic inference engine
Loading...
Searching...
No Matches
entropic::PromptCache Class Reference

Host-memory KV cache with LRU eviction. More...

#include </home/runner/work/entropic/entropic/src/inference/prompt_cache.h>

Public Member Functions

 PromptCache (size_t max_bytes)
 Construct with maximum RAM budget.
 
bool store (const CacheKey &key, std::vector< uint8_t > &&data, int token_count)
 Store a KV cache snapshot.
 
const CacheEntrylookup (const CacheKey &key)
 Retrieve a cached KV snapshot.
 
void clear ()
 Evict all entries.
 
size_t bytes_used () const
 Current total bytes consumed by cached entries.
 
size_t entry_count () const
 Number of cached entries.
 
CacheStats stats () const
 Cache hit/miss statistics.
 

Static Public Member Functions

static CacheKey make_key (std::string_view prompt_text, std::string_view model_path)
 Compute a cache key from prompt text and model path.
 

Detailed Description

Host-memory KV cache with LRU eviction.

Stores KV cache snapshots keyed by content hash of the system prompt text concatenated with the model path. Evicts least-recently-used entries when the configured RAM limit is exceeded.

Thread safety
All public methods acquire mutex_. The expensive llama.cpp calls (llama_decode, llama_state_seq_get/set_data) happen OUTSIDE the cache mutex in the caller (LlamaCppBackend).
Lifecycle
PromptCache cache(max_bytes);
cache.store(key, data, token_count); // after system prompt decode
auto* entry = cache.lookup(key); // before next decode
cache.clear(); // on model unload
Host-memory KV cache with LRU eviction.
Version
1.8.3

Definition at line 102 of file prompt_cache.h.

Constructor & Destructor Documentation

◆ PromptCache()

entropic::PromptCache::PromptCache ( size_t  max_bytes)
explicit

Construct with maximum RAM budget.

Parameters
max_bytesMaximum total bytes across all cached entries. 0 = caching disabled.
Version
1.8.3
Parameters
max_bytesMaximum total bytes. 0 disables caching.
Version
1.8.3

Definition at line 49 of file prompt_cache.cpp.

Member Function Documentation

◆ bytes_used()

size_t entropic::PromptCache::bytes_used ( ) const

Current total bytes consumed by cached entries.

Current total bytes consumed.

Returns
Byte count.
Version
1.8.3
Returns
Byte count.

Definition at line 198 of file prompt_cache.cpp.

◆ clear()

void entropic::PromptCache::clear ( )

Evict all entries.

Called on model reload.

Version
1.8.3

Definition at line 183 of file prompt_cache.cpp.

◆ entry_count()

size_t entropic::PromptCache::entry_count ( ) const

Number of cached entries.

Returns
Entry count.
Version
1.8.3
Returns
Entry count.

Definition at line 209 of file prompt_cache.cpp.

◆ lookup()

const CacheEntry * entropic::PromptCache::lookup ( const CacheKey key)

Retrieve a cached KV snapshot.

Look up a cached KV snapshot.

Parameters
keyHash to look up.
Returns
Pointer to cached entry if found, nullptr on miss. Pointer valid until next store() or clear() call. Updates LRU ordering on hit.
Version
1.8.3

On hit, moves the entry to front of LRU list and increments hit counter. On miss, increments miss counter.

Parameters
keyHash to look up.
Returns
Pointer to entry on hit, nullptr on miss.

Definition at line 154 of file prompt_cache.cpp.

◆ make_key()

CacheKey entropic::PromptCache::make_key ( std::string_view  prompt_text,
std::string_view  model_path 
)
static

Compute a cache key from prompt text and model path.

Compute cache key from prompt text and model path.

Parameters
prompt_textFull system prompt string.
model_pathModel file path string.
Returns
CacheKey with combined hash.
Version
1.8.3

Concatenates prompt_text + '\0' + model_path and hashes with FNV-1a. The null separator prevents prefix collisions.

Parameters
prompt_textFull system prompt string.
model_pathModel file path string.
Returns
CacheKey with combined hash.

Definition at line 67 of file prompt_cache.cpp.

◆ stats()

CacheStats entropic::PromptCache::stats ( ) const

Cache hit/miss statistics.

Cache performance statistics.

Returns
Copy of current stats.
Version
1.8.3
Returns
Copy of current stats.

Definition at line 220 of file prompt_cache.cpp.

◆ store()

bool entropic::PromptCache::store ( const CacheKey key,
std::vector< uint8_t > &&  data,
int  token_count 
)

Store a KV cache snapshot.

Parameters
keyHash of prompt content + model path.
dataRaw KV cache bytes from llama_state_seq_get_data.
token_countNumber of prompt tokens this entry covers.
Returns
true if stored (may evict older entries), false if entry exceeds max_bytes entirely and cannot be stored.
Version
1.8.3

If the entry size exceeds max_bytes, returns false without storing. Otherwise evicts LRU entries as needed and stores the new entry.

Parameters
keyHash of prompt content + model path.
dataRaw KV cache bytes (moved).
token_countPrompt tokens covered.
Returns
true if stored, false if too large.

Definition at line 91 of file prompt_cache.cpp.


The documentation for this class was generated from the following files: