Entropic 2.3.8
Local-first agentic inference engine
Loading...
Searching...
No Matches
entropic::SecondaryModelLoader Class Reference

Role-keyed lifecycle manager for non-primary models. More...

#include <entropic/inference/secondary_model_loader.h>

Public Member Functions

bool ensure_loaded (const std::string &role, const ModelConfig &config)
 Lazily load and activate a model for a role.
 
InferenceBackendget (const std::string &role) const
 Get the backend for a role.
 
std::shared_ptr< InferenceBackendget_shared (const std::string &role) const
 Get the backend for a role as a shared_ptr.
 
bool release_role (const std::string &role)
 Unload and drop a role.
 
bool is_loaded (const std::string &role) const
 Check whether a role is currently loaded and active.
 
std::vector< std::string > loaded_roles () const
 Names of all roles with a currently-loaded backend.
 
void clear_all_prompt_caches ()
 Fanout: clear prompt/KV cache on every loaded backend.
 
void shutdown ()
 Unload every role.
 

Detailed Description

Role-keyed lifecycle manager for non-primary models.

Replaces the per-role std::shared_ptr<InferenceBackend> members (router_) that previously lived directly on ModelOrchestrator. The router refactor is intentionally invisible to callers: existing router behavior is preserved via loader_.get("router").

Role names (conventional, not enforced):
  • "router" — digit-classifier model used by ModelOrchestrator::route()
  • "draft" — speculative-decoding draft (v2.1.11+)
  • "thinking" — future thinking-model slot (gh#25)
Version
2.1.11

Definition at line 55 of file secondary_model_loader.h.

Member Function Documentation

◆ clear_all_prompt_caches()

void entropic::SecondaryModelLoader::clear_all_prompt_caches ( )

Fanout: clear prompt/KV cache on every loaded backend.

Used by ModelOrchestrator::clear_all_prompt_caches() so the router and draft caches invalidate alongside the main pool when identity content changes (P1-7, v2.0.6-rc16 contract).

@utility

Version
2.1.11

@utility

Version
2.1.11

Definition at line 147 of file secondary_model_loader.cpp.

◆ ensure_loaded()

bool entropic::SecondaryModelLoader::ensure_loaded ( const std::string &  role,
const ModelConfig config 
)

Lazily load and activate a model for a role.

If the role is already loaded against the same config path, this is a no-op (idempotent). If the role is loaded against a different path, the existing backend is unloaded first.

Parameters
roleRole name (e.g. "router", "draft").
configModelConfig for the secondary model.
Returns
true on successful activation, false on failure.
Version
2.1.11
Parameters
roleRole name (e.g. "router", "draft").
configModelConfig for the secondary model.
Returns
true on activation success.

Definition at line 34 of file secondary_model_loader.cpp.

◆ get()

InferenceBackend * entropic::SecondaryModelLoader::get ( const std::string &  role) const

Get the backend for a role.

Parameters
roleRole name.
Returns
Backend pointer if loaded, nullptr otherwise. @utility
Version
2.1.11
Parameters
roleRole name.
Returns
Backend pointer, nullptr if role is unknown. @utility
Version
2.1.11

Definition at line 67 of file secondary_model_loader.cpp.

◆ get_shared()

std::shared_ptr< InferenceBackend > entropic::SecondaryModelLoader::get_shared ( const std::string &  role) const

Get the backend for a role as a shared_ptr.

Used when callers need to extend backend lifetime beyond the loader (e.g. holding a reference for the duration of a long generation while the loader could otherwise be released).

Parameters
roleRole name.
Returns
Backend shared_ptr (empty if not loaded). @utility
Version
2.1.11
Parameters
roleRole name.
Returns
Backend shared_ptr, empty if role is unknown. @utility
Version
2.1.11

Definition at line 80 of file secondary_model_loader.cpp.

◆ is_loaded()

bool entropic::SecondaryModelLoader::is_loaded ( const std::string &  role) const

Check whether a role is currently loaded and active.

Check whether a role is currently loaded and non-COLD.

Parameters
roleRole name.
Returns
true if get(role) != nullptr and the backend reports it is loaded (state != COLD). @utility
Version
2.1.11
Parameters
roleRole name.
Returns
true if backend is present and is_loaded(). @utility
Version
2.1.11

Definition at line 117 of file secondary_model_loader.cpp.

◆ loaded_roles()

std::vector< std::string > entropic::SecondaryModelLoader::loaded_roles ( ) const

Names of all roles with a currently-loaded backend.

Names of all loaded roles (sorted for deterministic output).

Returns
Sorted list of role names whose backend is non-COLD. @utility
Version
2.1.11
Returns
Sorted role names whose backend reports is_loaded(). @utility
Version
2.1.11

Definition at line 129 of file secondary_model_loader.cpp.

◆ release_role()

bool entropic::SecondaryModelLoader::release_role ( const std::string &  role)

Unload and drop a role.

Parameters
roleRole name.
Returns
true if a role was unloaded, false if it was not loaded.
Version
2.1.11
Parameters
roleRole name.
Returns
true if a role was unloaded, false if none was loaded.

Definition at line 95 of file secondary_model_loader.cpp.

◆ shutdown()

void entropic::SecondaryModelLoader::shutdown ( )

Unload every role.

Mirrors ModelOrchestrator::shutdown() — called during engine teardown. Safe to call repeatedly.

Version
2.1.11

Safe to call repeatedly.

Definition at line 159 of file secondary_model_loader.cpp.


The documentation for this class was generated from the following files: