Role-keyed lifecycle manager for non-primary models. More...

#include <entropic/inference/secondary_model_loader.h>

Public Member Functions
bool	ensure_loaded (const std::string &role, const ModelConfig &config)
	Lazily load and activate a model for a role.

InferenceBackend *	get (const std::string &role) const
	Get the backend for a role.

std::shared_ptr< InferenceBackend >	get_shared (const std::string &role) const
	Get the backend for a role as a shared_ptr.

bool	release_role (const std::string &role)
	Unload and drop a role.

bool	is_loaded (const std::string &role) const
	Check whether a role is currently loaded and active.

std::vector< std::string >	loaded_roles () const
	Names of all roles with a currently-loaded backend.

void	clear_all_prompt_caches ()
	Fanout: clear prompt/KV cache on every loaded backend.

void	shutdown ()
	Unload every role.

Detailed Description

Role-keyed lifecycle manager for non-primary models.

Replaces the per-role std::shared_ptr<InferenceBackend> members (router_) that previously lived directly on ModelOrchestrator. The router refactor is intentionally invisible to callers: existing router behavior is preserved via loader_.get("router").

Role names (conventional, not enforced):

"router" — digit-classifier model used by ModelOrchestrator::route()
"draft" — speculative-decoding draft (v2.1.11+)
"thinking" — future thinking-model slot (gh#25)

Version: 2.1.11

Definition at line 55 of file secondary_model_loader.h.

Member Function Documentation

◆ clear_all_prompt_caches()

void entropic::SecondaryModelLoader::clear_all_prompt_caches ( )

Fanout: clear prompt/KV cache on every loaded backend.

Used by ModelOrchestrator::clear_all_prompt_caches() so the router and draft caches invalidate alongside the main pool when identity content changes (P1-7, v2.0.6-rc16 contract).

@utility

Version: 2.1.11

@utility

Version: 2.1.11

Definition at line 147 of file secondary_model_loader.cpp.

◆ ensure_loaded()

bool entropic::SecondaryModelLoader::ensure_loaded	(	const std::string &	role,
		const ModelConfig &	config
	)

Lazily load and activate a model for a role.

If the role is already loaded against the same config path, this is a no-op (idempotent). If the role is loaded against a different path, the existing backend is unloaded first.

Parameters

role	Role name (e.g. `"router"`, `"draft"`).
config	ModelConfig for the secondary model.

Returns: true on successful activation, false on failure.

Version: 2.1.11

Parameters

role	Role name (e.g. `"router"`, `"draft"`).
config	ModelConfig for the secondary model.

Returns: true on activation success.

Definition at line 34 of file secondary_model_loader.cpp.

◆ get()

InferenceBackend * entropic::SecondaryModelLoader::get ( const std::string & role ) const

Get the backend for a role.

Parameters

role	Role name.

Returns: Backend pointer if loaded, nullptr otherwise. @utility

Version: 2.1.11

Parameters

role	Role name.

Returns: Backend pointer, nullptr if role is unknown. @utility

Version: 2.1.11

Definition at line 67 of file secondary_model_loader.cpp.

◆ get_shared()

std::shared_ptr< InferenceBackend > entropic::SecondaryModelLoader::get_shared ( const std::string & role ) const

Get the backend for a role as a shared_ptr.

Used when callers need to extend backend lifetime beyond the loader (e.g. holding a reference for the duration of a long generation while the loader could otherwise be released).

Parameters

role	Role name.

Returns: Backend shared_ptr (empty if not loaded). @utility

Version: 2.1.11

Parameters

role	Role name.

Returns: Backend shared_ptr, empty if role is unknown. @utility

Version: 2.1.11

Definition at line 80 of file secondary_model_loader.cpp.

◆ is_loaded()

bool entropic::SecondaryModelLoader::is_loaded ( const std::string & role ) const

Check whether a role is currently loaded and active.

Check whether a role is currently loaded and non-COLD.

Parameters

role	Role name.

Returns: true if get(role) != nullptr and the backend reports it is loaded (state != COLD). @utility

Version: 2.1.11

Parameters

role	Role name.

Returns: true if backend is present and is_loaded(). @utility

Version: 2.1.11

Definition at line 117 of file secondary_model_loader.cpp.

◆ loaded_roles()

std::vector< std::string > entropic::SecondaryModelLoader::loaded_roles ( ) const

Names of all roles with a currently-loaded backend.

Names of all loaded roles (sorted for deterministic output).

Returns: Sorted list of role names whose backend is non-COLD. @utility

Version: 2.1.11

Returns: Sorted role names whose backend reports is_loaded(). @utility

Version: 2.1.11

Definition at line 129 of file secondary_model_loader.cpp.

◆ release_role()

bool entropic::SecondaryModelLoader::release_role ( const std::string & role )

Unload and drop a role.

Parameters

role	Role name.

Returns: true if a role was unloaded, false if it was not loaded.

Version: 2.1.11

Parameters

role	Role name.

Returns: true if a role was unloaded, false if none was loaded.

Definition at line 95 of file secondary_model_loader.cpp.

◆ shutdown()

void entropic::SecondaryModelLoader::shutdown ( )

Unload every role.

Mirrors ModelOrchestrator::shutdown() — called during engine teardown. Safe to call repeatedly.

Version: 2.1.11

Safe to call repeatedly.

Definition at line 159 of file secondary_model_loader.cpp.

The documentation for this class was generated from the following files:

entropic/inference/secondary_model_loader.h
inference/secondary_model_loader.cpp

Public Member Functions

Detailed Description

Member Function Documentation

◆ clear_all_prompt_caches()

◆ ensure_loaded()

◆ get()

◆ get_shared()

◆ is_loaded()

◆ loaded_roles()

◆ release_role()

◆ shutdown()