Unified lifecycle for non-primary inference backends. More...

#include <entropic/inference/backend.h>
#include <entropic/types/config.h>
#include <memory>
#include <mutex>
#include <string>
#include <unordered_map>
#include <vector>

Include dependency graph for secondary_model_loader.h:

This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes
class	entropic::SecondaryModelLoader
	Role-keyed lifecycle manager for non-primary models. More...

Namespaces
namespace	entropic
	Activate model on GPU (WARM → ACTIVE).

Detailed Description

Unified lifecycle for non-primary inference backends.

Owns role-keyed std::shared_ptr<InferenceBackend> slots for models that are not part of the main-tier pool: the router (always-ACTIVE classifier), the speculative draft (CPU- or GPU-resident, v2.1.11+), and the future thinking-model (gh#25). Each role is loaded lazily on first use, unloaded via release_role(), and survives independent of the main-tier swap path.

Follows the single-class pattern of AdapterManager (architecture decision #29) and GrammarRegistry (decision #31): one implementation, no interface layer. Owned by ModelOrchestrator via composition. Not exposed across the inference .so boundary — all callers are inference-internal.

Thread safety: Lifecycle transitions (ensure_loaded / release_role / shutdown) acquire an internal mutex. get() is lock-free once a role has been loaded — same atomic-state contract as InferenceBackend itself.

Version: 2.1.11

Definition in file secondary_model_loader.h.

Classes

Namespaces

Detailed Description