|
Entropic 2.3.8
Local-first agentic inference engine
|
Role-keyed lifecycle manager for non-primary models. More...
#include <entropic/inference/secondary_model_loader.h>
Public Member Functions | |
| bool | ensure_loaded (const std::string &role, const ModelConfig &config) |
| Lazily load and activate a model for a role. | |
| InferenceBackend * | get (const std::string &role) const |
| Get the backend for a role. | |
| std::shared_ptr< InferenceBackend > | get_shared (const std::string &role) const |
| Get the backend for a role as a shared_ptr. | |
| bool | release_role (const std::string &role) |
| Unload and drop a role. | |
| bool | is_loaded (const std::string &role) const |
| Check whether a role is currently loaded and active. | |
| std::vector< std::string > | loaded_roles () const |
| Names of all roles with a currently-loaded backend. | |
| void | clear_all_prompt_caches () |
| Fanout: clear prompt/KV cache on every loaded backend. | |
| void | shutdown () |
| Unload every role. | |
Role-keyed lifecycle manager for non-primary models.
Replaces the per-role std::shared_ptr<InferenceBackend> members (router_) that previously lived directly on ModelOrchestrator. The router refactor is intentionally invisible to callers: existing router behavior is preserved via loader_.get("router").
"router" — digit-classifier model used by ModelOrchestrator::route()"draft" — speculative-decoding draft (v2.1.11+)"thinking" — future thinking-model slot (gh#25)Definition at line 55 of file secondary_model_loader.h.
| void entropic::SecondaryModelLoader::clear_all_prompt_caches | ( | ) |
Fanout: clear prompt/KV cache on every loaded backend.
Used by ModelOrchestrator::clear_all_prompt_caches() so the router and draft caches invalidate alongside the main pool when identity content changes (P1-7, v2.0.6-rc16 contract).
@utility
@utility
Definition at line 147 of file secondary_model_loader.cpp.
| bool entropic::SecondaryModelLoader::ensure_loaded | ( | const std::string & | role, |
| const ModelConfig & | config | ||
| ) |
Lazily load and activate a model for a role.
If the role is already loaded against the same config path, this is a no-op (idempotent). If the role is loaded against a different path, the existing backend is unloaded first.
| role | Role name (e.g. "router", "draft"). |
| config | ModelConfig for the secondary model. |
| role | Role name (e.g. "router", "draft"). |
| config | ModelConfig for the secondary model. |
Definition at line 34 of file secondary_model_loader.cpp.
| InferenceBackend * entropic::SecondaryModelLoader::get | ( | const std::string & | role | ) | const |
Get the backend for a role.
| role | Role name. |
| role | Role name. |
Definition at line 67 of file secondary_model_loader.cpp.
| std::shared_ptr< InferenceBackend > entropic::SecondaryModelLoader::get_shared | ( | const std::string & | role | ) | const |
Get the backend for a role as a shared_ptr.
Used when callers need to extend backend lifetime beyond the loader (e.g. holding a reference for the duration of a long generation while the loader could otherwise be released).
| role | Role name. |
| role | Role name. |
Definition at line 80 of file secondary_model_loader.cpp.
| bool entropic::SecondaryModelLoader::is_loaded | ( | const std::string & | role | ) | const |
Check whether a role is currently loaded and active.
Check whether a role is currently loaded and non-COLD.
| role | Role name. |
get(role) != nullptr and the backend reports it is loaded (state != COLD). @utility | role | Role name. |
Definition at line 117 of file secondary_model_loader.cpp.
| std::vector< std::string > entropic::SecondaryModelLoader::loaded_roles | ( | ) | const |
Names of all roles with a currently-loaded backend.
Names of all loaded roles (sorted for deterministic output).
Definition at line 129 of file secondary_model_loader.cpp.
| bool entropic::SecondaryModelLoader::release_role | ( | const std::string & | role | ) |
Unload and drop a role.
| role | Role name. |
| role | Role name. |
Definition at line 95 of file secondary_model_loader.cpp.
| void entropic::SecondaryModelLoader::shutdown | ( | ) |
Unload every role.
Mirrors ModelOrchestrator::shutdown() — called during engine teardown. Safe to call repeatedly.
Safe to call repeatedly.
Definition at line 159 of file secondary_model_loader.cpp.