|
Entropic 2.3.8
Local-first agentic inference engine
|
LoRA adapter lifecycle manager. More...
#include <entropic/inference/adapter_manager.h>
Public Member Functions | |
| bool | load (const std::string &name, const std::filesystem::path &adapter_path, llama_model *model, float scale=1.0f) |
| Load a LoRA adapter into RAM (COLD -> WARM). | |
| void | unload (const std::string &name, llama_context *ctx) |
| Unload adapter (any state -> COLD). | |
| bool | activate (const std::string &name, llama_context *ctx) |
| Activate adapter on context (WARM -> HOT). | |
| void | deactivate (llama_context *ctx) |
| Deactivate current HOT adapter (HOT -> WARM). | |
| bool | swap (const std::string &name, llama_context *ctx) |
| Swap to a different adapter atomically. | |
| void | unload_all_for_model (llama_model *model, llama_context *ctx) |
| Unload all adapters for a given base model. | |
| void | unload_all () |
| Free every loaded adapter handle (gh#58 close-out, v2.3.0). | |
| AdapterState | state (const std::string &name) const |
| Get adapter state. | |
| AdapterInfo | info (const std::string &name) const |
| Get metadata for an adapter. | |
| std::vector< AdapterInfo > | list_adapters () const |
| List all known adapters. | |
| std::string | active_adapter () const |
| Get the currently HOT adapter name. | |
| void | set_hook_interface (const HookInterface &hooks) |
| Set hook interface for ON_ADAPTER_SWAP dispatch. | |
LoRA adapter lifecycle manager.
Manages adapter COLD/WARM/HOT states. One active (HOT) adapter per context at a time. Multiple adapters can be WARM simultaneously.
Definition at line 58 of file adapter_manager.h.
| bool entropic::AdapterManager::activate | ( | const std::string & | name, |
| llama_context * | ctx | ||
| ) |
Activate adapter on context (WARM -> HOT).
| name | Adapter identifier. |
| ctx | llama_context to activate on. |
If another adapter is HOT, deactivates it first. Uses llama_set_adapters_lora() to apply the single adapter.
| name | Adapter identifier. |
| ctx | llama_context to activate on. |
Definition at line 164 of file adapter_manager.cpp.
| std::string entropic::AdapterManager::active_adapter | ( | ) | const |
Get the currently HOT adapter name.
Get currently HOT adapter name.
Definition at line 417 of file adapter_manager.cpp.
| void entropic::AdapterManager::deactivate | ( | llama_context * | ctx | ) |
Deactivate current HOT adapter (HOT -> WARM).
| ctx | llama_context to clear adapter from. |
Clears all adapters from the context. No-op if none active.
| ctx | llama_context to clear from. |
Definition at line 208 of file adapter_manager.cpp.
| AdapterInfo entropic::AdapterManager::info | ( | const std::string & | name | ) | const |
Get metadata for an adapter.
| name | Adapter identifier. |
| name | Adapter identifier. |
Definition at line 386 of file adapter_manager.cpp.
| std::vector< AdapterInfo > entropic::AdapterManager::list_adapters | ( | ) | const |
List all known adapters.
Definition at line 401 of file adapter_manager.cpp.
| bool entropic::AdapterManager::load | ( | const std::string & | name, |
| const std::filesystem::path & | adapter_path, | ||
| llama_model * | model, | ||
| float | scale = 1.0f |
||
| ) |
Load a LoRA adapter into RAM (COLD -> WARM).
| name | Unique identifier for this adapter. |
| adapter_path | Path to the LoRA .gguf file. |
| model | llama_model the adapter targets. |
| scale | LoRA scaling factor (default 1.0). |
Calls llama_adapter_lora_init() to load the adapter against the base model. The adapter stays in RAM until explicitly unloaded or the base model is unloaded.
| name | Unique identifier. |
| adapter_path | Path to .gguf adapter file. |
| model | Base llama_model pointer. |
| scale | LoRA scaling factor. |
Definition at line 73 of file adapter_manager.cpp.
| void entropic::AdapterManager::set_hook_interface | ( | const HookInterface & | hooks | ) |
Set hook interface for ON_ADAPTER_SWAP dispatch.
Set hook dispatch interface.
| hooks | Hook dispatch interface. |
| hooks | Hook interface from facade. |
Definition at line 428 of file adapter_manager.cpp.
| AdapterState entropic::AdapterManager::state | ( | const std::string & | name | ) | const |
Get adapter state.
| name | Adapter identifier. |
| name | Adapter identifier. |
Definition at line 370 of file adapter_manager.cpp.
| bool entropic::AdapterManager::swap | ( | const std::string & | name, |
| llama_context * | ctx | ||
| ) |
Swap to a different adapter atomically.
| name | Target adapter (must be WARM). |
| ctx | llama_context to swap on. |
Deactivates current HOT adapter and activates the target. Fires ON_ADAPTER_SWAP hook which can cancel the operation.
| name | Target adapter (must be WARM). |
| ctx | llama_context to swap on. |
Definition at line 241 of file adapter_manager.cpp.
| void entropic::AdapterManager::unload | ( | const std::string & | name, |
| llama_context * | ctx | ||
| ) |
Unload adapter (any state -> COLD).
| name | Adapter identifier. |
| ctx | Context to deactivate from (if HOT). May be nullptr. |
If HOT, clears from context first. Frees via llama_adapter_lora_free(). No-op if not found.
| name | Adapter identifier. |
| ctx | Context to clear from (if HOT). May be nullptr. |
Definition at line 125 of file adapter_manager.cpp.
| void entropic::AdapterManager::unload_all | ( | ) |
Free every loaded adapter handle (gh#58 close-out, v2.3.0).
Called by ~ModelOrchestrator after backends are torn down, so by the time this runs the llama_contexts that referenced these adapters are already destroyed and there is no HOT-attachment to clear. Safe to call repeatedly.
Without this, the bare AdapterManager destructor (defaulted) dropped raw llama_adapter_lora* handles on engine destroy, leaking each loaded LoRA's VRAM until process exit — the same pattern that caused gh#58's two-handle GPU regression for LlamaCppBackend in v2.2.8.
@utility
Called from ~ModelOrchestrator after backends are torn down. Skips the HOT-deactivation path because the contexts that were holding the adapter references are already gone — calling clear_adapters(ctx) against a destroyed context would be a use-after-free.
Definition at line 343 of file adapter_manager.cpp.
| void entropic::AdapterManager::unload_all_for_model | ( | llama_model * | model, |
| llama_context * | ctx | ||
| ) |
Unload all adapters for a given base model.
Unload all adapters targeting a specific base model.
| model | The base model being unloaded. |
| ctx | Context to deactivate from. May be nullptr. |
Called when base model transitions out of ACTIVE/WARM. Prevents dangling adapter handles.
| model | The base model being unloaded. |
| ctx | Context to clear from. May be nullptr. |
Definition at line 292 of file adapter_manager.cpp.