Entropic 2.3.8
Local-first agentic inference engine
Loading...
Searching...
No Matches
entropic::AdapterManager Class Reference

LoRA adapter lifecycle manager. More...

#include <entropic/inference/adapter_manager.h>

Public Member Functions

bool load (const std::string &name, const std::filesystem::path &adapter_path, llama_model *model, float scale=1.0f)
 Load a LoRA adapter into RAM (COLD -> WARM).
 
void unload (const std::string &name, llama_context *ctx)
 Unload adapter (any state -> COLD).
 
bool activate (const std::string &name, llama_context *ctx)
 Activate adapter on context (WARM -> HOT).
 
void deactivate (llama_context *ctx)
 Deactivate current HOT adapter (HOT -> WARM).
 
bool swap (const std::string &name, llama_context *ctx)
 Swap to a different adapter atomically.
 
void unload_all_for_model (llama_model *model, llama_context *ctx)
 Unload all adapters for a given base model.
 
void unload_all ()
 Free every loaded adapter handle (gh#58 close-out, v2.3.0).
 
AdapterState state (const std::string &name) const
 Get adapter state.
 
AdapterInfo info (const std::string &name) const
 Get metadata for an adapter.
 
std::vector< AdapterInfolist_adapters () const
 List all known adapters.
 
std::string active_adapter () const
 Get the currently HOT adapter name.
 
void set_hook_interface (const HookInterface &hooks)
 Set hook interface for ON_ADAPTER_SWAP dispatch.
 

Detailed Description

LoRA adapter lifecycle manager.

Manages adapter COLD/WARM/HOT states. One active (HOT) adapter per context at a time. Multiple adapters can be WARM simultaneously.

Lifecycle
COLD ──load()──> WARM ──activate()──> HOT
^ ^ |
└──unload()──────┘<──deactivate()─────┘
@ WARM
mmap'd + mlock'd in RAM
@ COLD
On disk only, no RAM consumed.
@ HOT
Active on context via llama_set_adapter_lora(). Influencing generation.
Version
1.9.2

Definition at line 58 of file adapter_manager.h.

Member Function Documentation

◆ activate()

bool entropic::AdapterManager::activate ( const std::string &  name,
llama_context *  ctx 
)

Activate adapter on context (WARM -> HOT).

Parameters
nameAdapter identifier.
ctxllama_context to activate on.
Returns
true on success.
Version
1.9.2

If another adapter is HOT, deactivates it first. Uses llama_set_adapters_lora() to apply the single adapter.

Parameters
nameAdapter identifier.
ctxllama_context to activate on.
Returns
true on success.

Definition at line 164 of file adapter_manager.cpp.

◆ active_adapter()

std::string entropic::AdapterManager::active_adapter ( ) const

Get the currently HOT adapter name.

Get currently HOT adapter name.

Returns
Adapter name, empty if none active.
Version
1.9.2
Returns
Adapter name, empty if none.

Definition at line 417 of file adapter_manager.cpp.

◆ deactivate()

void entropic::AdapterManager::deactivate ( llama_context *  ctx)

Deactivate current HOT adapter (HOT -> WARM).

Parameters
ctxllama_context to clear adapter from.
Version
1.9.2

Clears all adapters from the context. No-op if none active.

Parameters
ctxllama_context to clear from.

Definition at line 208 of file adapter_manager.cpp.

◆ info()

AdapterInfo entropic::AdapterManager::info ( const std::string &  name) const

Get metadata for an adapter.

Parameters
nameAdapter identifier.
Returns
AdapterInfo. COLD state if not found.
Version
1.9.2
Parameters
nameAdapter identifier.
Returns
AdapterInfo. COLD with empty name if not found.

Definition at line 386 of file adapter_manager.cpp.

◆ list_adapters()

std::vector< AdapterInfo > entropic::AdapterManager::list_adapters ( ) const

List all known adapters.

Returns
Vector of AdapterInfo.
Version
1.9.2
Returns
Vector of AdapterInfo snapshots.

Definition at line 401 of file adapter_manager.cpp.

◆ load()

bool entropic::AdapterManager::load ( const std::string &  name,
const std::filesystem::path &  adapter_path,
llama_model *  model,
float  scale = 1.0f 
)

Load a LoRA adapter into RAM (COLD -> WARM).

Parameters
nameUnique identifier for this adapter.
adapter_pathPath to the LoRA .gguf file.
modelllama_model the adapter targets.
scaleLoRA scaling factor (default 1.0).
Returns
true on success.
Version
1.9.2

Calls llama_adapter_lora_init() to load the adapter against the base model. The adapter stays in RAM until explicitly unloaded or the base model is unloaded.

Parameters
nameUnique identifier.
adapter_pathPath to .gguf adapter file.
modelBase llama_model pointer.
scaleLoRA scaling factor.
Returns
true on success, false on duplicate name or load failure.

Definition at line 73 of file adapter_manager.cpp.

◆ set_hook_interface()

void entropic::AdapterManager::set_hook_interface ( const HookInterface &  hooks)

Set hook interface for ON_ADAPTER_SWAP dispatch.

Set hook dispatch interface.

Parameters
hooksHook dispatch interface.
Version
1.9.2
Parameters
hooksHook interface from facade.

Definition at line 428 of file adapter_manager.cpp.

◆ state()

AdapterState entropic::AdapterManager::state ( const std::string &  name) const

Get adapter state.

Parameters
nameAdapter identifier.
Returns
AdapterState. COLD if not found.
Version
1.9.2
Parameters
nameAdapter identifier.
Returns
AdapterState. COLD if not found.

Definition at line 370 of file adapter_manager.cpp.

◆ swap()

bool entropic::AdapterManager::swap ( const std::string &  name,
llama_context *  ctx 
)

Swap to a different adapter atomically.

Parameters
nameTarget adapter (must be WARM).
ctxllama_context to swap on.
Returns
true on success.
Version
1.9.2

Deactivates current HOT adapter and activates the target. Fires ON_ADAPTER_SWAP hook which can cancel the operation.

Parameters
nameTarget adapter (must be WARM).
ctxllama_context to swap on.
Returns
true on success.

Definition at line 241 of file adapter_manager.cpp.

◆ unload()

void entropic::AdapterManager::unload ( const std::string &  name,
llama_context *  ctx 
)

Unload adapter (any state -> COLD).

Parameters
nameAdapter identifier.
ctxContext to deactivate from (if HOT). May be nullptr.
Version
1.9.2

If HOT, clears from context first. Frees via llama_adapter_lora_free(). No-op if not found.

Parameters
nameAdapter identifier.
ctxContext to clear from (if HOT). May be nullptr.

Definition at line 125 of file adapter_manager.cpp.

◆ unload_all()

void entropic::AdapterManager::unload_all ( )

Free every loaded adapter handle (gh#58 close-out, v2.3.0).

Called by ~ModelOrchestrator after backends are torn down, so by the time this runs the llama_contexts that referenced these adapters are already destroyed and there is no HOT-attachment to clear. Safe to call repeatedly.

Without this, the bare AdapterManager destructor (defaulted) dropped raw llama_adapter_lora* handles on engine destroy, leaking each loaded LoRA's VRAM until process exit — the same pattern that caused gh#58's two-handle GPU regression for LlamaCppBackend in v2.2.8.

@utility

Version
2.3.0

Called from ~ModelOrchestrator after backends are torn down. Skips the HOT-deactivation path because the contexts that were holding the adapter references are already gone — calling clear_adapters(ctx) against a destroyed context would be a use-after-free.

Definition at line 343 of file adapter_manager.cpp.

◆ unload_all_for_model()

void entropic::AdapterManager::unload_all_for_model ( llama_model *  model,
llama_context *  ctx 
)

Unload all adapters for a given base model.

Unload all adapters targeting a specific base model.

Parameters
modelThe base model being unloaded.
ctxContext to deactivate from. May be nullptr.
Version
1.9.2

Called when base model transitions out of ACTIVE/WARM. Prevents dangling adapter handles.

Parameters
modelThe base model being unloaded.
ctxContext to clear from. May be nullptr.

Definition at line 292 of file adapter_manager.cpp.


The documentation for this class was generated from the following files: