Entropic 2.3.8
Local-first agentic inference engine
Loading...
Searching...
No Matches
secondary_model_loader.h
Go to the documentation of this file.
1// SPDX-License-Identifier: Apache-2.0
27#pragma once
28
31
32#include <memory>
33#include <mutex>
34#include <string>
35#include <unordered_map>
36#include <vector>
37
38namespace entropic {
39
56public:
69 bool ensure_loaded(const std::string& role, const ModelConfig& config);
70
79 InferenceBackend* get(const std::string& role) const;
80
93 std::shared_ptr<InferenceBackend> get_shared(
94 const std::string& role) const;
95
103 bool release_role(const std::string& role);
104
114 bool is_loaded(const std::string& role) const;
115
123 std::vector<std::string> loaded_roles() const;
124
136
145 void shutdown();
146
147private:
151 mutable std::mutex slots_mutex_;
152
156 std::unordered_map<std::string, std::shared_ptr<InferenceBackend>>
157 slots_;
158
161 std::unordered_map<std::string, std::string> slot_paths_;
162};
163
164} // namespace entropic
Concrete base class for inference backends (80% logic).
Definition backend.h:69
Role-keyed lifecycle manager for non-primary models.
std::shared_ptr< InferenceBackend > get_shared(const std::string &role) const
Get the backend for a role as a shared_ptr.
void clear_all_prompt_caches()
Fanout: clear prompt/KV cache on every loaded backend.
bool is_loaded(const std::string &role) const
Check whether a role is currently loaded and active.
std::vector< std::string > loaded_roles() const
Names of all roles with a currently-loaded backend.
bool release_role(const std::string &role)
Unload and drop a role.
InferenceBackend * get(const std::string &role) const
Get the backend for a role.
bool ensure_loaded(const std::string &role, const ModelConfig &config)
Lazily load and activate a model for a role.
Configuration structs with defaults.
InferenceBackend concrete base class.
Activate model on GPU (WARM → ACTIVE).
Model configuration for a single tier.
Definition config.h:148