ModelOrchestrator implementation. More...

#include <entropic/inference/orchestrator.h>
#include <entropic/inference/speculative_compat.h>
#include <entropic/interfaces/i_inference_backend.h>
#include <entropic/types/logging.h>
#include "llama_cpp_backend.h"
#include "adapters/adapter_registry.h"
#include <llama.h>
#include <nlohmann/json.hpp>
#include <cstdlib>
#include <filesystem>

Include dependency graph for orchestrator.cpp:

Go to the source code of this file.

Namespaces
namespace	entropic
	Activate model on GPU (WARM → ACTIVE).

Functions
static void	entropic::apply_adapter_parse (ChatAdapter *adapter, GenerationResult &result)
	Run the tier's adapter over a result to split tool calls.

static void	entropic::log_orchestration (const GenerationResult &result, const std::string &selected, const std::string &adapter_name, const GenerationParams &params, double routing_ms, double swap_ms)
	Log the per-orchestration tier/adapter/timing summary.

static llama_model *	entropic::resolve_target_model (const std::shared_ptr< InferenceBackend > &tier_backend)
	Resolve the active main-tier llama_model* for compat lookup.

static std::string	entropic::normalize_grammar_key (const std::string &grammar_value)
	Normalize a frontmatter grammar value to a registry key.

static nlohmann::json	entropic::make_residency_entry (const std::string &name, const std::filesystem::path &path, int context_length, size_t footprint, int vram_reserve_mb, long long last_ms)
	JSON serialization of the current residency set.

Detailed Description

ModelOrchestrator implementation.

Model pool deduplication, per-tier adapters, VRAM lifecycle, tier routing via router complete(), swap logic, and grammar registry integration.

Version: 1.9.3

Definition in file orchestrator.cpp.

Namespaces

Functions

Detailed Description