Entropic 2.3.8
Local-first agentic inference engine
Loading...
Searching...
No Matches
orchestrator.h File Reference

ModelOrchestrator — multi-model lifecycle and routing. More...

#include <entropic/inference/backend.h>
#include <entropic/inference/adapter_manager.h>
#include <entropic/inference/grammar_registry.h>
#include <entropic/inference/profile_registry.h>
#include <entropic/inference/secondary_model_loader.h>
#include <entropic/inference/throughput_tracker.h>
#include <entropic/inference/adapters/adapter_base.h>
#include <entropic/types/config.h>
#include <entropic/types/error.h>
#include <chrono>
#include <functional>
#include <memory>
#include <mutex>
#include <string>
#include <unordered_map>
#include <unordered_set>
#include <vector>
Include dependency graph for orchestrator.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  entropic::RoutingResult
 Result metadata from a routing decision. More...
 
class  entropic::ModelOrchestrator
 Multi-model lifecycle and routing orchestrator. More...
 
struct  entropic::ModelOrchestrator::SpeculativeCompatInfo
 Result of a speculative-decoding compatibility check. More...
 

Namespaces

namespace  entropic
 Activate model on GPU (WARM → ACTIVE).
 

Detailed Description

ModelOrchestrator — multi-model lifecycle and routing.

Responsibilities
  • Model pool deduplication (one backend per unique .gguf path)
  • Per-tier adapters (identity-specific, independent of shared backend)
  • VRAM lifecycle: one ACTIVE main tier, router always ACTIVE
  • Tier routing via router model (raw completion, digit classification)
  • Handoff rule enforcement
  • Swap logic: keep_warm → WARM, otherwise → COLD
Thread safety
Uses std::mutex for tier swap operations. State queries on individual backends are lock-free (atomic). Generation calls are not serialized.

Internal to inference .so.

Version
1.8.2

Definition in file orchestrator.h.