Named GPU resource profile for controlling inference hardware knobs.
More...
#include <entropic/types/config.h>
|
| std::string | name |
| | Profile name ("maximum", "balanced", "background", "minimal")
|
| |
| int | n_batch = 512 |
| | Batch size for prompt processing (1-2048)
|
| |
| int | n_threads = 0 |
| | CPU threads for generation (0 = auto-detect)
|
| |
| int | n_threads_batch = 0 |
| | CPU threads for batch processing (0 = use n_threads)
|
| |
| std::string | description |
| | Human-readable description.
|
| |
Named GPU resource profile for controlling inference hardware knobs.
Profiles are applied at the LlamaCppBackend level before each generation call. They control batch size and thread counts without requiring a model reload. Profile switching is sub-millisecond.
- Bundled profiles
- Four profiles ship with the engine: maximum, balanced, background, minimal. Consumers can register custom profiles at runtime via ProfileRegistry.
- Version
- 1.9.7
Definition at line 215 of file config.h.
◆ description
| std::string entropic::GPUResourceProfile::description |
Human-readable description.
Definition at line 220 of file config.h.
◆ n_batch
| int entropic::GPUResourceProfile::n_batch = 512 |
Batch size for prompt processing (1-2048)
Definition at line 217 of file config.h.
◆ n_threads
| int entropic::GPUResourceProfile::n_threads = 0 |
CPU threads for generation (0 = auto-detect)
Definition at line 218 of file config.h.
◆ n_threads_batch
| int entropic::GPUResourceProfile::n_threads_batch = 0 |
CPU threads for batch processing (0 = use n_threads)
Definition at line 219 of file config.h.
◆ name
| std::string entropic::GPUResourceProfile::name |
Profile name ("maximum", "balanced", "background", "minimal")
Definition at line 216 of file config.h.
The documentation for this struct was generated from the following file: