|
Entropic 2.3.8
Local-first agentic inference engine
|
Gemma 4 chat adapter (v2.1.9, gh#46). More...
#include <entropic/inference/adapters/adapter_base.h>

Go to the source code of this file.
Classes | |
| class | entropic::Gemma4Adapter |
| Gemma 4 chat adapter (covers A4B / E4B / E2B variants). More... | |
Namespaces | |
| namespace | entropic |
| Activate model on GPU (WARM → ACTIVE). | |
Gemma 4 chat adapter (v2.1.9, gh#46).
Covers the Gemma 4 instruct family: 26B-A4B, E4B, E2B variants. All three share the same chat template structure and channel-based thinking convention, so a single adapter class handles them.
<|channel>thought\n... <channel|> markers for the reasoning/answer split. We do not replicate the template here — chat_format() returns the empty string so llama.cpp applies the GGUF-stored template directly.<|im_start|>tool_call, JSON body, plain </tool_call> close (an asymmetric pair). Both sizes scored 0/6 completion before the fix because none of the prior open variants matched that header. The adapter parses with this layered strategy:parse_tagged_tool_calls, which now accepts <tool_call>, <|tool_call>, <|tool_call|>, and the <|im_start|>tool_call channel header (gh#69) — all closed by </tool_call>."name" — base class parse_bare_json_tool_calls.Surface scrubbing (parse_tool_calls in the .cpp) removes the channel block and any stray <|im_start|>tool_call / <|im_end|> turn markers so they never reach the assistant-visible body.
<think>...</think> (common llama.cpp convention for thinking models), the base-class strip_think_blocks cleans them. If a different marker shape leaks through, that is addressed in the same model-test loop.Internal to inference .so.
Definition in file gemma4_adapter.h.