Gemma 4 chat adapter (v2.1.9, gh#46). More...

#include <entropic/inference/adapters/adapter_base.h>

Include dependency graph for gemma4_adapter.h:

This graph shows which files directly or indirectly include this file:

Classes
class	entropic::Gemma4Adapter
	Gemma 4 chat adapter (covers A4B / E4B / E2B variants). More...

Namespaces
namespace	entropic
	Activate model on GPU (WARM → ACTIVE).

Detailed Description

Gemma 4 chat adapter (v2.1.9, gh#46).

Covers the Gemma 4 instruct family: 26B-A4B, E4B, E2B variants. All three share the same chat template structure and channel-based thinking convention, so a single adapter class handles them.

Chat template: The GGUF-embedded template uses Gemma's <|channel>thought\n... <channel|> markers for the reasoning/answer split. We do not replicate the template here — chat_format() returns the empty string so llama.cpp applies the GGUF-stored template directly.

Tool-call format (resolved at v2.3.8, gh#69)

Empirical capture from gemma-4-E2B-it / E4B-it confirmed Gemma 4 emits tool calls inside a ChatML-style channel: opening header <|im_start|>tool_call, JSON body, plain </tool_call> close (an asymmetric pair). Both sizes scored 0/6 completion before the fix because none of the prior open variants matched that header. The adapter parses with this layered strategy:

Primary: tagged JSON via base parse_tagged_tool_calls, which now accepts <tool_call>, <|tool_call>, <|tool_call|>, and the <|im_start|>tool_call channel header (gh#69) — all closed by </tool_call>.
Fallback: bare-JSON lines containing "name" — base class parse_bare_json_tool_calls.
Final fallback: malformed-JSON recovery via the base class.

Surface scrubbing (parse_tool_calls in the .cpp) removes the channel block and any stray <|im_start|>tool_call / <|im_end|> turn markers so they never reach the assistant-visible body.

Reasoning: Gemma 4 emits a channel-tagged thought block. If the GGUF template surfaces those tokens as <think>...</think> (common llama.cpp convention for thinking models), the base-class strip_think_blocks cleans them. If a different marker shape leaks through, that is addressed in the same model-test loop.

Internal to inference .so.

Version: 2.1.9

Definition in file gemma4_adapter.h.

Classes

Namespaces

Detailed Description