60 void on_token(
const char* chunk,
size_t len);
76 void* raw_ud_ =
nullptr;
77 bool in_think_ =
false;
79 std::string utf8_buf_;
87 void emit_utf8_safe(
const char* data,
size_t len);
94 void process_byte(
char c);
Streaming filter that removes <think> blocks from output.
void set_raw_callback(TokenCallback cb, void *ud)
Set optional raw callback (receives ALL tokens unfiltered).
void on_token(const char *chunk, size_t len)
Process a chunk of tokens.
void flush()
Flush any buffered partial tag content.
Activate model on GPU (WARM → ACTIVE).
void(*)(const char *, size_t, void *) TokenCallback
Token callback type matching the C API signature.