|
Entropic 2.3.8
Local-first agentic inference engine
|
Pure C interface contract for inference backends. More...


Go to the source code of this file.
Typedefs | |
| typedef struct entropic_inference_backend * | entropic_inference_backend_t |
| Opaque handle to an inference backend instance. | |
Functions | |
| entropic_error_t | entropic_inference_load (entropic_inference_backend_t backend, const char *config_json) |
| Load a model from config (COLD → WARM). | |
| entropic_error_t | entropic_inference_activate (entropic_inference_backend_t backend) |
| Activate model on GPU (WARM → ACTIVE). | |
| entropic_error_t | entropic_inference_deactivate (entropic_inference_backend_t backend) |
| Deactivate model (ACTIVE → WARM). | |
| entropic_error_t | entropic_inference_unload (entropic_inference_backend_t backend) |
| Unload model completely (→ COLD). | |
| int | entropic_inference_state (entropic_inference_backend_t backend) |
| Query model state (lock-free). | |
| entropic_error_t | entropic_inference_generate (entropic_inference_backend_t backend, const char *messages_json, const char *params_json, char **result_json) |
| Generate a response from messages (batch mode). | |
| entropic_error_t | entropic_inference_generate_streaming (entropic_inference_backend_t backend, const char *messages_json, const char *params_json, void(*on_token)(const char *token, size_t len, void *user_data), void *user_data, int *cancel_flag) |
| Generate with streaming token callback. | |
| entropic_error_t | entropic_inference_complete (entropic_inference_backend_t backend, const char *prompt, const char *params_json, char **result_json) |
| Raw text completion without chat template. | |
| int | entropic_inference_count_tokens (entropic_inference_backend_t backend, const char *text, size_t text_len) |
| Count tokens in text using model's tokenizer. | |
| void | entropic_inference_destroy (entropic_inference_backend_t backend) |
| Destroy backend instance and free all resources. | |
| void | entropic_inference_free (void *ptr) |
| Free a string allocated by the inference backend. | |
| int | entropic_inference_supports (entropic_inference_backend_t backend, int capability) |
| Query backend capability. | |
| uint32_t | entropic_inference_capabilities (entropic_inference_backend_t backend) |
| Get all supported capabilities as bitmask. | |
| char * | entropic_inference_info (entropic_inference_backend_t backend) |
| Get backend metadata as JSON. | |
| entropic_error_t | entropic_inference_save_state (entropic_inference_backend_t backend, int seq_id, void **buffer, size_t *buffer_size) |
| Save model state for a sequence. | |
| entropic_error_t | entropic_inference_restore_state (entropic_inference_backend_t backend, int seq_id, const void *buffer, size_t buffer_size) |
| Restore model state for a sequence. | |
| entropic_error_t | entropic_inference_clear_state (entropic_inference_backend_t backend, int seq_id) |
| Clear/reset model state. | |
| entropic_error_t | entropic_inference_generate_seq (entropic_inference_backend_t backend, int seq_id, const char *messages_json, const char *params_json, char **result_json) |
| Generate with explicit sequence ID. | |
| entropic_error_t | entropic_inference_generate_streaming_seq (entropic_inference_backend_t backend, int seq_id, const char *messages_json, const char *params_json, void(*on_token)(const char *token, size_t len, void *user_data), void *user_data, int *cancel_flag) |
| Streaming generation with explicit sequence ID. | |
| void | entropic_inference_log_to_file (const char *path) |
| Redirect llama/ggml logs to a file. | |
| void | entropic_inference_log_silence (void) |
| Silence all llama/ggml log output. | |
Pure C interface contract for inference backends.
This is the .so boundary for inference. All types are C-safe: opaque handles, error codes, function pointers, const char* JSON strings. No C++ types cross this boundary.
Definition in file i_inference_backend.h.
| typedef struct entropic_inference_backend* entropic_inference_backend_t |
Opaque handle to an inference backend instance.
Definition at line 42 of file i_inference_backend.h.
| entropic_error_t entropic_inference_activate | ( | entropic_inference_backend_t | backend | ) |
Activate model on GPU (WARM → ACTIVE).
If COLD, loads first (convenience path).
| backend | Backend handle. |
Activate model on GPU (WARM → ACTIVE).
| backend | Opaque backend handle. |
Definition at line 185 of file inference_c_api.cpp.
| uint32_t entropic_inference_capabilities | ( | entropic_inference_backend_t | backend | ) |
Get all supported capabilities as bitmask.
| backend | Backend handle. |
| entropic_error_t entropic_inference_clear_state | ( | entropic_inference_backend_t | backend, |
| int | seq_id | ||
| ) |
Clear/reset model state.
| backend | Backend handle. |
| seq_id | Sequence identifier, or -1 for all sequences. |
| entropic_error_t entropic_inference_complete | ( | entropic_inference_backend_t | backend, |
| const char * | prompt, | ||
| const char * | params_json, | ||
| char ** | result_json | ||
| ) |
Raw text completion without chat template.
Used by the router for digit-based classification. The prompt is passed directly to the model without any chat template formatting.
| backend | Backend handle. | |
| prompt | Raw prompt string. | |
| params_json | Generation parameters as JSON. | |
| [out] | result_json | Output: JSON result string. Caller frees. |
Raw text completion without chat template.
| backend | Opaque backend handle. |
| prompt | Null-terminated prompt string. |
| params_json | JSON-serialized GenerationParams. |
| result_json | Out-param: newly allocated result JSON. |
Definition at line 330 of file inference_c_api.cpp.
| int entropic_inference_count_tokens | ( | entropic_inference_backend_t | backend, |
| const char * | text, | ||
| size_t | text_len | ||
| ) |
Count tokens in text using model's tokenizer.
Returns exact count when model is loaded (WARM or ACTIVE). Returns len/4 estimate when model is COLD.
| backend | Backend handle. |
| text | Text to tokenize. |
| text_len | Length of text in bytes. |
Count tokens in text using model's tokenizer.
| backend | Opaque backend handle. |
| text | Pointer to UTF-8 text bytes. |
| text_len | Length of the text in bytes. |
Definition at line 356 of file inference_c_api.cpp.
| entropic_error_t entropic_inference_deactivate | ( | entropic_inference_backend_t | backend | ) |
Deactivate model (ACTIVE → WARM).
No-op if not ACTIVE.
| backend | Backend handle. |
Deactivate model (ACTIVE → WARM).
| backend | Opaque backend handle. |
Definition at line 203 of file inference_c_api.cpp.
| void entropic_inference_destroy | ( | entropic_inference_backend_t | backend | ) |
Destroy backend instance and free all resources.
| backend | Backend handle. NULL is a safe no-op. |
Destroy backend instance and free all resources.
| backend | Opaque backend handle (must not be used after this call). @req REQ-INFER-017 |
Definition at line 374 of file inference_c_api.cpp.
| void entropic_inference_free | ( | void * | ptr | ) |
Free a string allocated by the inference backend.
| ptr | Pointer returned by generate, complete, or similar. NULL is a safe no-op. |
Free a string allocated by the inference backend.
| ptr | Pointer returned by a previous generate/complete call. @utility |
Definition at line 386 of file inference_c_api.cpp.
| entropic_error_t entropic_inference_generate | ( | entropic_inference_backend_t | backend, |
| const char * | messages_json, | ||
| const char * | params_json, | ||
| char ** | result_json | ||
| ) |
Generate a response from messages (batch mode).
Requires ACTIVE state. Returns ENTROPIC_ERROR_INVALID_STATE otherwise.
| backend | Backend handle. | |
| messages_json | JSON array of message objects. | |
| params_json | JSON object of GenerationParams fields. | |
| [out] | result_json | Output: JSON result string. Caller frees with entropic_inference_free(). |
Generate a response from messages (batch mode).
| backend | Opaque backend handle. |
| messages_json | JSON-serialized message list. |
| params_json | JSON-serialized GenerationParams. |
| result_json | Out-param: newly allocated result JSON (free with entropic_inference_free). |
Definition at line 257 of file inference_c_api.cpp.
| entropic_error_t entropic_inference_generate_seq | ( | entropic_inference_backend_t | backend, |
| int | seq_id, | ||
| const char * | messages_json, | ||
| const char * | params_json, | ||
| char ** | result_json | ||
| ) |
Generate with explicit sequence ID.
| backend | Backend handle. | |
| seq_id | Sequence identifier (0 for single-sequence backends). | |
| messages_json | JSON string of messages. | |
| params_json | JSON string of generation params. | |
| [out] | result_json | Output: JSON result string. Caller frees. |
| entropic_error_t entropic_inference_generate_streaming | ( | entropic_inference_backend_t | backend, |
| const char * | messages_json, | ||
| const char * | params_json, | ||
| void(*)(const char *token, size_t len, void *user_data) | on_token, | ||
| void * | user_data, | ||
| int * | cancel_flag | ||
| ) |
Generate with streaming token callback.
Requires ACTIVE state. The on_token callback is invoked for each generated token. The token pointer is valid only for the duration of the callback — caller must copy if retention is needed.
| backend | Backend handle. |
| messages_json | JSON array of message objects. |
| params_json | JSON object of GenerationParams fields. |
| on_token | Callback for each token. Must not call back into API. |
| user_data | Opaque pointer forwarded to on_token. |
| cancel_flag | Pointer to int flag. Caller sets to 1 to cancel. Checked between tokens — cancellation latency is one token. May be NULL if cancellation is not needed. |
Generate with streaming token callback.
| backend | Opaque backend handle. |
| messages_json | JSON-serialized message list. |
| params_json | JSON-serialized GenerationParams. |
| on_token | Callback fired per token (token bytes, length, user_data). |
| user_data | Opaque pointer passed to on_token. |
| cancel_flag | Optional pointer; setting *cancel_flag to non-zero stops generation. |
Definition at line 290 of file inference_c_api.cpp.
| entropic_error_t entropic_inference_generate_streaming_seq | ( | entropic_inference_backend_t | backend, |
| int | seq_id, | ||
| const char * | messages_json, | ||
| const char * | params_json, | ||
| void(*)(const char *token, size_t len, void *user_data) | on_token, | ||
| void * | user_data, | ||
| int * | cancel_flag | ||
| ) |
Streaming generation with explicit sequence ID.
| backend | Backend handle. |
| seq_id | Sequence identifier. |
| messages_json | JSON string of messages. |
| params_json | JSON string of generation params. |
| on_token | Callback for each token. Must not call back into API. |
| user_data | Opaque pointer forwarded to on_token. |
| cancel_flag | Pointer to int flag. Set to 1 to cancel. May be NULL. |
| char * entropic_inference_info | ( | entropic_inference_backend_t | backend | ) |
Get backend metadata as JSON.
| backend | Backend handle. |
| entropic_error_t entropic_inference_load | ( | entropic_inference_backend_t | backend, |
| const char * | config_json | ||
| ) |
Load a model from config (COLD → WARM).
| backend | Backend handle. |
| config_json | JSON string of ModelConfig fields. |
Load a model from config (COLD → WARM).
| backend | Opaque backend handle from entropic_create_inference_backend(). |
| config_json | JSON-serialized ModelConfig string. |
Definition at line 160 of file inference_c_api.cpp.
| void entropic_inference_log_silence | ( | void | ) |
Silence all llama/ggml log output.
Silence all llama/ggml log output.
Definition at line 521 of file inference_c_api.cpp.
| void entropic_inference_log_to_file | ( | const char * | path | ) |
Redirect llama/ggml logs to a file.
Opens the file (truncating), redirects all llama.cpp and ggml log output to it. Call with NULL to silence logs entirely.
| path | Log file path (null-terminated), or NULL to silence. |
Redirect llama/ggml logs to a file.
First-call-wins under multi-handle (gh#58): a second handle whose canonical path differs is rejected with a warning rather than clobbering the live redirect. Same-path re-call truncates and reopens (preserves pre-v2.2.5 reset-on-recall behavior).
Definition at line 486 of file inference_c_api.cpp.
| entropic_error_t entropic_inference_restore_state | ( | entropic_inference_backend_t | backend, |
| int | seq_id, | ||
| const void * | buffer, | ||
| size_t | buffer_size | ||
| ) |
Restore model state for a sequence.
| backend | Backend handle. |
| seq_id | Sequence identifier. |
| buffer | State data from previous save_state call. |
| buffer_size | Size of state data. |
| entropic_error_t entropic_inference_save_state | ( | entropic_inference_backend_t | backend, |
| int | seq_id, | ||
| void ** | buffer, | ||
| size_t * | buffer_size | ||
| ) |
Save model state for a sequence.
| backend | Backend handle. | |
| seq_id | Sequence identifier (0 for single-sequence). | |
| [out] | buffer | Output: pointer to state data. Caller frees with entropic_inference_free(). |
| [out] | buffer_size | Output: size of state data in bytes. |
| int entropic_inference_state | ( | entropic_inference_backend_t | backend | ) |
Query model state (lock-free).
| backend | Backend handle. |
Query model state (lock-free).
| backend | Opaque backend handle. |
Definition at line 241 of file inference_c_api.cpp.
| int entropic_inference_supports | ( | entropic_inference_backend_t | backend, |
| int | capability | ||
| ) |
Query backend capability.
| backend | Backend handle. |
| capability | Capability enum value (see BackendCapability). |
| entropic_error_t entropic_inference_unload | ( | entropic_inference_backend_t | backend | ) |
Unload model completely (→ COLD).
Releases all RAM + VRAM.
| backend | Backend handle. |
Unload model completely (→ COLD).
| backend | Opaque backend handle. |
Definition at line 222 of file inference_c_api.cpp.