Entropic 2.3.8
Local-first agentic inference engine
Loading...
Searching...
No Matches
generation_result.h
Go to the documentation of this file.
1// SPDX-License-Identifier: Apache-2.0
13#pragma once
14
17
18#include <string>
19#include <vector>
20
21namespace entropic {
22
31 std::string content;
32 std::string raw_content;
33 std::vector<ToolCall> tool_calls;
34 std::string finish_reason = "stop";
35 int token_count = 0;
36 double generation_time_ms = 0.0;
37
38 /* ── Orchestrator timing (populated by ModelOrchestrator) ── */
39 double routing_ms = 0.0;
40 double swap_ms = 0.0;
41 double total_ms = 0.0;
42
43 /* ── v1.9.7: Throughput + time cap metadata ── */
44
48 double throughput_tok_s = 0.0;
49
53 bool time_limited = false;
54
59
60 /* ── v1.9.13: Multi-sequence tracking ── */
61
66 int seq_id = 0;
67
68 /* ── Error state (for partial results on failure) ── */
70 std::string error_message;
71
78 bool ok() const { return error_code == ENTROPIC_OK; }
79};
80
81} // namespace entropic
Error types for cross-.so error reporting.
entropic_error_t
Error codes returned by all C API functions.
Definition error.h:35
@ ENTROPIC_OK
Success.
Definition error.h:36
Activate model on GPU (WARM → ACTIVE).
Result of a single generation call.
bool time_limited
true if generation was terminated by time limit rather than EOS/stop sequence/max_tokens.
entropic_error_t error_code
Error code (ENTROPIC_OK if no error)
double swap_ms
Model swap time.
double routing_ms
Router classification time.
bool ok() const
True if generation completed without error.
double generation_time_ms
Wall-clock generation time.
int seq_id
Sequence identifier for multi-sequence backends.
double throughput_tok_s
Measured throughput for this generation (tok/s).
std::string raw_content
Raw model output before adapter processing.
std::string finish_reason
Finish reason: "stop", "length", "error".
int original_max_tokens
Original max_tokens before auto-adaptation reduced it.
std::string content
Generated text (cleaned by adapter)
std::vector< ToolCall > tool_calls
Tool calls parsed from content.
std::string error_message
Error description (empty if no error)
int token_count
Generated token count.
double total_ms
Total end-to-end time.
Tool call and tool result types.