title: AI Agents Module
The AI Agents module is agnostic to the library used. The SDK will instrument existing AI agents in certain frameworks or libraries (at the time of writing those are openai-agents in Python and Vercel AI in Javascript). You may need to manually annotate spans for other libraries.
Spans Conventions
For your AI agents data to show up in the Sentry AI Agents Insights, at least one of the AI spans needs to be created and have well-defined names and data attributes. If the required data (marked with MUST or required) is missing, the data will not show up in the Agents dashbboard.
We try to follow v1.36.0 of the OpenTelemetry Semantic Conventions for Generative AI as close as possible. Being 100% compatible is not yet possible, because OpenTelemetry has "Span Events" which Sentry does not support. The input from/output to an AI model is stored in span events in OpenTelemetry. Since this is not possible in Sentry, we add this data onto span attributes as a list.
<Alert level="success" title="Hint"> The [Sentry Conventions](https://github.com/getsentry/sentry-conventions/) have all the detailed specifications for `"gen_ai.*"` span attributes.Sentry Conventions is the single source of truth.
</Alert>Create Agent Span
Describes GenAI agent creation and is usually applicable when working with remote agent services.
- Span
opSHOULD be"gen_ai.create_agent". - Span
nameSHOULD be"create_agent {gen_ai.agent.name}". (e.g."create_agent Weather Agent") - Attribute
gen_ai.operation.nameMUST be"create_agent". - Attribute
gen_ai.agent.nameSHOULD be set to the agents name. (e.g."Weather Agent") - If provided, the attribute
gen_ai.request.modelMUST be the agent's default request model. (e.g."gpt-4o") - If relevant, the
gen_ai.pipeline.nameattribute SHOULD be set to the name of the AI workflow, pipeline or chain within which the agent operates. (e.g."weather-pipeline") - All Common Span Attributes SHOULD be set (all
requiredcommon attributes MUST be set).
Invoke Agent Span
Describes AI agent invocation. Agent invocations represent operations that can include multiple model calls, or some auxiliary work that goes beyond transforming the model input and output.
- Span
opSHOULD be"gen_ai.invoke_agent". - Span
nameSHOULD be"invoke_agent {gen_ai.agent.name}". (e.g."invoke_agent Weather Agent") [8] - Attribute
gen_ai.operation.nameMUST be"invoke_agent". - Attribute
gen_ai.agent.nameSHOULD be set to the agents name. (e.g."Weather Agent") - If provided, the attribute
gen_ai.request.modelMUST be the agent's default request model. (e.g."gpt-4o") - If relevant, the
gen_ai.pipeline.nameattribute SHOULD be set to the name of the AI workflow, pipeline or chain within which the agent operates. (e.g."weather-pipeline") - All Common Span Attributes SHOULD be set (all
requiredcommon attributes MUST be set).
Additional attributes on the span:
Request Data
| Attribute | Type | Requirement Level | Description | Example |
|---|---|---|---|---|
gen_ai.input.messages | string | optional | List of dictionaries describing the messages (prompts) given to the agent. [0], [1], [6], [7], [9] | '[{"role": "user", "parts": [{"type": "text", "content": "..."}]}]' |
gen_ai.tool.definitions | string | optional | List of dictionaries describing the available tools. [0] | '[{"name": "random_number", "description": "..."}, ...]' |
gen_ai.system_instructions | string | optional | The system instructions passed to the model. | "You are a helpful assistant." |
gen_ai.request.max_tokens | int | optional | Model configuration parameter. | 500 |
gen_ai.request.seed | string | optional | Seed for reproducible outputs. | "12345" |
gen_ai.request.frequency_penalty | float | optional | Model configuration parameter. | 0.5 |
gen_ai.request.presence_penalty | float | optional | Model configuration parameter. | 0.5 |
gen_ai.request.temperature | float | optional | Model configuration parameter. | 0.1 |
gen_ai.request.top_p | float | optional | Model configuration parameter. | 0.7 |
gen_ai.request.top_k | int | optional | Limits model to K most likely next tokens. | 40 |
gen_ai.request.messages | string | optional | Deprecated. Use gen_ai.input.messages instead. List of dictionaries describing the messages (prompts) given to the agent. [0] | '[{"role": "system", "content": "..."}, ...]' |
gen_ai.request.available_tools | string | optional | Deprecated. Use gen_ai.tool.definitions instead. List of dictionaries describing the available tools. [0] | '[{"name": "random_number", "description": "..."}, ...]' |
Response Data
| Attribute | Type | Requirement Level | Description | Example |
|---|---|---|---|---|
gen_ai.output.messages | string | optional | Stringified array of message objects representing the model's output. [0], [1] | '[{"role": "assistant", "parts": [{"type": "text", "content": "..."}]}]' |
gen_ai.response.streaming | boolean | optional | Whether response was streamed asynchronously. | true |
gen_ai.response.text | string | optional | Deprecated. Use gen_ai.output.messages instead. The text representation of the agents response. | "The weather in Paris is rainy" |
gen_ai.response.tool_calls | string | optional | Deprecated. Use gen_ai.output.messages instead. The tool calls in the model's response. [0] | '[{"name": "random_number", "type": "function_call", "arguments": "..."}]' |
Token Usage Data
| Attribute | Type | Requirement Level | Description | Example |
|---|---|---|---|---|
gen_ai.usage.input_tokens | int | optional | The number of tokens used in the AI input (prompt), including cached tokens. [2] | 60 |
gen_ai.usage.input_tokens.cached | int | optional | The number of cached tokens used in the AI input (prompt). | 50 |
gen_ai.usage.input_tokens.cache_write | int | optional | Tokens written to cache when processing input. | 20 |
gen_ai.usage.output_tokens | int | optional | The number of tokens used in the AI output, including reasoning tokens. [3] | 130 |
gen_ai.usage.output_tokens.reasoning | int | optional | The number of tokens used for reasoning. | 30 |
gen_ai.usage.total_tokens | int | optional | The sum of gen_ai.usage.input_tokens and gen_ai.usage.output_tokens. | 190 |
Cost Data
| Attribute | Type | Requirement Level | Description | Example |
|---|---|---|---|---|
gen_ai.cost.input_tokens | double | optional | Cost of input tokens in USD (without cached). | 0.005 |
gen_ai.cost.output_tokens | double | optional | Cost of output tokens in USD (without reasoning). | 0.015 |
gen_ai.cost.total_tokens | double | optional | Total cost for tokens used. | 0.020 |
- [0]: Span attributes only allow primitive data types (like
int,float,boolean,string). This means you need to use a stringified version of a list of dictionaries. Do NOT set the object/array[{"foo": "bar"}]but rather the string'[{"foo": "bar"}]'(must be parsable JSON). - [1]: Messages use the format
{role, parts}wherepartsis an array of typed objects:[{"role": "user", "parts": [{"type": "text", "content": "..."}]}]. Therolemust be"user","assistant","tool", or"system". For backwards compatibility, the legacy format{role, content}(e.g.[{"role": "user", "content": "..."}]) is also accepted. - [2]: Cached tokens are a subset of input tokens;
gen_ai.usage.input_tokensincludesgen_ai.usage.input_tokens.cached. - [3]: Reasoning tokens are a subset of output tokens;
gen_ai.usage.output_tokensincludesgen_ai.usage.output_tokens.reasoning.
AI Client Span
This span represents a request to an AI model or service that generates a response or requests a tool call based on the input prompt.
- Span
opSHOULD be"gen_ai.{gen_ai.operation.name}". (e.g."gen_ai.chat") - Span
nameSHOULD be{gen_ai.operation.name} {gen_ai.request.model}". (e.g."chat o3-mini") - Attribute
gen_ai.operation.nameMUST be"chat","embeddings","generate_content"or"text_completion". [4] - Attribute
gen_ai.request.modelMUST be the requested model. (e.g."gpt-4o") - Attribute
gen_ai.response.modelMUST be the concrete response model. (e.g."gpt-4o-2024-08-06") - If the request originates from an agent, the
gen_ai.agent.nameattribute SHOULD be set to the name of the agent. (e.g."Weather Agent") - If relevant, the
gen_ai.pipeline.nameattribute SHOULD be set to the name of the AI workflow, pipeline or chain within which the agent operates. (e.g."weather-pipeline") - All Common Span Attributes SHOULD be set (all
requiredcommon attributes MUST be set).
Additional attributes on the span:
Request Data
| Attribute | Type | Requirement Level | Description | Example |
|---|---|---|---|---|
gen_ai.input.messages | string | optional | List of dictionaries describing the messages (prompts) sent to the LLM. [0], [1], [6], [7], [9] | '[{"role": "user", "parts": [{"type": "text", "content": "..."}]}]' |
gen_ai.tool.definitions | string | optional | List of dictionaries describing the available tools. [0] | '[{"name": "random_number", "description": "..."}, ...]' |
gen_ai.system_instructions | string | optional | The system instructions passed to the model. | "You are a helpful assistant." |
gen_ai.request.max_tokens | int | optional | Model configuration parameter. | 500 |
gen_ai.request.seed | string | optional | Seed for reproducible outputs. | "12345" |
gen_ai.request.frequency_penalty | float | optional | Model configuration parameter. | 0.5 |
gen_ai.request.presence_penalty | float | optional | Model configuration parameter. | 0.5 |
gen_ai.request.temperature | float | optional | Model configuration parameter. | 0.1 |
gen_ai.request.top_p | float | optional | Model configuration parameter. | 0.7 |
gen_ai.request.top_k | int | optional | Limits model to K most likely next tokens. | 40 |
gen_ai.request.messages | string | optional | Deprecated. Use gen_ai.input.messages instead. List of dictionaries describing the messages (prompts) sent to the LLM. [0] | '[{"role": "system", "content": "..."}, ...]' |
gen_ai.request.available_tools | string | optional | Deprecated. Use gen_ai.tool.definitions instead. List of dictionaries describing the available tools. [0] | '[{"name": "random_number", "description": "..."}, ...]' |
Response Data
| Attribute | Type | Requirement Level | Description | Example |
|---|---|---|---|---|
gen_ai.output.messages | string | optional | Stringified array of message objects representing the model's output. [0], [1] | '[{"role": "assistant", "parts": [{"type": "text", "content": "..."}]}]' |
gen_ai.response.finish_reasons | string | optional | The reason why the model stopped generating. | "stop" |
gen_ai.response.id | string | optional | Unique identifier for the completion. | "chatcmpl-abc123" |
gen_ai.response.streaming | boolean | optional | Whether response was streamed asynchronously. | true |
gen_ai.response.time_to_first_token | double | optional | Seconds until first response chunk in streaming. | 0.5 |
gen_ai.response.tokens_per_second | double | optional | Output tokens per second throughput. | 50.0 |
gen_ai.response.text | string | optional | Deprecated. Use gen_ai.output.messages instead. The text representation of the model's response. [0] | "The weather in Paris is rainy" |
gen_ai.response.tool_calls | string | optional | Deprecated. Use gen_ai.output.messages instead. The tool calls in the model's response. [0] | '[{"name": "random_number", "type": "function_call", "arguments": "..."}]' |
Token Usage Data
| Attribute | Type | Requirement Level | Description | Example |
|---|---|---|---|---|
gen_ai.usage.input_tokens | int | optional | The number of tokens used in the AI input (prompt), including cached tokens. [2] | 60 |
gen_ai.usage.input_tokens.cached | int | optional | The number of cached tokens used in the AI input (prompt). | 50 |
gen_ai.usage.input_tokens.cache_write | int | optional | Tokens written to cache when processing input. | 20 |
gen_ai.usage.output_tokens | int | optional | The number of tokens used in the AI output, including reasoning tokens. [3] | 130 |
gen_ai.usage.output_tokens.reasoning | int | optional | The number of tokens used for reasoning. | 30 |
gen_ai.usage.total_tokens | int | optional | The sum of gen_ai.usage.input_tokens and gen_ai.usage.output_tokens. | 190 |
Cost Data
| Attribute | Type | Requirement Level | Description | Example |
|---|---|---|---|---|
gen_ai.cost.input_tokens | double | optional | Cost of input tokens in USD (without cached). | 0.005 |
gen_ai.cost.output_tokens | double | optional | Cost of output tokens in USD (without reasoning). | 0.015 |
gen_ai.cost.total_tokens | double | optional | Total cost for tokens used. | 0.020 |
- [0]: Span attributes only allow primitive data types (like
int,float,boolean,string). This means you need to use a stringified version of a list of dictionaries. Do NOT set the object/array[{"foo": "bar"}]but rather the string'[{"foo": "bar"}]'(must be parsable JSON). - [1]: Messages use the format
{role, parts}wherepartsis an array of typed objects:[{"role": "user", "parts": [{"type": "text", "content": "..."}]}]. Therolemust be"user","assistant","tool", or"system". For backwards compatibility, the legacy format{role, content}(e.g.[{"role": "user", "content": "..."}]) is also accepted. - [2]: Cached tokens are a subset of input tokens;
gen_ai.usage.input_tokensincludesgen_ai.usage.input_tokens.cached. - [3]: Reasoning tokens are a subset of output tokens;
gen_ai.usage.output_tokensincludesgen_ai.usage.output_tokens.reasoning.
Execute Tool Span
Describes a tool execution.
- Span
opSHOULD be"gen_ai.execute_tool". - Span
nameSHOULD be"execute_tool {gen_ai.tool.name}". (e.g."execute_tool query_database") - Attribute
gen_ai.operation.nameMUST be"execute_tool". - Attribute
gen_ai.tool.nameSHOULD be set to the name of the tool. (e.g."query_database") - Attribute
gen_ai.agent.nameSHOULD be set to the name of the agent that invoked the tool. (e.g."Weather Agent") - If relevant, the
gen_ai.pipeline.nameattribute SHOULD be set to the name of the AI workflow, pipeline or chain within which the agent operates. (e.g."weather-pipeline") - All Common Span Attributes SHOULD be set (all
requiredcommon attributes MUST be set).
Additional attributes on the span:
Tool Data
| Attribute | Type | Requirement Level | Description | Example |
|---|---|---|---|---|
gen_ai.tool.name | string | optional | Name of the tool executed. | "random_number" |
gen_ai.tool.description | string | optional | Description of the tool executed. | "Tool returning a random number" |
gen_ai.tool.type | string | optional | The type of the tools. | "function"; "extension"; "datastore" |
gen_ai.tool.call.arguments | string | optional | Arguments of the tool call (stringified). | '{"max":10}' |
gen_ai.tool.call.result | string | optional | Result of the tool call (stringified). | "7" |
gen_ai.tool.message | string | optional | Response from a tool/function call passed to model. | "The random number is 7" |
gen_ai.tool.input | string | optional | Deprecated. Use gen_ai.tool.call.arguments instead. Input that was given to the executed tool as string. | '{"max":10}' |
gen_ai.tool.output | string | optional | Deprecated. Use gen_ai.tool.call.result instead. The output from the tool. | "7" |
Handoff Span
A span that describes the handoff from one agent to another agent.
- Span
opSHOULD be"gen_ai.handoff". - Span
nameSHOULD be"handoff from {from_agent} to {to_agent}". - Attribute
gen_ai.operation.nameMUST be"handoff". - All Common Span Attributes SHOULD be set (all
requiredcommon attributes MUST be set).
Common Span Attributes
Some attributes are common to all AI Agents spans:
| Attribute | Type | Requirement Level | Description | Example |
|---|---|---|---|---|
gen_ai.operation.name | string | required | The name of the operation being performed. [4] | "chat" |
gen_ai.system | string | optional | The Generative AI product as identified by the client or server instrumentation. [5] | "openai" |
[4] Well defined values for data attribute gen_ai.operation.name:
| Value | Description |
|---|---|
"chat" | Chat completion operation (e.g. OpenAI Chat API) |
"create_agent" | Create GenAI agent |
"embeddings" | Embeddings operation (e.g. OpenAI Create Embeddings API) |
"execute_tool" | Execute a tool |
"generate_content" | Multimodal content generation (e.g. Gemini Generate Content) |
"invoke_agent" | Invoke GenAI agent |
"text_completion" | Text completion operation |
[5] Well defined values for data attribute gen_ai.system:
| Value | Description |
|---|---|
"anthropic" | Anthropic |
"aws.bedrock" | AWS Bedrock |
"az.ai.inference" | Azure AI Inference |
"az.ai.openai" | Azure OpenAI |
"cohere" | Cohere |
"deepseek" | DeepSeek |
"gcp.gemini" | Gemini |
"gcp.gen_ai" | Any Google generative AI endpoint |
"gcp.vertex_ai" | Vertex AI |
"groq" | Groq |
"ibm.watsonx.ai" | IBM Watsonx AI |
"mistral_ai" | Mistral AI |
"openai" | OpenAI |
"perplexity" | Perplexity |
"xai" | xAI |
[6]
The input list should include the most recent messages up to and including the most recent previous model response. The previous model response is identified with an "assistant" or "model" role in common frameworks. If there is no previous model response in the input list, then all input items which are not system instructions should be included. System instructions must be added in gen_ai.system_instructions, and are not included in the gen_ai.input.messages list.
[7]
Binary blobs in the input list should be replaced with the string "[Blob substitute]" in positions where binary data is expected in a given schema. Only binary blobs in positions where binary data is explicitly expected must be redacted. For example, in OpenAI Completions schema, only binary blobs in content blocks with type image_url, input_audio or file should be redacted.
[8]
In some agent libraries, the agent name is optional, and some do not provide the user an option to name their agents.
In these cases, the span name SHOULD be "invoke_agent {call_id}", where call_id is some user-provided identifier for the agent invocation. For example, functionId in Vercel AI.
[9]
Image URLs in the data URL format in the input list should be replaced with the string "[Blob substitute]" in positions where binary data is expected. For example, data URLs like data:image/png;base64 will be redacted, but HTTP URLs like example.com/data?<a-base64-string> will not be.
See here for the regex used.