From 0b3babde7ff9f74b03a1a49adcdb319354d47d85 Mon Sep 17 00:00:00 2001 From: Roger Coll Date: Fri, 31 Jan 2025 00:08:32 +0100 Subject: [PATCH 1/2] add User-agent OS attributes (#1434) Co-authored-by: Trask Stalnaker Co-authored-by: Joao Grassi <5938087+joaopgrassi@users.noreply.github.com> --- .chloggen/user_agent_os.yaml | 22 ++++++++++++++++++++++ docs/attributes-registry/user-agent.md | 24 ++++++++++++++++++++---- model/user-agent/registry.yaml | 23 +++++++++++++++++++++++ 3 files changed, 65 insertions(+), 4 deletions(-) create mode 100755 .chloggen/user_agent_os.yaml diff --git a/.chloggen/user_agent_os.yaml b/.chloggen/user_agent_os.yaml new file mode 100755 index 0000000000..7e70c687f0 --- /dev/null +++ b/.chloggen/user_agent_os.yaml @@ -0,0 +1,22 @@ +# Use this changelog template to create an entry for release notes. +# +# If your change doesn't affect end users you should instead start +# your pull request title with [chore] or use the "Skip Changelog" label. + +# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' +change_type: 'enhancement' + +# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db) +component: user-agent + +# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). +note: Add `user_agent.os.name` and `user_agent.os.version` attributes + +# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. +# The values here must be integers. +issues: [1433] + +# (Optional) One or more lines of additional information to render under the primary note. +# These lines will be padded with 2 spaces and then inserted directly into the document. +# Use pipe (|) for multiline entries. +subtext: diff --git a/docs/attributes-registry/user-agent.md b/docs/attributes-registry/user-agent.md index eefb97d911..e5fa50e946 100644 --- a/docs/attributes-registry/user-agent.md +++ b/docs/attributes-registry/user-agent.md @@ -3,6 +3,9 @@ # User agent +- [User-agent Attributes](#user-agent-attributes) +- [User-agent OS Attributes](#user-agent-os-attributes) + ## User-agent Attributes Describes user-agent attributes. @@ -11,14 +14,27 @@ Describes user-agent attributes. |---|---|---|---|---| | `user_agent.name` | string | Name of the user-agent extracted from original. Usually refers to the browser's name. [1] | `Safari`; `YourApp` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `user_agent.original` | string | Value of the [HTTP User-Agent](https://www.rfc-editor.org/rfc/rfc9110.html#field.user-agent) header sent by the client. | `CERN-LineMode/2.15 libwww/2.17b3`; `Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1`; `YourApp/1.0.0 grpc-java-okhttp/1.27.2` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| `user_agent.synthetic.type` | string | Specifies the category of synthetic traffic, such as tests or bots. [2] | `bot`; `test` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `user_agent.version` | string | Version of the user-agent extracted from original. Usually refers to the browser's version [3] | `14.1.2`; `1.0.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `user_agent.version` | string | Version of the user-agent extracted from original. Usually refers to the browser's version [2] | `14.1.2`; `1.0.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1] `user_agent.name`:** [Example](https://www.whatsmyua.info) of extracting browser's name from original string. In the case of using a user-agent for non-browser products, such as microservices with multiple names/versions inside the `user_agent.original`, the most significant name SHOULD be selected. In such a scenario it should align with `user_agent.version` -**[2] `user_agent.synthetic.type`:** This attribute MAY be derived from the contents of the `user_agent.original` attribute. Components that populate the attribute are responsible for determining what they consider to be synthetic bot or test traffic. This attribute can either be set for self-identification purposes, or on telemetry detected to be generated as a result of a synthetic request. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests. +**[2] `user_agent.version`:** [Example](https://www.whatsmyua.info) of extracting browser's version from original string. In the case of using a user-agent for non-browser products, such as microservices with multiple names/versions inside the `user_agent.original`, the most significant version SHOULD be selected. In such a scenario it should align with `user_agent.name` + +## User-agent OS Attributes + +Describes the OS user-agent attributes. + +| Attribute | Type | Description | Examples | Stability | +|---|---|---|---|---| +| `user_agent.os.name` | string | Human readable operating system name. [3] | `iOS`; `Android`; `Ubuntu` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `user_agent.os.version` | string | The version string of the operating system as defined in [Version Attributes](/docs/resource/README.md#version-attributes). [4] | `14.2.1`; `18.04.1` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `user_agent.synthetic.type` | string | Specifies the category of synthetic traffic, such as tests or bots. [5] | `bot`; `test` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +**[3] `user_agent.os.name`:** For mapping user agent strings to OS names, libraries such as [ua-parser](https://github.com/ua-parser) can be utilized. + +**[4] `user_agent.os.version`:** For mapping user agent strings to OS versions, libraries such as [ua-parser](https://github.com/ua-parser) can be utilized. -**[3] `user_agent.version`:** [Example](https://www.whatsmyua.info) of extracting browser's version from original string. In the case of using a user-agent for non-browser products, such as microservices with multiple names/versions inside the `user_agent.original`, the most significant version SHOULD be selected. In such a scenario it should align with `user_agent.name` +**[5] `user_agent.synthetic.type`:** This attribute MAY be derived from the contents of the `user_agent.original` attribute. Components that populate the attribute are responsible for determining what they consider to be synthetic bot or test traffic. This attribute can either be set for self-identification purposes, or on telemetry detected to be generated as a result of a synthetic request. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests. --- diff --git a/model/user-agent/registry.yaml b/model/user-agent/registry.yaml index 369aad35cd..d9d4ad5d8a 100644 --- a/model/user-agent/registry.yaml +++ b/model/user-agent/registry.yaml @@ -35,6 +35,29 @@ groups: using a user-agent for non-browser products, such as microservices with multiple names/versions inside the `user_agent.original`, the most significant version SHOULD be selected. In such a scenario it should align with `user_agent.name` + + - id: registry.user_agent.os + type: attribute_group + display_name: User-agent OS Attributes + brief: "Describes the OS user-agent attributes." + attributes: + - id: user_agent.os.name + type: string + stability: experimental + brief: 'Human readable operating system name.' + examples: ['iOS', 'Android', 'Ubuntu'] + note: > + For mapping user agent strings to OS names, libraries such as [ua-parser](https://github.com/ua-parser) can be utilized. + - id: user_agent.os.version + type: string + stability: experimental + brief: > + The version string of the operating system as defined in + [Version Attributes](/docs/resource/README.md#version-attributes). + examples: ['14.2.1', '18.04.1'] + note: > + For mapping user agent strings to OS versions, libraries such as [ua-parser](https://github.com/ua-parser) can be utilized. + - id: user_agent.synthetic.type stability: experimental brief: > From fa19325ac9942c0e5a82334469f17516fc4209dd Mon Sep 17 00:00:00 2001 From: Liudmila Molkova Date: Thu, 30 Jan 2025 16:49:17 -0800 Subject: [PATCH 2/2] First stab at documenting the art of defining semantic conventions (#1707) Co-authored-by: Trask Stalnaker Co-authored-by: Alexandra Konrad --- CONTRIBUTING.md | 13 ++ .../how-to-define-semantic-conventions.md | 154 ++++++++++++++++++ 2 files changed, 167 insertions(+) create mode 100644 docs/general/how-to-define-semantic-conventions.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 0d37634e9c..5d3f99b839 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -14,6 +14,7 @@ requirements and recommendations. - [Sign the CLA](#sign-the-cla) - [How to Contribute](#how-to-contribute) - [Which semantic conventions belong in this repo](#which-semantic-conventions-belong-in-this-repo) + - [Suggesting conventions for a new area](#suggesting-conventions-for-a-new-area) - [Prerequisites](#prerequisites) - [1. Modify the YAML model](#1-modify-the-yaml-model) - [Code structure](#code-structure) @@ -81,6 +82,18 @@ and helps to keep conventions consistent and backward compatible. Want to define your own conventions outside this repo while building on OTel’s? Come help us [decentralize semantic conventions](https://github.com/open-telemetry/weaver/issues/215). +### Suggesting conventions for a new area + +Defining semantic conventions requires a group of people who are familiar with the domain, +are involved with instrumentation efforts, and are committed to be the point of contact for +pull requests, issues, and questions in this area. + +Check out [project management](https://github.com/open-telemetry/community/blob/main/project-management.md) +for the details on how to start. + +Refer to the [How to define new conventions](/docs/general/how-to-define-semantic-conventions.md) +document for guidance. + ### Prerequisites The Specification uses several tools to check things like style, diff --git a/docs/general/how-to-define-semantic-conventions.md b/docs/general/how-to-define-semantic-conventions.md new file mode 100644 index 0000000000..89d5ac9adb --- /dev/null +++ b/docs/general/how-to-define-semantic-conventions.md @@ -0,0 +1,154 @@ + + +# How to define new semantic conventions + +**Status**: [Development][DocumentStatus] + + + +- [Defining new conventions](#defining-new-conventions) + - [Best practices](#best-practices) + - [Defining attributes](#defining-attributes) + - [Defining spans](#defining-spans) + - [Defining metrics](#defining-metrics) + - [Defining resources](#defining-resources) + - [Defining events](#defining-events) +- [Stabilizing existing conventions](#stabilizing-existing-conventions) + - [Migration plan](#migration-plan) + + + +This document describes requirements, recommendations, and best practices on how to define conventions +for new areas or make substantial changes to the existing ones. + +## Defining new conventions + +- New conventions MUST have a group of codeowners. See [project management](https://github.com/open-telemetry/community/blob/main/project-management.md) for more details. + +- New conventions SHOULD be defined in YAML files. See [YAML Model for Semantic Conventions](/model/README.md) for the details. +- New conventions SHOULD be defined with `development` stability level. +- New conventions SHOULD include telemetry signal definitions (spans, metrics, events, resources, profiles) and MAY include new attribute definitions. + +### Best practices + +> [!NOTE] +> +> This section contains non-normative guidance. + +#### Defining attributes + +Reuse existing attributes when possible. Look through [existing conventions](/docs/attributes-registry/) for similar areas, +check out [common attributes](/docs/general/attributes.md). +Semantic conventions authors are encouraged to use attributes from different namespaces. + +Consider adding a new attribute if all of the following apply: + +- It provides a clear benefit to end users by enhancing telemetry. +- There is a clear plan to use the attributes when defining spans, metrics, events, resources, or other telemetry signals in semantic conventions. +- There is a clear plan on how these attributes will be used by instrumentations + +Semantic convention maintainers may reject the addition of a new attribute if its benefits +and use-cases are not yet clear. + +When defining a new attribute: + +- Follow the [naming guidance](/docs/general/naming.md) +- Provide descriptive `brief` and `note` sections to clearly explain what the attribute represents. + - If the attribute represents a common concept documented externally, include relevant links. + For example, always link to concepts defined in RFCs or other standards. + - If the attribute's value might contain PII or other sensitive information, explicitly call this out in + the `note`. + + Include a warning similar to the following: + + ```yaml + - id: user.full_name + ... + note: | + ... + + > [!WARNING] + > + > This attribute contains sensitive (PII) information. + ``` + +- Use the appropriate [attribute type](https://github.com/open-telemetry/weaver/blob/main/schemas/semconv-syntax.md#type) + - If the value has a reasonably short (open or closed) set of possible values, define it as an enum. + - If the value is a timestamp, record it as a string in ISO 8601 format. + - For arrays of primitives, use the array type. Avoid recording arrays as a single string. + - Use the template type to define attributes with dynamic names (only the last segment of the name should be dynamic). + This is useful for capturing user-defined key-value pairs, such as HTTP headers. + - Represent complex values as a set of flat attributes. +- Define new attributes with `development` stability. +- Provide realistic examples +- Avoid defining attributes with potentially unbounded values, such as strings longer than + 1 KB or arrays with more than 1,000 elements. Such values should be recorded in the log or event body instead. + +Consider the scope of the attribute and how it may evolve in the future: + +- When defining an attribute for a narrow use case, think about potential broader use cases. + For example, if creating a system-specific attribute, evaluate whether other systems + in the same domain might need a similar attribute in the future. + + Similarly, instead of defining a simple boolean flag indicating a success or failure, consider a + more extensible approach, such as using a `foo.status_code` attribute to include additional details. + +- When defining a broad attribute applicable across multiple domains or systems, + check for existing standards or widely accepted best practices in the industry. + Avoid creating generic attributes that are not based on established standards. + +> [!NOTE] +> +> When defining conventions for areas with multiple implementations or systems — such as databases, +> or cloud providers — it can take time to strike the right balance between being +> overly generic and not generic enough. +> +> Start with experimental conventions, document how they apply to a diverse range +> of providers, systems, or libraries, and prototype instrumentations. +> +> The end-user experience should serve as the primary guiding principle: +> +> - If the attribute is expected to be used in general-purpose metrics for the area, +> consider introducing a common attribute. +> +> For example, most messaging systems have a concept like a queue or topic. +> Queue or topic names are critical for latency and throughput metrics and +> equally important for spans to debug and visualize message flow. +> This indicates the need for a generic attribute representing any type of messaging destination. +> +> - If the attribute represents something useful in a narrow set of scenarios or +> is specific to certain system metrics, spans, or events, it likely does not need to be generic. + +#### Defining spans + +TBD + +#### Defining metrics + +TBD + +#### Defining resources + +TBD + +#### Defining events + +TBD + +## Stabilizing existing conventions + +- All conventions MUST be defined in YAML before they can be declared stable +- Conventions that are not used by instrumentations MUST NOT be declared stable + +TODO: + +- prototyping/implementation requirements +- migration plan + +### Migration plan + +TODO + +[DocumentStatus]: https://opentelemetry.io/docs/specs/otel/document-status