-
Notifications
You must be signed in to change notification settings - Fork 55
Integration Documentation Reference
This is a supplement for the Integration Creation Guide. It is intended to provide more technical details and references, rather than detailed steps for implementing integrations.
For a functioning integration, there are four stages that telemetry data goes through. For the current release, we are mostly focused on stages 2-4, but in the future we can look more carefully at stage 1.
- Stage 1. The data is generated by an application and stored in some intermediary bucket storage, such as S3, Spark, or Prometheus. There may be multiple levels of mappings here, and for some architectures we may skip directly to stage 2.
- Stage 2. The data is retrieved from either intermediary storage or the application directly, and indexed in OpenSearch as OTEL-compliant records.
- Stage 3. OTEL-compliant records at rest in OpenSearch.
- Stage 4. OTEL-compliant records are queried from OpenSearch and displayed in OpenSearch Dashboards as integration assets.
For the majority of the integration work we are doing today, we are focused on stage 2 and 3. The purpose of this document is to gather resources for helping with each.
The main focus of this stage is to set up some infrastructure that generates telemetry data. There are a few different repositories that we have that demonstrate this data collection:
- The OpenTelemetry Demo repository has an extensive docker setup with many services being ingested into OpenSearch. This can be good to look at for making integrations for more complicated architectures.
- There is an old Nginx Demo that involves parsing and mapping data using a basic FluentBit filter. This is a good approach if the integration is being made for a standalone application. (TODO: @Swiddis has made an updated FluentBit config that is much easier to work with, and needs to update the demo. Ping him if you need it now.)
- For AWS applications, there is no currently working example on-hand, but a resource that we’ve found is the SIEM on OpenSearch demo repository. There is also a page that has a lot of examples of AWS log data.
As OTEL is a widespread protocol, there are many tools that can ingest records and convert them to the format. There are three that I’m aware are currently being used with the project:
- FluentBit is relatively straightforward and can convert to arbitrary data formats, but requires understanding OTEL somewhat well to use.
- Jaeger Tracing has an OTEL collection mode and can export to OpenSearch, but the compatibility with OpenSearch seems slightly off regarding timestamps.
- The OpenTelemetry Collector is the most formally correct software, but it doesn’t currently have a native OpenSearch export mode. The author hasn’t yet been able to make it run.
When setting up the collector to output to OpenSearch, it is recommended to make a reproducible configuration that runs in Docker. Retrieving sample data from a successful setup will be useful for testing integrations, we recommend storing that in the integration.
Within integrations, a sample of the results of this encoding stage should be stored in the data
directory.
-
OpenTelemetry Receivers available for many applications, under the
receivers
directory. -
Some examples of current ElasticSearch Integrations, under the
packages
directory.
The current primary reference for what defines an OTEL data record is the OTEL Semantic Conventions repository. This can be used to check whether data provided is in an OTEL format. We are working on creating OpenSearch Mappings for these conventions, available in the OpenSearch Catalog. However, note that the mapping files present today might not be perfectly aligned with OTEL, due to typos or otherwise, so until automated check functionality is added some level of manual double-checking is advised.
We have an Integrations CLI tool that can automatically check whether provided data records are compliant with the catalog’s mapping files. This can help a lot with debugging conversion, and finding schemas to reference in an integration.
Within integrations, this stage is encoded in the schemas
directory.
Most dashboards that are in the wild today are not automatically compliant with our OTEL format. They will have to be either converted, or remade from scratch. If possible or feasible, remaking from scratch is preferred. To help with this, we’ve started preparing the Visualization Catalog that will contain examples of visualizations that query OTEL records. We also have some tooling via the Integrations CLI that can verify whether a visualization is already compliant with OTEL.
The target workflow is to be able to let an integration developer provide sample OTEL data records, and we can suggest visualizations for a dashboard based on what fields are present in the records. For the moment, we are focusing more on just gathering visualizations, so if you do make a new visualization, please consider making a PR to add it to the catalog.
Within integrations, this stage is encoded in the assets
directory.