You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 10, 2024. It is now read-only.
Requirement - what kind of business use case are you trying to solve?
Create a production-grade implementation of System Architecture Feature that is able to build dependency items from a continuous stream of spans consumed from Kafka.
Problem - what in Jaeger blocks you from solving the requirement?
It is worth recognizing that there is an existing means of computing dependency items from spans through spark-dependencies. This solution is based on a single query on spans in a backing store that builds and bulk-loads the dependency items into the backing store.
However, to maintain an accurate count of edges (call counts) between services and an up-to-date topology, a streaming solution would be more suitable while also removing the need to manage cron jobs and date boundaries.
Proposal - what do you suggest to solve the problem or improve the existing situation?
We currently have a streaming solution for the System Architecture feature running in our (Logz.io) production environment. It is based on the Kafka Streams library, which could be integrated into the existing Jaeger architecture with Kafka as an intermediate buffer.
The following illustrates how we have implemented the System Architecture feature currently, courtesy of the engineer of this solution @PropAnt:
The reason why dependency items are written back to Kafka is to allow the Kafka streams application to efficiently write processed data without being limited by back-pressure from by the dependency items backing store.
High-level description of the Kafka streams topology:
We propose to adopt and open source our implementation that already works in production.
Any open questions to address
There are few ways to design the solution:
A single module with a Kafka Streams application can be added to calculate dependency items and stream them to the separate Kafka topic for further ingestion. As above, this approach is tried and tested in our production environment and code is available to open-source (more or less) as-is.
Another approach is to structure the code to encapsulate the business logic and data models in separate modules. The advantage of this approach is that Jaeger will have a streaming framework-agnostic implementation of System Architecture Feature. The trade-off is the additional effort required to architect the existing code to be agnostic to the streaming framework.
The text was updated successfully, but these errors were encountered:
I do have one question on how the results (dependency links) are stored and queried from the storage. The current spark batch implementation creates one record per time interval (e.g. 24h). I assume the streaming creates multiple records per the same interval. Did you have to change the query or perhaps the store implementation to deal with this?
@pavolloffay no we didn't. Jaeger queries for a certain period and gets a set of dependency links (items) back. Depending on the time interval for the dependencies, the query will return a bigger or smaller set. We didn't change data models and queries. The time interval for dependency links can be set through properties.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Requirement - what kind of business use case are you trying to solve?
Create a production-grade implementation of System Architecture Feature that is able to build dependency items from a continuous stream of spans consumed from Kafka.
Problem - what in Jaeger blocks you from solving the requirement?
It is worth recognizing that there is an existing means of computing dependency items from spans through spark-dependencies. This solution is based on a single query on spans in a backing store that builds and bulk-loads the dependency items into the backing store.
However, to maintain an accurate count of edges (call counts) between services and an up-to-date topology, a streaming solution would be more suitable while also removing the need to manage cron jobs and date boundaries.
Proposal - what do you suggest to solve the problem or improve the existing situation?
We currently have a streaming solution for the System Architecture feature running in our (Logz.io) production environment. It is based on the Kafka Streams library, which could be integrated into the existing Jaeger architecture with Kafka as an intermediate buffer.
The following illustrates how we have implemented the System Architecture feature currently, courtesy of the engineer of this solution @PropAnt:
The reason why dependency items are written back to Kafka is to allow the Kafka streams application to efficiently write processed data without being limited by back-pressure from by the dependency items backing store.
High-level description of the Kafka streams topology:
We propose to adopt and open source our implementation that already works in production.
Any open questions to address
There are few ways to design the solution:
The text was updated successfully, but these errors were encountered: