Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

see if improvements can/have-to be made to rama's http open telemetry layer support #383

Open
GlenDC opened this issue Jan 8, 2025 · 16 comments
Assignees
Labels
easy An easy issue to pick up for anyone. enhancement New feature or request good first issue Good for newcomers mentor available A mentor is available to help you through the issue.
Milestone

Comments

@GlenDC
Copy link
Member

GlenDC commented Jan 8, 2025

We have already HTTP Layer OpenTelemetry support in https://github.com/plabayo/rama/blob/main/rama-http/src/layer/opentelemetry.rs. Recently I bumped into https://github.com/francoposa/tower-otel-http-metrics/blob/main/src/lib.rs.

Could be good to compare that code with ours and see if there are improvements that can be made to our version.

@GlenDC GlenDC added enhancement New feature or request good first issue Good for newcomers easy An easy issue to pick up for anyone. mentor available A mentor is available to help you through the issue. labels Jan 8, 2025
@GlenDC GlenDC added this to the v0.2 milestone Jan 8, 2025
@Anon258
Copy link

Anon258 commented Jan 13, 2025

Hello, I am interested in contributing to this issue. I am not that familiar with writing rust code and I feel this can be a good opportunity for me to learn.

@GlenDC
Copy link
Member Author

GlenDC commented Jan 13, 2025

Hi @Anon258, perhaps prior to start working on it. Do you understand the linked code files? Can you read these? Do you also know how to run the tests and manually confirm if all works out? As those would be the minimal requirements to get started. I can obviously guide where needed and explain stuff, talk things through. But not sure where the baseline in your knowledge is here.

@Anon258
Copy link

Anon258 commented Jan 13, 2025

Yes I did read through the linked code files. I can read them quite comfortably, that is no problem. However I would like to discuss why certain things are implemented the way they are and on what tests are supposed to be done to compare their code with rama.

@GlenDC
Copy link
Member Author

GlenDC commented Jan 13, 2025

Ok perfect. I'll asign it to you already in that case. Feel free to ask any questions you might have here, via email or discord whatever you wish. All three channels are fine for me. If all the same to you probably best to keep it for visibility here, but if that doesn't work for w/e reasons the others are an option as well.

@Anon258
Copy link

Anon258 commented Jan 14, 2025

One question I had about the design was
RequestMetricsLayer seems to be just an intermediary structure between Metrics and RequestMetricsService which allows adding attributes_factory and giving custom Service Name/Version. Why do we have this separation as I think we could just as easily add the same options to RequestMetricsService which just adds an inner layer? (which I assume is for the Service based on the code for Serve)

And, for the tests it seems the tests in the linked file are just API testing and checking the attributes/context are being set correctly, what do you mean by comparing our code with theirs? Just the API structure or whether some runtime performance things?

@GlenDC
Copy link
Member Author

GlenDC commented Jan 14, 2025

RequestMetricsLayer seems to be just an intermediary structure between Metrics and RequestMetricsService which allows adding attributes_factory and giving custom Service Name/Version

This is just the paradigm of the system. Nothing would stop us from having a struct be both a Layer and a Service. Sometimes the difference is smaller between the two, sometimes bigger. However, the layer will not have an inner service as it didn't receive one yet at that point, but the service will have it, usually received via its layer parent. This is pretty much the same design as what Tower does https://docs.rs/tower/latest/tower/, should you be familiar with that. More info in the rama book: https://ramaproxy.org/book/intro/services_all_the_way_down.html

what do you mean by comparing our code with theirs? Just the API structure or whether some runtime performance things?

Yes, this is a small issue, don't expect it to a lot of work, if any. Might be just reporting work and no code changes required.

this is not about performance related matters, neither so much about API structure. What I am mostly wondering is in what attributes and metrics they are reporting, and which of those we are not. As we might want to do add these to ours as well, or perhaps we have good reasons not to. Same for other such matters.

@Anon258
Copy link

Anon258 commented Jan 14, 2025

More info in the rama book: https://ramaproxy.org/book/intro/services_all_the_way_down.html

So, is my understanding correct in that the paradigm asks for a Metrics Layer, but in Rama since everything is a Service, we wrap the MetricsLayer into a MetricsService with Service ()?

What I am mostly wondering is in what attributes and metrics they are reporting, and which of those we are not. As we might want to do add these to ours as well, or perhaps we have good reasons not to. Same for other such matters.

I see, I will look into that then

@GlenDC
Copy link
Member Author

GlenDC commented Jan 14, 2025

So, is my understanding correct in that the paradigm asks for a Metrics Layer, but in Rama since everything is a Service, we wrap the MetricsLayer into a MetricsService with Service ()?

No, not exactly. And it is rama defining this paradigm. So if ever it makes no sense we can go a whole different route.

  • A Service is something which takes a request and generates a response
  • The power in services is that you can stack them in all kinds of ways. A "Layer" is the thing that wraps a service with another Service.

As such the inner service is whatever is the stack that will provide the actual http response in the happy path scenario or error otherwise. The MetricsLayer is an example of a passthrough service which reads the request and response as it comes through so it can produce metrics for it. But it doesn't modify the content or branch off.

@Anon258
Copy link

Anon258 commented Jan 17, 2025

I read through the code, I have a few points on the differences between the two, and also a few questions I was hoping some clarification on

  1. I was following the Otel Http server Metrics listed here.
    - It seems HTTP_SERVER_TOTAL_REQUESTS, HTTP_SERVER_TOTAL_RESPONSES and HTTP_SERVER_TOTAL_FAILURES are not listed as metrics in this
    - These metrics are present in rama however are not there in tower-otel-http-metrics

  2. The tower-otel-http-metrics has a few additional metrics that are not present in rama
    - HTTP_SERVER_ACTIVE_REQUESTS
    - HTTP_SERVER_REQUEST_BODY_SIZE

  3. In terms of the attributes with the metrics
    - Rama has some additional attributes compared to Tower-otel
    * SERVER_PORT
    * USER_AGENT
    - Tower-Otel has one attribute which I think is not in Rama
    * NETWORK_PROTOCOL_NAME
    - If im correct, HTTP_REQUEST_HOST is same as HTTP_ROUTE, if not that is another difference between the two.

  4. The semantic-conventions for common attributes docs mention NETWORK_TRANSPORT as a conditionally required attribute, however it seems neither Rama nor Tower-Otel supports this attribute. Why is this so?

@GlenDC
Copy link
Member Author

GlenDC commented Jan 17, 2025

@soundofspace could you provide feedback on @Anon258's feedback here. As you are more well versed in the OTEL ecosystem than me. Have some toughts on this matter but curious what your findings are?

@GlenDC
Copy link
Member Author

GlenDC commented Jan 17, 2025

NETWORK_TRANSPORT is btw logged in the transport layer metrics: https://docs.rs/rama-net/0.2.0-alpha.6/src/rama_net/stream/layer/opentelemetry.rs.html#223, is it also used for the HTTP logs than? I would think the HTTP version implies the transport layer?

@Anon258
Copy link

Anon258 commented Jan 20, 2025

It seems it needs to be mentioned in HTTP logs if it is not the default one. Is it the case that rama does not support the non-default transport layer based on HTTP version?

@soundofspace
Copy link
Collaborator

Hi @Anon258 thanks for the detailed research!

It seems HTTP_SERVER_TOTAL_REQUESTS, HTTP_SERVER_TOTAL_RESPONSES and HTTP_SERVER_TOTAL_FAILURES are not listed as metrics in this

This is totally expected as currently opentelemetry is mainly focussed on standardizing metrics that are used in almost all scenarios, or metrics need that standardizing to play nice with other tools (eg duration metric combined with span). Standardizing stuff in general also takes some time, so might be a while before most metrics are standardizing, but once they are we can always come back and modify the ones we already have if needed. They mostly promote adding your own metrics that make sense for your use case, while still be consistent in naming convection, and added attributes.

The tower-otel-http-metrics has a few additional metrics that are not present in rama

Those metrics do seem valuable especially the active requests, so adding them would be nice. Can be done by you or someone else here.

Rama has some additional attributes compared to Tower-otel

Same here as metrics pretty much: you can always add more depending on your use case. Eg user-agent is pretty crucial in rama depending on how rama is used, so it makes sense to add that as an attribute. (We just have to be careful to not add too many attributes as they could make cardinality expode)

NETWORK_PROTOCOL_NAME

NETWORK_PROTOCOL_NAME means protocol used on this layer of the OSI layers , so here http, which combined with NETWORK_PROTOCOL_VERSION would then give you http 1.1

HTTP_REQUEST_HOST is same as HTTP_ROUTE

This is not the case. Lets say someone visits server with url https://example.com/user/5. Here host: example.com and path: user/{id} (template syntax can be chosen by us, but is in most cases aligned with used framework)

@GlenDC
Copy link
Member Author

GlenDC commented Jan 20, 2025

HTTP_ROUTE Is out of scope. I do think it's good idea to have it available but it would have to mean that the services where we do http routing would actually inject these templates in the Context, as that is currently not the case.

Thanks for clarifying @soundofspace. @Anon258 feel free to add those additional metrics in a PR, and that might be sufficient to resolve this issue.

@Anon258
Copy link

Anon258 commented Jan 22, 2025

Okay. I will add the 2 metrics active calls and request body size as a PR and update.

@GlenDC
Copy link
Member Author

GlenDC commented Jan 22, 2025

Thank you a lot @Anon258 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
easy An easy issue to pick up for anyone. enhancement New feature or request good first issue Good for newcomers mentor available A mentor is available to help you through the issue.
Projects
None yet
Development

No branches or pull requests

3 participants