Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory increase due to too many traces #590

Closed
ChrisTerBeke opened this issue Apr 1, 2019 · 4 comments
Closed

Memory increase due to too many traces #590

ChrisTerBeke opened this issue Apr 1, 2019 · 4 comments

Comments

@ChrisTerBeke
Copy link

Hi,

We're using this lib in several APIs running on GKE and using StackDriver. We noticed that when sending all traces to StackDriver, the amount of traces collected grows faster than the amount that can be exported to StackDriver. Because of this, a list containing all trace spans that still need to be sent grows over time, to the point where Kubernetes kills our pod because it uses too much memory.

While this is expected behavior from the current code, I wonder if there would be a cleaner way to do this. For example by dropping the spans when the list reaches a maximum and then logging a warning or error? This will prevent apps from 'leaking' memory.

What are your thoughts about this?

p.s. this is probably the same as reported on #334, but it's not really resolved there.

@c24t
Copy link
Member

c24t commented Apr 1, 2019

Configuring samplers to have a global rate limit may help (see #458), but I agree that memory shouldn't grow unbounded by default. Dropping traces seems like a good solution to me.

As for your application: are you sampling every trace? You might want to use the ProbabilitySampler instead.

#334 is a different issue, the library wasn't cleaning up monitoring clients.

@c24t
Copy link
Member

c24t commented Apr 1, 2019

See census-instrumentation/opencensus-java#1813 for a similar issue in the java client.

@ChrisTerBeke
Copy link
Author

We were sampling every trace, but since then have switched to the probability sampler (which solves the problem).

Glad to hear you also think this is something to improve :)

@c24t
Copy link
Member

c24t commented May 2, 2019

@reyang's work on #642 should fix the memory issue by dropping spans once the queue is full.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants