Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pydrake] Add example of memory leaks #21951

Merged

Conversation

jwnimmer-tri
Copy link
Collaborator

@jwnimmer-tri jwnimmer-tri commented Sep 25, 2024

Towards #14387.

The objective of this PR is to bootstrap the virtuous cycle of making reproducers of the problem, finding good diagnostic tools to help explain the problem, fixing bugs in prototypes, re-instrumenting, adding more examples, etc.


Sample output of bazel run //bindings/pydrake:memory_leak_test as of today:

RUNNING: dut_simple_source
RepetitionDetail(i=0, blocks=3)
RepetitionDetail(i=1, blocks=4)
RepetitionDetail(i=2, blocks=5)
RepetitionDetail(i=3, blocks=5)
RepetitionDetail(i=4, blocks=6)
RepetitionDetail(i=5, blocks=7)
RepetitionDetail(i=6, blocks=8)
RepetitionDetail(i=7, blocks=8)
RepetitionDetail(i=8, blocks=9)
RepetitionDetail(i=9, blocks=9)
RepetitionDetail(i=10, blocks=9)
RepetitionDetail(i=11, blocks=10)
RepetitionDetail(i=12, blocks=10)
RepetitionDetail(i=13, blocks=10)
RepetitionDetail(i=14, blocks=10)
RepetitionDetail(i=15, blocks=11)
RepetitionDetail(i=16, blocks=12)
RepetitionDetail(i=17, blocks=12)
RepetitionDetail(i=18, blocks=12)
RepetitionDetail(i=19, blocks=12)
RepetitionDetail(i=20, blocks=12)
RepetitionDetail(i=21, blocks=12)
RepetitionDetail(i=22, blocks=12)
RepetitionDetail(i=23, blocks=12)
RepetitionDetail(i=24, blocks=12)
RUNNING: dut_trivial_simulator
RepetitionDetail(i=0, blocks=4)
RepetitionDetail(i=1, blocks=7)
RepetitionDetail(i=2, blocks=14)
RepetitionDetail(i=3, blocks=17)
RepetitionDetail(i=4, blocks=23)
RepetitionDetail(i=5, blocks=25)
RepetitionDetail(i=6, blocks=29)
RepetitionDetail(i=7, blocks=36)
RepetitionDetail(i=8, blocks=42)
RepetitionDetail(i=9, blocks=45)
RepetitionDetail(i=10, blocks=48)
RepetitionDetail(i=11, blocks=52)
RepetitionDetail(i=12, blocks=57)
RepetitionDetail(i=13, blocks=65)
RepetitionDetail(i=14, blocks=72)
RepetitionDetail(i=15, blocks=78)
RepetitionDetail(i=16, blocks=84)
RepetitionDetail(i=17, blocks=91)
RepetitionDetail(i=18, blocks=96)
RepetitionDetail(i=19, blocks=102)
RepetitionDetail(i=20, blocks=107)
RepetitionDetail(i=21, blocks=112)
RepetitionDetail(i=22, blocks=122)
RepetitionDetail(i=23, blocks=131)
RepetitionDetail(i=24, blocks=137)

This change is Reviewable

@jwnimmer-tri jwnimmer-tri added priority: medium release notes: none This pull request should not be mentioned in the release notes labels Sep 25, 2024
@jwnimmer-tri
Copy link
Collaborator Author

+@rpoyner-tri for feature review, please.

Copy link
Contributor

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 unresolved discussion, LGTM missing from assignee rpoyner-tri(platform), needs at least two assigned reviewers

a discussion (no related file):
Got a bit of data. As far as I can tell "simple_source" doesn't leak or is too slow to worry about. On the other hand, "trivial_simulator" leaks about 1Mib/sec, as show in the graph:
Figure_1.png

The initial working set for both is about 300Mib, so it takes maybe 5.5 minutes for "trivial_simulator" to double the initial working set.

The above graph and the monitoring were done with apt install python3-mprof.


Copy link
Collaborator Author

@jwnimmer-tri jwnimmer-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 unresolved discussion, LGTM missing from assignee rpoyner-tri(platform), needs at least two assigned reviewers

a discussion (no related file):

Previously, rpoyner-tri (Rick Poyner (rico)) wrote…

Got a bit of data. As far as I can tell "simple_source" doesn't leak or is too slow to worry about. On the other hand, "trivial_simulator" leaks about 1Mib/sec, as show in the graph:
Figure_1.png

The initial working set for both is about 300Mib, so it takes maybe 5.5 minutes for "trivial_simulator" to double the initial working set.

The above graph and the monitoring were done with apt install python3-mprof.

Excellent. Does that mean this reproducer is a sufficient MVP and we can merge it and keep going?


Copy link
Contributor

@rpoyner-tri rpoyner-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 2 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: needs at least two assigned reviewers

a discussion (no related file):

Previously, jwnimmer-tri (Jeremy Nimmer) wrote…

Excellent. Does that mean this reproducer is a sufficient MVP and we can merge it and keep going?

yeah, I think we've got enough to roll.


@jwnimmer-tri
Copy link
Collaborator Author

+@ggould-tri for platform review tomorrow per schedule, please.

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:; I hope this bears fruit!

BTW, an idea for a future PR, if I were working with this a lot I would definitely toss a https://pypi.org/project/sparklines/ graph into the text output. The difference between logistic (e.g. finite cache), logarithmic (e.g. unbounded dict), and linear (traditional leak) would then be visually obvious.

Reviewed 1 of 2 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! all discussions resolved, LGTM from assignees rpoyner-tri(platform),ggould-tri(platform)

@rpoyner-tri rpoyner-tri merged commit 6074263 into RobotLocomotion:master Oct 1, 2024
9 checks passed
@jwnimmer-tri jwnimmer-tri deleted the memory_leak_manual_test branch October 1, 2024 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: medium release notes: none This pull request should not be mentioned in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants