-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DaCe VRAM pooling #295
DaCe VRAM pooling #295
Conversation
Prevent patch class to be badly renamed Make DaCeOrchestration.Run resilient to no communicator, defaulting to `.gt_cache` Change conftest to adjust for changes in DaceConfig reqs
Remove custom lazy compile
Restore proper restart config save
Fix OOB with passing origin to the __sdfg__
Orchestrate dyncore: delnflux
Remove unused parameter in Remap Orchestrate: dyn_core Fix translate parallel test comm passing to dace_config
Bad fix for multi-process yaml load
launch jenkins |
4 similar comments
launch jenkins |
launch jenkins |
launch jenkins |
launch jenkins |
@@ -34,7 +34,7 @@ attrs==21.2.0 | |||
# pytest | |||
babel==2.9.1 | |||
# via sphinx | |||
backports.entry-points-selectable==1.1.1 | |||
backports-entry-points-selectable==1.1.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know why this changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-_o_-
cmake==3.22.4 | ||
# via dace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this dependency added back anywhere. Was that removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-_o_-
I am guessing DaCe
changed their dependency tree
f"\t {detail.name}\n" | ||
) | ||
|
||
return report |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion: it would be more flexible to separate the counting memory from the report generation, i.e. split this into two functions. The first would create something like [(name, size)]
for an SDFG, then the next one would the english representation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am prepping another PR that renames that tool and add a kernel timing analysis. I'll break it up there
memory_pooled = 0.0 | ||
for _sd, _aname, arr in sdfg.arrays_recursive(): | ||
if arr.lifetime == dace.AllocationLifetime.Persistent: | ||
arr.pool = True | ||
memory_pooled += arr.total_size * arr.dtype.bytes | ||
arr.lifetime = dace.AllocationLifetime.Scope | ||
memory_pooled = float(memory_pooled) / (1024 * 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is my understanding correct that this is the moment when the arrays are switched to using pooled memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. DaCe will automatically, at code generation, pool all Scoped arrays flagged. So we swap persistent arrays (e.g. arrays in sub-SDFG not passed as parameters to the top SDFG) into scope, and flag them.
Actual pooling is done at code gen time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am clarifying in the comment
Purpose
Add VRAM pooling by moving all Persistent arrays in Transient withing the DaCe pipeline then use the new DaCe auto-pool of transients
Code changes:
Requirements changes:
python -m driver.tools
tooling>= 0.14
Infrastructure changes:
mallocAsync
fails when turned onChecklist