-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DaCe VRAM pooling #295
Merged
Merged
DaCe VRAM pooling #295
Changes from 133 commits
Commits
Show all changes
134 commits
Select commit
Hold shift + click to select a range
df4c245
Allow for env var to control orchestration
FlorianDeconinck afd41a0
(some) Translate tests
FlorianDeconinck 03a3ba4
Failing flllz orchestration
FlorianDeconinck b9f9146
(Re)Orchestrate remapping
FlorianDeconinck 1b79998
Fix orchestrate for new DaCe
FlorianDeconinck 9250283
Removing extra guard irrelevant since load_once is gone
FlorianDeconinck a4939d7
Correct type hint & return
FlorianDeconinck c0e326c
Use lazy_stencil when orchestrating
FlorianDeconinck 65f3164
Making sure lazy_stencil doesn't trigger before __call__
FlorianDeconinck 97eac2d
Remove the need to cache Communicator in dace_config
FlorianDeconinck 2680451
Fixing communicator removed from DaceConfig
FlorianDeconinck 257cb06
Merge branch 'DaceConfig_RemoveComm' into reorchestrate_all_modules
FlorianDeconinck ac1aec5
Integrate LazyStencil.field_info fix
FlorianDeconinck 4394d99
Remove unused _frozen_stencil() in stencil
FlorianDeconinck 01b40e3
Add domain to stencil __sdfg__
FlorianDeconinck 43eca42
Orchestrate tracers
FlorianDeconinck c119b93
Orchestrate: microphysics (minus driver)
FlorianDeconinck 9d06b5f
Merge branch 'orchestrate_on_AOT_stencils' into reorchestrate_all_mod…
FlorianDeconinck 31934be
Change orchestration build pipe to more efficient trf passes
FlorianDeconinck faf18c0
Minor
FlorianDeconinck 7d1f3d2
Orchestrate: fv_dynamics
FlorianDeconinck 4f721d1
Minor
FlorianDeconinck 7882901
Updating dace.conf
FlorianDeconinck 4e582f0
gitignore; DaCe & test
FlorianDeconinck 53e8f03
Verbose
FlorianDeconinck 15f6d8e
Boolean logic be hard
FlorianDeconinck 25e5bac
Removing dace.conf and replacing it with direct call to conf API
FlorianDeconinck da6e9d2
Restrict dace.config setup to orchestration
FlorianDeconinck 2798f8f
Orchestration: FV_Dynamics
FlorianDeconinck 72e9ae2
Move parsing in orchestration to commong fn, time.
FlorianDeconinck 7fd16aa
Use_cache on SDFG gen & verbose print
FlorianDeconinck d3d4f32
Orchestration: driver
FlorianDeconinck eabd66b
Driver example: c12 baroclinic orchestration on CPU
FlorianDeconinck 7bc7978
Merge remote-tracking branch 'origin/main' into reorchestrate_all_mod…
FlorianDeconinck 76d5771
Linting
FlorianDeconinck d0199a7
More linting
FlorianDeconinck cc28498
Revert cache us on sdfg parse
FlorianDeconinck 429f495
Fix timestep computation. Move in Config
FlorianDeconinck 929ce00
Modify Physics call structure to allow for DaCe parsing limit
FlorianDeconinck 6364776
Fix rank read in build.py logging
FlorianDeconinck 2ed46da
Fix log_on_rank_0 on MPI
FlorianDeconinck b8c0f8c
Bypass parsing when not the rank that should be compiling
FlorianDeconinck 8f3d4e8
Linting
FlorianDeconinck 252f371
Fix test_fv3core
FlorianDeconinck fc03f2b
Rename module orchestrate to orchestration
FlorianDeconinck 1cd473d
Swap .layout/decomposition for a post-build write up & runtime check
FlorianDeconinck 3758d44
Fix dace_config save in restart
FlorianDeconinck 8bf16d9
Lint
FlorianDeconinck 80fb2a8
Merge branch 'main' into reorchestrate_all_modules
FlorianDeconinck 707935e
Typo
FlorianDeconinck 838a18d
lint
FlorianDeconinck cda9cda
Fix dace_config serialization for Restart
FlorianDeconinck fc6b30a
dace_config is optional
FlorianDeconinck ef13fda
lint
FlorianDeconinck 2e9b2ef
Update `dace` version
FlorianDeconinck b0d3ecd
Update constraints.txt
FlorianDeconinck 52b1165
Guard against degenerate behavior for FV3_DACEMODE
FlorianDeconinck 1f551c6
Merge remote-tracking branch 'origin/reorchestrate_all_modules' into …
FlorianDeconinck 8a06268
PR notes - verbosing behavior
FlorianDeconinck f0c7591
Driver performance critical function renamed & verbosed
FlorianDeconinck ff10723
Merge branch 'main' into reorchestrate_all_modules
FlorianDeconinck 49a55e1
Extend write/verification of the build_info
FlorianDeconinck 797f11d
Missing serialization field
FlorianDeconinck 0b6aa3f
Fix build_info file
FlorianDeconinck 2b8d503
Small fix
FlorianDeconinck 737003a
Small fix
FlorianDeconinck 35e5e1c
SDFG count RAM/VRAM
FlorianDeconinck 3a662f1
Fix cmd line
FlorianDeconinck 854d819
Update Pace code to DaCe v0.14 RC (TBR)
FlorianDeconinck 2e96bf9
Update DaCe to 0.14rc1
FlorianDeconinck 5dc3eae
Fix SDFG load on distributed cache
FlorianDeconinck 0ac1090
Microphysics: move setup computation on proper device
FlorianDeconinck 24b36ff
update gt4py to branch
b3a366d
Merge branch 'reorchestrate_all_modules' into dace_auto_RAM_read
FlorianDeconinck b11cb1a
Add per file and in-memory options
FlorianDeconinck 8ee1ed7
Re-insert performance collection after each timestep
FlorianDeconinck 3f1684a
Fix microphysics setup computation on proper Host/Device
FlorianDeconinck 608fa90
lint
FlorianDeconinck 4c57da9
Fix to the ContextLib orchestration
FlorianDeconinck 3ef571a
Merge branch 'main' into reorchestrate_all_modules
FlorianDeconinck 234368e
Merge branch 'reorchestrate_all_modules' into dace_auto_RAM_read
FlorianDeconinck 2e0f449
Detail reporting
FlorianDeconinck 2ebe9e8
Fix command line
FlorianDeconinck e4281b5
Do not instantiate Physics if you are not going to run it
FlorianDeconinck 06ab595
Add debug tools
FlorianDeconinck 29fe391
Drivre: Fix timestep, fix end_of_step_actions for orchestration
FlorianDeconinck daeb8d4
DaCe orchestrated: proper blocking size, comiple for newer target SM
FlorianDeconinck 7078866
Merge branch 'main' into stable_orchestration
FlorianDeconinck d7dd1b8
Lint
FlorianDeconinck 1e12120
Tweaking report to display orchestrated
FlorianDeconinck b1914f7
Merge branch 'stable_orchestration' into dace_auto_RAM_read
FlorianDeconinck fee465f
LINT
FlorianDeconinck 915e19e
Added static analysis to end of build
FlorianDeconinck 04f9458
Verbose
FlorianDeconinck ea3e311
NaN Check: removing unused code, auto schedule type, verbose
FlorianDeconinck 29bd9d3
Pass down transient flag
FlorianDeconinck 4b988e7
Merge branch 'main' into stable_orchestration
FlorianDeconinck 57c1037
Update dace to v0.14rc2
FlorianDeconinck 0cbff4c
Make constraints.txt happy
FlorianDeconinck 5e2a35f
Remove dace constraints -> per PIP requirements workaround
FlorianDeconinck 310c45e
Merge branch 'stable_orchestration' into dace_transient_pooled
FlorianDeconinck 7d75ace
Orchestration: pool persistent mem
FlorianDeconinck b98c64b
Merge branch 'dace_auto_RAM_read' into dace_transient_pooled
FlorianDeconinck 4a087a3
Remove -e from constraints.txt per PIP
FlorianDeconinck 1467a41
Dace config: query dace syncdebug
FlorianDeconinck a7b3393
Merge branch 'main' into stable_orchestration
FlorianDeconinck 401906c
Move gt4py & own reference to DaCe to rc2 capable
FlorianDeconinck 4bde297
GT4Py dace versionning relaxed constraints
FlorianDeconinck 70f1032
Add Dace requirements to the driver
FlorianDeconinck 4283e5a
Make `daint` pre-install dace to go around new PIP behavior of refusi…
FlorianDeconinck 46b1d85
Move dace install up (?) on daint
FlorianDeconinck fc10613
Copy changes to the other (sic) install_virtualenv, for more env fun
FlorianDeconinck 299f327
typo
FlorianDeconinck ffeacb0
Merge branch 'stable_orchestration' into dace_transient_pooled
FlorianDeconinck 5ecf71a
Merge remote-tracking branch 'origin/main' into stable_orchestration
FlorianDeconinck 64b9dec
DaCe config: remoce check_args (not needed & unsupported in new DaCe)
FlorianDeconinck c05127a
Merge branch 'stable_orchestration' into dace_transient_pooled
FlorianDeconinck 772bf71
Lint
FlorianDeconinck cc19793
Merge branch 'main' into dace_transient_pooled
FlorianDeconinck ce0dfed
Fix merge errors
FlorianDeconinck b0e62c3
Remove orch .yaml example
FlorianDeconinck 3b3713d
Use logger instead print
FlorianDeconinck ba15084
Lint
FlorianDeconinck bb6cadb
Merge remote-tracking branch 'origin/HEAD' into dace_transient_pooled
FlorianDeconinck 836a064
Fix logging
FlorianDeconinck 41f8a92
Merge remote-tracking branch 'origin/main' into dace_transient_pooled
FlorianDeconinck 4642f85
Merge remote-tracking branch 'origin/main' into dace_transient_pooled
FlorianDeconinck b8398e9
Deactivate distributed compile
FlorianDeconinck f8c0d9d
Move flag to dace_config
FlorianDeconinck 8287dbc
dace >= 0.14 for VRAM pooling
FlorianDeconinck f27e449
Remove -e from constraint.txt + lint
FlorianDeconinck fe29275
Make CUDA timer safe for non-CUDA context
FlorianDeconinck 761ade2
Reuse GPU availability code & optional import
FlorianDeconinck 5d18530
Cleanup & PR notes
FlorianDeconinck File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
# | ||
# This file is autogenerated by pip-compile | ||
# This file is autogenerated by pip-compile with python 3.8 | ||
# To update, run: | ||
# | ||
# pip-compile --output-file=constraints.txt driver/requirements.txt dsl/requirements.txt external/gt4py/setup.cfg fv3core/requirements.txt fv3gfs-physics/requirements.txt pace-util/requirements.txt requirements_dev.txt requirements_docs.txt requirements_lint.txt | ||
|
@@ -34,7 +34,7 @@ attrs==21.2.0 | |
# pytest | ||
babel==2.9.1 | ||
# via sphinx | ||
backports.entry-points-selectable==1.1.1 | ||
backports-entry-points-selectable==1.1.1 | ||
# via virtualenv | ||
black==22.3.0 | ||
# via | ||
|
@@ -76,8 +76,6 @@ click==8.0.1 | |
# pip-tools | ||
cloudpickle==2.0.0 | ||
# via dask | ||
cmake==3.22.4 | ||
# via dace | ||
Comment on lines
-79
to
-80
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see this dependency added back anywhere. Was that removed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
commonmark==0.9.1 | ||
# via recommonmark | ||
coverage==5.5 | ||
|
@@ -88,6 +86,13 @@ cytoolz==0.11.2 | |
# via | ||
# gt4py | ||
# gt4py (external/gt4py/setup.cfg) | ||
dace==0.14 | ||
# via | ||
# -r driver/requirements.txt | ||
# -r dsl/requirements.txt | ||
# -r fv3core/requirements/requirements_dace.txt | ||
# -r requirements_dev.txt | ||
# pace-dsl | ||
dacite==1.6.0 | ||
# via | ||
# -r driver/requirements.txt | ||
|
@@ -105,8 +110,6 @@ dill==0.3.5.1 | |
# via dace | ||
distlib==0.3.2 | ||
# via virtualenv | ||
distro==1.7.0 | ||
# via scikit-build | ||
docutils==0.16 | ||
# via | ||
# recommonmark | ||
|
@@ -163,15 +166,15 @@ google-api-core==2.0.0 | |
# via | ||
# google-cloud-core | ||
# google-cloud-storage | ||
google-auth-oauthlib==0.4.5 | ||
# via gcsfs | ||
google-auth==2.0.1 | ||
# via | ||
# gcsfs | ||
# google-api-core | ||
# google-auth-oauthlib | ||
# google-cloud-core | ||
# google-cloud-storage | ||
google-auth-oauthlib==0.4.5 | ||
# via gcsfs | ||
google-cloud-core==2.0.0 | ||
# via google-cloud-storage | ||
google-cloud-storage==1.42.0 | ||
|
@@ -238,15 +241,15 @@ multidict==5.1.0 | |
# via | ||
# aiohttp | ||
# yarl | ||
mypy==0.790 | ||
# via | ||
# -r fv3gfs-physics/requirements.txt | ||
# -r pace-util/requirements.txt | ||
mypy-extensions==0.4.3 | ||
# via | ||
# black | ||
# mypy | ||
# typing-inspect | ||
mypy==0.790 | ||
# via | ||
# -r fv3gfs-physics/requirements.txt | ||
# -r pace-util/requirements.txt | ||
netcdf4==1.5.7 | ||
# via | ||
# -r driver/requirements.txt | ||
|
@@ -292,7 +295,6 @@ packaging==21.0 | |
# gt4py | ||
# gt4py (external/gt4py/setup.cfg) | ||
# pytest | ||
# scikit-build | ||
# sphinx | ||
# tox | ||
pandas==1.3.2 | ||
|
@@ -326,12 +328,12 @@ py==1.10.0 | |
# pytest | ||
# pytest-forked | ||
# tox | ||
pyasn1-modules==0.2.8 | ||
# via google-auth | ||
pyasn1==0.4.8 | ||
# via | ||
# pyasn1-modules | ||
# rsa | ||
pyasn1-modules==0.2.8 | ||
# via google-auth | ||
pybind11==2.8.1 | ||
# via | ||
# gt4py | ||
|
@@ -350,6 +352,21 @@ pygments==2.10.0 | |
# via sphinx | ||
pyparsing==2.4.7 | ||
# via packaging | ||
pytest==6.2.4 | ||
# via | ||
# -r driver/requirements.txt | ||
# -r fv3core/requirements/requirements_base.txt | ||
# -r requirements_dev.txt | ||
# pytest-cache | ||
# pytest-cov | ||
# pytest-datadir | ||
# pytest-dependency | ||
# pytest-factoryboy | ||
# pytest-forked | ||
# pytest-profiling | ||
# pytest-regressions | ||
# pytest-subtests | ||
# pytest-xdist | ||
pytest-cache==1.0 | ||
# via -r fv3core/requirements/requirements_base.txt | ||
pytest-cov==2.12.1 | ||
|
@@ -378,21 +395,6 @@ pytest-subtests==0.5.0 | |
# -r requirements_dev.txt | ||
pytest-xdist==2.3.0 | ||
# via -r fv3core/requirements/requirements_base.txt | ||
pytest==6.2.4 | ||
# via | ||
# -r driver/requirements.txt | ||
# -r fv3core/requirements/requirements_base.txt | ||
# -r requirements_dev.txt | ||
# pytest-cache | ||
# pytest-cov | ||
# pytest-datadir | ||
# pytest-dependency | ||
# pytest-factoryboy | ||
# pytest-forked | ||
# pytest-profiling | ||
# pytest-regressions | ||
# pytest-subtests | ||
# pytest-xdist | ||
python-dateutil==2.8.2 | ||
# via | ||
# faker | ||
|
@@ -411,8 +413,6 @@ pyyaml==5.4.1 | |
# pytest-regressions | ||
recommonmark==0.7.1 | ||
# via -r requirements_docs.txt | ||
requests-oauthlib==1.3.0 | ||
# via google-auth-oauthlib | ||
requests==2.26.0 | ||
# via | ||
# dace | ||
|
@@ -421,10 +421,10 @@ requests==2.26.0 | |
# google-cloud-storage | ||
# requests-oauthlib | ||
# sphinx | ||
requests-oauthlib==1.3.0 | ||
# via google-auth-oauthlib | ||
rsa==4.7.2 | ||
# via google-auth | ||
scikit-build==0.15.0 | ||
# via dace | ||
scipy==1.7.1 | ||
# via | ||
# -r fv3core/requirements/requirements_base.txt | ||
|
@@ -447,6 +447,13 @@ snowballstemmer==2.1.0 | |
# via sphinx | ||
sortedcontainers==2.4.0 | ||
# via hypothesis | ||
sphinx==4.1.2 | ||
# via | ||
# -r requirements_docs.txt | ||
# recommonmark | ||
# sphinx-argparse | ||
# sphinx-gallery | ||
# sphinx-rtd-theme | ||
sphinx-argparse==0.3.1 | ||
# via -r requirements_docs.txt | ||
sphinx-gallery==0.10.1 | ||
|
@@ -455,13 +462,6 @@ sphinx-rtd-theme==0.5.2 | |
# via | ||
# -r pace-util/requirements.txt | ||
# -r requirements_docs.txt | ||
sphinx==4.1.2 | ||
# via | ||
# -r requirements_docs.txt | ||
# recommonmark | ||
# sphinx-argparse | ||
# sphinx-gallery | ||
# sphinx-rtd-theme | ||
sphinxcontrib-applehelp==1.0.2 | ||
# via sphinx | ||
sphinxcontrib-devhelp==1.0.2 | ||
|
@@ -533,7 +533,6 @@ wheel==0.37.0 | |
# -r pace-util/requirements.txt | ||
# astunparse | ||
# pip-tools | ||
# scikit-build | ||
xarray==0.19.0 | ||
# via | ||
# -r driver/requirements.txt | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
import os | ||
from typing import Optional | ||
|
||
import click | ||
|
||
from pace.dsl.dace.utils import count_memory_from_path | ||
|
||
|
||
# Count the memory from a given SDFG | ||
ACTION_SDFG_MEMORY_COUNT = "sdfg_memory_count" | ||
|
||
|
||
@click.command() | ||
@click.argument( | ||
"action", | ||
required=True, | ||
type=click.Choice([ACTION_SDFG_MEMORY_COUNT]), | ||
) | ||
@click.option( | ||
"--sdfg_path", | ||
type=click.STRING, | ||
) | ||
@click.option("--report_detail", is_flag=True, type=click.BOOL, default=False) | ||
def command_line(action: str, sdfg_path: Optional[str], report_detail: Optional[bool]): | ||
""" | ||
Run tooling. | ||
""" | ||
if action == ACTION_SDFG_MEMORY_COUNT: | ||
if sdfg_path is None or not os.path.exists(sdfg_path): | ||
raise RuntimeError(f"Can't load SDFG {sdfg_path}") | ||
print(count_memory_from_path(sdfg_path, detail_report=report_detail)) | ||
|
||
|
||
if __name__ == "__main__": | ||
command_line() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,4 +7,4 @@ numpy | |
netCDF4 | ||
xarray | ||
zarr | ||
git+https://github.com/spcl/dace[email protected] | ||
dace>=0.14 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know why this changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-_o_-