New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Added profiling docs for Polaris and Aurora #576

Open

khossain4337 wants to merge 3 commits into main from kh_dlprof

Collaborator

khossain4337 commented Dec 11, 2024

For Poalris:

Added nsys and ncu profiling methods

For Aurora

Added unitrace profiling methods

TODO:

Add PyTorch Profiler for Polaris
Add THAPI for Aurora

Khalid Hossain added 3 commits

December 11, 2024 14:15


          Added unitrace profiling for Aurora, updated mkdocs nav.

3bd1991


          Merge branch 'main' into kh_dlprof.

798b048

Pulling in the changes from main.


          Added profiling with nsys on Polaris, updated mkdocs.yml

c26f6b4

khossain4337 requested review from BethanyL and FilippoSimini

December 11, 2024 23:02

felker requested changes

View reviewed changes

docs/aurora/data-science/profiling_dl.md

+              multiple nodes. A simple example, where we use a wrapper script to trace the
+              rank 0 on each node of a 4 node job running a PyTorch application is below:
+              ### A `unitrace` wrapper

Member

felker Jan 9, 2025

Can you add a title for the script using: https://squidfunk.github.io/mkdocs-material/reference/code-blocks/#adding-a-title

And remove the ### subsection header

docs/aurora/data-science/profiling_dl.md


		### Deployment

		The wrapper above can be deployed using a PBS job script the following way

Member

felker Jan 9, 2025

Suggested change

      
            The wrapper above can be deployed using a PBS job script the following way
          
            The wrapper above can be deployed using the following PBS job script:

docs/polaris/data-science/profiling_dl.md

		@@ -0,0 +1,250 @@
		# Profiling Deep Learning Applications

		We can use both framework (for example, PyTorch) native profiler and vendor specific

Member

felker Jan 9, 2025

Suggested change

      
            We can use both framework (for example, PyTorch) native profiler and vendor specific 
          
            We can use both a framework-specific (for example, PyTorch-specific) native profiler and the vendor-specific NVIDIA

docs/polaris/data-science/profiling_dl.md

+              [Nsight compute profiler](https://developer.nvidia.com/tools-overview/nsight-compute/get-started).
+              Refer to the respective documentation for more details:
+              [Nsight System User Guide](https://docs.nvidia.com/nsight-systems/UserGuide/index.html)

Member

felker Jan 9, 2025

use unordered list -

docs/polaris/data-science/profiling_dl.md

+              multiple nodes. A simple example, where we use a wrapper script to trace the
+              rank 0 on each node of a 2 node job running a PyTorch application is below:
+              ### An `nsys` wrapper

Member

felker Jan 9, 2025

same comment as before re: code block title

docs/polaris/data-science/profiling_dl.md

+              This wrapper can be deployed as the `nsys` example above. In the `ncu` wrapper
+              we explicitly set the name of the kernel that we want to analyze
+              (a gemm kernel in this case).

Member

felker Jan 9, 2025

Suggested change

      
            (a gemm kernel in this case).
          
            (a GEMM kernel in this case).

or

Suggested change

      
            (a gemm kernel in this case).
          
            (a `gemm` kernel in this case).

docs/polaris/data-science/profiling_dl.md

+              The next step is to load the `nsys-rep` files in the Nsight Systems GUI, and
+              the `ncu-rep` files to the Nsight Compute GUI.
+              ### For a single rank run

Member

felker Jan 9, 2025

Suggested change

      
            ### For a single rank run
          
            ### Single rank run

docs/polaris/data-science/profiling_dl.md

+              of the documentation. Here we only show standard options, either of the three
+              could be chosen. Note that, invoking each option will lead to varying amounts
+              of time the profiler need to run. This will be important in setting the
+              requested wall-time for your batch job.

Member

felker Jan 9, 2025

Suggested change

      
            requested wall-time for your batch job.
          
            requested walltime for your batch job.

docs/polaris/data-science/profiling_dl.md

+              generate the profiles. The exhaustive list could be found in the respective
+              documentation pages:
+              [Nsight System User Guide](https://docs.nvidia.com/nsight-systems/UserGuide/index.html)

Member

felker Jan 9, 2025

Use unordered list to clean up the formatting

docs/aurora/data-science/profiling_dl.md

+              fi
+              ```
+              There are a few important things to notice in the wrapper.

Member

felker Jan 9, 2025

Suggested change

      
            There are a few important things to notice in the wrapper.
          
            There are several important shell variables in the wrapper, which may require modification:

also change the phrasing on the Polaris profiling doc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet