Merge pull request #2533 from devitocodes/advisor_refresh_II

misc: Update advisor with oneAPI 2025
devitocodes · Mar 7, 2025 · b22dd66 · b22dd66
2 parents e1dcb86 + fd5df3f
commit b22dd66
Show file tree

Hide file tree

Showing 9 changed files with 234 additions and 267 deletions.
diff --git a/benchmarks/user/advisor/README.md b/benchmarks/user/advisor/README.md
@@ -1,48 +1,88 @@
-Example runs:
+# Intel Advisor roofline profiling on Devito
 
-* `python3 run_advisor.py --name isotropic --path <path-to-devito>/examples/seismic/acoustic/acoustic_example.py`
-* `python3 run_advisor.py --name tti_so8 --path <path-to-devito>/examples/seismic/tti/tti_example.py --exec-args "-so 8"`
-* `python3 run_advisor.py --name iso_ac_so6 --path <path-to-devito>/benchmarks/user/benchmark.py --exec-args "bench -P acoustic -so 6 --tn 200 -d 100 100 100 --autotune off -x 1"`
+This README aims to help users derive rooflines through using Devito with [Intel Advisor](https://www.intel.com/content/www/us/en/developer/tools/oneapi/advisor.html).
+We recommend going through tutorial [02_advisor_roofline.ipynb](https://github.com/devitocodes/devito/blob/master/examples/performance/02_advisor_roofline.ipynb) for a more detailed step-by-step guidance.
 
-After the run has finished you should be able to plot a roofline with the results and export roofline data to JSON using:
-* `python3 roofline.py --name Roofline --project <advisor-project-name>`
+### Prerequisites:
+* Support is guaranteed only for Intel oneAPI 2025; earlier versions may not work.
+You may download Intel oneAPI [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?packages=oneapi-toolkit&oneapi-toolkit-os=linux&oneapi-lin=apt).
 
-To create a read-only snapshot for use with Intel Advisor GUI, use:
-* `advixe-cl --snapshot --project-dir=<advisor-project-name> pack -- /<new-snapshot-name>`
+* Add Advisor (advixe-cl) and compilers (icx) in the path. The right env variables should be sourced along the lines of (depending on your isntallation folder):
+```sh
+source /opt/intel/oneapi/advisor/latest/env/vars.sh
+source /opt/intel/oneapi/compiler/latest/env/vars.sh
+```
 
-Prerequisites:
-* Support guaranteed only for Intel Advisor as installed with Intel Parallel Studio v 2020 Update 2
-  and Intel oneAPI 2021; earlier years may not work; other 2020/2021 versions, as well as later years,
-  may or may not work.
 * In Linux systems you may need to enable system-wide profiling by setting:
-  - `/proc/sys/kernel/yama/ptrace_scope` to `0`
-  - `/proc/sys/kernel/perf_event_paranoid` to `1`
 
-* `numactl` must be available on the system. If not available, install with:
-	`sudo apt-get install numactl`
+```sh
+/proc/sys/kernel/yama/ptrace_scope to 0
+/proc/sys/kernel/perf_event_paranoid to 1
+```
+
+* `numactl` must be available on the system. If not available, install using:
+```sh
+sudo apt-get install numactl
+```
 * Install `pandas` and `matplotlib`. They are not included in the core Devito installation.
+```sh
+pip install pandas matplotlib
+```
+
+
+### Example runs:
 
-Limitations:
+```bash
+# The isotropic acoustic example
+python3 run_advisor.py --name isotropic --path <path-to-devito>/examples/seismic/acoustic/acoustic_example.py
+# The isotropic elastic example
+python3 run_advisor.py --name iso_elastic --path <path-to-devito>/examples/seismic/elastic/elastic_example.py --exec-args "-so 4"
+# The anisotropic acoustic (TTI) example
+python3 run_advisor.py --name tti_so8 --path <path-to-devito>/examples/seismic/tti/tti_example.py --exec-args "-so 8"
+```
 
-* Untested with more complicated examples.
-* Untested on Intel KNL (we might need to ask `numactl` to bind to MCDRAM).
-* Running the `tripcounts` analysis takes a lot, despite starting in paused
+After the run has finished you should be able to save a `.json` and plot the
+roofline with the results:
+```bash
+python3 roofline.py --name Roofline --project <advisor-project-name>
+```
+
+To create a read-only snapshot for use with Intel Advisor GUI, use:
+```bash
+advixe-cl --snapshot --project-dir=<advisor-project-name> pack -- /<new-snapshot-name>
+```
+### Limitations:
+
+* Not tested with all possible examples that Devito can support.
+* Running the `tripcounts` analysis is time-consuming, despite starting in paused
   mode. This analysis, together with the `survey` analysis, is necessary to
   generate a roofline. Both are run by `run_advisor.py`.
-* Requires python3, untested in earlier versions of python and conda environments
-* Currently requires download of repository and running `pip3 install .`, the scripts
+* Requires Python 3.9 or later, untested in conda environments
+* Currently requires download of repository and running `pip install .`, the scripts
   are currently not included as a package with the user installation of Devito
 
-TODO:
+### TODO:
 
 * Give a name to the points in the roofline, otherwise it's challenging to
   relate loops (code sections) to data.
 * Emit a report summarizing the configuration used to run the analysis
   (threading, socket binding, ...).
 
-Useful links:
+### Useful links:
+
+* [ Intel® Advisor Performance Optimization Cookbook ](https://www.intel.com/content/www/us/en/docs/advisor/cookbook/2024-2/overview.html " Intel® Advisor Performance Optimization Cookbook ")
+
+* [ Intel® Advisor User Guide ](https://www.intel.com/content/www/us/en/docs/advisor/cookbook/2024-2/overview.html " Intel® Advisor User Guide ")
+
+* [ Roofline Resources for Intel® Advisor Users ](https://software.intel.com/content/www/us/en/develop/articles/advisor-roofline-resources.html " Roofline Resources for Intel® Advisor Users ")
+
 * [ Memory-Level Roofline Analysis in Intel® Advisor ](https://software.intel.com/content/www/us/en/develop/articles/memory-level-roofline-model-with-advisor.html " Memory-Level Roofline Analysis in Intel® Advisor ")
-* [CPU / Memory Roofline Insights
-Perspective](https://software.intel.com/content/www/us/en/develop/documentation/advisor-user-guide/top/optimize-cpu-usage/cpu-roofline-perspective.html "CPU / Memory Roofline Insights
-Perspective")
-* [ Roofline Resources for Intel® Advisor Users ](https://software.intel.com/content/www/us/en/develop/articles/advisor-roofline-resources.html " Roofline Resources for Intel® Advisor Users ")
+
+* [ Identify Bottlenecks Iteratively: Cache-Aware Roofline ](https://www.intel.com/content/www/us/en/docs/advisor/cookbook/2024-2/identify-bottlenecks-cache-aware-roofline.html " Identify Bottlenecks Iteratively: Cache-Aware Roofline ")
+
+* [ Samuel Williams, Andrew Waterman, and David Patterson [2009]. Roofline: an insightful visual performance model for multicore architectures ](https://dl.acm.org/doi/10.1145/1498765.1498785 " Roofline: an insightful visual performance model for multicore architectures ")
+
+* [ A. Ilic, F. Pratas and L. Sousa [2014]. Cache-aware Roofline model: Upgrading the loft ](https://ieeexplore.ieee.org/document/6506838 " Cache-aware Roofline model: Upgrading the loft ")
+
+* [ Understanding the Roofline Model by Durganshu Mishra ](https://hackernoon.com/understanding-the-roofline-model " Understanding the Roofline Model ")
+
diff --git a/benchmarks/user/advisor/roofline.py b/benchmarks/user/advisor/roofline.py
@@ -18,23 +18,22 @@
 import sys
 import os
 
-from benchmarks.user.advisor.advisor_logging import check, err, log
+from advisor_logging import check, err, log
 
 
 try:
     import advisor
 except ImportError:
     check(False, 'Error: Intel Advisor could not be found on the system,'
           ' make sure to source environment variables properly. Information can be'
-          ' found at https://software.intel.com/content/www/us/en/develop/'
-          'documentation/advisor-user-guide/top/launch-the-intel-advisor/'
-          'intel-advisor-cli/setting-and-using-intel-advisor-environment-variables.html')
+          ' found at https://www.intel.com/content/www/us/en/docs/advisor/'
+          'user-guide/2024-2/set-up-environment-variables.html')
     sys.exit(1)
 
 
 matplotlib.use('Agg')
 # Use fancy plot colors
-plt.style.use('seaborn-darkgrid')
+plt.style.use('ggplot')
 
 
 @click.command()
@@ -65,21 +64,25 @@
 def roofline(name, project, scale, precision, mode, th):
     pd.options.display.max_rows = 20
 
-    log('Opening project...')
+    log(f'Opening project {project}...')
     project = advisor.open_project(str(project))
 
     if not project:
-        err('Could not open project %s.' % project)
+        err(f'Could not open project {project}.')
     log('Loading data...')
 
     data = project.load(advisor.SURVEY)
     rows = [{col: row[col] for col in row} for row in data.bottomup]
     roofs = data.get_roofs()
 
+    # Following deprecation solution from here:
+    # https://github.com/pandas-dev/pandas/issues/57734
+    pd.set_option('future.no_silent_downcasting', True)
     full_df = pd.DataFrame(rows).replace('', np.nan)
 
     # Narrow down the columns to those of interest
     try:
+        analysis_columns = ['loop_name', 'self_ai', 'self_gflops', 'self_time']
         df = full_df[analysis_columns].copy()
     except KeyError:
         err('Could not read data columns from profiling. Not enough data has been '
@@ -168,8 +171,8 @@ def roofline(name, project, scale, precision, mode, th):
             label_x = row.self_ai + (row.self_ai + ai_max - 2 * ai_min) * (2**0.005 - 1)
             label_y = row.self_gflops
             ax.text(label_x, label_y,
-                    'Time: %.2fs\n'
-                    'Incidence: %.0f%%' % (row.self_time, row.percent_weight),
+                    f'Time: {row.self_time:.2f}s\n'
+                    f'Incidence: {row.percent_weight:.0f}%',
                     bbox={'boxstyle': 'round', 'facecolor': 'white'}, fontsize=8)
         top_loops_data = [{'ai': row.self_ai,
                            'gflops': row.self_gflops,
@@ -198,19 +201,18 @@ def roofline(name, project, scale, precision, mode, th):
     legend = plt.legend(loc='center left', bbox_to_anchor=(1, 0.5),
                         prop={'size': 7}, title='Rooflines')
 
-    # saving the chart in PNG format
-    plt.savefig('%s.png' % name, bbox_extra_artists=(legend,), bbox_inches='tight')
+    # saving the chart in PDF format
+    plt.savefig(f'{name}.pdf', bbox_extra_artists=(legend,), bbox_inches='tight')
     figpath = os.path.realpath(__file__).split(os.path.basename(__file__))[0]
-    log('Figure saved in %s%s.png.' % (figpath, name))
+    log(f'\nFigure saved in {figpath}{name}.pdf.')
 
     # Save the JSON file
     with open('%s.json' % name, 'w') as f:
         f.write(json.dumps(roofline_data))
 
-    log('JSON file saved as %s.json.' % name)
+    log(f'\nJSON file saved as {name}.json.')
+    log('Done!')
 
 
-analysis_columns = ['loop_name', 'self_ai', 'self_gflops', 'self_time']
-
 if __name__ == '__main__':
     roofline()
diff --git a/benchmarks/user/advisor/run_advisor.py b/benchmarks/user/advisor/run_advisor.py
@@ -1,16 +1,14 @@
+import click
 import datetime
 import logging
 import os
+import sys
+
 from pathlib import Path
 from subprocess import check_output, PIPE, Popen
-import sys
 from tempfile import gettempdir, mkdtemp
 
-import click
-
-
-from benchmarks.user.advisor.advisor_logging import (check, log, progress,
-                                                     log_process)
+from advisor_logging import check, log, progress, log_process
 
 
 @click.command()
@@ -30,40 +28,43 @@
                                    'in --exec-args (if any).')
 def run_with_advisor(path, output, name, exec_args):
     path = Path(path)
-    check(path.is_file(), '%s not found' % path)
-    check(path.suffix == '.py', '%s not a Python file' % path)
+    check(path.is_file(), f'{path} not found')
+    check(path.suffix == '.py', f'{path} not a Python file')
 
     # Create a directory to store the profiling report
     if name is None:
         name = path.stem
         if exec_args:
-            name = "%s_%s" % (name, ''.join(exec_args.split()))
+            name = f"{name}_{''.join(exec_args.split())}"
     if output is None:
         output = Path(gettempdir()).joinpath('devito-profilings')
         output.mkdir(parents=True, exist_ok=True)
     else:
         output = Path(output)
     if name is None:
-        output = Path(mkdtemp(dir=str(output), prefix="%s-" % name))
+        output = Path(mkdtemp(dir=str(output), prefix=f"{name}-"))
     else:
         output = Path(output).joinpath(name)
         output.mkdir(parents=True, exist_ok=True)
 
-    # Intel Advisor and Intel compilers must be available through either Intel Parallel
-    # Studio or Intel oneAPI (currently tested versions include IPS 2020 Update 2 and
-    # oneAPI 2021 beta08)
+    # advixe-cl and icx should be available through Intel oneAPI
+    # (tested with Intel oneAPI 2025.1)
     try:
         ret = check_output(['advixe-cl', '--version']).decode("utf-8")
+        log(f"Found advixe-cl version: {ret.strip()}\n")
     except FileNotFoundError:
-        check(False, "Error: Couldn't detect `advixe-cl` to run Intel Advisor.")
+        check(False, "Error: Couldn't detect `advixe-cl` to run Intel Advisor."
+              " Please source the Advisor environment.")
 
     try:
-        ret = check_output(['icc', '--version']).decode("utf-8")
+        ret = check_output(['icx', '--version']).decode("utf-8")
+        log(f"Found icx version: {ret.strip()}\n")
     except FileNotFoundError:
-        check(False, "Error: Couldn't detect Intel Compiler (icc).")
+        check(False, "Error: Couldn't detect Intel Compiler (icx)."
+              " Please source the Intel oneAPI compilers.")
 
     # All good, Intel compiler and advisor are available
-    os.environ['DEVITO_ARCH'] = 'intel'
+    os.environ['DEVITO_ARCH'] = 'icx'
 
     # Tell Devito to instrument the generated code for Advisor
     os.environ['DEVITO_PROFILING'] = 'advisor'
@@ -73,7 +74,7 @@ def run_with_advisor(path, output, name, exec_args):
     if devito_logging is None:
         os.environ['DEVITO_LOGGING'] = 'WARNING'
 
-    with progress('Setting up multi-threading environment'):
+    with progress('Setting up multi-threading environment with OpenMP'):
         # Roofline analyses are recommended with threading enabled
         os.environ['DEVITO_LANGUAGE'] = 'openmp'
 
@@ -84,20 +85,19 @@ def run_with_advisor(path, output, name, exec_args):
             ret = check_output(['numactl', '--show']).decode("utf-8")
             ret = dict(i.split(':') for i in ret.split('\n') if i)
             n_sockets = len(ret['cpubind'].split())
-            n_cores = len(ret['physcpubind'].split())  # noqa
         except FileNotFoundError:
             check(False, "Couldn't detect `numactl`, necessary for thread pinning.")
 
         # Prevent NumPy from using threads, which otherwise leads to a deadlock when
         # used in combination with Advisor. This issue has been described at:
-        #     `software.intel.com/en-us/forums/intel-advisor-xe/topic/780506`
+        #   `software.intel.com/en-us/forums/intel-advisor-xe/topic/780506`
         # Note: we should rather sniff the BLAS library used by NumPy, and set the
         # appropriate env var only
         os.environ['OPENBLAS_NUM_THREADS'] = '1'
         os.environ['MKL_NUM_THREADS'] = '1'
         # Note: `Numaexpr`, used by NumPy, also employs threading, so we shall disable
         # it too via the corresponding env var. See:
-        #     `stackoverflow.com/questions/17053671/python-how-do-you-stop-numpy-from-multithreading`  # noqa
+        #   `stackoverflow.com/questions/17053671/python-how-do-you-stop-numpy-from-multithreading`  # noqa
         os.environ['NUMEXPR_NUM_THREADS'] = '1'
 
     # To build a roofline with Advisor, we need to run two analyses back to
@@ -130,8 +130,8 @@ def run_with_advisor(path, output, name, exec_args):
     ]
     py_cmd = [sys.executable, str(path)] + exec_args.split()
 
-    # Before collecting the `survey` and `tripcounts` a "pure" python run to warmup the
-    # jit cache is preceded
+    # Before collecting the `survey` and `tripcounts` a "pure" python run
+    # to warmup the jit cache is preceded
 
     log('Starting Intel Advisor\'s `roofline` analysis for `%s`' % name)
     dt = datetime.datetime.now()
@@ -147,6 +147,9 @@ def run_with_advisor(path, output, name, exec_args):
     advixe_handler.setFormatter(advixe_formatter)
     advixe_logger.addHandler(advixe_handler)
 
+    log(f"Project folder: {output}")
+    log(f"Logging progress in: `{advixe_handler.baseFilename}`")
+
     with progress('Performing `cache warm-up` run'):
         try:
             p_warm_up = Popen(py_cmd, stdout=PIPE, stderr=PIPE)
@@ -170,10 +173,12 @@ def run_with_advisor(path, output, name, exec_args):
         except OSError:
             check(False, 'Failed!')
 
-    log('Storing `survey` and `tripcounts` data in `%s`' % str(output))
+    log(f'Storing `survey` and `tripcounts` data in `{output}`')
     log('To plot a roofline type: ')
-    log('python3 roofline.py --name %s --project %s --scale %f'
-        % (name, str(output), n_sockets))
+    log(f'python3 roofline.py --name {name} --project {output} --scale {n_sockets}')
+
+    log('\nTo open the roofline using advixe-gui: ')
+    log(f'advixe-gui {output}')
 
 
 if __name__ == '__main__':

diff --git a/conftest.py b/conftest.py
@@ -8,7 +8,7 @@
 from devito import Eq, configuration, Revolver  # noqa
 from devito.checkpointing import NoopRevolver
 from devito.finite_differences.differentiable import EvalDerivative
-from devito.arch import Cpu64, Device, sniff_mpi_distro, Arm
+from devito.arch import Cpu64, Device, sniff_mpi_distro, Arm, get_advisor_path
 from devito.arch.compiler import (compiler_registry, IntelCompiler, OneapiCompiler,
                                   NvidiaCompiler)
 from devito.ir.iet import (FindNodes, FindSymbols, Iteration, ParallelBlock,
@@ -32,8 +32,8 @@ def skipif(items, whole_module=False):
     # Sanity check
     accepted = set()
     accepted.update({'device', 'device-C', 'device-openmp', 'device-openacc',
-                     'device-aomp', 'cpu64-icc', 'cpu64-icx', 'cpu64-nvc', 'cpu64-arm',
-                     'cpu64-icpx', 'chkpnt'})
+                     'device-aomp', 'cpu64-icc', 'cpu64-icx', 'cpu64-nvc',
+                     'noadvisor', 'cpu64-arm', 'cpu64-icpx', 'chkpnt'})
     accepted.update({'nodevice'})
     unknown = sorted(set(items) - accepted)
     if unknown:
@@ -79,6 +79,12 @@ def skipif(items, whole_module=False):
            isinstance(configuration['platform'], Cpu64):
             skipit = "`icx+cpu64` won't work with this test"
             break
+        # Skip if icx or advisor are not available
+        if i == 'noadvisor' and \
+            (not isinstance(configuration['compiler'], IntelCompiler) or
+             not get_advisor_path()):
+            skipit = "Only `icx+advisor` should be tested here"
+            break
         # Skip if it won't run on Arm
         if i == 'cpu64-arm' and isinstance(configuration['platform'], Arm):
             skipit = "Arm doesn't support x86-specific instructions"