Skip to content

Commit

Permalink
Merge pull request #2533 from devitocodes/advisor_refresh_II
Browse files Browse the repository at this point in the history
misc: Update advisor with oneAPI 2025
  • Loading branch information
FabioLuporini authored Mar 7, 2025
2 parents e1dcb86 + fd5df3f commit b22dd66
Show file tree
Hide file tree
Showing 9 changed files with 234 additions and 267 deletions.
96 changes: 68 additions & 28 deletions benchmarks/user/advisor/README.md
Original file line number Diff line number Diff line change
@@ -1,48 +1,88 @@
Example runs:
# Intel Advisor roofline profiling on Devito

* `python3 run_advisor.py --name isotropic --path <path-to-devito>/examples/seismic/acoustic/acoustic_example.py`
* `python3 run_advisor.py --name tti_so8 --path <path-to-devito>/examples/seismic/tti/tti_example.py --exec-args "-so 8"`
* `python3 run_advisor.py --name iso_ac_so6 --path <path-to-devito>/benchmarks/user/benchmark.py --exec-args "bench -P acoustic -so 6 --tn 200 -d 100 100 100 --autotune off -x 1"`
This README aims to help users derive rooflines through using Devito with [Intel Advisor](https://www.intel.com/content/www/us/en/developer/tools/oneapi/advisor.html).
We recommend going through tutorial [02_advisor_roofline.ipynb](https://github.com/devitocodes/devito/blob/master/examples/performance/02_advisor_roofline.ipynb) for a more detailed step-by-step guidance.

After the run has finished you should be able to plot a roofline with the results and export roofline data to JSON using:
* `python3 roofline.py --name Roofline --project <advisor-project-name>`
### Prerequisites:
* Support is guaranteed only for Intel oneAPI 2025; earlier versions may not work.
You may download Intel oneAPI [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?packages=oneapi-toolkit&oneapi-toolkit-os=linux&oneapi-lin=apt).

To create a read-only snapshot for use with Intel Advisor GUI, use:
* `advixe-cl --snapshot --project-dir=<advisor-project-name> pack -- /<new-snapshot-name>`
* Add Advisor (advixe-cl) and compilers (icx) in the path. The right env variables should be sourced along the lines of (depending on your isntallation folder):
```sh
source /opt/intel/oneapi/advisor/latest/env/vars.sh
source /opt/intel/oneapi/compiler/latest/env/vars.sh
```

Prerequisites:
* Support guaranteed only for Intel Advisor as installed with Intel Parallel Studio v 2020 Update 2
and Intel oneAPI 2021; earlier years may not work; other 2020/2021 versions, as well as later years,
may or may not work.
* In Linux systems you may need to enable system-wide profiling by setting:
- `/proc/sys/kernel/yama/ptrace_scope` to `0`
- `/proc/sys/kernel/perf_event_paranoid` to `1`

* `numactl` must be available on the system. If not available, install with:
`sudo apt-get install numactl`
```sh
/proc/sys/kernel/yama/ptrace_scope to 0
/proc/sys/kernel/perf_event_paranoid to 1
```

* `numactl` must be available on the system. If not available, install using:
```sh
sudo apt-get install numactl
```
* Install `pandas` and `matplotlib`. They are not included in the core Devito installation.
```sh
pip install pandas matplotlib
```


### Example runs:

Limitations:
```bash
# The isotropic acoustic example
python3 run_advisor.py --name isotropic --path <path-to-devito>/examples/seismic/acoustic/acoustic_example.py
# The isotropic elastic example
python3 run_advisor.py --name iso_elastic --path <path-to-devito>/examples/seismic/elastic/elastic_example.py --exec-args "-so 4"
# The anisotropic acoustic (TTI) example
python3 run_advisor.py --name tti_so8 --path <path-to-devito>/examples/seismic/tti/tti_example.py --exec-args "-so 8"
```

* Untested with more complicated examples.
* Untested on Intel KNL (we might need to ask `numactl` to bind to MCDRAM).
* Running the `tripcounts` analysis takes a lot, despite starting in paused
After the run has finished you should be able to save a `.json` and plot the
roofline with the results:
```bash
python3 roofline.py --name Roofline --project <advisor-project-name>
```

To create a read-only snapshot for use with Intel Advisor GUI, use:
```bash
advixe-cl --snapshot --project-dir=<advisor-project-name> pack -- /<new-snapshot-name>
```
### Limitations:

* Not tested with all possible examples that Devito can support.
* Running the `tripcounts` analysis is time-consuming, despite starting in paused
mode. This analysis, together with the `survey` analysis, is necessary to
generate a roofline. Both are run by `run_advisor.py`.
* Requires python3, untested in earlier versions of python and conda environments
* Currently requires download of repository and running `pip3 install .`, the scripts
* Requires Python 3.9 or later, untested in conda environments
* Currently requires download of repository and running `pip install .`, the scripts
are currently not included as a package with the user installation of Devito

TODO:
### TODO:

* Give a name to the points in the roofline, otherwise it's challenging to
relate loops (code sections) to data.
* Emit a report summarizing the configuration used to run the analysis
(threading, socket binding, ...).

Useful links:
### Useful links:

* [ Intel® Advisor Performance Optimization Cookbook ](https://www.intel.com/content/www/us/en/docs/advisor/cookbook/2024-2/overview.html " Intel® Advisor Performance Optimization Cookbook ")

* [ Intel® Advisor User Guide ](https://www.intel.com/content/www/us/en/docs/advisor/cookbook/2024-2/overview.html " Intel® Advisor User Guide ")

* [ Roofline Resources for Intel® Advisor Users ](https://software.intel.com/content/www/us/en/develop/articles/advisor-roofline-resources.html " Roofline Resources for Intel® Advisor Users ")

* [ Memory-Level Roofline Analysis in Intel® Advisor ](https://software.intel.com/content/www/us/en/develop/articles/memory-level-roofline-model-with-advisor.html " Memory-Level Roofline Analysis in Intel® Advisor ")
* [CPU / Memory Roofline Insights
Perspective](https://software.intel.com/content/www/us/en/develop/documentation/advisor-user-guide/top/optimize-cpu-usage/cpu-roofline-perspective.html "CPU / Memory Roofline Insights
Perspective")
* [ Roofline Resources for Intel® Advisor Users ](https://software.intel.com/content/www/us/en/develop/articles/advisor-roofline-resources.html " Roofline Resources for Intel® Advisor Users ")

* [ Identify Bottlenecks Iteratively: Cache-Aware Roofline ](https://www.intel.com/content/www/us/en/docs/advisor/cookbook/2024-2/identify-bottlenecks-cache-aware-roofline.html " Identify Bottlenecks Iteratively: Cache-Aware Roofline ")

* [ Samuel Williams, Andrew Waterman, and David Patterson [2009]. Roofline: an insightful visual performance model for multicore architectures ](https://dl.acm.org/doi/10.1145/1498765.1498785 " Roofline: an insightful visual performance model for multicore architectures ")

* [ A. Ilic, F. Pratas and L. Sousa [2014]. Cache-aware Roofline model: Upgrading the loft ](https://ieeexplore.ieee.org/document/6506838 " Cache-aware Roofline model: Upgrading the loft ")

* [ Understanding the Roofline Model by Durganshu Mishra ](https://hackernoon.com/understanding-the-roofline-model " Understanding the Roofline Model ")

32 changes: 17 additions & 15 deletions benchmarks/user/advisor/roofline.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,23 +18,22 @@
import sys
import os

from benchmarks.user.advisor.advisor_logging import check, err, log
from advisor_logging import check, err, log


try:
import advisor
except ImportError:
check(False, 'Error: Intel Advisor could not be found on the system,'
' make sure to source environment variables properly. Information can be'
' found at https://software.intel.com/content/www/us/en/develop/'
'documentation/advisor-user-guide/top/launch-the-intel-advisor/'
'intel-advisor-cli/setting-and-using-intel-advisor-environment-variables.html')
' found at https://www.intel.com/content/www/us/en/docs/advisor/'
'user-guide/2024-2/set-up-environment-variables.html')
sys.exit(1)


matplotlib.use('Agg')
# Use fancy plot colors
plt.style.use('seaborn-darkgrid')
plt.style.use('ggplot')


@click.command()
Expand Down Expand Up @@ -65,21 +64,25 @@
def roofline(name, project, scale, precision, mode, th):
pd.options.display.max_rows = 20

log('Opening project...')
log(f'Opening project {project}...')
project = advisor.open_project(str(project))

if not project:
err('Could not open project %s.' % project)
err(f'Could not open project {project}.')
log('Loading data...')

data = project.load(advisor.SURVEY)
rows = [{col: row[col] for col in row} for row in data.bottomup]
roofs = data.get_roofs()

# Following deprecation solution from here:
# https://github.com/pandas-dev/pandas/issues/57734
pd.set_option('future.no_silent_downcasting', True)
full_df = pd.DataFrame(rows).replace('', np.nan)

# Narrow down the columns to those of interest
try:
analysis_columns = ['loop_name', 'self_ai', 'self_gflops', 'self_time']
df = full_df[analysis_columns].copy()
except KeyError:
err('Could not read data columns from profiling. Not enough data has been '
Expand Down Expand Up @@ -168,8 +171,8 @@ def roofline(name, project, scale, precision, mode, th):
label_x = row.self_ai + (row.self_ai + ai_max - 2 * ai_min) * (2**0.005 - 1)
label_y = row.self_gflops
ax.text(label_x, label_y,
'Time: %.2fs\n'
'Incidence: %.0f%%' % (row.self_time, row.percent_weight),
f'Time: {row.self_time:.2f}s\n'
f'Incidence: {row.percent_weight:.0f}%',
bbox={'boxstyle': 'round', 'facecolor': 'white'}, fontsize=8)
top_loops_data = [{'ai': row.self_ai,
'gflops': row.self_gflops,
Expand Down Expand Up @@ -198,19 +201,18 @@ def roofline(name, project, scale, precision, mode, th):
legend = plt.legend(loc='center left', bbox_to_anchor=(1, 0.5),
prop={'size': 7}, title='Rooflines')

# saving the chart in PNG format
plt.savefig('%s.png' % name, bbox_extra_artists=(legend,), bbox_inches='tight')
# saving the chart in PDF format
plt.savefig(f'{name}.pdf', bbox_extra_artists=(legend,), bbox_inches='tight')
figpath = os.path.realpath(__file__).split(os.path.basename(__file__))[0]
log('Figure saved in %s%s.png.' % (figpath, name))
log(f'\nFigure saved in {figpath}{name}.pdf.')

# Save the JSON file
with open('%s.json' % name, 'w') as f:
f.write(json.dumps(roofline_data))

log('JSON file saved as %s.json.' % name)
log(f'\nJSON file saved as {name}.json.')
log('Done!')


analysis_columns = ['loop_name', 'self_ai', 'self_gflops', 'self_time']

if __name__ == '__main__':
roofline()
57 changes: 31 additions & 26 deletions benchmarks/user/advisor/run_advisor.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
import click
import datetime
import logging
import os
import sys

from pathlib import Path
from subprocess import check_output, PIPE, Popen
import sys
from tempfile import gettempdir, mkdtemp

import click


from benchmarks.user.advisor.advisor_logging import (check, log, progress,
log_process)
from advisor_logging import check, log, progress, log_process


@click.command()
Expand All @@ -30,40 +28,43 @@
'in --exec-args (if any).')
def run_with_advisor(path, output, name, exec_args):
path = Path(path)
check(path.is_file(), '%s not found' % path)
check(path.suffix == '.py', '%s not a Python file' % path)
check(path.is_file(), f'{path} not found')
check(path.suffix == '.py', f'{path} not a Python file')

# Create a directory to store the profiling report
if name is None:
name = path.stem
if exec_args:
name = "%s_%s" % (name, ''.join(exec_args.split()))
name = f"{name}_{''.join(exec_args.split())}"
if output is None:
output = Path(gettempdir()).joinpath('devito-profilings')
output.mkdir(parents=True, exist_ok=True)
else:
output = Path(output)
if name is None:
output = Path(mkdtemp(dir=str(output), prefix="%s-" % name))
output = Path(mkdtemp(dir=str(output), prefix=f"{name}-"))
else:
output = Path(output).joinpath(name)
output.mkdir(parents=True, exist_ok=True)

# Intel Advisor and Intel compilers must be available through either Intel Parallel
# Studio or Intel oneAPI (currently tested versions include IPS 2020 Update 2 and
# oneAPI 2021 beta08)
# advixe-cl and icx should be available through Intel oneAPI
# (tested with Intel oneAPI 2025.1)
try:
ret = check_output(['advixe-cl', '--version']).decode("utf-8")
log(f"Found advixe-cl version: {ret.strip()}\n")
except FileNotFoundError:
check(False, "Error: Couldn't detect `advixe-cl` to run Intel Advisor.")
check(False, "Error: Couldn't detect `advixe-cl` to run Intel Advisor."
" Please source the Advisor environment.")

try:
ret = check_output(['icc', '--version']).decode("utf-8")
ret = check_output(['icx', '--version']).decode("utf-8")
log(f"Found icx version: {ret.strip()}\n")
except FileNotFoundError:
check(False, "Error: Couldn't detect Intel Compiler (icc).")
check(False, "Error: Couldn't detect Intel Compiler (icx)."
" Please source the Intel oneAPI compilers.")

# All good, Intel compiler and advisor are available
os.environ['DEVITO_ARCH'] = 'intel'
os.environ['DEVITO_ARCH'] = 'icx'

# Tell Devito to instrument the generated code for Advisor
os.environ['DEVITO_PROFILING'] = 'advisor'
Expand All @@ -73,7 +74,7 @@ def run_with_advisor(path, output, name, exec_args):
if devito_logging is None:
os.environ['DEVITO_LOGGING'] = 'WARNING'

with progress('Setting up multi-threading environment'):
with progress('Setting up multi-threading environment with OpenMP'):
# Roofline analyses are recommended with threading enabled
os.environ['DEVITO_LANGUAGE'] = 'openmp'

Expand All @@ -84,20 +85,19 @@ def run_with_advisor(path, output, name, exec_args):
ret = check_output(['numactl', '--show']).decode("utf-8")
ret = dict(i.split(':') for i in ret.split('\n') if i)
n_sockets = len(ret['cpubind'].split())
n_cores = len(ret['physcpubind'].split()) # noqa
except FileNotFoundError:
check(False, "Couldn't detect `numactl`, necessary for thread pinning.")

# Prevent NumPy from using threads, which otherwise leads to a deadlock when
# used in combination with Advisor. This issue has been described at:
# `software.intel.com/en-us/forums/intel-advisor-xe/topic/780506`
# `software.intel.com/en-us/forums/intel-advisor-xe/topic/780506`
# Note: we should rather sniff the BLAS library used by NumPy, and set the
# appropriate env var only
os.environ['OPENBLAS_NUM_THREADS'] = '1'
os.environ['MKL_NUM_THREADS'] = '1'
# Note: `Numaexpr`, used by NumPy, also employs threading, so we shall disable
# it too via the corresponding env var. See:
# `stackoverflow.com/questions/17053671/python-how-do-you-stop-numpy-from-multithreading` # noqa
# `stackoverflow.com/questions/17053671/python-how-do-you-stop-numpy-from-multithreading` # noqa
os.environ['NUMEXPR_NUM_THREADS'] = '1'

# To build a roofline with Advisor, we need to run two analyses back to
Expand Down Expand Up @@ -130,8 +130,8 @@ def run_with_advisor(path, output, name, exec_args):
]
py_cmd = [sys.executable, str(path)] + exec_args.split()

# Before collecting the `survey` and `tripcounts` a "pure" python run to warmup the
# jit cache is preceded
# Before collecting the `survey` and `tripcounts` a "pure" python run
# to warmup the jit cache is preceded

log('Starting Intel Advisor\'s `roofline` analysis for `%s`' % name)
dt = datetime.datetime.now()
Expand All @@ -147,6 +147,9 @@ def run_with_advisor(path, output, name, exec_args):
advixe_handler.setFormatter(advixe_formatter)
advixe_logger.addHandler(advixe_handler)

log(f"Project folder: {output}")
log(f"Logging progress in: `{advixe_handler.baseFilename}`")

with progress('Performing `cache warm-up` run'):
try:
p_warm_up = Popen(py_cmd, stdout=PIPE, stderr=PIPE)
Expand All @@ -170,10 +173,12 @@ def run_with_advisor(path, output, name, exec_args):
except OSError:
check(False, 'Failed!')

log('Storing `survey` and `tripcounts` data in `%s`' % str(output))
log(f'Storing `survey` and `tripcounts` data in `{output}`')
log('To plot a roofline type: ')
log('python3 roofline.py --name %s --project %s --scale %f'
% (name, str(output), n_sockets))
log(f'python3 roofline.py --name {name} --project {output} --scale {n_sockets}')

log('\nTo open the roofline using advixe-gui: ')
log(f'advixe-gui {output}')


if __name__ == '__main__':
Expand Down
12 changes: 9 additions & 3 deletions conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from devito import Eq, configuration, Revolver # noqa
from devito.checkpointing import NoopRevolver
from devito.finite_differences.differentiable import EvalDerivative
from devito.arch import Cpu64, Device, sniff_mpi_distro, Arm
from devito.arch import Cpu64, Device, sniff_mpi_distro, Arm, get_advisor_path
from devito.arch.compiler import (compiler_registry, IntelCompiler, OneapiCompiler,
NvidiaCompiler)
from devito.ir.iet import (FindNodes, FindSymbols, Iteration, ParallelBlock,
Expand All @@ -32,8 +32,8 @@ def skipif(items, whole_module=False):
# Sanity check
accepted = set()
accepted.update({'device', 'device-C', 'device-openmp', 'device-openacc',
'device-aomp', 'cpu64-icc', 'cpu64-icx', 'cpu64-nvc', 'cpu64-arm',
'cpu64-icpx', 'chkpnt'})
'device-aomp', 'cpu64-icc', 'cpu64-icx', 'cpu64-nvc',
'noadvisor', 'cpu64-arm', 'cpu64-icpx', 'chkpnt'})
accepted.update({'nodevice'})
unknown = sorted(set(items) - accepted)
if unknown:
Expand Down Expand Up @@ -79,6 +79,12 @@ def skipif(items, whole_module=False):
isinstance(configuration['platform'], Cpu64):
skipit = "`icx+cpu64` won't work with this test"
break
# Skip if icx or advisor are not available
if i == 'noadvisor' and \
(not isinstance(configuration['compiler'], IntelCompiler) or
not get_advisor_path()):
skipit = "Only `icx+advisor` should be tested here"
break
# Skip if it won't run on Arm
if i == 'cpu64-arm' and isinstance(configuration['platform'], Arm):
skipit = "Arm doesn't support x86-specific instructions"
Expand Down
Loading

0 comments on commit b22dd66

Please sign in to comment.