This repository is designed to test CARMOT on benchmark suites. This repository includes the artifact evaluation materials for the CARMOT CGO 2023 paper: "Program State Element Characterization".
This artifact generates the main results of the paper shown in Figures 6, 7, 10, 11 in text format. The artifact is a podman image that runs Ubuntu 20.04 and already contains the NAS and PARSEC3 benchmarks suites. We cannot share SPEC CPU 2017 directly, if the reviewer has access to SPEC CPU 2017, it can be included following this Section.
We open sourced CARMOT, the infrastructure we built to evaluate CARMOT on several benchmark suites (e.g., NAS, PARSEC3, SPEC CPU 2017), and CARMOT's dependencies NOELLE, VIRGIL. This artifact will download all necessary software dependencies by cloning the open-sourced repositories (from GitHub) that are not already included within the podman image. Please make sure to have a network connection when running the artifact.
This artifact requires a Linux-base machine with podman installed.
To ensure Pin-related results are colleted correctly, the host machine must have /proc/sys/kernel/yama/ptrace_scope
set to 0, because this value is shared with the podman container where the artifact runs.
If /proc/sys/kernel/yama/ptrace_scope
is not set to 0, do (as root):
# echo 0 > /proc/sys/kernel/yama/ptrace_scope
Before running the podman container. Note that this value is reset to its default every time the machine is rebooted.
In order to evaluate this artifact correctly, an Intel multicore processor with shared memory is necessary. The required amount of main memory is 125 GiB to ensure all runs do not go to swap, which can increase the measured execution time of the experiments. To ensure the accuracy of execution time measurements, all frequency scaling mechanisms (e.g., TurboBoost) have to be disabled and the machine must be idle (no other compute or memory intensive process can run on the machine during the execution of the experiments). The required amount of disk space for the whole (fully unpacked) podman container is approximately 200 GB.
Two sets of results can be generated with this artifact: Minimal and Full. Results for NAS and PARSEC3 will be generated by default. If reviewers are allowed to use SPEC CPU 2017 for artifact evaluation purposes and they would like to generate the results for SPEC CPU 2017, they will have to include it manually in the podman container (see Section).
These are the Minimal results that should be evaluated in this artifact. They consist of:
- The black and red speedup bars of Figure 6.
- The CARMOT overhead (red bars) in the OpenMP use case of Figure 7.
- The CARMOT overhead (red bars) in the C++ Smart Pointer use case of Figure 10.
- The CARMOT overhead (red bars) in the STATS use case of Figure 11.
Adding SPEC CPU 2017 is optional.
NOTE: Computing the Minimal results wihout SPEC CPU 2017 takes approximately 2 days (one the machine used in the paper). Adding SPEC CPU 2017 increases the time to approximately 4 days.
The Full set of results of the paper consists of the Minimal Results plus the Naive approach black bars of Figures 7, 10, 11 of the paper. Adding SPEC CPU 2017 is optional.
NOTE: Computing the Full results wihout SPEC CPU 2017 takes approximately 4 days. Adding SPEC CPU 2017 increases the time to approximately 6 days.
To run the experiments do as follows.
Download the artifact (i.e., podman image carmot.tar
) following the DOI in the paper appendix.
Load and run the podman image carmot.tar
interactively:
$ podman load < carmot.tar
$ podman run --rm -it carmot /bin/bash
This will open a shell inside the podman container.
From inside the podman container, the entry point to generate the Minimal set of results is the script bin/carmot_experiments
.
It must be invoked as follows with no arguments, and can be run in the background (progress can be checked inside the carmot_experiments_output.txt
file):
$ ./bin/carmot_experiments &
Alternatively, the Full set of results can be generated by setting the environment variable CARMOT_FULL
to 1:
$ export CARMOT_FULL=1 ; ./bin/carmot_experiments &
Additionally, the reviewer can control how many times each data point is executed by setting the environment variable CARMOT_NUM_RUNS
to a strictly greater than 0 integer value (the default is CARMOT_NUM_RUNS=3
).
For example, to generate each data point 5 times for the Full results, the reviewer will invoke bin/carmot_experiments
as follows:
$ export CARMOT_NUM_RUNS=5 ; export CARMOT_FULL=1 ; ./bin/carmot_experiments &
Note that the higher the CARMOT_NUM_RUNS
, the more time it takes to run the experiments.
The customization of the experiments can be disabled by unsetting the corresponding environment variables:
$ unset CARMOT_NUM_RUNS ; unset CARMOT_FULL
The progress of the experiments can be monitored by looking at the carmot_experiments_output.txt
file:
$ tail -f carmot_experiments_output.txt
Once bin/carmot_experiments
finishes, the results of the experiments will be placed under results/current_machine
in the running podman container.
This directory has the following structure:
results/current_machine/
├── fig10
│ └── carmot
│ ├── NAS
│ │ ├── overhead_blackbars.txt
│ │ └── overhead_redbars.txt
│ ├── PARSEC3
│ │ ├── overhead_blackbars.txt
│ │ └── overhead_redbars.txt
│ └── SPEC2017
│ ├── overhead_blackbars.txt
│ └── overhead_redbars.txt
├── fig11
│ └── carmot
│ └── PARSEC3
│ ├── overhead_blackbars.txt
│ └── overhead_redbars.txt
├── fig6
│ ├── carmot
│ │ ├── NAS
│ │ │ └── speedup.txt
│ │ ├── PARSEC3
│ │ │ └── speedup.txt
│ │ └── SPEC2017
│ │ └── speedup.txt
│ └── original_parallelism
│ ├── NAS
│ │ └── speedup.txt
│ ├── PARSEC3
│ │ ├── speedup_pthread.txt
│ │ └── speedup.txt
│ └── SPEC2017
│ └── speedup.txt
└── fig7
└── carmot
├── NAS
│ ├── overhead_blackbars.txt
│ └── overhead_redbars.txt
├── PARSEC3
│ ├── overhead_blackbars.txt
│ └── overhead_redbars.txt
└── SPEC2017
├── overhead_blackbars.txt
└── overhead_redbars.txt
NOTE: if CARMOT_FULL
is not set, the overhead_blackbars.txt
files will not be generated.
The directory results/authors_machine
contains the results computed by the authors, and follows the same structure.
Because CARMOT is an actively developed project, the authors_machine
results have been updated and show some differences compared to the results reported in the Figures of the paper.
However, the claims made in the paper still hold.
The directory results/additional_authors_machines
contains more results that the authors have computed on additional machines.
The execution time of the baselines can vary considerably depending on the architecture of the machine the artifact is running on. For this reason, the absolute values of speedup (Figure 6) and overhead (Figures 7, 10, 11) might differ significantly compared to the authors' ones.
Instead of comparing the absolute values, we recommend reviewers to check the validity of the two following main claims of the paper:
- For the Figure 6 speedup results, the Original parallelism and CARMOT-induced parallelism speedup values should be close to each other;
- For Figures 7, 10, 11 overhead results, the CARMOT overhead (red bars) should be considerably lower than the Naive approach overhead (black bars), often by one order of magnitude or more.
These two claims hold for all authors results (the results computed on the machine described in the paper: results/authors_result
, and the additional included results: results/additional_authors_machines
), even though the absolute values of speedup and overhead are different.
For this reason, we encourage the reviewers to compute the Full set of results by setting the environment variable CARMOT_FULL
to 1 prior to starting the experiments.
If the reviewer is under time constraints, we suggest to reduce the CARMOT_NUM_RUNS
.
If the reviewers are allowed to use SPEC CPU 2017 to evaluate this artifact, they can do so by opening a new shell on the host machine where the podman container is running and getting its CONTAINER_ID
using:
$ podman ps
Then, copy your SPEC CPU 2017 tar.gz archive from the host to the running podman container using:
$ podman cp /path/to/your/SPECCPU2017/archive.tar.gz CONTAINER_ID:/home/cgo23ae/benchmarkSuites/SPEC2017.tar.gz
Note that your SPEC CPU 2017 archive must be a tar.gz archive and the name of the archive copied into the podman container must be SPEC2017.tar.gz
.
Your SPEC CPU 2017 tar.gz archive must contain a single directory called SPEC2017
and its structure must be as follows:
SPEC2017
├── bin
├── cshrc
├── Docs
├── Docs.txt
├── install_archives
├── install.bat
├── install.sh
├── LICENSE
├── LICENSE.txt
├── MANIFEST
├── PTDaemon
├── README
├── README.txt
├── redistributable_sources
├── Revisions
├── shrc
├── shrc.bat
├── tools
├── uninstall.sh
└── version.txt
In order to correctly run the SPEC CPU 2017 experiments your SPEC CPU 2017 archive must be complete with both the source code of all benchmarks (speed and rate) and all inputs (test, train, reference), otherwise unexpected errors might happen.
Once SPEC2017.tar.gz
is added to the podman container, the bin/carmot_experiments
script will automatically generate results for SPEC CPU 2017 (on top of the already included NAS and PARSEC3).
We have NOT tested the execution of this artifact under job scheduling systems like condor or slurm. We recommend to run the podman container and its experiments directly on the host machine.
Given the amount of time required to run the experiments, if a remote machine is used, we strongly suggest to use a terminal multiplexer (e.g., tmux, screen) to avoid losing the progress made in case the network connection is lost.
Furthermore, the scripts provided to evaluate this artifact assume that experiments will be run one after the other sequentially, please do not run experiments in parallel or unexpeted behavior might happen.
Finally, although podman and docker are mostly compatible, we could only test this artifact using podman 4.2.0 on RedHat machines. If possible, use podman rather than docker to avoid unknown issues.