HCQC is a tool for investigating the quality of code generation of the kernel part of the HPC application by the compiler.
Many HPC applications have a few hot spots which are a very narrow range of code, consisting of one function or several consecutive loops. These hot spots occupy most of the program execution time. Therefore, the quality of the code of the hot spot is important for performance.
HCQC is a program for collecting metric data for investigating the quality of hotspot code by some compilers for registered test programs. There is not much meaning with just one compiler's data, but it is meaningful to compare the results of multiple compilers. A typical comparison method is as follows.
-
On Architecture
A
, CompilerX
vs. CompilerY
This can evaluate the advantages and disadvantages of compilers
X
andY
. -
Compiler
X
versionV
vs. CompilerX
versionW
This can check the effect of changes in compiler version.
-
Compiler
X
on ArchitectureA
vs. ArchitectureB
This can confirm the lack of compiler
X
's features when the architecture changes. -
Compiler
X
on ArchitectureA
vs. CompilerY
on ArchitectureB
If the architecture
A
is new and the architectureB
is mature, this comparison will provide important information on compilerX
's enhancement.
HCQC is a tool to help improve the performance of hot spots.
HCQC currently mainly deals with GCC or Clang/LLVM on Linux of 64 bit ARM architecture(AArch64). For other architectures or compilers, see How to Add New Architectures or Compilers.
In the following, ${INSTALL_DIRECTORY}
shows the directory where hcqc exists.
To execute HCQC, it is necessary to define a compiler and command line options for the compiler to be investigated.
The definition of the investigation target is described in the configuration file of JSON format placed in directory ${INSTALL_DIRECTORY}/hcqc/config
.
For example, if you want to investigate the optimization level -O2
of the GCC whose version is 7.1.1, whose absolute path is /usr/bin/gcc
, then the configuration file should be written as follows:
{
"DISTRIBUTION" : "OpenSUSE Tumbleweed",
"ARCH" : "aarch64",
"CPU" : "AMD Opteron A1100 Cortex A57",
"LANGUAGE" : "C",
"COMPILER" : "GCC",
"COMMAND" : "/usr/bin/gcc",
"VERSION" : "7.1.1",
"OPT_FLAGS" : ["-O2"],
"ASM_FLAGS" : ["-S", "-fverbose-asm"],
"FLAG_DB" : [["?DEBUG_FLAG", "-g"],
["?C99_STANDARD", "-std=c99"]]
}
Explanation of each field of the configuration file is described in How to Create New Configuration Files.
The name of the configuration file including this definition is gcc-config.json
in the following.
To investigate the quality of the defined configuration, it is necessary to compile and execute test programs and collect data using the configuration file.
All test programs exist under directory ${INSTALL_DIRECTORY}/hcqc/test-program
.
In the following description, it is assumed that sample
test program is used as a test program.
To investigate the quality of the compiler, it is necessary to specify a metric indicating the type of data collection.
In this case, the metric criteria kind
for taking statistics of the mnemonic type, which is included in the function which includes hot spots, is used as an example.
Regarding the configuration file gcc-config.json
, to collect data of the measurement standard kind
from the test program sample
, execute the following command:
% cd ${INSTALL_DIRECTORY}/hcqc
% ./command/hcqc gcc-config sample kind
By executing this command, the following processing is executed.
(1) Using the compiler and compile options specified in the configuration file, compile and execute the test program sample
and confirm that the execution result is correct.
All the files generated by this work are placed under the following directory:
${INSTALL_DIRECTORY}/hcqc/work/sample/gcc-config/kind
(2) From the assembly code generated by compiling the kernel part of the test program sample
,
a control flow graph of the kernel part is created.
Based on the control flow graph, a result file gcc-config--sample--kind.json
which is statistical data of the mnemonic based on the measurement standard kind
is created.
The content of the result file gcc-config--sample--kind.json
is, for example, as follows:
[
[ "TITLE", ["CFG", "SIZE", "DEPTH", "memory", "branch", "other"]],
[ "kernel cond .L1", [ "2", "0", "0", "1", "1"]],
[ " ", [ "7", "0", "0", "0", "7"]],
[ ".L5 ", [ "1", "1", "0", "0", "1"]],
[ ".L4 ", [ "4", "2", "1", "0", "3"]],
[ ".L3 cond .L3", [ "8", "3", "3", "1", "4"]],
[ " cond .L4", [ "3", "2", "0", "1", "2"]],
[ " cond .L5", [ "6", "1", "0", "1", "5"]],
[ ".L1 end", [ "1", "0", "0", "1", "0"]],
[ "*SUMMARY*", [ "32", "-", "4", "5", "23"]]]
All the result files generated using the test program sample
are placed under the following directory:
${INSTALL_DIRECTORY}/hcqc/result/sample
The execution result of command ./command/hcqc
is data in JSON format.
To make them easier to see, you can use the command ./command/hcqc-report
.
For example, the following command:
% ./command/hcqc-report gcc-config sample R0 kind
creates a report file gcc-config--sample--R0.csv
with CSV format in the following directory:
${INSTALL_DIRECTORY}/hcqc/report/sample
The contents of the report file are as follows, for example:
CFG,SIZE,DEPTH,memory,branch,other
kernel cond .L1,2,0,0,1,1
,7,0,0,0,7
.L5 ,1,1,0,0,1
.L4 ,4,2,1,0,3
.L3 cond .L3,8,3,3,1,4
cond .L4,3,2,0,1,2
cond .L5,6,1,0,1,5
.L1 end,1,0,0,1,0
*SUMMARY*,32,-,4,5,23
By executing the following command, all generated files under work
or report
directories except the result data files under result
directory can be deleted.
% cd ${INSTALL_DIRECTORY}/hcqc
% ./clean-all.sh
To delete all the generated files including the result data files under result
directory,
Execute the following command:
% cd ${INSTALL_DIRECTORY}/hcqc
% ./realclean-all.sh
The HCQC acquires the test program data on the specified configuration file in the following procedure. In the following, when using the configuration file
${INSTALL_DIRECTORY}/hcqc/config/CONFIG.json
the test program
${INSTALL_DIRECTORY}/hcqc/test-program/sample
and the metric program name kind
, the execution of HCQC is as follows:
% cd ${INSTALL_DIRECTORY}/hcqc
% ./command/hcqc CONFIG sample kind
The workflow executing this command is as follows:
First, HCQC opens the configuration file
${INSTALL_DIRECTORY}/hcqc/config/CONFIG.json
and reads each field of the JSON format file. The configuration file defines the compiler to be investigated and optimization options, and includes, for example, the following contents:
{
"DISTRIBUTION" : "OpenSUSE Tumbleweed",
"ARCH" : "aarch64",
"CPU" : "AMD Opteron A1100 Cortex A57",
"LANGUAGE" : "C",
"COMPILER" : "GCC",
"COMMAND" : "/usr/bin/gcc",
"VERSION" : "7.1.1",
"OPT_FLAGS" : ["-O2"],
"ASM_FLAGS" : ["-S", "-fverbose-asm"],
"FLAG_DB" : [["?DEBUG_FLAG", "-g"],
["?C99_STANDARD", "-std=c99"]]
}
The meaning of each field of the configuration file is as follows:
DISTRIBUTION
: distribution of OS(Currently unused)ARCH
: the machine hardware name(Must matchuname -m
)CPU
: the CPU name(Currently unused)LANGUAGE
: target programming language(Must matchLANGUAGE
of test programs)COMPILER
: the name of compilerCOMMAND
: full path name of the compiler commandVERSION
: the compiler version number(Must match--version
result)OPT_FLAGS
: compiler options to be investigatedASM_FLAGS
: compiler options for generating assembly codeFLAG_DB
: definition of flag variables used in program information file
From the field information of FLAG_DB
, HCQC creates a flag replacement map.
For example, the definition of the field of FLAG_DB:
"FLAG_DB" : [["?DEBUG_FLAG", "-g"],
["?C99_STANDARD", "-std=c99"]]
generates the map:
{ "?DEBUG_FLAG" : "-g",
"?C99_STANDARD" : "-std=c99" }
Next, the program information file
${INSTALL_DIRECTORY}/hcqc/test-program/sample/program-info.json
for the specified test program sample
is opened and each field of the JSON format file is read.
The program information file includes information for compiling and executing the test program.
The contents are, for example, as follows:
{
"LANGUAGE" : "C",
"MAIN_FLAGS" : ["?DEBUG_FLAG", "?C99_STANDARD"],
"KERNEL_FLAGS" : ["-DFAST", "?C99_STANDARD"],
"LINK_FLAGS" : ["?C99_STANDARD"],
"LIB_LIST" : ["-lm"],
"MAIN_FILENAME" : "main.c",
"KERNEL_FILENAME" : "kernel.c",
"KERNEL_FUNCTION_NAME" : "kernel",
"INPUT" : [ "STDIN", "in.data" ],
"OUTPUT" : [ "STDOUT", "out.data" ]
}
The meaning of each field is as follows:
LANGUAGE
: programming language describing the test programMAIN_FLAGS
: compile options for compiling files in the main partKERNEL_FLAGS
: compile options for compiling files in the kernel partLINK_FLAGS
: link options for generating executable filesLIB_LIST
: library specification options used to generate executable fileMAIN_FILENAME
: the file name of the main partKERNEL_FILENAME
: the file name of the kernel partKERNEL_FUNCTION_NAME
: the name of kernel functionINPUT
: specification of input data for executable fileOUTPUT
: specification of output data for executable file
Using these pieces of information, HCQC compiles and executes the test program, and verifies the result. First, it compiles the main file and generates the object file. At this time, HCQC uses the compiler which is specified in the COMMAND field of the configuration file. That is, the command to compile the file of the main part has the following form:
% COMMAND MAIN_FILENAME MAIN_FLAGS -c -o RESULT_FILENME
Here, RESULT_FILENME
is the file name obtained by converting the suffix of MAIN_FILENAME
into .o
.
At this time, the flag variable in MAIN_FLAGS
is replaced by using the FLAG_DB
information in the configuration file.
The command actually executed is, for example, as follows:
% /usr/bin/gcc ${INSTALL_DIRECTORY}/hcqc/test-program/sample/main.c \
-g -std=c99 -c -o ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/main.o
Similarly, HCQC compiles the file of the kernel part with the following command:
% COMMAND KERNEL_FILENAME KERNEL_FLAGS OPT_FLAGS -c -o RESULT_FILENME
At this time, the command actually executed is, for example, as follows:
% /usr/bin/gcc ${INSTALL_DIRECTORY}/hcqc/test-program/sample/kernel.c \
-DFAST -std=c99 -O2 -c -o ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.o
HCQC executes the following command to create an executable file from generated object files:
% COMMAND LINK_FLAGS RESULT_FILENAME_LIST -o EXEC_FILENAME LIB_LIST
At this time, the command actually executed is, for example, as follows:
% /usr/bin/gcc -std=c99 ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/main.o \
${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.o \
-o ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/a.out -lm
Next, HCQC executes the created executable file with the specified input data and verifies the result. In the sample program information file, the specification of the input data is as follows:
"INPUT" : [ "STDIN", "in.data" ]
This description means that HCQC executes the program by inputting the data of the file in.data
to the standard input.
In the sample program information file, the specification of the output data is as follows:
"OUTPUT" : [ "STDOUT", "out.data" ]
This description means that HCQC verifies the execution result by comparing the result of the standard output with the content of the file out.data
.
At this time, the command actually executed is, for example, as follows:
% ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/a.out \
< in.data \
> ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/out.data
% /usr/bin/diff \
${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/out.data \
${INSTALL_DIRECTORY}/hcqc/test-program/sample/out.data
After confirming that the execution result is correct, HCQC executes the metric program and obtain data related to the test program. For this purpose, HCQC generates an assembly code file of the kernel part of the test program. To do this, HCQC executes the next command which changed part of the command to compile the kernel part.
% COMMAND KERNEL_FILENAME KERNEL_FLAGS OPT_FLAGS ASM_FLAGS -o ASM_FILENAME
At this time, the command actually executed is, for example, as follows:
% /usr/bin/gcc ${INSTALL_DIRECTORY}/hcqc/test-program/sample/kernel.c \
-DFAST -std=c99 -O2 -S -fverbose-asm \
-o ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.s
When execution of the test program is confirmed, HCQC executes the metric program. When the metric program name is M, HCQC executes the set of scripts
${INSTALL_DIRECTORY}/hcqc/command/metric/M/M000.py
${INSTALL_DIRECTORY}/hcqc/command/metric/M/M001.py
....
${INSTALL_DIRECTORY}/hcqc/command/metric/M/M999.py
under the directory
${INSTALL_DIRECTORY}/hcqc/command/metric/M
in order.
HCQC uses the field information read from the configuration file, evaluates the match_p
method in each script, and executes a script whose result is True.
If the evaluation result of the match_p
method is False, HCQC stops executing the script and tries the next script.
The method of acquiring metric information from the target test program differs depending on the target architecture and compiler. Therefore, it is necessary to select a script suitable for the specified configuration file and test program.
The example execution command is as follows:
% ./command/hcqc CONFIG sample kind
then the metric program name is kind
.
Therefore, HCQC executes the script that exists under the directory
${INSTALL_DIRECTORY}/hcqc/command/metric/kind
This directory has only the following script file:
${INSTALL_DIRECTORY}/hcqc/command/metric/kind/kind999.py
and, HCQC executes this script.
Generally, each metric program performs the following operations.
(1) Create a control flow graph from the generated assembly code file.
In this example, a control flow graph is created from the file
${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.s
and the file
${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.s.dot
which represents the control flow graph is created.
This file contains information for Graphviz dot
command.
In this example, the control flow graph is as follows:
The information of this control flow graph corresponds to the following part of the result information file.
[ "TITLE", ["CFG", "SIZE", "DEPTH",
[ "kernel cond .L1", [ "2", "0",
[ " ", [ "7", "0",
[ ".L5 ", [ "1", "1",
[ ".L4 ", [ "4", "2",
[ ".L3 cond .L3", [ "8", "3",
[ " cond .L4", [ "3", "2",
[ " cond .L5", [ "6", "1",
[ ".L1 end", [ "1", "0",
[ "*SUMMARY*", [ "32", "-",
It corresponds to the first and second column parts of the CSV file of the result report file.
CFG,SIZE,DEPTH,
kernel cond .L1,2,0,
,7,0,
.L5 ,1,1,
.L4 ,4,2,
.L3 cond .L3,8,3,
cond .L4,3,2,
cond .L5,6,1,
.L1 end,1,0,
*SUMMARY*,32,-,
Here, the column of CFG
represents the control flow graph and each row of the CFG
column corresponds to each basic block of the control flow graph.
The column of SIZE
represents the number of instruction in each basic block and the column of DEPTH
represents the depth of nesting of the loop.
A value of depth 0 means that the basic block is outside loops and a value of depth 999 means that the basic block does not reach the exit of the function by entering an infinite loop.
Here, cond
means that the basic block ends with a conditional branch instruction having a fallthrough, and goto
means that the basic block ends with an unconditional branch instruction.
Also, end
means the end of the function.
(2) Information unique to the metric program is collected and output as information of additional columns to the result file.
For example, the metric program kind
classifies the mnemonic contained in each basic block of the control flow graph into a memory access instruction (memory
), a branch instruction (branch
), and others (other
).
In this example, kind
generates the following part of the result information file.
"memory", "branch", "other"]],
"0", "1", "1"]],
"0", "0", "7"]],
"0", "0", "1"]],
"1", "0", "3"]],
"3", "1", "4"]],
"0", "1", "2"]],
"0", "1", "5"]],
"0", "1", "0"]],
"4", "5", "23"]]]
This information corresponds to the following part of the result report file.
memory,branch,other
0,1,1
0,0,7
0,0,1
1,0,3
3,1,4
0,1,2
0,1,5
0,1,0
4,5,23
Information gathered by the metric program is saved in the result file under the directory
${INSTALL_DIRECTORY}/hcqc/result/sample
if the test program name is sample
.
In that case, the file name becomes:
"configuration file name"--"test program name"--"metric program name".json
In this example, the file name is as follows:
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json
The command clean-all.sh
does not delete the file under the directory result
.
Therefore, if the same configuration file, test program, and metric program are specified and executed for the second time and thereafter:
% ./command/hcqc CONFIG sample kind
In this case, the file name of the result to be created is:
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json.new
first. And HCQC compares the contents of the following two files:
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json.new
If the contents of these two files are the same, delete the new file with the suffix .new
.
If those are different, HCQC rename the old file to the file name with the current time appended to the old file, for example
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json--2017-10-12-13-26-13
and delete the suffix .new
from the name of the new file.
That is, as a result, the following two files remain.
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json--2017-10-12-13-26-13
All result files of hcqc command are data with JSON format.
the command ./command/hcqc-report
converts the data of these results into a table of CSV format.
For example, the command
% ./command/hcqc-report CONFIG TEST_PROGRAM_NAME REPORT_NAME METRIC_PROGRAM
finds the result data file which is
${INSTALL_DIRECTORY}/hcqc/result/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--METRIC_PROGRAM_NAME.json
and generate the report file which is
${INSTALL_DIRECTORY}/hcqc/report/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--REPORT_NAME.csv
The command ./command/hcqc-report
can combine multiple tables.
For example, the command
% ./command/hcqc-report CONFIG TEST_PROGRAM_NAME REPORT_NAME M1 M2 ... Mn
can combine tables for the following data files in the order specified:
${INSTALL_DIRECTORY}/hcqc/result/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--M1.json
${INSTALL_DIRECTORY}/hcqc/result/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--M2.json
...
${INSTALL_DIRECTORY}/hcqc/result/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--Mn.json
This is possible because CFG
, SIZE
,and DEPTH
columns in the result table are equivalent
when the result information is generated from the same configuration file and test program.
The metric program in HCQC is a set of programs to investigate the quality of code generated by the compiler according to various metrics.
arch | compiler | op | kind | regalloc | height | ilp | swpl | vectorize |
---|---|---|---|---|---|---|---|---|
AArch64 | GCC | Y | Y | Y | Y | N | N | N |
AArch64 | Clang/LLVM | Y | Y | Y | Y | N | N | N |
x86_64 | GCC | Y | N | N | N | N | N | N |
x86_64 | Clang/LLVM | Y | N | Y | N | N | N | N |
x86_64 | ICC | Y | N | N | N | N | N | N |
The metric program op
investigates the number of mnemonics of the instructions used in the assembly code of the kernel function.
The number of mnemonics is summarized for each basic block of the control flow graph.
The metric program kind
investigates the kind of mnemonic of the instruction used in the assembly code of the kernel function.
The number of mnemonic kinds is summarized for each basic block of the control flow graph.
The kinds of mnemonics handled by the metric program kind
are as follows:
memory
: memory access instructionsbranch
: branch instructions and any control transfer instructionsother
: remaining instructions not included in the above two kinds
The metric program regalloc
examines the quality of register allocation by the compiler.
It counts the number of spill codes which exist in the assembly code of the kernel function.
The number of spill codes is summarized for each basic block of the control flow graph.
The metric program regalloc
counts the spill code by dividing it into the following two types.
spill out
: a set of store instructions to save the value of the register to the spill area prepared by the compilerspill in
: a set of load instructions to restore the value of the register from the spill area prepared by the compiler
The quality of register allocation is determined by the number of spill codes generated by the compiler.
However, the exact definition of spill code may differ depending on the compiler.
For example, some compilers may consider saving and restoring the values of registers generated before and after function calls as spill codes.
Therefore, if the compiler is different, it can not simply decide the quality of register allocation by comparing the number of spill codes.
Also, note that the location where the spill code exists is also important.
The spill code in the innermost loop becomes a cause of the performance degradation more than the spill code outside the loop.
The result of regalloc
should be judged with these considerations taken into account.
The metric program regalloc
detects spill out and spill in instructions from the resulting assembly code.
The details of this process vary depending on the compiler.
-
LLVM
If the compiler is Clang/LLVM, the instruction with the following comment in the assembly code file is the spill code.
// ???-byte Spill // ???-byte Folded Spill // ???-byte Reload // ???-byte Folded Reload
The following script
${INSTALL_DIRECTORY}/hcqc/command/metric/regalloc/regalloc000.py
implements the process of detecting and counting them.
-
GCC
If the compiler is GCC, the instruction with the following comment in the assembly code file handles the address of the spill area prepared by the compiler.
/// ... %sfp ...
These comments can be generated by creating an assembly code file by attaching option
-fverbose-asm
to GCC. Therefore, if the compiler is GCC, you can regard memory access instructions with these comments as spill codes.The following script
${INSTALL_DIRECTORY}/hcqc/command/metric/regalloc/regalloc001.py
implements the process of detecting and counting them.
(Note): LLVM considers load instructions or store instructions generated at the entry and the exit of a function as spill codes. However, GCC does not regard these instructions as spill codes. Therefore, when comparing the number of spill codes at the entry and the exit of a function, it is necessary to consider these differences.
The metric program height
examines the height of the data dependence graph of the instruction in each basic block.
By referring to these values, it is possible to detect the problem of lowering the instruction level parallelism as a result of using the same register within a narrow range by a register allocation pass.
This metric program is for investigating the quality of instruction scheduling by the compiler. This program has not been implemented yet.
This metric program is for investigating the quality of software pipelining by the compiler. This program has not been implemented yet.
This metric program is for investigating the quality of vectorization or SIMDization by the compiler. This program has not been implemented yet.
In the following, it is assumed that the name of the newly created configuration file is NEWCONFIG
.
In this case, it is necessary to add a new file
${INSTALL_DIRECTORY}/hcqc/config/NEWCONFIG.json
under the directory
${INSTALL_DIRECTORY}/hcqc/config
The file NEWCONFIG.json
need to define the following fields:
-
DISTRIBUTION
This field defines the name of OS distribution. This field is not currently used.
-
ARCH
This field defines the machine hardware name. HCQC checks whether this definition matches the result of
uname -m
. -
CPU
This field defines the name of CPU. This field is not currently used.
-
LANGUAGE
This field defines the programming language which targeted by the compiler
COMPILER
. HCQC checks whether this definition matches theLANGUAGE
field in theprogram-info.json
for test programs. -
COMPILER
This field defines the name of the compiler to be investigated. The string is used for selecting the Python class in
hcqc/command/config.py
. Currently, onlyGCC
andClangLLVM
are supported. For introducing the new compiler name, you need to add definitions for it in the script filehcqc/command/config.py
(see How to Add New Architectures or Compilers). -
COMMAND
This field defines the full path name of the compiler
COMPILER
to be investigated. -
VERSION
This field defines the version number of the compiler
COMPILER
to be investigated. HCQC checks whether this definition matches the result of theCOMMAND
execution results with the--version
option. -
OPT_FLAGS
This field defines the optimization options using the compiler
COMPILER
to be investigated. HCQC regards a pair of a compiler and optimization options used by its compiler as identifiers to be investigated. -
ASM_FLAGS
This field defines the options for generating assembly codes by the compiler
COMPILER
to be investigated. Because HCQC uses assembly code with detailed information added, the option-S
may not be enough. -
FLAG_DB
This field defines the flag variables used in program information files. Each compiler often has different options for the same feature. Program information files use flag variables in the definitions, then HCQC replaces those flag variables using the definition of the field
FLAG_DB
before executing the compiler. If there is no option to specify the specific feature in the target compiler, you can specify an empty string for the flag variable like the following:"FLAG_DB" : [["?COMPILER_RARE_FLAG", ""], ...]
By executing HCQC with option --v
, you can check what kind of commands are actually executed for the processing of the test program using the specific configuration file.
In the following, it is assumed that the name of the newly added test program is newtest
.
In this case, it is necessary to create a new directory
${INSTALL_DIRECTORY}/hcqc/test-program/newtest
under the directory
${INSTALL_DIRECTORY}/hcqc/test-program
This new directory should have at least the following three files:
-
program-info.json
This file defines how to compile, execute, and check the result of the test program. See below for details of this definition.
-
a source file of kernel part of the test program
In order to prevent the function from disappearing due to interprocedural optimizations performed in the file, the kernel part and the main part of the test program should be divided into different files.
-
a source file of main part of the test program
The main part of the test program should execute the kernel function in the kernel file and verify the result. If there is no output result in the program, the main part should report the result status with an exit code.
Both the compiler command and the generated executable file are executed in directory
${INSTALL_DIRECTORY}/hcqc/test-program/newtest
then, header files etc. in this directory can be referenced from the test program by the relative path.
The file program-info.json
needs to define the following fields:
-
LANGUAGE
This field defines the programming language which the test program is described. HCQC checks whether this definition matches the
LANGUAGE
field in the configuration file. -
MAIN_FLAGS
This field defines the options for compiling the main file by the compiler to be investigated.
-
KERNEL_FLAGS
This field defines the options for compiling the kernel file by the compiler to be investigated. This field should not contain optimization options which should be included in the configuration file.
-
LINK_FLAGS
This field defines the options for building the executable file by the compiler to be investigated.
-
LIB_LIST
This field defines the library options for building the executable file by the compiler to be investigated.
-
MAIN_FILENAME
This field defines the file name of the main part.
-
KERNEL_FILENAME
This field defines the file name of the kernel part.
-
KERNEL_FUNCTION_NAME
This field defines the kernel function name in the kernel file. This kernel function should be called from the main file. There is no problem if other functions in the kernel file are removed by some optimizations.
-
INPUT
This field defines how to handle the input data for the generated executable file.
-
[ "STDIN", INPUT_FILENAME ]
This description means the generated executable file is executed by inputting the data of the file INPUT_FILENAME to the standard input.
-
[ "FILE", INPUT_FILENAME ]
This description means the generated executable file is executed by specifying the input file name INPUT_FILENAME on the command line.
-
[ "NONE", "NONE" ]
This description means the generated executable file doesn't use input data for its execution.
-
-
OUTPUT
This field defines how to handle the output data for the generated executable file.
-
[ "STDOUT", OUTPUT_FILENAME ]
This description means the execution of the generated executable file outputs the result data into the standard output. HCQC verifies the output by comparing the content of the file OUTPUT_FILENAME.
-
[ "FILE", OUTPUT_FILENAME ]
This description means the execution of the generated executable file outputs the result data into the output file OUTPUT_FILENAME. HCQC verifies the output by comparing the content of the answer file OUTPUT_FILENAME.
-
[ "NONE", "NONE" ]
This description means the generated executable file doesn't generate output for its execution. The execution result is verified only with the end code.
-
If the test program uses the INPUT_FILENAME file or the OUTPUT_FILENAME file, they should be placed under the test program directory, for example:
${INSTALL_DIRECTORY}/hcqc/test-program/newtest
When defining compiler options in the above fields, the following rules should be followed:
- Those fields should not contain optimization options which should be included in the configuration file.
- Those fields should contain only compiler independent options. If compiler-dependent options are required, flag variables should be used.
By executing HCQC with option --v
, you can check what kind of commands are actually executed for the processing of the test program.
In the following, it is assumed that the name of the newly created metric program is NEWMETRIC
.
In this case, it is necessary to create a new directory
${INSTALL_DIRECTORY}/hcqc/command/metric/NEWMETRIC
under the directory
${INSTALL_DIRECTORY}/hcqc/command/metric
A new Python script file
${INSTALL_DIRECTORY}/hcqc/command/metric/NEWMETRIC/NEWMETRIC000.py
needs to be added under
${INSTALL_DIRECTORY}/hcqc/command/metric/NEWMETRIC
The file NEWMETRIC000.py
needs to define a new class which is a subclass of the class driver.MetricWorker
.
The definition of the class driver.MetricWorker
exists in the following file:
${INSTALL_DIRECTORY}/hcqc/command/driver.py
In the following, let XxxMetricWorker
be the new class name.
This new class needs to have the definition of the following methods.
In the following methods, the argument target_config
represents an instance of class config.Config
which includes information from the configuration file which is the file specified on the command line of command ./command/hcqc
.
-
match_p(self, target_config, test_name)
This method decides whether or not to use this script
NEWMETRIC000.py
for the specified configurationtarget_config
or the est program nametest_name
. If the result of this method is True, HCQC uses this script file. If the result is False, HCQC tries to use another script file. -
set_up_before_getting_data(self, target_config, bb_list)
This method defines the operation to be performed before the metric program returns the result data.
bb_list
represents a list of basic blocks of the control flow graph. Each basic block of an element of thebb_list
is an instance object of the classcfg.BasicBlock
in the file${INSTALL_DIRECTORY}/hcqc/command/cfg.py
Since the object of each basic block holds the information of the kernel part of the test program, it can be used for data analysis and collection.
-
get_column_name_list(self)
This method returns a list of column names in the table of the metric program's result. The implementation of this method depends on the metric program. For example, this method of the metric program
kind
returns the following result.[ 'memory', 'branch', 'other']
This method in the metric program
op
returns a list of the mnemonics of the instructions contained in the kernel function in the test program. Therefore, the metric programop
needs to create this list. The methodset_up_before_getting_data
in the metric programop
implements the work. -
get_data_list(self, target_config, bb)
The row of the result table of the metric program corresponds to each basic block of the control flow graph. This method returns a list of data corresponding to the basic block bb. The length of the resulting data list must be the same as the length of the list of column names returned by the method
get_column_name_list
. -
get_summary_list(self, target_config)
The last row of the result table of the metric program represents summary data for each column. This method returns a list of summary data in the last row. The length of the resulting data list must be the same as the length of the list of column names returned by the method
get_column_name_list
.
The data element of the method that returns the result data needs to be a string.
First, HCQC creates a control flow graph of the test program.
Then HCQC tries executing each script in order of name from the directory of the specified metric program name.
If the method match_p
of one script returns True, HCQC does not execute subsequent scripts.
The Python script
${INSTALL_DIRECTORY}/hcqc/command/test-metric.py
is a program for testing execution of any metric programs.
For example, you can test NEWMETRIC000.py
as follow:
% cd ${INSTALL_DIRECTORY}/hcqc/command
% python3 test-metric.py metric.NEWMETRIC.NEWMETRIC000 \
aarch64 ClangLLVM 4.0.1 /tmp/AsmByClangLLVM.s kernel_f /tmp/RESULT.json
At the moment, the metric program gets the necessary information from the resulting assembly code. If the new metric program is insufficient only with the assembly code, the compiler needs to be modified to output the necessary information.
For introducing a new architecture for investigating the quality of compilers, it is necessary to create a class representing the architecture in the Python script file
${INSTALL_DIRECTORY}/hcqc/command/config.py
In the following, it is assumed that the name of the newly added architecture is myarch
.
This name is used in the ARCH
field of the configuration file and must match the name returned by command uname -m
.
This rule is to suppress the use of the created configuration file in the wrong environment.
A class representing a new architecture needs to be created as a subclass of the Config
class.
Also, the class name must be created as a class name by adding C_
at the beginning of the architecture name and adding __
at the end.
For example, if the name of the newly added architecture is myarch
, the following class definition is required.
class C_myarch__(Config):
With this rule, HCQC automatically detects its class definition from the information in the ARCH
field in the configuration file.
The class Config
defines the following methods.
-
function_entry_p(self, name, line)
This method determines whether any line
line
of assembly code is the entry to the kernel function.name
is the name of the kernel function. -
function_exit_p(self, name, line)
This method determines whether any line
line
of assembly code is the exit to the kernel function.name
is the name of the kernel function. -
bb_label(self, line)
This method determines whether any lineline
of assembly code represents the entry label of the basic block.
For these methods, if another definition is needed for the newly added architecture myarch
,
the class C_myarch__
can override those definitions.
-
get_asm_comment(self, line)
This method returns the string of the comment if any line
line
of assembly code contains the comment of the assembly code. Otherwise, it returns None. -
bb_branch(self, line)
If any line
line
of assembly contains a control transfer instruction, this method returns the mnemonic of the instruction and the label of the control transfer destination(if any) as a pair. Otherwise, it returns None. -
call_p(self, branch_op, branch_target)
This method decides whether the mnemonic
branch_op
and the control transfer destination labelbranch_target
(if any) represent a function call instruction. -
tail_call_p(self, branch_op, branch_target)
This method decides whether the mnemonic
branch_op
and the control transfer destination labelbranch_target
(if any) represent a tail function call instruction. -
branch_by_register_p(self, branch_op, branch_target)
This method decides whether the mnemonic
branch_op
and the control transfer destination labelbranch_target
(if any) represent a branch by the value of a register. An instruction which branches by the value of a register is either a [tail] call by a function pointer or a table branch. -
table_branch_p(self, branch_op, branch_target, table_branch_label, line_list)
This method decides whether the mnemonic
branch_op
and the control transfer destinationbranch_target
(if any) represent a table branch instruction. Thetable_branch_label
represents a label for table branch(if any) included in the basic block currently being processed, andline_list
represents a list of line in the basic block. -
get_table_branch_prologue_number(self)
This method returns the number of lines(or states) necessary for detecting tables for table branches from the assembly code.
-
trace_table_branch_prologue(self, region_status, line)
This method represents a process for detecting tables for table branches. It determines whether to transition to the next state of the process when it reaches the line
line
when the current state number isregion_status
. When transitioning to the next state, this method returns(True, label)
as a result. Here, thelabel
represents the head label of the table for table branches that the lineline
contains. If the lineline
does not contain the label, thelabel
is None. If this method does not transition to the next state, it returns(False, None)
as a result. -
get_table_branch_content(self, line)
If any line
line
is an element of the table of the table branch, this method returns the label of the destination which the lineline
includes. Otherwise, this method returnsNone
. -
fall_through_p(self, branch_op)
This method decides whether the mnemonic
branch_op
of control transfer instructions falls through the next basic block. -
op(self, line)
If any line
line
of assembly code is an instruction, then this method returns its mnemonic. Otherwise, it returns None. -
load_op_p(self, op)
This method determines whether the mnemonic
op
is a memory read instruction. -
store_op_p(self, op)
This method determines whether the mnemonic
op
is a memory write instruction. -
control_transfer_op_p(self, op)
This method determines whether the mnemonic
op
is a control transfer instruction.
Defining a new architecture alone is meaningless. To use that definition, you need to define a compiler that will investigate the quality on the newly added architecture.
For introducing a new compiler for investigating the quality of it, it is necessary to create a class representing the compiler in the Python script file
${INSTALL_DIRECTORY}/hcqc/command/config.py
HCQC treats the compiler and the architecture that runs the compiler as a pair.
Therefore, if there is no definition of the architecture to run the new compiler, it is necessary to define the architecture first.
In the following, it is assumed that the name of the newly added compiler is Foo
and the name of the architecture that runs the compiler Foo
is myarch
.
A class representing a new compiler needs to be created as a subclass of the class representing the architecture to run the compiler.
Also, the class name must be created by appending the name of the compiler after the class name of the architecture.
For example, if the name of a newly added compiler is Foo
, the following class definition is required.
class C_myarch__Foo(C_myarch__):
This rule is for automatically detecting the class definition from the information in the ARCH
field and COMPILER
field in the configuration file.
If the compiler Foo
needs to change the behavior of the method defined in the class C_myarch__
or the class Config
,
the class C_myarch__Foo
can override those definitions.
Since the name of the compiler is used as a Python class name, the name needs to be created only from characters that can be used for Python identifiers.
You can test the new definition of the compiler by creating a control flow graph using the assembly code which the compiler generated. The Python script
${INSTALL_DIRECTORY}/hcqc/command/test-cfg.py
is a program for testing generating control flow graphs. For example, you can generate a control flow graph for the assembly code as follows:
% cd ${INSTALL_DIRECTORY}/hcqc/command
% python3 test-cfg.py aarch64 ClangLLVM 4.0.1 /tmp/AsmByClangLLVM.s kernel_f
- Add supports for the Scalable Vector Extension(SVE) of AArch64 if it becomes available for GCC or Clang/LLVM.