Skip to content

Latest commit

 

History

History
1013 lines (658 loc) · 44 KB

README.md

File metadata and controls

1013 lines (658 loc) · 44 KB

HCQC - HPC compiler quality checker

Introduction

HCQC is a tool for investigating the quality of code generation of the kernel part of the HPC application by the compiler.

Many HPC applications have a few hot spots which are a very narrow range of code, consisting of one function or several consecutive loops. These hot spots occupy most of the program execution time. Therefore, the quality of the code of the hot spot is important for performance.

HCQC is a program for collecting metric data for investigating the quality of hotspot code by some compilers for registered test programs. There is not much meaning with just one compiler's data, but it is meaningful to compare the results of multiple compilers. A typical comparison method is as follows.

  • On Architecture A, Compiler X vs. Compiler Y

    This can evaluate the advantages and disadvantages of compilers X and Y.

  • Compiler X version V vs. Compiler X version W

    This can check the effect of changes in compiler version.

  • Compiler X on Architecture A vs. Architecture B

    This can confirm the lack of compiler X's features when the architecture changes.

  • Compiler X on Architecture A vs. Compiler Y on Architecture B

    If the architecture A is new and the architecture B is mature, this comparison will provide important information on compiler X's enhancement.

HCQC is a tool to help improve the performance of hot spots.

Quickstart Guide

HCQC currently mainly deals with GCC or Clang/LLVM on Linux of 64 bit ARM architecture(AArch64). For other architectures or compilers, see How to Add New Architectures or Compilers.

In the following, ${INSTALL_DIRECTORY} shows the directory where hcqc exists.

Installation

To execute HCQC, it is necessary to define a compiler and command line options for the compiler to be investigated. The definition of the investigation target is described in the configuration file of JSON format placed in directory ${INSTALL_DIRECTORY}/hcqc/config. For example, if you want to investigate the optimization level -O2 of the GCC whose version is 7.1.1, whose absolute path is /usr/bin/gcc, then the configuration file should be written as follows:

{
    "DISTRIBUTION" : "OpenSUSE Tumbleweed",
    "ARCH" : "aarch64",
    "CPU" : "AMD Opteron A1100 Cortex A57",
    "LANGUAGE" : "C",
    "COMPILER" : "GCC",
    "COMMAND" : "/usr/bin/gcc",
    "VERSION" : "7.1.1",
    "OPT_FLAGS" : ["-O2"],
    "ASM_FLAGS" : ["-S", "-fverbose-asm"],
    "FLAG_DB" : [["?DEBUG_FLAG", "-g"],
                 ["?C99_STANDARD", "-std=c99"]]
}

Explanation of each field of the configuration file is described in How to Create New Configuration Files.

The name of the configuration file including this definition is gcc-config.json in the following.

Running samples

To investigate the quality of the defined configuration, it is necessary to compile and execute test programs and collect data using the configuration file. All test programs exist under directory ${INSTALL_DIRECTORY}/hcqc/test-program. In the following description, it is assumed that sample test program is used as a test program. To investigate the quality of the compiler, it is necessary to specify a metric indicating the type of data collection.

In this case, the metric criteria kind for taking statistics of the mnemonic type, which is included in the function which includes hot spots, is used as an example. Regarding the configuration file gcc-config.json, to collect data of the measurement standard kind from the test program sample, execute the following command:

% cd ${INSTALL_DIRECTORY}/hcqc
% ./command/hcqc gcc-config sample kind

By executing this command, the following processing is executed.

(1) Using the compiler and compile options specified in the configuration file, compile and execute the test program sample and confirm that the execution result is correct.

All the files generated by this work are placed under the following directory:

${INSTALL_DIRECTORY}/hcqc/work/sample/gcc-config/kind

(2) From the assembly code generated by compiling the kernel part of the test program sample, a control flow graph of the kernel part is created. Based on the control flow graph, a result file gcc-config--sample--kind.json which is statistical data of the mnemonic based on the measurement standard kind is created.

The content of the result file gcc-config--sample--kind.json is, for example, as follows:

[
    [ "TITLE", ["CFG", "SIZE", "DEPTH", "memory", "branch", "other"]],
    [ "kernel cond .L1", [ "2", "0", "0", "1", "1"]],
    [ "               ", [ "7", "0", "0", "0", "7"]],
    [ ".L5            ", [ "1", "1", "0", "0", "1"]],
    [ ".L4            ", [ "4", "2", "1", "0", "3"]],
    [ ".L3    cond .L3", [ "8", "3", "3", "1", "4"]],
    [ "       cond .L4", [ "3", "2", "0", "1", "2"]],
    [ "       cond .L5", [ "6", "1", "0", "1", "5"]],
    [ ".L1         end", [ "1", "0", "0", "1", "0"]],
    [ "*SUMMARY*", [ "32", "-", "4", "5", "23"]]]

All the result files generated using the test program sample are placed under the following directory:

${INSTALL_DIRECTORY}/hcqc/result/sample

Viewing results

The execution result of command ./command/hcqc is data in JSON format. To make them easier to see, you can use the command ./command/hcqc-report. For example, the following command:

% ./command/hcqc-report gcc-config sample R0 kind

creates a report file gcc-config--sample--R0.csv with CSV format in the following directory:

${INSTALL_DIRECTORY}/hcqc/report/sample

The contents of the report file are as follows, for example:

CFG,SIZE,DEPTH,memory,branch,other
kernel cond .L1,2,0,0,1,1
               ,7,0,0,0,7
.L5            ,1,1,0,0,1
.L4            ,4,2,1,0,3
.L3    cond .L3,8,3,3,1,4
       cond .L4,3,2,0,1,2
       cond .L5,6,1,0,1,5
.L1         end,1,0,0,1,0
*SUMMARY*,32,-,4,5,23

Cleaning up

By executing the following command, all generated files under work or report directories except the result data files under result directory can be deleted.

% cd ${INSTALL_DIRECTORY}/hcqc
% ./clean-all.sh

To delete all the generated files including the result data files under result directory, Execute the following command:

% cd ${INSTALL_DIRECTORY}/hcqc
% ./realclean-all.sh

Workflow of HCQC

Confirmation of execution of test programs

The HCQC acquires the test program data on the specified configuration file in the following procedure. In the following, when using the configuration file

${INSTALL_DIRECTORY}/hcqc/config/CONFIG.json

the test program

${INSTALL_DIRECTORY}/hcqc/test-program/sample

and the metric program name kind, the execution of HCQC is as follows:

% cd ${INSTALL_DIRECTORY}/hcqc
% ./command/hcqc CONFIG sample kind

The workflow executing this command is as follows:

First, HCQC opens the configuration file

${INSTALL_DIRECTORY}/hcqc/config/CONFIG.json

and reads each field of the JSON format file. The configuration file defines the compiler to be investigated and optimization options, and includes, for example, the following contents:

{
    "DISTRIBUTION" : "OpenSUSE Tumbleweed",
    "ARCH" : "aarch64",
    "CPU" : "AMD Opteron A1100 Cortex A57",
    "LANGUAGE" : "C",
    "COMPILER" : "GCC",
    "COMMAND" : "/usr/bin/gcc",
    "VERSION" : "7.1.1",
    "OPT_FLAGS" : ["-O2"],
    "ASM_FLAGS" : ["-S", "-fverbose-asm"],
    "FLAG_DB" : [["?DEBUG_FLAG", "-g"],
                 ["?C99_STANDARD", "-std=c99"]]
}

The meaning of each field of the configuration file is as follows:

  • DISTRIBUTION : distribution of OS(Currently unused)
  • ARCH : the machine hardware name(Must match uname -m)
  • CPU : the CPU name(Currently unused)
  • LANGUAGE : target programming language(Must match LANGUAGE of test programs)
  • COMPILER : the name of compiler
  • COMMAND : full path name of the compiler command
  • VERSION : the compiler version number(Must match --version result)
  • OPT_FLAGS : compiler options to be investigated
  • ASM_FLAGS : compiler options for generating assembly code
  • FLAG_DB : definition of flag variables used in program information file

From the field information of FLAG_DB, HCQC creates a flag replacement map. For example, the definition of the field of FLAG_DB:

"FLAG_DB" : [["?DEBUG_FLAG", "-g"],
             ["?C99_STANDARD", "-std=c99"]]

generates the map:

{ "?DEBUG_FLAG" : "-g",
  "?C99_STANDARD" : "-std=c99" }

Next, the program information file

${INSTALL_DIRECTORY}/hcqc/test-program/sample/program-info.json

for the specified test program sample is opened and each field of the JSON format file is read. The program information file includes information for compiling and executing the test program. The contents are, for example, as follows:

{
    "LANGUAGE" : "C",
    "MAIN_FLAGS" : ["?DEBUG_FLAG", "?C99_STANDARD"],
    "KERNEL_FLAGS" : ["-DFAST", "?C99_STANDARD"],
    "LINK_FLAGS" : ["?C99_STANDARD"],
    "LIB_LIST" : ["-lm"],
    "MAIN_FILENAME" : "main.c",
    "KERNEL_FILENAME" : "kernel.c",
    "KERNEL_FUNCTION_NAME" : "kernel",
    "INPUT" : [ "STDIN", "in.data" ],
    "OUTPUT" : [ "STDOUT", "out.data" ]
}

The meaning of each field is as follows:

  • LANGUAGE : programming language describing the test program
  • MAIN_FLAGS : compile options for compiling files in the main part
  • KERNEL_FLAGS : compile options for compiling files in the kernel part
  • LINK_FLAGS : link options for generating executable files
  • LIB_LIST : library specification options used to generate executable file
  • MAIN_FILENAME : the file name of the main part
  • KERNEL_FILENAME : the file name of the kernel part
  • KERNEL_FUNCTION_NAME : the name of kernel function
  • INPUT : specification of input data for executable file
  • OUTPUT : specification of output data for executable file

Using these pieces of information, HCQC compiles and executes the test program, and verifies the result. First, it compiles the main file and generates the object file. At this time, HCQC uses the compiler which is specified in the COMMAND field of the configuration file. That is, the command to compile the file of the main part has the following form:

% COMMAND MAIN_FILENAME MAIN_FLAGS -c -o RESULT_FILENME

Here, RESULT_FILENME is the file name obtained by converting the suffix of MAIN_FILENAME into .o. At this time, the flag variable in MAIN_FLAGS is replaced by using the FLAG_DB information in the configuration file. The command actually executed is, for example, as follows:

% /usr/bin/gcc ${INSTALL_DIRECTORY}/hcqc/test-program/sample/main.c \
  -g -std=c99 -c -o ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/main.o

Similarly, HCQC compiles the file of the kernel part with the following command:

% COMMAND KERNEL_FILENAME KERNEL_FLAGS OPT_FLAGS -c -o RESULT_FILENME

At this time, the command actually executed is, for example, as follows:

% /usr/bin/gcc ${INSTALL_DIRECTORY}/hcqc/test-program/sample/kernel.c \
  -DFAST -std=c99 -O2 -c -o ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.o

HCQC executes the following command to create an executable file from generated object files:

% COMMAND LINK_FLAGS RESULT_FILENAME_LIST -o EXEC_FILENAME LIB_LIST

At this time, the command actually executed is, for example, as follows:

% /usr/bin/gcc -std=c99 ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/main.o \
  ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.o \
   -o ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/a.out -lm

Next, HCQC executes the created executable file with the specified input data and verifies the result. In the sample program information file, the specification of the input data is as follows:

"INPUT" : [ "STDIN", "in.data" ]

This description means that HCQC executes the program by inputting the data of the file in.data to the standard input. In the sample program information file, the specification of the output data is as follows:

"OUTPUT" : [ "STDOUT", "out.data" ]

This description means that HCQC verifies the execution result by comparing the result of the standard output with the content of the file out.data. At this time, the command actually executed is, for example, as follows:

 % ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/a.out \
   < in.data \
   > ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/out.data
 % /usr/bin/diff  \
   ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/out.data \
   ${INSTALL_DIRECTORY}/hcqc/test-program/sample/out.data

After confirming that the execution result is correct, HCQC executes the metric program and obtain data related to the test program. For this purpose, HCQC generates an assembly code file of the kernel part of the test program. To do this, HCQC executes the next command which changed part of the command to compile the kernel part.

% COMMAND KERNEL_FILENAME KERNEL_FLAGS OPT_FLAGS ASM_FLAGS -o ASM_FILENAME

At this time, the command actually executed is, for example, as follows:

% /usr/bin/gcc ${INSTALL_DIRECTORY}/hcqc/test-program/sample/kernel.c \
  -DFAST -std=c99 -O2 -S -fverbose-asm \
  -o ${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.s

Executing metric programs

When execution of the test program is confirmed, HCQC executes the metric program. When the metric program name is M, HCQC executes the set of scripts

${INSTALL_DIRECTORY}/hcqc/command/metric/M/M000.py
${INSTALL_DIRECTORY}/hcqc/command/metric/M/M001.py
....
${INSTALL_DIRECTORY}/hcqc/command/metric/M/M999.py

under the directory

${INSTALL_DIRECTORY}/hcqc/command/metric/M

in order. HCQC uses the field information read from the configuration file, evaluates the match_p method in each script, and executes a script whose result is True. If the evaluation result of the match_p method is False, HCQC stops executing the script and tries the next script.

The method of acquiring metric information from the target test program differs depending on the target architecture and compiler. Therefore, it is necessary to select a script suitable for the specified configuration file and test program.

The example execution command is as follows:

% ./command/hcqc CONFIG sample kind

then the metric program name is kind. Therefore, HCQC executes the script that exists under the directory

${INSTALL_DIRECTORY}/hcqc/command/metric/kind

This directory has only the following script file:

${INSTALL_DIRECTORY}/hcqc/command/metric/kind/kind999.py

and, HCQC executes this script.

Generally, each metric program performs the following operations.

(1) Create a control flow graph from the generated assembly code file.

In this example, a control flow graph is created from the file

${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.s

and the file

${INSTALL_DIRECTORY}/hcqc/work/sample/CONFIG/kind/kernel.s.dot

which represents the control flow graph is created. This file contains information for Graphviz dot command. In this example, the control flow graph is as follows:

CFG image

The information of this control flow graph corresponds to the following part of the result information file.

[ "TITLE", ["CFG", "SIZE", "DEPTH",
[ "kernel cond .L1", [ "2", "0",
[ "               ", [ "7", "0",
[ ".L5            ", [ "1", "1",
[ ".L4            ", [ "4", "2",
[ ".L3    cond .L3", [ "8", "3",
[ "       cond .L4", [ "3", "2",
[ "       cond .L5", [ "6", "1",
[ ".L1         end", [ "1", "0",
[ "*SUMMARY*", [ "32", "-",

It corresponds to the first and second column parts of the CSV file of the result report file.

CFG,SIZE,DEPTH,
kernel cond .L1,2,0,
               ,7,0,
.L5            ,1,1,
.L4            ,4,2,
.L3    cond .L3,8,3,
       cond .L4,3,2,
       cond .L5,6,1,
.L1         end,1,0,
*SUMMARY*,32,-,

Here, the column of CFG represents the control flow graph and each row of the CFG column corresponds to each basic block of the control flow graph. The column of SIZE represents the number of instruction in each basic block and the column of DEPTH represents the depth of nesting of the loop. A value of depth 0 means that the basic block is outside loops and a value of depth 999 means that the basic block does not reach the exit of the function by entering an infinite loop. Here, cond means that the basic block ends with a conditional branch instruction having a fallthrough, and goto means that the basic block ends with an unconditional branch instruction. Also, end means the end of the function.

(2) Information unique to the metric program is collected and output as information of additional columns to the result file.

For example, the metric program kind classifies the mnemonic contained in each basic block of the control flow graph into a memory access instruction (memory), a branch instruction (branch), and others (other). In this example, kind generates the following part of the result information file.

"memory", "branch", "other"]],
"0", "1", "1"]],
"0", "0", "7"]],
"0", "0", "1"]],
"1", "0", "3"]],
"3", "1", "4"]],
"0", "1", "2"]],
"0", "1", "5"]],
"0", "1", "0"]],
"4", "5", "23"]]]

This information corresponds to the following part of the result report file.

memory,branch,other
0,1,1
0,0,7
0,0,1
1,0,3
3,1,4
0,1,2
0,1,5
0,1,0
4,5,23

Information gathered by the metric program is saved in the result file under the directory

${INSTALL_DIRECTORY}/hcqc/result/sample

if the test program name is sample. In that case, the file name becomes:

"configuration file name"--"test program name"--"metric program name".json

In this example, the file name is as follows:

${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json

The command clean-all.sh does not delete the file under the directory result. Therefore, if the same configuration file, test program, and metric program are specified and executed for the second time and thereafter:

% ./command/hcqc CONFIG sample kind

In this case, the file name of the result to be created is:

${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json.new

first. And HCQC compares the contents of the following two files:

${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json.new

If the contents of these two files are the same, delete the new file with the suffix .new. If those are different, HCQC rename the old file to the file name with the current time appended to the old file, for example

${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json--2017-10-12-13-26-13

and delete the suffix .new from the name of the new file. That is, as a result, the following two files remain.

${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json
${INSTALL_DIRECTORY}/hcqc/result/sample/CONFIG--sample--kind.json--2017-10-12-13-26-13

Creating report files

All result files of hcqc command are data with JSON format. the command ./command/hcqc-report converts the data of these results into a table of CSV format. For example, the command

% ./command/hcqc-report CONFIG TEST_PROGRAM_NAME REPORT_NAME METRIC_PROGRAM

finds the result data file which is

${INSTALL_DIRECTORY}/hcqc/result/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--METRIC_PROGRAM_NAME.json

and generate the report file which is

${INSTALL_DIRECTORY}/hcqc/report/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--REPORT_NAME.csv

The command ./command/hcqc-report can combine multiple tables. For example, the command

% ./command/hcqc-report CONFIG TEST_PROGRAM_NAME REPORT_NAME M1 M2 ... Mn

can combine tables for the following data files in the order specified:

${INSTALL_DIRECTORY}/hcqc/result/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--M1.json
${INSTALL_DIRECTORY}/hcqc/result/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--M2.json
...
${INSTALL_DIRECTORY}/hcqc/result/TEST_PROGRAM_NAME/CONFIG--TEST_PROGRAM_NAME--Mn.json    

This is possible because CFG, SIZE ,and DEPTH columns in the result table are equivalent when the result information is generated from the same configuration file and test program.

The Metric Program

The metric program in HCQC is a set of programs to investigate the quality of code generated by the compiler according to various metrics.

Status of each metric program

arch compiler op kind regalloc height ilp swpl vectorize
AArch64 GCC Y Y Y Y N N N
AArch64 Clang/LLVM Y Y Y Y N N N
x86_64 GCC Y N N N N N N
x86_64 Clang/LLVM Y N Y N N N N
x86_64 ICC Y N N N N N N

op

The metric program op investigates the number of mnemonics of the instructions used in the assembly code of the kernel function. The number of mnemonics is summarized for each basic block of the control flow graph.

kind

The metric program kind investigates the kind of mnemonic of the instruction used in the assembly code of the kernel function. The number of mnemonic kinds is summarized for each basic block of the control flow graph. The kinds of mnemonics handled by the metric program kind are as follows:

  • memory : memory access instructions
  • branch : branch instructions and any control transfer instructions
  • other : remaining instructions not included in the above two kinds

regalloc

The metric program regalloc examines the quality of register allocation by the compiler. It counts the number of spill codes which exist in the assembly code of the kernel function. The number of spill codes is summarized for each basic block of the control flow graph. The metric program regalloc counts the spill code by dividing it into the following two types.

  • spill out: a set of store instructions to save the value of the register to the spill area prepared by the compiler
  • spill in: a set of load instructions to restore the value of the register from the spill area prepared by the compiler

The quality of register allocation is determined by the number of spill codes generated by the compiler. However, the exact definition of spill code may differ depending on the compiler. For example, some compilers may consider saving and restoring the values of registers generated before and after function calls as spill codes. Therefore, if the compiler is different, it can not simply decide the quality of register allocation by comparing the number of spill codes. Also, note that the location where the spill code exists is also important. The spill code in the innermost loop becomes a cause of the performance degradation more than the spill code outside the loop. The result of regalloc should be judged with these considerations taken into account.

The metric program regalloc detects spill out and spill in instructions from the resulting assembly code. The details of this process vary depending on the compiler.

  • LLVM

    If the compiler is Clang/LLVM, the instruction with the following comment in the assembly code file is the spill code.

    // ???-byte Spill
    // ???-byte Folded Spill
    // ???-byte Reload
    // ???-byte Folded Reload
    

    The following script

    ${INSTALL_DIRECTORY}/hcqc/command/metric/regalloc/regalloc000.py
    

    implements the process of detecting and counting them.

  • GCC

    If the compiler is GCC, the instruction with the following comment in the assembly code file handles the address of the spill area prepared by the compiler.

      /// ... %sfp ...
    

    These comments can be generated by creating an assembly code file by attaching option -fverbose-asm to GCC. Therefore, if the compiler is GCC, you can regard memory access instructions with these comments as spill codes.

    The following script

    ${INSTALL_DIRECTORY}/hcqc/command/metric/regalloc/regalloc001.py
    

    implements the process of detecting and counting them.

(Note): LLVM considers load instructions or store instructions generated at the entry and the exit of a function as spill codes. However, GCC does not regard these instructions as spill codes. Therefore, when comparing the number of spill codes at the entry and the exit of a function, it is necessary to consider these differences.

height

The metric program height examines the height of the data dependence graph of the instruction in each basic block. By referring to these values, it is possible to detect the problem of lowering the instruction level parallelism as a result of using the same register within a narrow range by a register allocation pass.

ilp

This metric program is for investigating the quality of instruction scheduling by the compiler. This program has not been implemented yet.

swpl

This metric program is for investigating the quality of software pipelining by the compiler. This program has not been implemented yet.

vectorize

This metric program is for investigating the quality of vectorization or SIMDization by the compiler. This program has not been implemented yet.

In the following, it is assumed that the name of the newly created configuration file is NEWCONFIG. In this case, it is necessary to add a new file

${INSTALL_DIRECTORY}/hcqc/config/NEWCONFIG.json

under the directory

${INSTALL_DIRECTORY}/hcqc/config

The file NEWCONFIG.json need to define the following fields:

  • DISTRIBUTION

    This field defines the name of OS distribution. This field is not currently used.

  • ARCH

    This field defines the machine hardware name. HCQC checks whether this definition matches the result of uname -m.

  • CPU

    This field defines the name of CPU. This field is not currently used.

  • LANGUAGE

    This field defines the programming language which targeted by the compiler COMPILER. HCQC checks whether this definition matches the LANGUAGE field in the program-info.json for test programs.

  • COMPILER

    This field defines the name of the compiler to be investigated. The string is used for selecting the Python class in hcqc/command/config.py. Currently, only GCC and ClangLLVM are supported. For introducing the new compiler name, you need to add definitions for it in the script file hcqc/command/config.py(see How to Add New Architectures or Compilers).

  • COMMAND

    This field defines the full path name of the compiler COMPILER to be investigated.

  • VERSION

    This field defines the version number of the compiler COMPILER to be investigated. HCQC checks whether this definition matches the result of the COMMAND execution results with the --version option.

  • OPT_FLAGS

    This field defines the optimization options using the compiler COMPILER to be investigated. HCQC regards a pair of a compiler and optimization options used by its compiler as identifiers to be investigated.

  • ASM_FLAGS

    This field defines the options for generating assembly codes by the compiler COMPILER to be investigated. Because HCQC uses assembly code with detailed information added, the option -S may not be enough.

  • FLAG_DB

    This field defines the flag variables used in program information files. Each compiler often has different options for the same feature. Program information files use flag variables in the definitions, then HCQC replaces those flag variables using the definition of the field FLAG_DB before executing the compiler. If there is no option to specify the specific feature in the target compiler, you can specify an empty string for the flag variable like the following:

    "FLAG_DB" : [["?COMPILER_RARE_FLAG", ""], ...]
    

By executing HCQC with option --v, you can check what kind of commands are actually executed for the processing of the test program using the specific configuration file.

How to Add New Test Programs

In the following, it is assumed that the name of the newly added test program is newtest. In this case, it is necessary to create a new directory

${INSTALL_DIRECTORY}/hcqc/test-program/newtest

under the directory

${INSTALL_DIRECTORY}/hcqc/test-program

This new directory should have at least the following three files:

  • program-info.json

    This file defines how to compile, execute, and check the result of the test program. See below for details of this definition.

  • a source file of kernel part of the test program

    In order to prevent the function from disappearing due to interprocedural optimizations performed in the file, the kernel part and the main part of the test program should be divided into different files.

  • a source file of main part of the test program

    The main part of the test program should execute the kernel function in the kernel file and verify the result. If there is no output result in the program, the main part should report the result status with an exit code.

Both the compiler command and the generated executable file are executed in directory

${INSTALL_DIRECTORY}/hcqc/test-program/newtest

then, header files etc. in this directory can be referenced from the test program by the relative path.

The file program-info.json needs to define the following fields:

  • LANGUAGE

    This field defines the programming language which the test program is described. HCQC checks whether this definition matches the LANGUAGE field in the configuration file.

  • MAIN_FLAGS

    This field defines the options for compiling the main file by the compiler to be investigated.

  • KERNEL_FLAGS

    This field defines the options for compiling the kernel file by the compiler to be investigated. This field should not contain optimization options which should be included in the configuration file.

  • LINK_FLAGS

    This field defines the options for building the executable file by the compiler to be investigated.

  • LIB_LIST

    This field defines the library options for building the executable file by the compiler to be investigated.

  • MAIN_FILENAME

    This field defines the file name of the main part.

  • KERNEL_FILENAME

    This field defines the file name of the kernel part.

  • KERNEL_FUNCTION_NAME

    This field defines the kernel function name in the kernel file. This kernel function should be called from the main file. There is no problem if other functions in the kernel file are removed by some optimizations.

  • INPUT

    This field defines how to handle the input data for the generated executable file.

    • [ "STDIN", INPUT_FILENAME ]

      This description means the generated executable file is executed by inputting the data of the file INPUT_FILENAME to the standard input.

    • [ "FILE", INPUT_FILENAME ]

      This description means the generated executable file is executed by specifying the input file name INPUT_FILENAME on the command line.

    • [ "NONE", "NONE" ]

      This description means the generated executable file doesn't use input data for its execution.

  • OUTPUT

    This field defines how to handle the output data for the generated executable file.

    • [ "STDOUT", OUTPUT_FILENAME ]

      This description means the execution of the generated executable file outputs the result data into the standard output. HCQC verifies the output by comparing the content of the file OUTPUT_FILENAME.

    • [ "FILE", OUTPUT_FILENAME ]

      This description means the execution of the generated executable file outputs the result data into the output file OUTPUT_FILENAME. HCQC verifies the output by comparing the content of the answer file OUTPUT_FILENAME.

    • [ "NONE", "NONE" ]

      This description means the generated executable file doesn't generate output for its execution. The execution result is verified only with the end code.

If the test program uses the INPUT_FILENAME file or the OUTPUT_FILENAME file, they should be placed under the test program directory, for example:

${INSTALL_DIRECTORY}/hcqc/test-program/newtest

When defining compiler options in the above fields, the following rules should be followed:

  • Those fields should not contain optimization options which should be included in the configuration file.
  • Those fields should contain only compiler independent options. If compiler-dependent options are required, flag variables should be used.

By executing HCQC with option --v, you can check what kind of commands are actually executed for the processing of the test program.

How to Add New Metric Programs

In the following, it is assumed that the name of the newly created metric program is NEWMETRIC. In this case, it is necessary to create a new directory

${INSTALL_DIRECTORY}/hcqc/command/metric/NEWMETRIC

under the directory

${INSTALL_DIRECTORY}/hcqc/command/metric

A new Python script file

${INSTALL_DIRECTORY}/hcqc/command/metric/NEWMETRIC/NEWMETRIC000.py

needs to be added under

${INSTALL_DIRECTORY}/hcqc/command/metric/NEWMETRIC

The file NEWMETRIC000.py needs to define a new class which is a subclass of the class driver.MetricWorker. The definition of the class driver.MetricWorker exists in the following file:

${INSTALL_DIRECTORY}/hcqc/command/driver.py

In the following, let XxxMetricWorker be the new class name. This new class needs to have the definition of the following methods. In the following methods, the argument target_config represents an instance of class config.Config which includes information from the configuration file which is the file specified on the command line of command ./command/hcqc.

  • match_p(self, target_config, test_name)

    This method decides whether or not to use this script NEWMETRIC000.py for the specified configuration target_config or the est program name test_name. If the result of this method is True, HCQC uses this script file. If the result is False, HCQC tries to use another script file.

  • set_up_before_getting_data(self, target_config, bb_list)

    This method defines the operation to be performed before the metric program returns the result data. bb_list represents a list of basic blocks of the control flow graph. Each basic block of an element of the bb_list is an instance object of the class cfg.BasicBlock in the file

    ${INSTALL_DIRECTORY}/hcqc/command/cfg.py
    

    Since the object of each basic block holds the information of the kernel part of the test program, it can be used for data analysis and collection.

  • get_column_name_list(self)

    This method returns a list of column names in the table of the metric program's result. The implementation of this method depends on the metric program. For example, this method of the metric program kind returns the following result.

    [ 'memory', 'branch', 'other']
    

    This method in the metric program op returns a list of the mnemonics of the instructions contained in the kernel function in the test program. Therefore, the metric program op needs to create this list. The method set_up_before_getting_data in the metric program op implements the work.

  • get_data_list(self, target_config, bb)

    The row of the result table of the metric program corresponds to each basic block of the control flow graph. This method returns a list of data corresponding to the basic block bb. The length of the resulting data list must be the same as the length of the list of column names returned by the method get_column_name_list.

  • get_summary_list(self, target_config)

    The last row of the result table of the metric program represents summary data for each column. This method returns a list of summary data in the last row. The length of the resulting data list must be the same as the length of the list of column names returned by the method get_column_name_list.

The data element of the method that returns the result data needs to be a string.

First, HCQC creates a control flow graph of the test program. Then HCQC tries executing each script in order of name from the directory of the specified metric program name. If the method match_p of one script returns True, HCQC does not execute subsequent scripts.

The Python script

${INSTALL_DIRECTORY}/hcqc/command/test-metric.py

is a program for testing execution of any metric programs. For example, you can test NEWMETRIC000.py as follow:

% cd ${INSTALL_DIRECTORY}/hcqc/command
% python3 test-metric.py metric.NEWMETRIC.NEWMETRIC000 \
          aarch64 ClangLLVM 4.0.1 /tmp/AsmByClangLLVM.s kernel_f /tmp/RESULT.json

At the moment, the metric program gets the necessary information from the resulting assembly code. If the new metric program is insufficient only with the assembly code, the compiler needs to be modified to output the necessary information.

How to Add New Architectures

For introducing a new architecture for investigating the quality of compilers, it is necessary to create a class representing the architecture in the Python script file

${INSTALL_DIRECTORY}/hcqc/command/config.py

In the following, it is assumed that the name of the newly added architecture is myarch. This name is used in the ARCH field of the configuration file and must match the name returned by command uname -m. This rule is to suppress the use of the created configuration file in the wrong environment.

A class representing a new architecture needs to be created as a subclass of the Config class. Also, the class name must be created as a class name by adding C_ at the beginning of the architecture name and adding __ at the end. For example, if the name of the newly added architecture is myarch, the following class definition is required.

class C_myarch__(Config):

With this rule, HCQC automatically detects its class definition from the information in the ARCH field in the configuration file. The class Config defines the following methods.

  • function_entry_p(self, name, line)

    This method determines whether any line line of assembly code is the entry to the kernel function. name is the name of the kernel function.

  • function_exit_p(self, name, line)

    This method determines whether any line line of assembly code is the exit to the kernel function. name is the name of the kernel function.

  • bb_label(self, line) This method determines whether any line line of assembly code represents the entry label of the basic block.

For these methods, if another definition is needed for the newly added architecture myarch, the class C_myarch__ can override those definitions.

  • get_asm_comment(self, line)

    This method returns the string of the comment if any line line of assembly code contains the comment of the assembly code. Otherwise, it returns None.

  • bb_branch(self, line)

    If any line line of assembly contains a control transfer instruction, this method returns the mnemonic of the instruction and the label of the control transfer destination(if any) as a pair. Otherwise, it returns None.

  • call_p(self, branch_op, branch_target)

    This method decides whether the mnemonic branch_op and the control transfer destination label branch_target(if any) represent a function call instruction.

  • tail_call_p(self, branch_op, branch_target)

    This method decides whether the mnemonic branch_op and the control transfer destination label branch_target(if any) represent a tail function call instruction.

  • branch_by_register_p(self, branch_op, branch_target)

    This method decides whether the mnemonic branch_op and the control transfer destination label branch_target(if any) represent a branch by the value of a register. An instruction which branches by the value of a register is either a [tail] call by a function pointer or a table branch.

  • table_branch_p(self, branch_op, branch_target, table_branch_label, line_list)

    This method decides whether the mnemonic branch_op and the control transfer destination branch_target(if any) represent a table branch instruction. The table_branch_label represents a label for table branch(if any) included in the basic block currently being processed, and line_list represents a list of line in the basic block.

  • get_table_branch_prologue_number(self)

    This method returns the number of lines(or states) necessary for detecting tables for table branches from the assembly code.

  • trace_table_branch_prologue(self, region_status, line)

    This method represents a process for detecting tables for table branches. It determines whether to transition to the next state of the process when it reaches the line line when the current state number is region_status. When transitioning to the next state, this method returns (True, label) as a result. Here, the label represents the head label of the table for table branches that the line line contains. If the line line does not contain the label, the label is None. If this method does not transition to the next state, it returns (False, None) as a result.

  • get_table_branch_content(self, line)

    If any line line is an element of the table of the table branch, this method returns the label of the destination which the line line includes. Otherwise, this method returns None.

  • fall_through_p(self, branch_op)

    This method decides whether the mnemonic branch_op of control transfer instructions falls through the next basic block.

  • op(self, line)

    If any line line of assembly code is an instruction, then this method returns its mnemonic. Otherwise, it returns None.

  • load_op_p(self, op)

    This method determines whether the mnemonic op is a memory read instruction.

  • store_op_p(self, op)

    This method determines whether the mnemonic op is a memory write instruction.

  • control_transfer_op_p(self, op)

    This method determines whether the mnemonic op is a control transfer instruction.

Defining a new architecture alone is meaningless. To use that definition, you need to define a compiler that will investigate the quality on the newly added architecture.

How to Add New Compilers

For introducing a new compiler for investigating the quality of it, it is necessary to create a class representing the compiler in the Python script file

${INSTALL_DIRECTORY}/hcqc/command/config.py

HCQC treats the compiler and the architecture that runs the compiler as a pair. Therefore, if there is no definition of the architecture to run the new compiler, it is necessary to define the architecture first. In the following, it is assumed that the name of the newly added compiler is Foo and the name of the architecture that runs the compiler Foo is myarch.

A class representing a new compiler needs to be created as a subclass of the class representing the architecture to run the compiler. Also, the class name must be created by appending the name of the compiler after the class name of the architecture. For example, if the name of a newly added compiler is Foo, the following class definition is required.

class C_myarch__Foo(C_myarch__):

This rule is for automatically detecting the class definition from the information in the ARCH field and COMPILER field in the configuration file. If the compiler Foo needs to change the behavior of the method defined in the class C_myarch__ or the class Config, the class C_myarch__Foo can override those definitions. Since the name of the compiler is used as a Python class name, the name needs to be created only from characters that can be used for Python identifiers.

You can test the new definition of the compiler by creating a control flow graph using the assembly code which the compiler generated. The Python script

${INSTALL_DIRECTORY}/hcqc/command/test-cfg.py

is a program for testing generating control flow graphs. For example, you can generate a control flow graph for the assembly code as follows:

% cd ${INSTALL_DIRECTORY}/hcqc/command
% python3 test-cfg.py aarch64 ClangLLVM 4.0.1 /tmp/AsmByClangLLVM.s kernel_f

Future Work

  • Add supports for the Scalable Vector Extension(SVE) of AArch64 if it becomes available for GCC or Clang/LLVM.