Skip to content

Commit

Permalink
Merge pull request #94 from intel/develop
Browse files Browse the repository at this point in the history
Develop
  • Loading branch information
chuckyount authored Apr 1, 2018
2 parents cd15da7 + 88ef640 commit 2d6d6c3
Show file tree
Hide file tree
Showing 21 changed files with 574 additions and 338 deletions.
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,24 @@
YASK--Yet Another Stencil Kernel: A framework to facilitate exploration of the HPC stencil-performance design space, including optimizations such as
YASK--Yet Another Stencil Kernel: A framework to facilitate exploration of the HPC stencil-performance design space, including optimizations and features such as
* Vector folding,
* Cache blocking,
* Memory layout,
* Loop construction,
* Multi-level OpenMP parallelism,
* Encapsulated memory layout,
* Advanced loop construction,
* Temporal wave-front blocking, and
* MPI halo exchange.

YASK contains a specialized source-to-source translator to convert scalar C++ stencil code to SIMD-optimized code for Intel(R) Xeon Phi(TM) and Intel(R) Xeon(R) processors.
YASK contains a domain-specific compiler to convert scalar C++ stencil code to SIMD-optimized code for Intel(R) Xeon Phi(TM) and Intel(R) Xeon(R) processors.

Supported Platforms
* 64-bit Linux
* Intel(R) Xeon Phi(TM) processor supporting the MIC_AVX512 instruction set.
* Intel(R) Xeon(R) processor supporting the AVX, AVX2, or CORE_AVX512 instruction sets
* Intel(R) Xeon Phi(TM) coprocessor supporting the Knights-Corner instruction set.
* Intel(R) Xeon(R) processor supporting the AVX, AVX2, or CORE_AVX512 instruction sets.
* Intel(R) Xeon Phi(TM) coprocessor supporting the Knights-Corner instruction set (no longer tested).

Pre-requisites:
* Intel(R) C++ compiler (17.0.2 or later recommended),
https://software.intel.com/en-us/intel-parallel-studio-xe.
* Gnu C++ compiler, g++ (4.9.0 or later; 6.1.0 or later recommended).
* Intel(R) Software Development Emulator,
https://software.intel.com/en-us/articles/intel-software-development-emulator
(optional: for functional testing if you don't have native ISA support).
Expand Down
Binary file modified docs/YASK-intro.pdf
100644 → 100755
Binary file not shown.
2 changes: 1 addition & 1 deletion src/common/common_utils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ namespace yask {
// for numbers above 9 (at least up to 99).

// Format: "major.minor.patch".
const string version = "2.03.00";
const string version = "2.04.00";

string yask_get_version_string() {
return version;
Expand Down
63 changes: 53 additions & 10 deletions src/compiler/lib/Eqs.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,12 @@ namespace yask {
bool same_eq = eq1 == eq2;
bool same_cond = areExprsSame(cond1, cond2);
bool same_og = og1 == og2;


#ifdef DEBUG_DEP
if (!same_eq)
cout << " ...from equation " << eq2->makeQuotedStr() << "...\n";
#endif

// If two different eqs have the same condition, they
// cannot update the same grid.
if (!same_eq && same_cond && same_og) {
Expand Down Expand Up @@ -367,8 +372,12 @@ namespace yask {
}

// Save dependency.
if (eq_deps)
if (eq_deps) {
#ifdef DEBUG_DEP
cout << " Exact match found to " << op1->makeQuotedStr() << ".\n";
#endif
(*eq_deps)[cur_step_dep].set_imm_dep_on(eq2, eq1);
}

// Move along to next eq2.
continue;
Expand Down Expand Up @@ -422,14 +431,22 @@ namespace yask {
}

// Save dependency.
if (eq_deps)
if (eq_deps) {
#ifdef DEBUG_DEP
cout << " Likely match found to " << op1->makeQuotedStr() << ".\n";
#endif
(*eq_deps)[cur_step_dep].set_imm_dep_on(eq2, eq1);
}

// Move along to next equation.
break;
}
}
}
#ifdef DEBUG_DEP
cout << " No deps found.\n";
#endif

} // for all eqs (eq2).
} // for all eqs (eq1).

Expand Down Expand Up @@ -968,9 +985,11 @@ namespace yask {
}

// Divide all equations into eqGroups.
// Only process updates to grids in 'gridRegex'.
// 'targets': string provided by user to specify grouping.
// 'eq_deps': pre-computed dependencies between equations.
void EqGroups::makeEqGroups(Eqs& allEqs,
const string& gridRegex,
const string& targets,
EqDepMap& eq_deps,
ostream& os)
Expand All @@ -980,6 +999,7 @@ namespace yask {

// Add each scratch equation to a separate group.
// TODO: Allow multiple scratch eqs in a group with same conds & halos.
// TODO: Only add scratch eqs that are needed by grids in 'gridRegex'.
for (auto eq : allEqs.getEqs()) {

// Get updated grid.
Expand All @@ -993,10 +1013,17 @@ namespace yask {
}
}

// Make a regex for the allowed grids.
regex gridx(gridRegex);

// Handle each key-value pair in 'targets' string.
// Key is eq-group name (with possible format strings); value is regex pattern.
ArgParser ap;
ap.parseKeyValuePairs
(targets, [&](const string& key, const string& value) {
(targets, [&](const string& egfmt, const string& pattern) {

// Make a regex for the pattern.
regex patx(pattern);

// Search allEqs for matches to current value.
for (auto eq : allEqs.getEqs()) {
Expand All @@ -1006,19 +1033,35 @@ namespace yask {
assert(gp);
string gname = gp->getName();

// Does value appear in the grid name?
size_t np = gname.find(value);
if (np != string::npos) {
// Match to gridx?
if (!regex_search(gname, gridx))
continue;

// Add equation.
addExprToGroup(eq, allEqs.getCond(eq), key, false, eq_deps);
}
// Match to patx?
smatch mr;
if (!regex_search(gname, mr, patx))
continue;

// Substitute special tokens with match.
string egname = mr.format(egfmt);

// Add equation.
addExprToGroup(eq, allEqs.getCond(eq), egname, false, eq_deps);
}
});

// Add all remaining equations.
for (auto eq : allEqs.getEqs()) {

// Get name of updated grid.
auto gp = eq->getGrid();
assert(gp);
string gname = gp->getName();

// Match to gridx?
if (!regex_search(gname, gridx))
continue;

// Add equation.
addExprToGroup(eq, allEqs.getCond(eq), _basename_default, false, eq_deps);
}
Expand Down
1 change: 1 addition & 0 deletions src/compiler/lib/Eqs.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,7 @@ namespace yask {
// all eqs updating grid names containing 'bar' go in eqGroup2, and
// each remaining eq goes into a separate eqGroup.
void makeEqGroups(Eqs& eqs,
const string& gridRegex,
const string& targets,
EqDepMap& eq_deps,
std::ostream& os);
Expand Down
1 change: 1 addition & 0 deletions src/compiler/lib/Expr.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ IN THE SOFTWARE.
#include <cstdarg>
#include <assert.h>
#include <fstream>
#include <regex>

// Common utilities.
#include "common_utils.hpp"
Expand Down
1 change: 1 addition & 0 deletions src/compiler/lib/Grid.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -303,6 +303,7 @@ namespace yask {
bool _doComb = true; // combine commutative operations.
bool _doOptCluster = true; // apply optimizations also to cluster.
string _eqGroupTargets; // how to group equations.
string _gridRegex; // grids to update.
};

// Stencil dimensions.
Expand Down
3 changes: 2 additions & 1 deletion src/compiler/lib/Soln.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,8 @@ namespace yask {
// Create equation groups based on dependencies and/or target strings.
_eqGroups.set_basename_default(_settings._eq_group_basename_default);
_eqGroups.set_dims(_dims);
_eqGroups.makeEqGroups(_eqs, _settings._eqGroupTargets, eq_deps, *_dos);
_eqGroups.makeEqGroups(_eqs, _settings._gridRegex,
_settings._eqGroupTargets, eq_deps, *_dos);
_eqGroups.optimizeEqGroups(_settings, "scalar & vector", false, *_dos);

// Make a copy of each equation at each cluster offset.
Expand Down
31 changes: 21 additions & 10 deletions src/compiler/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -78,16 +78,25 @@ void usage(const string& cmd) {
" formats with explicit lengths, lengths will adjusted as needed.\n"
" -cluster <dim>=<size>,...\n"
" Set number of vectors to evaluate in each dimension.\n"
" -eq <name>=<substr>,...\n"
" Put updates to grids containing <substr> in equation-groups with base-name <name>.\n"
" By default, eq-groups are defined as needed based on dependencies with\n"
" -grids <regex>\n"
" Only process updates to grids whose names match <regex>.\n"
" This can be used to generate code for a subset of the stencil equations.\n"
" -eq-groups <name>=<regex>,...\n"
" Put updates to grids matching <regex> in equation-group with base-name <name>.\n"
" By default, eq-groups are created as needed based on dependencies between equations:\n"
" equations that do not depend on each other are grouped together into groups with the\n"
" base-name '" << settings._eq_group_basename_default << "'.\n"
" Eq-groups are created in the order in which they are specified.\n"
" By default, they are created based on the order in which the grids are initialized.\n"
" Each eq-group base-name is appended with a unique index number.\n"
" Example: '-eq a=foo,b=bar' creates one or more eq-groups with base-name 'a'\n"
" containing updates to grids whose name contains 'foo' and one or more eq-groups\n"
" with base-name 'b' containing updates to grids whose name contains 'bar'.\n"
" Each eq-group base-name is appended with a unique index number, so the default group\n"
" names are '" << settings._eq_group_basename_default << "_0', " <<
settings._eq_group_basename_default << "_1', etc.\n"
" This option allows more control over this grouping.\n"
" Example: \"-eq-groups a=foo,b=b[aeiou]r\" creates one or more eq-groups named 'a_0', 'a_1', etc.\n"
" containing updates to each grid whose name contains 'foo' and one or more eq-groups\n"
" named 'b_0', 'b_1', etc. containing updates to each grid whose name matches 'b[aeiou]r'.\n"
" Standard regex-format tokens in <name> will be replaced based on matches to <regex>.\n"
" Example: \"-eq-groups 'g_$&=b[aeiou]r'\" with grids 'bar_x', 'bar_y', 'ber_x', and 'ber_y'\n"
" would create eq-group 'g_bar_0' for grids 'bar_x' and 'bar_y' and eq-group 'g_ber_0' for\n"
" grids 'ber_x' and 'ber_y' because '$&' is substituted by the string that matches the regex.\n"
" -step-alloc <size>\n"
" Specify the size of the step-dimension memory allocation.\n"
" By default, allocations are calculated automatically for each grid.\n"
Expand Down Expand Up @@ -191,7 +200,9 @@ void parseOpts(int argc, const char* argv[])
// options w/a string value.
if (opt == "-stencil")
solutionName = argop;
else if (opt == "-eq")
else if (opt == "-grids")
settings._gridRegex = argop;
else if (opt == "-eq-groups")
settings._eqGroupTargets = argop;
else if (opt == "-fold" || opt == "-cluster") {

Expand Down
16 changes: 8 additions & 8 deletions src/kernel/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ mpi = 1
real_bytes = 4
radius = 2
ranks = 1
v_args = -b 16 -r 32 -rt 2 -dt 2 -d 63

# Defaults based on stencil type (and arch for some stencils).
ifeq ($(stencil),)
Expand Down Expand Up @@ -91,7 +92,6 @@ else ifneq ($(findstring iso3dfd,$(stencil)),)

else ifneq ($(findstring awp,$(stencil)),)
time_alloc = 1
eqs = velocity=vel,stress=str
def_block_args = -b 32
YC_FLAGS += -min-es 1
def_rank_args = -dx 512 -dy 1024 -dz 128 # assume 2 ranks/node in 'x'.
Expand Down Expand Up @@ -122,11 +122,10 @@ else ifneq ($(findstring awp,$(stencil)),)

else ifneq ($(findstring ssg,$(stencil)),)
time_alloc = 1
eqs = v_bl=v_bl,v_tr=v_tr,v_tl=v_tl,s_br=s_br,s_bl=s_bl,s_tr=s_tr,s_tl=s_tl

else ifneq ($(findstring fsg,$(stencil)),)
time_alloc = 1
eqs = v_br=v_br,v_bl=v_bl,v_tr=v_tr,v_tl=v_tl,s_br=s_br,s_bl=s_bl,s_tr=s_tr,s_tl=s_tl
eqs = '$$&=[a-z]_[a-z]+' # match 1st and 2nd 'parts' of grid names.
ifeq ($(arch),knl)
omp_region_schedule = guided
def_block_args = -b 16
Expand Down Expand Up @@ -328,7 +327,7 @@ MISC_LOOP_CODE ?= $(MISC_LOOP_OUTER_MODS) loop($(MISC_LOOP_ORDER)) \
# Flags passed to stencil compiler.
YC_FLAGS += -stencil $(stencil) -elem-bytes $(real_bytes) -cluster $(cluster) -fold $(fold)
ifneq ($(eqs),)
YC_FLAGS += -eq $(eqs)
YC_FLAGS += -eq-groups $(eqs)
endif
ifneq ($(radius),)
YC_FLAGS += -radius $(radius)
Expand Down Expand Up @@ -708,7 +707,7 @@ cxx-yc-api-test-with-exception:

# Run the default YASK compiler and kernel.
yc-and-yk-test: $(YK_EXEC)
$(BIN_DIR)/yask.sh -stencil $(stencil) -arch $(arch) -ranks $(ranks) -v
$(BIN_DIR)/yask.sh -stencil $(stencil) -arch $(arch) -ranks $(ranks) -v $(v_args)

# Generate the code file using the built-in compiler.
code-file: $(YK_CODE_FILE)
Expand All @@ -722,7 +721,7 @@ kernel-only:

# Run the YASK kernel test without implicity using the YASK compiler.
yk-test-no-yc: kernel-only
$(BIN_DIR)/yask.sh -stencil $(stencil) -arch $(arch) -v
$(BIN_DIR)/yask.sh -stencil $(stencil) -arch $(arch) -ranks $(ranks) -v $(v_args)

# NB: set arch var if applicable.
# NB: save some time by using YK_CXXOPT=-O2.
Expand All @@ -741,6 +740,7 @@ all-tests:
$(MAKE) clean; $(MAKE) stencil=test_scratch2 fold=x=2,y=2,z=2 yc-and-yk-test
$(MAKE) clean; $(MAKE) stencil=iso3dfd fold=x=4,y=2 yc-and-yk-test
$(MAKE) clean; $(MAKE) stencil=awp_elastic real_bytes=8 yc-and-yk-test
$(MAKE) clean; $(MAKE) stencil=ssg real_bytes=8 yc-and-yk-test
$(MAKE) clean; $(MAKE) stencil=fsg_abc real_bytes=8 yc-and-yk-test
$(MAKE) clean; $(MAKE) stencil=iso3dfd cxx-yk-api-test
$(MAKE) clean; $(MAKE) stencil=iso3dfd py-yk-api-test
Expand Down Expand Up @@ -855,11 +855,11 @@ help:
@echo " "
@echo "Example debug builds of kernel cmd-line tool:"
@echo " $(MAKE) clean; $(MAKE) -j stencil=iso3dfd mpi=0 OMPFLAGS='-qopenmp-stubs' YK_CXXOPT='-O0' EXTRA_MACROS='DEBUG'"
@echo " $(MAKE) clean; $(MAKE) -j arch=intel64 stencil=3axis mpi=0 OMPFLAGS='-qopenmp-stubs' YK_CXXOPT='-O0' EXTRA_MACROS='DEBUG'"
@echo " $(MAKE) clean; $(MAKE) -j arch=intel64 stencil=3axis mpi=0 OMPFLAGS='-qopenmp-stubs' YK_CXXOPT='-O0' EXTRA_MACROS='DEBUG TRACE' # TRACE is a useful debug setting!"
@echo " $(MAKE) clean; $(MAKE) -j arch=intel64 stencil=3axis radius=0 fold='x=1,y=1,z=1' mpi=0 YK_CXX=g++ OMPFLAGS='' YK_CXXOPT='-O0' EXTRA_MACROS='DEBUG TRACE TRACE_MEM TRACE_INTRINSICS'"
@echo " "
@echo "Example builds with test runs:"
@echo " $(MAKE) -j all"
@echo " $(MAKE) -j all ranks=2"
@echo " $(MAKE) -j all YK_CXX=g++ YK_CXXOPT=-O2 mpi=0"
@echo " $(MAKE) -j all YK_CXX=mpigxx YK_CXXOPT=-O2 ranks=2"
@echo " $(MAKE) -j all YK_CXX=mpigxx YK_CXXOPT=-O2 ranks=3"
Loading

0 comments on commit 2d6d6c3

Please sign in to comment.