Skip to content

Commit b13b857

Browse files
AOCL-LibMem: AMD-optimized memory/string functions
Details: * Initial Public Version of AOCL LibMem Library * This corresponds to AOCL 4.1 Release * Supports memcpy, mempcpy, memmove, memset, memcmp and strcpy
0 parents  commit b13b857

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

75 files changed

+10584
-0
lines changed

BUILD_RUN.md

+134
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
# Build & Run Guide for **_AOCL-LibMem_**
2+
3+
## Requirements
4+
* **cmake 3.11**
5+
* **python 3.10**
6+
* **gcc 12.2.0**
7+
* **aocc 4.0**
8+
9+
## Build Procedure:
10+
### Shared Library:
11+
```sh
12+
$ mkdir build
13+
$ cd build
14+
#Configure for GCC build
15+
# Default Native Build
16+
$ cmake -D CMAKE_C_COMPILER=gcc ../aocl-libmem
17+
# Cross Compiling AVX2 binary on AVX512 machine
18+
$ cmake -D CMAKE_C_COMPILER=gcc -D ALMEM_ARCH=avx2 ../aocl-libmem
19+
# Cross Compiling AVX512 binary on AVX2 machine
20+
$ cmake -D CMAKE_C_COMPILER=gcc -D ALMEM_ARCH=avx512 ../aocl-libmem
21+
# Enabling Tunable Parameters
22+
$ cmake -D CMAKE_C_COMPILER=gcc -D ENABLE_TUNABLES=Y ../aocl-libmem
23+
#Configure for AOCC(Clang) build
24+
# Default Native Build
25+
$ cmake -D CMAKE_C_COMPILER=clang ../aocl-libmem
26+
# Cross Compiling AVX2 binary on AVX512 machine
27+
$ cmake -D CMAKE_C_COMPILER=clang -D ALMEM_ARCH=avx2 ../aocl-libmem
28+
# Cross Compiling AVX512 binary on AVX2 machine
29+
$ cmake -D CMAKE_C_COMPILER=clang -D ALMEM_ARCH=avx512 ../aocl-libmem
30+
# Enabling Tunable Parameters
31+
$ cmake -D CMAKE_C_COMPILER=clang -D ENABLE_TUNABLES=Y ../aocl-libmem
32+
#Build
33+
$ cmake --build .
34+
#Install
35+
$ make install
36+
```
37+
38+
A shared library file 'libaocl-libmem.so' will be generated and stored under 'build/lib/shared/' path.
39+
40+
41+
### Static Library:
42+
```sh
43+
$ mkdir build
44+
$ cd build
45+
#Configure for GCC build
46+
# Default Native Build
47+
$ cmake -D CMAKE_C_COMPILER=gcc -D BUILD_SHARED_LIBS=N ../aocl-libmem
48+
# Cross Compiling AVX2 binary on AVX512 machine
49+
$ cmake -D CMAKE_C_COMPILER=gcc -D ALMEM_ARCH=avx2 -D BUILD_SHARED_LIBS=N ../aocl-libmem
50+
# Cross Compiling AVX512 binary on AVX2 machine
51+
$ cmake -D CMAKE_C_COMPILER=gcc -D ALMEM_ARCH=avx512 -D BUILD_SHARED_LIBS=N ../aocl-libmem
52+
# Enabling Tunable Parameters
53+
$ cmake -D CMAKE_C_COMPILER=gcc -D ENABLE_TUNABLES=Y -D BUILD_SHARED_LIBS=N ../aocl-libmem
54+
#Configure for AOCC(Clang) build
55+
# Default Native Build
56+
$ cmake -D CMAKE_C_COMPILER=clang -D BUILD_SHARED_LIBS=N ../aocl-libmem
57+
# Cross Compiling AVX2 binary on AVX512 machine
58+
$ cmake -D CMAKE_C_COMPILER=clang -D ALMEM_ARCH=avx2 -D BUILD_SHARED_LIBS=N ../aocl-libmem
59+
# Cross Compiling AVX512 binary on AVX2 machine
60+
$ cmake -D CMAKE_C_COMPILER=clang -D ALMEM_ARCH=avx512 -D BUILD_SHARED_LIBS=N ../aocl-libmem
61+
# Enabling Tunable Parameters
62+
$ cmake -D CMAKE_C_COMPILER=clang -D ENABLE_TUNABLES=Y -D BUILD_SHARED_LIBS=N ../aocl-libmem
63+
#Build
64+
$ cmake --build .
65+
#Install
66+
$ make install
67+
```
68+
69+
A static library file 'libaocl-libmem.a' will be generated and stored under 'build/lib/static' path.
70+
71+
## Debug Build:
72+
To enable logging build the source as below
73+
```sh
74+
$ cmake -D ENABLE_LOGGING=Y ../aocl-libmem
75+
```
76+
Logs will be stored in the`"/tmp/libmem.log"` file.
77+
78+
Enable debugging logs by uncommenting the below line from "CMakeLists.txt" in root directory.
79+
_debugging logs_: `add_definitions(-DLOG_LEVEL=4)`
80+
81+
## Running application:
82+
``Run the application by preloading the shared 'libaocl-libmem.so' generated from the above build procedure.``
83+
```sh
84+
$ LD_PRELOAD=<path to build/lib/shared/libaocl-libmem.so> <executable> <params>
85+
```
86+
* **`WARNING: Do not load/run AVX512 library on Non-AVX512 machine. Running AVX512 on non-AVX512 will lead to crash(invalid instructions).`**
87+
88+
## User Config:
89+
### 1. Default State Run:
90+
``Best fit implementation for the underlying ZEN microarchitecture will be chosen by the library.``
91+
92+
93+
### 2. Tunable State Run:
94+
95+
_There are two tunables that will be parsed by libmem._
96+
* **`LIBMEM_OPERATION`** :- instruction based on alignment and cacheability
97+
* **`LIBMEM_THRESHOLD`** :- the threshold for ERMS and Non-Temporal instructions
98+
99+
The library will choose the implementation based on the tuned parameter at run time.
100+
101+
#### 2.1. _LIBMEM_OPERATION_ :
102+
**Setting this tunable will let you choose implementation which is a combination of move instructions and alignment of the source and destination addresses.**
103+
104+
**LIBMEM_OPERATION** format: **`<operations>,<source_alignment>,<destination_alignmnet>`**
105+
106+
##### Valid options:
107+
* `<operations> = [avx2|avx512|erms]`
108+
* `<source_alignment> = [b|w|d|q|x|y|n]`
109+
* `<destination_alignmnet> = [b|w|d|q|x|y|n]`
110+
111+
e.g.: To use only avx2 based move operations with both unaligned source and destination addresses.
112+
```sh
113+
LD_PRELOAD=<build/lib/shared/libaocl-libmem.so> LIBMEM_OPERATION=avx2,b,b <executable>
114+
```
115+
116+
#### 2.2. _LIBMEM_THRESHOLD_ :
117+
**Setting this tunable will let us configure the threshold values for the supported instruction set.**
118+
119+
**LIBMEM_THRESHOLD** format: **`<repmov_start_threshold>,<repmov_stop_threshold>,<nt_start_threshold>,<nt_stop_threshold>`**
120+
121+
##### Valid options:
122+
* `<repmov_start_threshold> = [0, +ve integers]`
123+
* `<repmov_stop_threshold> = [0, +ve integers, -1]`
124+
* `<nt_start_threshold> = [0, +ve integers]`
125+
* `<nt_stop_threshold> = [0, +ve integers, -1]`
126+
127+
One has to make sure that they provide valid start and stop range values.
128+
If the size has to be set to maximum length then pass "-1"
129+
130+
e.g.: To use **REP MOVE** instructions for a range of 1KB to 2KB and non_temporal instructions for a range of 512KB and above.
131+
```sh
132+
LD_PRELOAD=<build/lib/shared/libaocl-libmem.so> LIBMEM_THRESHOLD=1024,2048,524288,-1 <executable>
133+
```
134+
**` Kindly refer to User Guide(docs/User_Guide.md) for the detailed tuning of parameters.`**

CMakeLists.txt

+104
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# Copyright (C) 2022-23 Advanced Micro Devices, Inc. All rights reserved.
2+
#
3+
# Redistribution and use in source and binary forms, with or without modification,
4+
# are permitted provided that the following conditions are met:
5+
# 1. Redistributions of source code must retain the above copyright notice,
6+
# this list of conditions and the following disclaimer.
7+
# 2. Redistributions in binary form must reproduce the above copyright notice,
8+
# this list of conditions and the following disclaimer in the documentation
9+
# and/or other materials provided with the distribution.
10+
# 3. Neither the name of the copyright holder nor the names of its contributors
11+
# may be used to endorse or promote products derived from this software without
12+
# specific prior written permission.
13+
#
14+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
15+
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
16+
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
17+
# IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
18+
# INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
19+
# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
20+
# OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
21+
# WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
22+
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
23+
# POSSIBILITY OF SUCH DAMAGE.
24+
25+
cmake_minimum_required(VERSION 3.10)
26+
27+
# avoid build in root directory
28+
if(${CMAKE_SOURCE_DIR} STREQUAL ${CMAKE_BINARY_DIR})
29+
message(FATAL_ERROR “In-source build detected!”)
30+
endif()
31+
32+
33+
# set the project name and version
34+
set(LIBMEM_VERSION_STRING 4.1.0)
35+
36+
project(aocl-libmem VERSION ${LIBMEM_VERSION_STRING} LANGUAGES C DESCRIPTION
37+
"Library of AMD optimized string/memory functions")
38+
39+
string(TIMESTAMP BUILD_DATE "%Y%m%d")
40+
41+
set(LIBMEM_BUILD_VERSION_STR
42+
"AOCL-LibMem ${LIBMEM_VERSION_STRING} Build ${BUILD_DATE}")
43+
44+
add_definitions(-DLIBMEM_BUILD_VERSION="${LIBMEM_BUILD_VERSION_STR}")
45+
46+
set(DEFAULT_BUILD_TYPE "Release")
47+
48+
option(BUILD_SHARED_LIBS "Build shared libraries" ON)
49+
50+
option(ENABLE_LOGGING "Enable Logger" OFF)
51+
52+
option(ENABLE_TUNABLES "Enable user input" OFF)
53+
54+
if (ENABLE_LOGGING)
55+
add_definitions(-DENABLE_LOGGER)
56+
# uncomment the below for debug logs, LOG_LEVEL=DEBUG.
57+
#add_definitions(-DLOG_LEVEL=4)
58+
endif ()
59+
60+
option(ALMEM_ARCH "ISA_ARCH_TYPE" ON)
61+
62+
execute_process(COMMAND bash "-c" "lscpu | grep erms"
63+
RESULT_VARIABLE ERMS_FEATURE OUTPUT_QUIET)
64+
65+
if (NOT ${ERMS_FEATURE})
66+
#uncomment after addng ERMS support for all funcs
67+
#add_definitions(-DERMS_FEATURE_ENABLED)
68+
message("ERMS Feature Enabled on Build machine.")
69+
endif ()
70+
71+
execute_process(COMMAND bash "-c" "lscpu | grep avx512"
72+
RESULT_VARIABLE AVX512_FEATURE OUTPUT_QUIET)
73+
74+
if (NOT ${AVX512_FEATURE})
75+
message("AVX512 Feature Enabled on Build machine.")
76+
if (NOT ${ALMEM_ARCH} STREQUAL "avx2")
77+
message("Setting Arch to AVX512")
78+
set(ALMEM_ARCH "avx512")
79+
endif ()
80+
endif ()
81+
82+
if (ALMEM_ARCH MATCHES "avx512")
83+
set(AVX512_FEATURE_AVAILABLE true)
84+
add_definitions(-DAVX512_FEATURE_ENABLED)
85+
if (${AVX512_FEATURE})
86+
message("Cross-Compiling for AVX512 Arch...")
87+
else ()
88+
message("Native-Compiling for AVX512 Arch...")
89+
endif ()
90+
else ()
91+
if (NOT ${AVX512_FEATURE})
92+
message("Cross-Compiling for AVX2 Arch...")
93+
else ()
94+
message("Native-Compiling for AVX2 Arch...")
95+
endif ()
96+
endif ()
97+
98+
# option for building shared lib
99+
option(BUILD_SHARED_LIBS "Build using shared libraries" ON)
100+
101+
# let the build system know the source directory
102+
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/src)
103+
104+
file(WRITE ${CMAKE_BINARY_DIR}/version.h ${LIBMEM_BUILD_VERSION_STR})

COPYRIGHT.txt

+153
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
(C) 2022-23 Advanced Micro Devices, Inc. All Rights Reserved.
2+
3+
Advanced Micro Devices, Inc.
4+
Software License Agreement
5+
6+
IMPORTANT-READ CAREFULLY: Do not load or use the Software until you have
7+
carefully read and agreed to the following terms and conditions. This is a
8+
legal agreement ("Agreement") between you (either an individual or an entity)
9+
("Licensee") and Advanced Micro Devices, Inc. ("AMD"). If Licensee does not
10+
agree to the terms of this Agreement, do not install or use this software or
11+
any portion thereof. By loading or using the object code version only of the
12+
software obtained herewith, which may include associated install scripts and
13+
online or electronic documentation or any portion thereof, that is made
14+
available by AMD to download from any media ("Software"), Licensee agrees to
15+
all of the terms of this Agreement.
16+
17+
1. LICENSE:
18+
19+
a. Subject to the terms and conditions of this Agreement, AMD grants
20+
Licensee the following non-exclusive, non-transferable, royalty-free,
21+
limited copyright license to download, copy, use, distribute and sublicense
22+
the foregoing rights through multiple tiers of sublicenses the object code
23+
version of the Software and materials associated with this Agreement,
24+
including without limitation printed documentation, (collectively,
25+
"Materials"), provided that Licensee agrees to include all copyright
26+
legends and other legal notices that may appear in the Materials. The
27+
foregoing license is conditioned upon Licensee distributing the object code
28+
version of the Software only and under this software license agreement.
29+
Except for the limited license granted herein, Licensee shall have no other
30+
rights in the Materials, whether express, implied, arising by estoppel or
31+
otherwise.
32+
33+
b. Except as expressly set forth in Section 1(a), Licensee does not have
34+
the right to (i) distribute, rent, lease, sell, sublicense, assign, or
35+
otherwise transfer the Materials, in whole or in part, to third parties for
36+
commercial or for non-commercial use; or (ii) modify, disassemble, reverse
37+
engineer, or decompile the Software, or otherwise reduce any part of the
38+
Software to any human readable form. All rights in and to the Materials
39+
not expressly granted to Licensee in this Agreement are reserved to AMD.
40+
41+
2. FEEDBACK: Licensee may provide AMD feedback, suggestions or opinions as to
42+
the Software, its features, and desired enhancements or changes. If Licensee
43+
provides feedback, suggestions or opinions to AMD regarding any new features,
44+
use, functionality, or change to the Software or any materials related to the
45+
Software, Licensee hereby agrees to grant, and does grant, AMD all rights
46+
needed for AMD to incorporate, modify, distribute, use and commercialize any
47+
new feature, use, functionality, or change at no charge or encumbrance to AMD.
48+
Licensee agrees that AMD may disclose such feedback, suggestions or opinions to
49+
any third party in any manner, and Licensee agrees that AMD has the ability to
50+
sublicense any of the foregoing rights in any feedback, suggestions or opinions
51+
or AMD products or services in any form to any third party without restriction.
52+
53+
3. OWNERSHIP AND COPYRIGHT OF MATERIALS: Licensee agrees that the Materials
54+
are owned by AMD and are protected by United States and foreign intellectual
55+
property laws (e.g. patent and copyright laws) and international treaty
56+
provisions. Licensee will not remove the copyright notice from the Materials.
57+
Licensee agrees to prevent any unauthorized copying of the Materials. All
58+
title and copyrights in and to the Materials, all copies thereof (in whole or
59+
in part, and in any form), and all rights therein shall remain vested in AMD.
60+
Except as expressly provided herein, AMD does not grant any express or implied
61+
right to Licensee under AMD patents, copyrights, trademarks, or trade secret
62+
information.
63+
64+
4. WARRANTY DISCLAIMER: THE MATERIALS ARE PROVIDED "AS IS" WITHOUT ANY EXPRESS
65+
OR IMPLIED WARRANTY OF ANY KIND INCLUDING WARRANTIES OF MERCHANTABILITY,
66+
NONINFRINGEMENT OF THIRD-PARTY INTELLECTUAL PROPERTY, TITLE, OR FITNESS FOR ANY
67+
PARTICULAR PURPOSE, OR THOSE ARISING FROM CUSTOM OF TRADE OR COURSE OF USAGE.
68+
THE ENTIRE RISK ARISING OUT OF USE OR PERFORMANCE OF THE MATERIALS REMAINS WITH
69+
LICENSEE. AMD DOES NOT WARRANT, GUARANTEE, OR MAKE ANY REPRESENTATIONS AS TO
70+
THE CORRECTNESS, ACCURACY, COMPLETENESS, QUALITY, OR RELIABILITY OF THE
71+
MATERIALS.
72+
73+
AMD DOES NOT WARRANT THAT OPERATION OF THE MATERIALS WILL BE UNINTERRUPTED OR
74+
ERROR-FREE. YOU ARE RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING
75+
THE SOFTWARE AND ASSUME ALL RISKS ASSOCIATED WITH THE USE OF THE MATERIALS,
76+
INCLUDING BUT NOT LIMITED TO THE RISKS OF PROGRAM ERRORS, DAMAGE TO OR LOSS OF
77+
DATA, PROGRAMS OR EQUIPMENT, AND UNAVAILABILITY OR INTERRUPTION OF OPERATIONS.
78+
SOME JURISDICTIONS DO NOT ALLOW FOR THE EXCLUSION OR LIMITATION OF IMPLIED
79+
WARRANTIES, SO THE ABOVE LIMITATIONS OR EXCLUSIONS MAY NOT APPLY TO LICENSEE.
80+
81+
5. LIMITATION OF LIABILITY: IN NO EVENT SHALL AMD OR ITS DIRECTORS, OFFICERS,
82+
EMPLOYEES AND AGENTS, ITS SUPPLIERS OR ITS LICENSORS BE LIABLE TO LICENSEE OR
83+
ANY THIRD PARTIES IN RECEIPT OF THE MATERIALS FOR CONSEQUENTIAL, INCIDENTAL,
84+
PUNITIVE OR SPECIAL DAMAGES, INCLUDING, BUT NOT LIMITED TO LOSS OF PROFITS,
85+
BUSINESS INTERRUPTION, OR LOSS OF INFORMATION ARISING OUT OF THE USE OF OR
86+
INABILITY TO USE THE MATERIALS, EVEN IF AMD HAS BEEN ADVISED OF THE POSSIBILITY
87+
OF SUCH DAMAGES. AMD DOES NOT ASSUME ANY RESPONSIBILITY TO SUPPORT OR UPDATE
88+
THE MATERIALS. BY USING THE MATERIALS WITHOUT CHARGE, YOU ACCEPT THIS
89+
ALLOCATION OF RISK. BECAUSE SOME JURSIDICTIONS PROHIBIT THE EXCLUSION OR
90+
LIMITATION OF LIABILITY FOR CONSEQUENTIAL OR INCIDENTAL DAMAGES, THE ABOVE
91+
LIMITATION MAY NOT APPLY TO LICENSEE.
92+
93+
6. U.S. GOVERNMENT RESTRICTED RIGHTS: The Materials are provided with
94+
"RESTRICTED RIGHTS." Use, duplication or disclosure by the Government is
95+
subject to restrictions as set forth in FAR52.227-14 and DFAR252.227-7013, et
96+
seq., or its successor. Use of the Materials by the Government constitutes
97+
acknowledgment of AMD's proprietary rights in them.
98+
99+
7. TERMINATION OF LICENSE: This Agreement will terminate immediately without
100+
notice from AMD or judicial resolution if Licensee fails to comply with any
101+
provisions of this Agreement. Upon termination of this Agreement, Licensee
102+
must delete or destroy all copies of the Materials.
103+
104+
8. SUPPORT. Under this Agreement, AMD is under no obligation to assist in the
105+
use of the Materials, to provide support to licensees of the Materials, or to
106+
provide maintenance, correction, modification, enhancement, or upgrades to the
107+
Materials. If AMD determines, in its sole discretion, to support, maintain,
108+
correct, modify, enhance, or upgrade the Software, such support, maintenance,
109+
correction, modification, enhancement or upgrade shall be considered part of
110+
the Materials, and shall be subject to this Agreement.
111+
112+
9. SURVIVAL: Sections 1(b), 2, 3, 4, 5, 6, and 8 through 14 shall survive any
113+
expiration or termination of this Agreement.
114+
115+
10. APPLICABLE LAWS: Any claim arising under or relating to this Agreement
116+
shall be governed by and construed in accordance with the substantive laws of
117+
the State of California, without regard to principles of conflict of laws.
118+
Each party hereto submits to the jurisdiction of the state and federal courts
119+
of Santa Clara County and the Northern District of California for the purposes
120+
of all legal proceedings arising out of or relating to this Agreement or the
121+
subject matter hereof. Each party waives any objection which it may have to
122+
contest such forum.
123+
124+
11. IMPORT/EXPORT/RE-EXPORT/USE/RELEASE/TRANSFER RESTRICTIONS AND COMPLIANCE
125+
WITH LAWS: Licensee is hereby provided notice, and agrees and acknowledges,
126+
that the Software, its source code, any accompanying media, material or
127+
information, and any product of the foregoing, may be subject to restrictions
128+
on use, release, transfer, importation, exportation and/or re- exportation
129+
under the laws and regulations of the United States or other countries
130+
("Applicable Laws"), which include but are not limited to U.S. export control
131+
laws such as the Export Administration Regulations and national security
132+
controls as defined thereunder, as well as State Department controls under the
133+
U.S. Munitions List. Licensee further agrees that the Software, its source
134+
code, any accompanying media, material or information, and any product of the
135+
foregoing, will not be used, released, transferred, imported, exported and/or
136+
re-exported in any manner prohibited under Applicable Laws, including U.S.
137+
export control laws regarding specifically designated persons, countries and
138+
nationals of countries subject to national security controls as provided in
139+
License Exception TSR of the Export Administration Regulations and any
140+
successor regulations.
141+
142+
12. SEVERABILITY: Should any term of this Agreement be declared void or
143+
unenforceable by any court of competent jurisdiction, such declaration shall
144+
have no effect on the remaining terms hereof.
145+
146+
13. NO WAIVER: The failure of either party to enforce any rights granted
147+
hereunder or to take action against the other party in the event of any breach
148+
hereunder shall not be deemed a waiver by that party as to subsequent
149+
enforcement of rights or subsequent actions in the event of future breaches.
150+
151+
14. ENTIRE AGREEMENT: This Agreement constitutes the entire agreement between
152+
the parties and supersedes any prior or contemporaneous oral or written
153+
agreements with respect to the subject matter of this Agreement.

0 commit comments

Comments
 (0)