Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP herbie #1860

Draft
wants to merge 267 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
267 commits
Select commit Hold shift + click to select a range
20c5464
disable lit parallelism
sbrantq May 30, 2024
4b800fc
add tests
sbrantq May 30, 2024
17de042
handle constants
sbrantq May 31, 2024
8e98bb3
more test & fast math flags
sbrantq May 31, 2024
4f4106e
cleanup
sbrantq May 31, 2024
c7ad4b3
add working generalized code
sbrantq Jun 6, 2024
6c25238
cleanup
sbrantq Jun 6, 2024
16df413
improve
sbrantq Jun 6, 2024
5de75d8
improve type casting
sbrantq Jun 6, 2024
0b431e4
add call instruction support
sbrantq Jun 8, 2024
c4a3b53
version check for tan
sbrantq Jun 8, 2024
add02ff
fix herbie seed
sbrantq Jun 8, 2024
beef25a
memory management
sbrantq Jun 9, 2024
6d8e9c1
std::vector --> SmallVector
sbrantq Jun 9, 2024
215c766
cleanup with O3 pass?
sbrantq Jun 12, 2024
48e0dcf
sign
sbrantq Jun 12, 2024
c072ac5
add simple phi node test
sbrantq Jun 12, 2024
3763219
unique file names for parallel tests
sbrantq Jun 12, 2024
7efa80d
adjust O3 flag
sbrantq Jun 13, 2024
69eb6d3
saving progress
sbrantq Jun 14, 2024
da12b38
better cleanup
sbrantq Jun 17, 2024
f6dfe4f
merge stuff from main
sbrantq Jun 18, 2024
3272642
demangle log func
sbrantq Jun 18, 2024
dce1297
pass operands to the logger
sbrantq Jun 19, 2024
9f5ed15
improve logging
sbrantq Jun 20, 2024
3aac534
add more complex logger
sbrantq Jun 20, 2024
0fa0343
improve
sbrantq Jun 20, 2024
f3f966d
set debug loc for logger call
sbrantq Jun 20, 2024
5bced48
move logger to separate file
sbrantq Jun 23, 2024
848fd56
improve
sbrantq Jun 23, 2024
d39c34c
fix constants
sbrantq Jun 24, 2024
421c6e4
if expr from Herbie
sbrantq Jun 24, 2024
254175b
herbie properties
sbrantq Jun 24, 2024
f2b156e
improve
sbrantq Jun 24, 2024
dc65ec7
macro
sbrantq Jun 24, 2024
03dc9b8
bool flag for err msg
sbrantq Jun 25, 2024
6bc269a
improve
sbrantq Jun 25, 2024
a21348b
manual arg eval & inf handling
sbrantq Jun 25, 2024
0a2ccbc
cbrt
sbrantq Jun 26, 2024
60d946d
large Herbie constant handling
sbrantq Jun 26, 2024
ca87346
simplify
sbrantq Jun 26, 2024
4259b00
WIP preconditions
sbrantq Jun 26, 2024
cd27b90
fwddiffe -> fwderr
sbrantq Jun 27, 2024
7ec2346
add mode selection
sbrantq Jul 1, 2024
ea59ad1
preprocess orig metadata
sbrantq Jul 1, 2024
8c520e7
neg
sbrantq Jul 1, 2024
11bac95
register fpopt
sbrantq Jul 2, 2024
2af3c11
only preprocess origin metadata to fp inst
sbrantq Jul 2, 2024
6f457fd
ifdef ENZYME_ENABLE_HERBIE
sbrantq Jul 2, 2024
4336ed0
denormals
sbrantq Jul 3, 2024
ecf1aa1
logged bounds for fpconst
sbrantq Jul 3, 2024
32dbbd5
fix bounds parsing
sbrantq Jul 4, 2024
0d005db
preconditions
sbrantq Jul 4, 2024
77f8887
skip declarations
sbrantq Jul 4, 2024
66a7ae3
simplify printed info
sbrantq Jul 4, 2024
7156848
post opt erasure check
sbrantq Jul 4, 2024
1c29d2c
update fpopt tests to include debug flags
sbrantq Jul 4, 2024
294c2a2
print
sbrantq Jul 5, 2024
4b8e17a
add labels to herbie-generated stuff
sbrantq Jul 5, 2024
a7e134c
more tests
sbrantq Jul 6, 2024
9802904
fmuladd
sbrantq Jul 6, 2024
cc53c46
fix test
sbrantq Jul 6, 2024
a5a450c
hypot
sbrantq Jul 6, 2024
0aab938
fix bounds parsing
sbrantq Jul 6, 2024
19443d8
more tests
sbrantq Jul 6, 2024
0cb80da
expm1 + log1p + precond fix
sbrantq Jul 6, 2024
3c16766
fix test
sbrantq Jul 7, 2024
f507934
make cbrt herbiable
sbrantq Jul 7, 2024
1c7cf06
WIP llvm instruction cost model
sbrantq Jul 7, 2024
01d4f8d
subexpr cost estimate
sbrantq Jul 8, 2024
2b14f05
improve
sbrantq Jul 8, 2024
7a4be5e
WIP solver
sbrantq Jul 8, 2024
39973f0
fix
sbrantq Jul 8, 2024
d272fd0
saving progress
sbrantq Jul 11, 2024
1e06089
more herbie knobs
sbrantq Jul 12, 2024
29438c3
reverse mode logging
sbrantq Jul 15, 2024
8e0a0bb
cleanup
sbrantq Jul 16, 2024
23f2a7c
only log grad
sbrantq Jul 16, 2024
0b51aa0
fix test
sbrantq Jul 16, 2024
7544cb2
multiple expr handling
sbrantq Jul 18, 2024
3d015f1
grad log parsing + improvements
sbrantq Jul 19, 2024
69e7bbf
improve
sbrantq Jul 19, 2024
cf89257
Merge branch 'main' of https://github.com/EnzymeAD/Enzyme into herbie
sbrantq Jul 21, 2024
b54886f
improve herbie's build cmd
sbrantq Jul 21, 2024
e9bdadf
TTI costs
sbrantq Jul 23, 2024
9df9349
saving solvers
sbrantq Jul 23, 2024
a2afd2d
fix alternatives
sbrantq Jul 23, 2024
138d8fa
Merge branch 'main' of https://github.com/EnzymeAD/Enzyme into herbie
sbrantq Jul 25, 2024
42ed544
fix dp solver
sbrantq Jul 28, 2024
b16bbcf
update herbie hash
sbrantq Jul 28, 2024
32949f0
update tests
sbrantq Jul 28, 2024
7fc69b9
constantexpr
sbrantq Jul 28, 2024
00259d4
get log func
sbrantq Jul 29, 2024
7fe0519
improve
sbrantq Jul 29, 2024
304a162
separating out value logging
sbrantq Jul 30, 2024
bfa4c61
add unified logger
sbrantq Jul 30, 2024
09bd80f
differential usage for reverse mode value logging
sbrantq Jul 30, 2024
ee85f63
log value before constant inst check
sbrantq Jul 30, 2024
4c1cf63
unified logger
sbrantq Jul 31, 2024
ae4ef9a
solver selection
sbrantq Aug 1, 2024
a18b006
log parsing bug fix
sbrantq Aug 1, 2024
ba4bf0a
log parsing bug fix
sbrantq Aug 1, 2024
9056ed1
WIP typing
sbrantq Aug 2, 2024
9b3a08d
fix f32
sbrantq Aug 3, 2024
afff45f
minor fixes
sbrantq Aug 3, 2024
5e6cff4
use libm func instead
sbrantq Aug 4, 2024
3592826
minor improvements and make fabs non-herbiable
sbrantq Aug 6, 2024
0e7bd4f
fpcc splitting algorithm
sbrantq Aug 6, 2024
748d08a
saving
sbrantq Aug 6, 2024
a577bcb
Merge branch 'main' of https://github.com/EnzymeAD/Enzyme into herbie
sbrantq Aug 16, 2024
9f81169
Merge branch 'main' of https://github.com/EnzymeAD/Enzyme into herbie
sbrantq Aug 25, 2024
cd79402
fix include
sbrantq Aug 25, 2024
7006efb
fix
sbrantq Aug 28, 2024
ddb9307
regex target func spec
sbrantq Aug 30, 2024
3413835
Merge branch 'main' of https://github.com/EnzymeAD/Enzyme into herbie
sbrantq Aug 31, 2024
ffb255d
fix
sbrantq Aug 31, 2024
5c123ed
improve
sbrantq Aug 31, 2024
1d0c686
cleanup
sbrantq Sep 3, 2024
2f876a4
fix tests
sbrantq Sep 3, 2024
8ebd39e
improve
sbrantq Sep 3, 2024
f42e7e8
FP subgraph precision changing
sbrantq Sep 7, 2024
0213868
comments
sbrantq Sep 7, 2024
7dc2be9
improve
sbrantq Sep 8, 2024
1b0a985
rename FPNode::getValue & WIP unified accuracy
sbrantq Sep 8, 2024
64afdbb
RTTI
sbrantq Sep 9, 2024
931f723
improve
sbrantq Sep 15, 2024
0e51d51
shared ptr & WIP golden values
sbrantq Sep 16, 2024
8fe7098
improve
sbrantq Sep 16, 2024
9480b7c
accuracy cost estimator for herbie rewrites
sbrantq Sep 17, 2024
4a80236
random engine
sbrantq Sep 20, 2024
4fa39d1
fix dp solver
sbrantq Sep 20, 2024
bb5b917
guaranteed erasable cost in expressions
sbrantq Sep 20, 2024
4a2f996
fix
sbrantq Sep 20, 2024
59da2ab
WIP PT candidate generation
sbrantq Sep 21, 2024
845f74a
generalize
sbrantq Sep 22, 2024
026e334
use AST nodes in precision tuning
sbrantq Sep 22, 2024
b9b3194
generalized mpfr evaluator
sbrantq Sep 23, 2024
bacca3b
unified accuracy cost done
sbrantq Sep 23, 2024
4a5a621
improve
sbrantq Sep 23, 2024
e7a65fe
fixed accuracy cost for fpcc
sbrantq Sep 23, 2024
ef23ce8
save
sbrantq Sep 29, 2024
df3087a
fix up
sbrantq Sep 29, 2024
498ce49
renaming
sbrantq Sep 29, 2024
d97de6c
custom cost model parsing & disable FP16 for now
sbrantq Sep 29, 2024
0621807
save
sbrantq Sep 30, 2024
834d7c3
generalized dp solver
sbrantq Sep 30, 2024
066ec46
fix up
sbrantq Sep 30, 2024
98d425a
bug fix & ruling out NaNs in acc cost estimation
sbrantq Oct 1, 2024
5371a97
more precisions & fmuladd --> fma
sbrantq Oct 3, 2024
f7ef6d2
cleanup after PT
sbrantq Oct 3, 2024
ec7a55b
costom cost model opcode suffix for fpcasts
sbrantq Oct 3, 2024
4e2fbb4
fix erasable inst check
sbrantq Oct 3, 2024
6953b7a
save
sbrantq Oct 4, 2024
84ab293
Explicit topo sort
sbrantq Oct 4, 2024
829185a
adjusted cost estimation & accuracy estimation bug fix
sbrantq Oct 5, 2024
33cd1b0
Only enable float double conversions
sbrantq Oct 5, 2024
d6dd10d
early pruning flag
sbrantq Oct 6, 2024
aa52ecd
ADAPT-style sensitivity estimation
sbrantq Oct 7, 2024
74dbaff
always fpcast operands first in MPFR evaluator
sbrantq Oct 9, 2024
0ebc1d8
AO/ACC sampled points consistency fix
sbrantq Oct 9, 2024
2c859a4
ponder fast math flags
sbrantq Oct 10, 2024
8309529
bug fix
sbrantq Oct 15, 2024
af67a08
complete PT & improve
sbrantq Oct 16, 2024
7488586
caching for adjusted ACC costs
sbrantq Oct 16, 2024
77992eb
improve
sbrantq Oct 17, 2024
859a954
fix
sbrantq Oct 17, 2024
a9b615e
more PT candidates
sbrantq Oct 29, 2024
653cf32
better temp expr materialization & cost estimation
sbrantq Oct 29, 2024
a558385
solution dominance thresholds
sbrantq Oct 30, 2024
274366c
save
sbrantq Oct 31, 2024
f4965a4
native fp emulator
sbrantq Oct 31, 2024
e381dd2
more options
sbrantq Oct 31, 2024
90cb151
save
sbrantq Nov 2, 2024
f38c611
save
sbrantq Nov 3, 2024
27ed106
print ranges only
sbrantq Nov 3, 2024
e433e5d
fix up
sbrantq Nov 4, 2024
08c4db8
bug fix
sbrantq Nov 4, 2024
46f7d11
improve
sbrantq Nov 4, 2024
67b5944
bug fix
sbrantq Nov 5, 2024
033bbcf
fix
sbrantq Nov 7, 2024
845cbf5
bug fix
sbrantq Nov 7, 2024
60a6ddd
remove duplicated expr
sbrantq Nov 8, 2024
fe0b354
accuracy cost evaluation: arithmetic avg --> geometric avg
sbrantq Nov 8, 2024
da75044
add some progress indication
sbrantq Nov 8, 2024
bb020b4
fix up
sbrantq Nov 8, 2024
6631026
fix up
sbrantq Nov 8, 2024
c67f13d
parallel herbie
sbrantq Nov 8, 2024
91456e2
disable herbie parallelism by default
sbrantq Nov 8, 2024
bd013be
fix up
sbrantq Nov 8, 2024
f4f7335
fix up
sbrantq Nov 8, 2024
50abba3
fix up
sbrantq Nov 8, 2024
003de29
dp solver fix up
sbrantq Nov 8, 2024
1b65a8c
improve
sbrantq Nov 9, 2024
b49de51
save
sbrantq Nov 9, 2024
0e7a6d7
experimental herbie output caching
sbrantq Nov 10, 2024
52950d5
bug fix
sbrantq Nov 10, 2024
5f39b98
FPEvaluator: add hypot
sbrantq Nov 11, 2024
53a57a7
Merge branch 'main' of https://github.com/EnzymeAD/Enzyme into herbie
sbrantq Nov 11, 2024
7eb617e
just skip unexecuted code
sbrantq Nov 11, 2024
7145c49
adapted to poseidon
sbrantq Nov 12, 2024
e39bee8
enzyme_active
sbrantq Nov 12, 2024
5b94be0
bug fix
sbrantq Nov 12, 2024
a8dede8
prune on boundaries
sbrantq Nov 13, 2024
dffbd51
dp table caching
sbrantq Nov 26, 2024
1d88952
range widening
sbrantq Nov 26, 2024
882cd35
fix up
sbrantq Dec 2, 2024
4c051bb
Merge branch 'main' of https://github.com/EnzymeAD/Enzyme into herbie
sbrantq Jan 14, 2025
0e5f84a
mark `getrusage` as nofree and inactive
sbrantq Jan 17, 2025
851c1bb
better fpopt metadata setting
sbrantq Jan 17, 2025
7fbd4f6
global fpopt print flag
sbrantq Jan 17, 2025
35a9165
fix up
sbrantq Jan 17, 2025
b85c191
cleanup
sbrantq Jan 17, 2025
4080e03
format
sbrantq Jan 17, 2025
cd0e35e
fix up
sbrantq Jan 17, 2025
953ac4f
format
sbrantq Jan 17, 2025
d62bda8
renaming legacy macro
sbrantq Jan 18, 2025
48fc5c2
fix up
sbrantq Jan 23, 2025
18428b8
better fpopt util func
sbrantq Jan 23, 2025
e79450d
disable recursive inlining when any FPOpt logger is found
sbrantq Jan 23, 2025
4cdf8f8
fix up
sbrantq Jan 23, 2025
922c88c
fix up
sbrantq Jan 24, 2025
8e7d46d
Merge branch 'main' of https://github.com/EnzymeAD/Enzyme into herbie
sbrantq Jan 24, 2025
d969cb2
miniparser support for plain herbie constants
sbrantq Jan 25, 2025
d50fd82
add assertion
sbrantq Jan 25, 2025
b8b3963
renaming & bug fix
sbrantq Jan 30, 2025
29ab9b4
updated herbie commit
sbrantq Feb 5, 2025
0aeba4b
more libm function support & miniparser update for new herbie syntax
sbrantq Feb 5, 2025
c586eb0
fix up
sbrantq Feb 5, 2025
2bbe409
use herbie 2.1
sbrantq Feb 5, 2025
7d97fe7
fix up
sbrantq Feb 5, 2025
2eee5d4
add flag to loose coverage check
sbrantq Feb 7, 2025
e934e3f
better poseidonable check
sbrantq Feb 7, 2025
71d25ee
to get stats in paper
sbrantq Feb 10, 2025
33eb87c
option for showing pt details
sbrantq Feb 10, 2025
babcc9f
more stat & bug fix
sbrantq Feb 12, 2025
9db684c
geomean fix
sbrantq Feb 13, 2025
b6c5b05
strict mode & pt improve
sbrantq Feb 15, 2025
5f7d8f2
geomean fix up & automatically cache budgets
sbrantq Feb 15, 2025
e58e76e
Adding llvm 19 support
vimarsh6739 Feb 10, 2025
d133a15
Fixed const fpcast
vimarsh6739 Feb 11, 2025
f155e08
update opaque ptr
vimarsh6739 Feb 11, 2025
04e0aa9
Merge branch 'main' of https://github.com/EnzymeAD/Enzyme into herbie
sbrantq Feb 16, 2025
1d690c6
bug fix
sbrantq Feb 16, 2025
26c4441
add powi support
sbrantq Feb 17, 2025
71fc823
bug fix
sbrantq Feb 18, 2025
0ca3db5
more reduction strategy
sbrantq Feb 18, 2025
6964246
more reduction options
sbrantq Feb 18, 2025
ea1c0a7
save
sbrantq Feb 22, 2025
7161762
startsWith fix up
sbrantq Feb 22, 2025
c99b1a6
save
sbrantq Feb 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions enzyme/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,7 @@ find_library(MPFR_LIB_PATH mpfr)
CHECK_INCLUDE_FILE("mpfr.h" HAS_MPFR_H)
message("MPFR lib: " ${MPFR_LIB_PATH})
message("MPFR header: " ${HAS_MPFR_H})
link_libraries(mpfr)

file(WRITE "${CMAKE_CURRENT_BINARY_DIR}/include/SCEV/ScalarEvolutionExpander.h" "${INPUT_TEXT}")

Expand Down
4 changes: 3 additions & 1 deletion enzyme/Enzyme/ActivityAnalysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,7 @@ const StringSet<> KnownInactiveFunctions = {
"memcmp",
"memchr",
"gettimeofday",
"getrusage",
"stat",
"mkdir",
"compress2",
Expand Down Expand Up @@ -1024,7 +1025,8 @@ bool isValuePotentiallyUsedAsPointer(llvm::Value *val) {
for (auto u : cur->users()) {
if (isa<ReturnInst>(u))
return true;
if (!cast<Instruction>(u)->mayReadOrWriteMemory()) {
if (isa<ConstantExpr>(u) ||
!cast<Instruction>(u)->mayReadOrWriteMemory()) {
todo.push_back(u);
continue;
}
Expand Down
32 changes: 27 additions & 5 deletions enzyme/Enzyme/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,33 @@ set(LLVM_LINK_COMPONENTS Demangle)
file(GLOB ENZYME_SRC CONFIGURE_DEPENDS RELATIVE ${CMAKE_CURRENT_SOURCE_DIR}
"*.cpp"
)
list(REMOVE_ITEM ENZYME_SRC "eopt.cpp")
list(REMOVE_ITEM ENZYME_SRC "eopt.cpp" "Herbie.cpp")
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

list(APPEND ENZYME_SRC TypeAnalysis/TypeTree.cpp TypeAnalysis/TypeAnalysis.cpp TypeAnalysis/TypeAnalysisPrinter.cpp TypeAnalysis/RustDebugInfo.cpp)

# set(ENZYME_LINK_TARGETS)
set(ENZYME_ENABLE_FPOPT 0 CACHE BOOL "Enable FPOpt")

if(ENZYME_ENABLE_FPOPT)
include(ExternalProject)
ExternalProject_Add(herbie
GIT_REPOSITORY https://github.com/sbrantq/herbie
GIT_TAG 9a81e9ecd01515ed88fb4efb77e4c0ae7be9cba5
UPDATE_COMMAND ""
CONFIGURE_COMMAND ""
BUILD_COMMAND make egg-herbie && make update && raco exe -o herbie --orig-exe --embed-dlls --vv src/main.rkt
BUILD_IN_SOURCE true
INSTALL_COMMAND COMMAND ${CMAKE_COMMAND} -E make_directory ${CMAKE_CURRENT_BINARY_DIR}/herbie/install
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${CMAKE_CURRENT_BINARY_DIR}/herbie-prefix/src/herbie/herbie ${CMAKE_CURRENT_BINARY_DIR}/herbie/install/herbie
)
list(APPEND ENZYME_SRC Herbie.cpp)
add_compile_definitions(ENZYME_ENABLE_FPOPT=1)
set_source_files_properties(Herbie.cpp PROPERTIES COMPILE_DEFINITIONS HERBIE_BINARY="${CMAKE_CURRENT_BINARY_DIR}/herbie/install/herbie")
endif()


# on windows `PLUGIN_TOOL` doesn't link against LLVM.dll
if ((WIN32 OR CYGWIN) AND LLVM_LINK_LLVM_DYLIB)
add_llvm_library( LLVMEnzyme-${LLVM_VERSION_MAJOR}
Expand All @@ -71,6 +92,7 @@ if (${Clang_FOUND})
intrinsics_gen
LINK_COMPONENTS
LLVM
${ENZYME_LINK_TARGETS}
)
target_compile_definitions(ClangEnzyme-${LLVM_VERSION_MAJOR} PUBLIC ENZYME_RUNPASS)
endif()
Expand Down Expand Up @@ -185,17 +207,17 @@ target_link_options(LLDEnzymePrintFlags INTERFACE "SHELL: -Wl,-mllvm -Wl,-enzyme
add_library(LLDEnzymeNoStrictAliasingFlags INTERFACE)
target_link_options(LLDEnzymeNoStrictAliasingFlags INTERFACE "SHELL: -Wl,-mllvm -Wl,-enzyme-strict-aliasing=0")

# this custom target exists to prevent CMake from incorrectly assuming that
# targets that link depend on LLDEnzyme-XX can be built at the same time or
# this custom target exists to prevent CMake from incorrectly assuming that
# targets that link depend on LLDEnzyme-XX can be built at the same time or
# before LLDEnzyme-XX has finished.
add_custom_target(LLDEnzymeDummy "" DEPENDS LLDEnzyme-${LLVM_VERSION_MAJOR})
add_dependencies(LLDEnzymeFlags LLDEnzymeDummy)

add_library(ClangEnzymeFlags INTERFACE)
target_compile_options(ClangEnzymeFlags INTERFACE "-fplugin=$<TARGET_FILE:ClangEnzyme-${LLVM_VERSION_MAJOR}>")

# this custom target exists to prevent CMake from incorrectly assuming that
# targets that link depend on ClangEnzyme-XX can be built at the same time or
# this custom target exists to prevent CMake from incorrectly assuming that
# targets that link depend on ClangEnzyme-XX can be built at the same time or
# before ClangEnzyme-XX has finished.
add_custom_target(ClangEnzymeDummy "" DEPENDS ClangEnzyme-${LLVM_VERSION_MAJOR})
add_dependencies(ClangEnzymeFlags ClangEnzymeDummy)
Expand Down
4 changes: 3 additions & 1 deletion enzyme/Enzyme/DiffeGradientUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -107,11 +107,13 @@ DiffeGradientUtils *DiffeGradientUtils::CreateFromClone(
std::string prefix;

switch (mode) {
case DerivativeMode::ForwardModeError:
case DerivativeMode::ForwardMode:
case DerivativeMode::ForwardModeSplit:
prefix = "fwddiffe";
break;
case DerivativeMode::ForwardModeError:
prefix = "fwderr";
break;
case DerivativeMode::ReverseModeCombined:
case DerivativeMode::ReverseModeGradient:
prefix = "diffe";
Expand Down
5 changes: 4 additions & 1 deletion enzyme/Enzyme/DifferentialUseAnalysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -123,14 +123,17 @@ inline bool is_value_needed_in_reverse(
}
}
}
if (gutils->mode == DerivativeMode::ForwardModeError &&
#ifdef ENZYME_ENABLE_FPOPT
if ((hasFPOptLogger(gutils->oldFunc->getParent()) ||
gutils->mode == DerivativeMode::ForwardModeError) &&
!gutils->isConstantValue(const_cast<Value *>(inst))) {
if (EnzymePrintDiffUse)
llvm::errs()
<< " Need: " << to_string(VT) << " of " << *inst
<< " in reverse as forward mode error always needs result\n";
return seen[idx] = true;
}
#endif
}

if (auto CI = dyn_cast<CallInst>(inst)) {
Expand Down
13 changes: 13 additions & 0 deletions enzyme/Enzyme/Enzyme.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3272,6 +3272,9 @@ AnalysisKey EnzymeNewPM::Key;
#include "ActivityAnalysisPrinter.h"
#include "JLInstSimplify.h"
#include "PreserveNVVM.h"
#ifdef ENZYME_ENABLE_FPOPT
#include "Herbie.h"
#endif
#include "TypeAnalysis/TypeAnalysisPrinter.h"
#include "llvm/Passes/PassBuilder.h"
#include "llvm/Transforms/AggressiveInstCombine/AggressiveInstCombine.h"
Expand Down Expand Up @@ -3385,6 +3388,10 @@ void augmentPassBuilder(llvm::PassBuilder &PB) {
OptimizerPM.addPass(llvm::SROAPass());
#endif
MPM.addPass(createModuleToFunctionPassAdaptor(std::move(OptimizerPM)));
#ifdef ENZYME_ENABLE_FPOPT
if (EnzymeEnableFPOpt)
MPM.addPass(FPOptNewPM());
#endif
MPM.addPass(EnzymeNewPM(/*PostOpt=*/true));
MPM.addPass(PreserveNVVMNewPM(/*Begin*/ false));
#if LLVM_VERSION_MAJOR >= 16
Expand Down Expand Up @@ -3669,6 +3676,12 @@ extern "C" void registerEnzyme(llvm::PassBuilder &PB) {
MPM.addPass(EnzymeNewPM());
return true;
}
#ifdef ENZYME_ENABLE_FPOPT
if (Name == "fp-opt") {
MPM.addPass(FPOptNewPM());
return true;
}
#endif
if (Name == "preserve-nvvm") {
MPM.addPass(PreserveNVVMNewPM(/*Begin*/ true));
return true;
Expand Down
1 change: 1 addition & 0 deletions enzyme/Enzyme/EnzymeLogic.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6395,6 +6395,7 @@ llvm::Function *EnzymeLogic::CreateNoFree(RequestContext context, Function *F) {
"vprintf",
"fprintf",
"fputc",
"getrusage",
"memchr",
"time",
"strlen",
Expand Down
37 changes: 35 additions & 2 deletions enzyme/Enzyme/FunctionUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,10 @@

#include "CacheUtility.h"

#ifdef ENZYME_ENABLE_FPOPT
#include "Herbie.h"
#endif

#define addAttribute addAttributeAtIndex
#define removeAttribute removeAttributeAtIndex
#define getAttribute getAttributeAtIndex
Expand Down Expand Up @@ -1418,6 +1422,27 @@ Function *PreProcessCache::preprocessForClone(Function *F,
/*ModuleLevelChanges*/ CloneFunctionChangeType::LocalChangesOnly,
Returns, "", nullptr);
}
#ifdef ENZYME_ENABLE_FPOPT
if (hasFPOptLogger(F->getParent())) {
for (const auto &pair : VMap) {
if (auto *before = dyn_cast<Instruction>(pair.first)) {
if (!before->getType()->isFloatingPointTy()) {
continue;
}
auto *after = cast<Instruction>(pair.second);
attachFPOptMetadata(after, before);
} else if (auto *beforeBB = dyn_cast<BasicBlock>(pair.first)) {
auto *afterBB = cast<BasicBlock>(pair.second);
for (const auto &[before, after] : zip(*beforeBB, *afterBB)) {
if (!before.getType()->isFloatingPointTy()) {
continue;
}
attachFPOptMetadata(&after, &before);
}
}
}
}
#endif
CloneOrigin[NewF] = F;
NewF->setAttributes(F->getAttributes());
if (EnzymeNoAlias)
Expand All @@ -1431,7 +1456,13 @@ Function *PreProcessCache::preprocessForClone(Function *F,
setFullWillReturn(NewF);

if (EnzymePreopt) {
#ifdef ENZYME_ENABLE_FPOPT
// Disable recursive inlining since no FPOpt metadata is attached
// to inlined instructions
if (!hasFPOptLogger(F->getParent()) && EnzymeInline) {
#else
if (EnzymeInline) {
#endif
ForceRecursiveInlining(NewF, /*Limit*/ EnzymeInlineCount);
setFullWillReturn(NewF);
PreservedAnalyses PA;
Expand Down Expand Up @@ -1800,7 +1831,8 @@ Function *PreProcessCache::preprocessForClone(Function *F,
FAM.invalidate(*NewF, PA);
}

if (mode != DerivativeMode::ForwardMode)
if (mode != DerivativeMode::ForwardMode &&
mode != DerivativeMode::ForwardModeError)
ReplaceReallocs(NewF);

{
Expand Down Expand Up @@ -1836,7 +1868,8 @@ Function *PreProcessCache::preprocessForClone(Function *F,
PA.preserve<PhiValuesAnalysis>();
}

if (mode != DerivativeMode::ForwardMode)
if (mode != DerivativeMode::ForwardMode &&
mode != DerivativeMode::ForwardModeError)
ReplaceReallocs(NewF);

if (mode == DerivativeMode::ReverseModePrimal ||
Expand Down
Loading
Loading