[AutoBump] Merge with fixes of 82383d5f3fa8 (May 1) (42) #299

mgehre-amd · 2024-08-22T11:43:43Z

No description provided.

…SCVISAInfo::parseArchString. NFC (llvm#91538) We can use a SmallVector<StringRef>. Adjust the code so we check for empty strings in the loop instead of making a copy of the vector returned from StringRef::split. This overlaps with llvm#91532 which also removed the std::vector, but that PR may be more controversial.

It is inaccurate and needs to be corrected.

…t (NFC) (llvm#91413) Fixed test for llvm#88193

…lass contexts (llvm#91498) Fixes a crash caused by llvm#90152.

A compiler can generate a redundant indirection for a jump via a fixed jump table target. Add a test case that covers such pattern that covers PIC case. We already have non-PIC case detection. Currently XFAIL.

…pression. (llvm#90918) Fixes llvm#90498. Same as 5337efc for atomic builtins, but for `std::atomic` this time. This is useful because even though the actual builtin atomic is still there, it may be buried beyond the inlining depth limit. Also add one popular custom smart pointer class name to the name-based heuristics, which isn't necessary to fix the bug but arguably a good idea regardless.

…4132) Under some circumstance (library loaded with the main program), TLS initial-exec model can be applied to local-dynamic access(es). We could use some simple heuristic to decide the update at function level: * If there is equal or less than a number of TLS local-dynamic access(es) in the function, use TLS initial-exec model. (the threshold which default to 1 is controlled by hidden option)

`Count` and `Skip` should use `uint64_t` as they are encoded/decoded using 64-bit ULEB128. In `*_OPCODE_DO_*_ULEB_TIMES_SKIPPING_ULEB`, `Skip` could be encoded as a two's complement for moving `SegmentOffset` backwards. Having a 32-bit `Skip` truncates the encoded value and leads to a malformed `AdvanceAmount` and invalid `SegmentOffset` that extends past valid sections.

These tests did not test what they were supposed to. The transform fails to actually handle the commuted cases.

These used both lower and upper case variants of the same name, resulting in malformed check lines when regenerated.

…88846) String pool merging currently, for a reason that's not entirely clear to me, tries to create GEP instructions instead of GEP constant expressions when replacing constant references. It only uses constant expressions in cases where this is required. However, it does not catch all cases where such a requirement exists. For example, the landingpad catch clause has to be a constant. Fix this by always using the constant expression variant, which also makes the implementation simpler. Additionally, there are some edge cases where even replacement with a constant GEP is not legal. The one I am aware of is the llvm.eh.typeid.for intrinsic, so add a special case to forbid replacements for it. Fixes llvm#88844.

…#91546) We failed to use BSETI when bit 31 was set and a few bits above bit 31 were set. We also failed to use multiple BSETI when the low 32 bits were zero. I've removed the special cases for constants 0x80000000-0xffffffff and wrote a more generic algorithm for BSETI. I've rewritten the BCLRI handling to be similar to the new BSETI algorithm. This picks up cases where bit 31 is 0 and only a few high bits are 0.

…o pass constructor. It is already initialized in RISCVTargetMachine.cpp

…andled Arguments in __kmpc_fork_call_if (llvm#82221) Root cause: Segmentation fault is caused by null pointer dereference inside the __kmpc_fork_call_if function at https://github.com/llvm/llvm-project/blob/main/openmp/runtime/src/z_Linux_asm.S#L1186 . __kmpc_fork_call_if is missing case to handle argc=0 . Fix: Added a check inside the __kmp_invoke_microtask function to handle the case when argc is 0. --------- Co-authored-by: Singh <[email protected]>

@Meinersbur

llvm#91517) Fixes bug llvm#90769. Many thanks to @Meinersbur for providing the initial thought and solution to this.

Test all commuted variants of the pattern, most of which currently fail to fold.

This pattern only handled commutation in the "or", while all involved operations are commutative. Make sure we handle all sixteen patterns.

…lvm#91189) The implementation is straight-forward, but comes with a big disclaimer. See llvm#91186 for details.

…lvm#91321) A leaf function may not store the link register to stack, but we it can still end up being a non-zero frame if it gets interrupted by a signal. Currently, we were unable to unwind past this function because we could not read the link register value. To make this work, this patch: - changes the function-entry unwind plan to include the `fp|lr = <same>` rules. This in turn necessitated an adjustment in the generic instruction emulation logic to ensure that `lr=[sp-X]` can override the `<same>` rule. - allows the `<same>` rule for pc and lr in all `m_all_registers_available` frames (and not just frame zero). The test verifies that we can unwind in a situation like this, and that the backtrace matches the one we computed before getting a signal.

…lvm#91560) I'm planning to remove StringRef::equals in favor of StringRef::operator==. - StringRef::operator==/!= outnumber StringRef::equals by a factor of 10 under mlir/ in terms of their usage. - The elimination of StringRef::equals brings StringRef closer to std::string_view, which has operator== but not equals. - S == "foo" is more readable than S.equals("foo"), especially for !Long.Expression.equals("str") vs Long.Expression != "str".

…#89026) Generally, IR and assembly test files benefit from being cleaned to remove unnecessary details. However, for tests requiring elaborate IR or assembly files where cleanup is less practical (e.g., large amount of debug information output from Clang), the current practice is to include the C/C++ source file and the generation instructions as comments. This is inconvenient when regeneration is needed. This patch adds `llvm/utils/update_test_body.py` to allow easier regeneration. `ld.lld --debug-names` tests (llvm#86508) utilize this script for Clang-generated assembly tests. Note: `-o pipefail` is standard (since https://www.austingroupbugs.net/view.php?id=789) but not supported by dash. Link: https://discourse.llvm.org/t/utility-to-generate-elaborated-assembly-ir-tests/78408

This PR address issue llvm#89002. #### Changes in this PR * Added a simple implementation of `cpp::lock_guard` (an equivalent of `std::lock_guard`) in libc/src/__support/CPP inspired by the libstdc++ implementation * Added tests for `cpp::lock_guard` in /libc/test/src/__support/CPP/mutex_test.cpp * Replaced all references to `MutexLock` with `cpp::lock_guard` --------- Co-authored-by: Guillaume Chatelet <[email protected]>

@erichkeane

The following program produces a diagnostic in Clang and EDG, but compiles correctly in GCC and MSVC: ```cpp #include <vector> consteval std::vector<int> fn() { return {1,2,3}; } constexpr int a = fn()[1]; ``` Clang's diagnostic is as follows: ```cpp <source>:6:19: error: call to consteval function 'fn' is not a constant expression 6 | constexpr int a = fn()[1]; | ^ <source>:6:19: note: pointer to subobject of heap-allocated object is not a constant expression /opt/compiler-explorer/gcc-snapshot/lib/gcc/x86_64-linux-gnu/14.0.1/../../../../include/c++/14.0.1/bits/allocator.h:193:31: note: heap allocation performed here 193 | return static_cast<_Tp*>(::operator new(__n)); | ^ 1 error generated. Compiler returned: 1 ``` Based on my understanding of [`[dcl.constexpr]/6`](https://eel.is/c++draft/dcl.constexpr#6): > In any constexpr variable declaration, the full-expression of the initialization shall be a constant expression It seems to me that GCC and MSVC are correct: the initializer `fn()[1]` does not evaluate to an lvalue referencing a heap-allocated value within the `vector` returned by `fn()`; it evaluates to an lvalue-to-rvalue conversion _from_ that heap-allocated value. This PR turns out to be a bug fix on the implementation of [P2564R3](https://wg21.link/p2564r3); as such, it only applies to C++23 and later. The core problem is that the definition of a constant-initialized variable ([`[expr.const/2]`](https://eel.is/c++draft/expr.const#2)) is contingent on whether the initializer can be evaluated as a constant expression: > A variable or temporary object o is _constant-initialized_ if [...] the full-expression of its initialization is a constant expression when interpreted as a _constant-expression_, [...] That can't be known until we've finished parsing the initializer, by which time we've already added immediate invocations and consteval references to the current expression evaluation context. This will have the effect of evaluating said invocations as full expressions when the context is popped, even if they're subexpressions of a larger constant expression initializer. If, however, the variable _is_ constant-initialized, then its initializer is [manifestly constant-evaluated](https://eel.is/c++draft/expr.const#20): > An expression or conversion is _manifestly constant-evaluated_ if it is [...] **the initializer of a variable that is usable in constant expressions or has constant initialization** [...] which in turn means that any subexpressions naming an immediate function are in an [immediate function context](https://eel.is/c++draft/expr.const#16): > An expression or conversion is in an immediate function context if it is potentially evaluated and either [...] it is a **subexpression of a manifestly constant-evaluated expression** or conversion and therefore _are not to be considered [immediate invocations](https://eel.is/c++draft/expr.const#16) or [immediate-escalating expressions](https://eel.is/c++draft/expr.const#17) in the first place_: > An invocation is an _immediate invocation_ if it is a potentially-evaluated explicit or implicit invocation of an immediate function and **is not in an immediate function context**. > An expression or conversion is _immediate-escalating_ if **it is not initially in an immediate function context** and [...] The approach that I'm therefore proposing is: 1. Create a new expression evaluation context for _every_ variable initializer (rather than only nonlocal ones). 2. Attach initializers to `VarDecl`s _prior_ to popping the expression evaluation context / scope / etc. This sequences the determination of whether the initializer is in an immediate function context _before_ any contained immediate invocations are evaluated. 3. When popping an expression evaluation context, elide all evaluations of constant invocations, and all checks for consteval references, if the context is an immediate function context. Note that if it could be ascertained that this was an immediate function context at parse-time, we [would never have registered](https://github.com/llvm/llvm-project/blob/760910ddb918d77e7632be1678f69909384d69ae/clang/lib/Sema/SemaExpr.cpp#L17799) these immediate invocations or consteval references in the first place. Most of the test changes previously made for this PR are now reverted and passing as-is. The only test updates needed are now as follows: - A few diagnostics in `consteval-cxx2a.cpp` are updated to reflect that it is the `consteval tester::tester` constructor, not the more narrow `make_name` function call, which fails to be evaluated as a constant expression. - The reclassification of `warn_impcast_integer_precision_constant` as a compile-time diagnostic adds a (somewhat duplicative) warning when attempting to define an enum constant using a narrowing conversion. It also, however, retains the existing diagnostics which @erichkeane (rightly) objected to being lost from an earlier revision of this PR. --------- Co-authored-by: cor3ntin <[email protected]>

…in `readability-string-compare` (llvm#88636) This PR aims to expand the list of classes that are considered to be "strings" by `readability-string-compare` check. 1. Currently only `std::string;:compare` is checked, but `std::string_view` has a similar `compare` method. This PR enables checking of `std::string_view::compare` by default. 2. Some codebases use custom string-like classes that have public interfaces similar to `std::string` or `std::string_view`. Example: [TStringBase](https://github.com/yandex/yatool/blob/main/util/generic/strbase.h#L38), A new option, `readability-string-compare.StringClassNames`, is added to allow specifying a custom list of string-like classes. Related to, but does not solve llvm#28396 (only adds support for custom string-like classes, not custom functions)

…90720) In PR llvm#88385 I've added support for auto-vectorisation of some early exit loops, which requires using the experimental.cttz.elts to calculate final indices in the early exit block. We need a more accurate cost model for this intrinsic to better reflect the cost of work required in the early exit block. I've tried to accurately represent the expansion code for the intrinsic when the target does not have efficient lowering for it. It's quite tricky to model because you need to first figure out what types will actually be used in the expansion. The type used can have a significant effect on the cost if you end up using illegal vector types. Tests added here: Analysis/CostModel/AArch64/cttz_elts.ll Analysis/CostModel/RISCV/cttz_elts.ll

This patch makes determining alignment and width of BitInt to be target ABI specific and makes it consistent with [Procedure Call Standard for the Arm® 64-bit Architecture (AArch64)](https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst) for AArch64 targets.

…lvm#91782) `vp.reduce.fmaximum/fminimum` are the VP version of `vector.reduce.fmaximum/fminimum`.

Fixes: https://lab.llvm.org/buildbot/#/builders/192/builds/9313

…bal subprogram" (llvm#91827) Reverts llvm#91668

…t instead of `APInt *` The `APInt *` version is pretty useless as any case one needs an `APInt *` out, they could just replace whatever they have the `m_Checked...` lambda with direct checks on the `APInt`. Leaving other helpers such as `m_Negative`, `m_Power2`, etc... unchanged as the `APInt` out version is used mostly for convenience and rarely change functionality when converted output a `Constant *`. Closes llvm#91377

The uses have been removed by commit 2b5cd8b

Remove FormatToken::isSimpleTypeSpecifier() and call Token::isSimpleTypeSpecifier(LangOpts) instead.

There are 100+ references. Use a wrapper similar to a623a4c

…sion expand from macro when ``IgnoreMacro`` option is enabled. (llvm#91757) Fixes: llvm#91487

Align BAT YAML to fdata profile. Test Plan: updated register-fragments-bolt-symbols.s Reviewers: dcci, rafaelauler, ayermolo, maksfb Reviewed By: dcci Pull Request: llvm#91773

Align with DataReader::readProfile that sets entry block counts from FuncBranchData->EntryData. Test Plan: updated bolt-address-translation-yaml.test

This should fix buildbot failures in: https://lab.llvm.org/buildbot/#/builders/5/builds/43303 https://lab.llvm.org/buildbot/#/builders/168/builds/20347

…#91727) Follow-up to llvm#88257

This is a follow-up patch for llvm#74199

…WebKit checkers. (llvm#91830)

@foo

GCC 12 and 13 generate incorrect code for a pattern in the tosa-to-tensor pass responsible for lowering tosa.reshape. This results in the tosa.reshape lowering producing IR which fails to verify. I've narrowed down the set of cmake flags needed to reproduce the issue to this: cmake -G Ninja ../llvm \ -DLLVM_ENABLE_PROJECTS="mlir" \ -DLLVM_TARGETS_TO_BUILD=host \ -DLLVM_ENABLE_PROJECTS=mlir \ -DCMAKE_BUILD_TYPE="Release" \ -DCMAKE_CXX_FLAGS_RELEASE="-O2" \ -DCMAKE_CXX_FLAGS="-O2" \ -DCMAKE_CXX_COMPILER=g++ \ -DCMAKE_C_COMPILER=gcc This is the failing test case: func.func @fails_in_gcc_12(%arg0: tensor<?xf32>) -> tensor<1x1x1x?xf32> { %0 = tosa.reshape %arg0 {new_shape = array<i64: 1, 1, 1, -1>} : (tensor<?xf32>) -> tensor<1x1x1x?xf32> return %0 : tensor<1x1x1x?xf32> } This should lower to a tensor.expand_shape operation like so: func.func @foo(%arg0: tensor<?xf32>) -> tensor<1x1x1x?xf32> { %c0 = arith.constant 0 : index %dim = tensor.dim %arg0, %c0 : tensor<?xf32> %c1 = arith.constant 1 : index %expanded = tensor.expand_shape %arg0 [[0, 1, 2, 3]] output_shape [1, 1, 1, %dim] : tensor<?xf32> into tensor<1x1x1x?xf32> return %expanded : tensor<1x1x1x?xf32> } Under GCC 12/13 with the above cmake configuration, the tensor.expand_shape looks like this %2 = "tensor.expand_shape"(%arg0) <{reassociation = [[0, 1, 2, 3]], static_output_shape = array<i64>}> : (tensor<?xf32>) -> tensor<?x1x1x?xf32> The key difference is the computed output type of `tensor<?x1x1x?xf32>` rather than the expected `tensor<1x1x1x?xf32>`. This expand_shape fails to verify with this error message: error: 'tensor.expand_shape' op expected number of static shape dims to be equal to the output rank (4) but found 0 inputs instead The problematic code is calculating the intermediate shape of the generated tensor.expand_shape operation in the expand_shape/collapse_shape sequence that implements tosa.reshape. // Compute result shape bool resultIsStatic = true; auto resultShape = llvm::map_to_vector(newShape, [&](int64_t size) { // Omitted // If we do not know the total size of the tensor, keep this dimension // dynamic in the result shape. if (!inputIsStatic) { resultIsStatic = false; return ShapedType::kDynamic; } }); if (resultIsStatic) { // do something return; } // do something else return; The failure point seems to be the update of the resultIsStatic variable in the lambda body. The assignment of false is not propagated to the use in the if-statement, resulting in the branch being taken when it should not. I've found several modification to the code that gets around the bug. The version I settled on is one which makes the logic a little more obvious.

cferry-AMD · 2024-08-22T11:50:36Z

What a bump!

…82383d5f3fa8

Bump to 9d66dca (May 11, TOSA reshape fix) (43)

topperc and others added 30 commits May 8, 2024 17:22

[InstCombine] Fix comment from llvm#88193 (NFC) (llvm#91427)

409ff97

It is inaccurate and needs to be corrected.

[InstCombine] Thwart complexity-based canonicalization in shl-add tes…

ba5170f

…t (NFC) (llvm#91413) Fixed test for llvm#88193

[Clang][Sema] Fix lookup of dependent operator= outside of complete-c…

62b5b61

…lass contexts (llvm#91498) Fixes a crash caused by llvm#90152.

[BOLT] Add test case for PIC fixed indirect jump (llvm#91547)

73a0144

A compiler can generate a redundant indirection for a jump via a fixed jump table target. Add a test case that covers such pattern that covers PIC case. We already have non-PIC case detection. Currently XFAIL.

[InstCombine] Thwart complexity-based canonicalization (NFC)

a39a382

These tests did not test what they were supposed to. The transform fails to actually handle the commuted cases.

[InstCombine] Handle more commuted cases in matchesSquareSum()

0d335f7

[InstCombine] Fix name clashes in check lines (NFC)

f958a73

These used both lower and upper case variants of the same name, resulting in malformed check lines when regenerated.

[RISCV] Add test for vmerge.vvm that could have splat sunk. NFC

8f4f34f

[InstCombine] Regenerate test checks (NFC)

73d4233

[RISCV] Remove unnecessary initialization from RISCVPostRAExpandPseud…

666970c

…o pass constructor. It is already initialized in RISCVTargetMachine.cpp

[flang][CMake] Add missing dependency to generate Fortran module files (

5adcfd4

llvm#91517) Fixes bug llvm#90769. Many thanks to @Meinersbur for providing the initial thought and solution to this.

[InstCombine] Add proper test coverage for or of xors pattern (NFC)

90ffaa6

Test all commuted variants of the pattern, most of which currently fail to fold.

[InstCombine] Handle commuted variants in or of xor pattern

534701d

This pattern only handled commutation in the "or", while all involved operations are commutative. Make sure we handle all sixteen patterns.

[Reassociate] Generate test checks (NFC)

97be79c

[lldb] Make SBType::GetDirectNestedType (mostly) work with typedefs (l…

042a0b0

…lvm#91189) The implementation is straight-forward, but comes with a big disclaimer. See llvm#91186 for details.

mshockwave and others added 16 commits May 10, 2024 16:01

[VP][RISCV] Add vp.reduce.fmaximum/fminimum and its RISC-V codegen (l…

f8063ff

…lvm#91782) `vp.reduce.fmaximum/fminimum` are the VP version of `vector.reduce.fmaximum/fminimum`.

[InstallAPI] Pass explicit module cache to avoid permissions issues.

504cf55

Fixes: https://lab.llvm.org/buildbot/#/builders/192/builds/9313

Revert "[flang][cuda] Implicitly load cudadevice module in device/glo…

38b2755

…bal subprogram" (llvm#91827) Reverts llvm#91668

[Driver] Remove unused getInstalledDir

df88d61

The uses have been removed by commit 2b5cd8b

[clang-format] Fix FormatToken::isSimpleTypeSpecifier() (llvm#91712)

e62ce1f

Remove FormatToken::isSimpleTypeSpecifier() and call Token::isSimpleTypeSpecifier(LangOpts) instead.

[Driver] Remove elf::script indirection. NFC

65c9b84

There are 100+ references. Use a wrapper similar to a623a4c

[clang-tidy] readability-simplify-boolean-expr avoid to warn expres…

3676b09

…sion expand from macro when ``IgnoreMacro`` option is enabled. (llvm#91757) Fixes: llvm#91487

[BOLT] Use disambiguated local names in BAT YAML

bbcdd4f

Align BAT YAML to fdata profile. Test Plan: updated register-fragments-bolt-symbols.s Reviewers: dcci, rafaelauler, ayermolo, maksfb Reviewed By: dcci Pull Request: llvm#91773

[BOLT] Set entry counts in BAT YAML profile (llvm#91775)

4f12766

Align with DataReader::readProfile that sets entry block counts from FuncBranchData->EntryData. Test Plan: updated bolt-address-translation-yaml.test

[clang-format] Move QualifierFixerTest::LangOpts to IsQualifierType test

5cd2804

This should fix buildbot failures in: https://lab.llvm.org/buildbot/#/builders/5/builds/43303 https://lab.llvm.org/buildbot/#/builders/168/builds/20347

[AMDGPU] Fix unittest linking error with LLVM_LINK_LLVM_DYLIB (llvm…

52c5a81

…#91727) Follow-up to llvm#88257

[X86] Add sub-feature nf (no flags update) for APX

575177f

This is a follow-up patch for llvm#74199

[analyzer] Treat bitwise_cast, std::addressof, and new as trivial in …

31774b6

…WebKit checkers. (llvm#91830)

mgehre-amd requested a review from cferry-AMD August 22, 2024 11:43

cferry-AMD approved these changes Aug 22, 2024

View reviewed changes

mgehre-amd force-pushed the bump_to_82383d5f3fa8 branch from 29571e5 to 09faf71 Compare August 22, 2024 11:56

Merge commit '82383d5f3fa8' into bump_to_82383d5f3fa8

643434e

mgehre-amd force-pushed the bump_to_82383d5f3fa8 branch from 09faf71 to 643434e Compare August 22, 2024 12:28

mgehre-amd changed the title ~~[AutoBump] Merge with fixes of 82383d5f3fa8 (May 1) (6)~~ [AutoBump] Merge with fixes of 82383d5f3fa8 (May 1) (42) Aug 22, 2024

Merge commit '9d66dcaf172c' into HEAD

64ba2b4

Base automatically changed from bump_to_c515c780 to feature/fused-ops August 23, 2024 19:43

An error occurred while trying to automatically change base from bump_to_c515c780 to feature/fused-ops August 23, 2024 19:43

mgehre-amd added 2 commits August 27, 2024 09:18

Merge remote-tracking branch 'origin/feature/fused-ops' into bump_to_…

7d90abb

…82383d5f3fa8

Merge pull request #302 from Xilinx/bump_to_9d66dcaf172c

d92bf11

Bump to 9d66dca (May 11, TOSA reshape fix) (43)

mgehre-amd merged commit fc56b0f into feature/fused-ops Sep 2, 2024
11 of 12 checks passed

mgehre-amd deleted the bump_to_82383d5f3fa8 branch September 2, 2024 08:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with fixes of 82383d5f3fa8 (May 1) (42) #299

[AutoBump] Merge with fixes of 82383d5f3fa8 (May 1) (42) #299

mgehre-amd commented Aug 22, 2024

cferry-AMD commented Aug 22, 2024

[AutoBump] Merge with fixes of 82383d5f3fa8 (May 1) (42) #299

[AutoBump] Merge with fixes of 82383d5f3fa8 (May 1) (42) #299

Conversation

mgehre-amd commented Aug 22, 2024

cferry-AMD commented Aug 22, 2024