merge main into amd-staging #499

z1-cciauto · 2025-11-05T12:10:09Z

No description provided.

This is a bugfix in rematerialization where the liveness of subreg mask was incorrectly updated causing crash in scheduler.

Folding a mask to a variable shift pair results in better code size as long as they are scalars that are <= XLen. Similar to llvm#158069

(llvm#131759) This isn't really the right check, we want to know that the intrinsic does not perform a true function call to any code (in the module or not). nocallback appears to be the closest thing to this property we have now though. Fixes theoretically miscompiles with intrinsics like statepoint, which hide a call to a real function. Also do the same for inferring no-agpr usage.

…)" (llvm#166500) This reverts commit 93e860e. hasOneUse still has the null check, and it seems bad to be logically inconsistent across multiple of these predicate functions.

Avoid copying machine operands. Signed-off-by: John Lu <John.Lu@amd.com>

Forked from llvm/test/CodeGen/X86

Fixes llvm#114402. This patch accept empty enum in C as a microsoft extension and introduce an new warning `-Wmicrosoft-empty-enum`. --------- Signed-off-by: yicuixi <qin_17914@126.com> Co-authored-by: Erich Keane <ekeane@nvidia.com> Co-authored-by: Aaron Ballman <aaron@aaronballman.com>

This allows more accurate alias analysis to apply at the bundle level. This has a bunch of minor effects in post-RA scheduling that look mostly beneficial to me, all of them in AMDGPU (the Thumb2 change is cosmetic). The pre-existing (and unchanged) test in CodeGen/MIR/AMDGPU/custom-pseudo-source-values.ll tests that MIR with a bundle with MMOs can be parsed successfully. v2: - use cloneMergedMemRefs - add another test to explicitly check the MMO bundling behavior v3: - use poison instead of undef to initialize the global variable in the test

Correct a typo in the triple that is used for the test. Because the OS was not recognised, it would fall to the non-Windows code generation.

This otherwise happens in ParseCaseExpression. If we don't call this, we don't perform the usual arithmetic conversions, etc.

…ication` (llvm#165659) Closes [llvm#157290](llvm#157290)

There doesn't seem much of a reason why this should be a struct. Make it a namespace instead.

… other parameter is non-const (llvm#166102) This patch enables `FoldOpIntoSelect` and `foldOpIntoPhi` for the cases when Op's second parameter is a non-constant. It doesn't seem to bring significant improvements, but the compile time impact is neglegable.

…ethod bodies (llvm#166335) Since commit 842622b adding support for overloading interface methods, a `using` directive is emitted for any interface method that does not require emission of a trait method, including for methods that define a method body. However, methods directly specifying a body (e.g., via the `methodBody` parameter of `InterfaceMethod`) are implemented directly in the interface class and are therefore not present in the associated trait. The generated `using` directive then referes to a non-existent method of the trait, resulting in an error upon compilation of the generated code. This patch changes `DefGen::emitTraitMethods()`, such that `genTraitMethodUsingDecl()` is not invoked for interface methods with a body anymore.

Split off from PR llvm#163525, this standalone patch replaces simple cases where undef is used as a value for arithmetic or getelementptr instructions. This will reduce the likelihood of contributors hitting the `undef deprecator` warning in github.

…ttribute lists. NFC. (llvm#166523) Makes it easier to compare constexpr/non-constexpr attribute defines Allows clang-format to pack the attributes more efficiently

…65085) This patch enables compile-time evaluation of AVX512 permutex2var intrinsics in constexpr contexts. Extend shuffle generic to handle both integer immediate and vector mask operands. Resolves llvm#161335

This is a follow up of PR llvm#165558. (1/n) This patch updates the below mbarrier Ops to use AnyTypeOf[] construct: ``` * mbarrier.arrive * mbarrier.arrive.noComplete * mbarrier.test.wait * cp.async.mbarrier.arrive ``` * Updated existing tests accordingly. * Verified locally that there are no new regressions in the `integration` tests. * TODO: Two more Ops remain and will be migrated in a subsequent PR. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>

Reverts llvm#166210 Buildbot failures in the libc on GPU bot: https://lab.llvm.org/buildbot/#/builders/10/builds/16711

…lvm#166376) If there are additional uses of the bit twiddled value as well as the rmw store, we can replace them with a (re)loaded copy of the full width integer value after the store. There's some memory op chain handling to handle here - the additional (re)load is chained after the new store and then any dependencies of the original store are chained after the (re)load.

Now that the `SourceManager::getExpansionLoc` and `SourceManager::getSpellingLoc` functions are efficient, delete unnecessary code duplicate in `SourceManager::getDecomposedExpansionLoc` and `SourceManager::getDecomposedSpellingLoc` methods.

…166182) The search should proceed from CallInst to the beginning of BB since X2 can be rewritten and we need to catch the most recent write before the call. Patch by Yafet Beyene alulayafet@gmail.com

…lvm#166393) And recurse into records properly.

…upport (llvm#166370) The `FEnvSafeTest.cpp` test fails on AArch64 soft nofp configurations because LLVM libc does not provide a floating-point environment in these configurations. This patch adds another preprocessor guard on `__ARM_FP` to disable the test on those.

…llvm#166174) Dummy variables have an entry in `Program::Globals`, but they are not added to `GlobalIndices`. When registering redeclarations, we used to only patch up the global indices, but that left the dummy variables alone. Update the dummy variables of all redeclarations as well. Fixes llvm#165952

…m#166266) explain more about use-after-free in llvm-twine-local add note about manually adjusting code after applying fix-it. fixed: llvm#154810

…der (llvm#166292) By default, the dialect conversion driver processes operations in pre-order: the initial worklist is populated pre-order. (New/modified operations are immediately legalized recursively.) This commit adds a new API for selective post-order legalization. Patterns can request an operation / region legalization via `ConversionPatternRewriter::legalize`. They can call these helper functions on nested regions before rewriting the operation itself. Note: In rollback mode, a failed recursive legalization typically leads to a conversion failure. Since recursive legalization is performed by separate pattern applications, there is no way for the original pattern to recover from such a failure.

z1-cciauto · 2025-11-05T12:12:06Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2661

VigneshwarJ and others added 30 commits November 4, 2025 22:40

[CodeGen] Register-coalescer remat fix subreg liveness (llvm#165662)

b5f2001

This is a bugfix in rematerialization where the liveness of subreg mask was incorrectly updated causing crash in scheduler.

[RISCV] Implement shouldFoldMaskToVariableShiftPair (llvm#166159)

6111ff1

Folding a mask to a variable shift pair results in better code size as long as they are scalars that are <= XLen. Similar to llvm#158069

Revert "IR: Remove null UseList checks in hasNUses methods (llvm#165929…

044e0f0

…)" (llvm#166500) This reverts commit 93e860e. hasOneUse still has the null check, and it seems bad to be logically inconsistent across multiple of these predicate functions.

[AMDGPU][NFC] Avoid copying MachineOperands (llvm#166293)

87b1d35

Avoid copying machine operands. Signed-off-by: John Lu <John.Lu@amd.com>

[msan][test] Add some avx512bf16 tests (llvm#166219)

4c2a9c4

Forked from llvm/test/CodeGen/X86

test: correct typo in RUN line (llvm#166511)

66f52ca

Correct a typo in the triple that is used for the test. Because the OS was not recognised, it would fall to the non-Windows code generation.

[clang] Call ActOnCaseExpr even if the 'case' is missing (llvm#166326)

9016c60

This otherwise happens in ParseCaseExpression. If we don't call this, we don't perform the usual arithmetic conversions, etc.

Fix bazel build issue caused by llvm#166259 (llvm#166519)

98f0139

[libc++] Remove <cstdlib> include from <exception> (llvm#166340)

988c1b1

[clang-tidy] Rename cert-dcl58-cpp to `bugprone-std-namespace-modif…

51d0f6d

…ication` (llvm#165659) Closes [llvm#157290](llvm#157290)

[libc++][NFC] Make __type_info_implementations a namespace (llvm#166339)

5b5d0a8

There doesn't seem much of a reason why this should be a struct. Make it a namespace instead.

[Headers][X86] avx ifma - move constexpr to the end of the function a…

0314b93

…ttribute lists. NFC. (llvm#166523) Makes it easier to compare constexpr/non-constexpr attribute defines Allows clang-format to pack the attributes more efficiently

[Clang] Add constexpr support for AVX512 permutex2 intrinsics (llvm#1…

cc9ad9a

…65085) This patch enables compile-time evaluation of AVX512 permutex2var intrinsics in constexpr contexts. Extend shuffle generic to handle both integer immediate and vector mask operands. Resolves llvm#161335

Revert "CodeGen: Record MMOs in finalizeBundle" (llvm#166520)

8339839

Reverts llvm#166210 Buildbot failures in the libc on GPU bot: https://lab.llvm.org/buildbot/#/builders/10/builds/16711

[AMDGPU] Another test for missing S_WAIT_XCNT (llvm#166154)

fb49adb

[BOLT][AArch64] Fix search to proceed upwards from memcpy call (llvm#…

a65867a

…166182) The search should proceed from CallInst to the beginning of BB since X2 can be rewritten and we need to catch the most recent write before the call. Patch by Yafet Beyene alulayafet@gmail.com

[clang][bytecode] Print primitive arrays in Descriptor::dumpFull() (l…

5821b09

…lvm#166393) And recurse into records properly.

[clang-tidy][doc] add more information in twine-local's document (llv…

e856483

…m#166266) explain more about use-after-free in llvm-twine-local add note about manually adjusting code after applying fix-it. fixed: llvm#154810

[CIR] Add support for storing into _Atomic variables (llvm#165872)

c1dc064

matthias-springer and others added 2 commits November 5, 2025 21:04

merge main into amd-staging

bddada5

z1-cciauto requested a review from a team November 5, 2025 12:10

ronlieb closed this Nov 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #499

merge main into amd-staging #499

Uh oh!

z1-cciauto commented Nov 5, 2025

Uh oh!

z1-cciauto commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

29 participants

merge main into amd-staging #499

merge main into amd-staging #499

Uh oh!

Conversation

z1-cciauto commented Nov 5, 2025

Uh oh!

z1-cciauto commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

29 participants