merge main into amd-staging #500

z1-cciauto · 2025-11-05T14:40:43Z

No description provided.

…upport (llvm#166370) The `FEnvSafeTest.cpp` test fails on AArch64 soft nofp configurations because LLVM libc does not provide a floating-point environment in these configurations. This patch adds another preprocessor guard on `__ARM_FP` to disable the test on those.

…llvm#166174) Dummy variables have an entry in `Program::Globals`, but they are not added to `GlobalIndices`. When registering redeclarations, we used to only patch up the global indices, but that left the dummy variables alone. Update the dummy variables of all redeclarations as well. Fixes llvm#165952

…m#166266) explain more about use-after-free in llvm-twine-local add note about manually adjusting code after applying fix-it. fixed: llvm#154810

…der (llvm#166292) By default, the dialect conversion driver processes operations in pre-order: the initial worklist is populated pre-order. (New/modified operations are immediately legalized recursively.) This commit adds a new API for selective post-order legalization. Patterns can request an operation / region legalization via `ConversionPatternRewriter::legalize`. They can call these helper functions on nested regions before rewriting the operation itself. Note: In rollback mode, a failed recursive legalization typically leads to a conversion failure. Since recursive legalization is performed by separate pattern applications, there is no way for the original pattern to recover from such a failure.

…rn` (llvm#166513) Fixes a bug in `VectorConvertToLLVMPattern`, which converted operations with unsupported FP types. E.g., `arith.addf ... : f4E2M1FN` was lowered to `llvm.fadd ... : i4`, which does not verify. There are a few more patterns that have the same bug. Those will be fixed in follow-up PRs. This commit is in preparation of adding an `APFloat`-based lowering for `arith` operations with unsupported floating-point types.

…s.py. (llvm#164965) This change enables update_llc_test_checks.py to automatically generate MIR checks for RUN lines that use `-stop-before` or `-stop-after` flags allowing tests to verify intermediate compilation stages (e.g., after instruction selection but before peephole optimizations) alongside the final assembly output. If `-debug-only` flag is present in the run line it's considered as the main point of interest for testing and stop flags above are ignored (that is no MIR checks are generated). This resulted from the scenario, when I needed to test two instruction matching patterns where the later pattern in the peepholer reverts the earlier pattern in the instruction selector and distinguish it from the case when the earlier pattern didn't worked at all. Initially created by Claude Sonnet 4.5 it was improved later to handle conflicts in MIR <-> ASM prefixes and formatting.

…157818) This patch introduces the LASX and LSX conversion intrinsics: - <8 x float> @llvm.loongarch.lasx.cast.128.s(<4 x float>) - <4 x double> @llvm.loongarch.lasx.cast.128.d(<2 x double>) - <4 x i64> @llvm.loongarch.lasx.cast.128(<2 x i64>) - <8 x float> @llvm.loongarch.lasx.concat.128.s(<4 x float>, <4 x float>) - <4 x double> @llvm.loongarch.lasx.concat.128.d(<2 x double>, <2 x double>) - <4 x i64> @llvm.loongarch.lasx.concat.128(<2 x i64>, <2 x i64>) - <4 x float> @llvm.loongarch.lasx.extract.128.lo.s(<8 x float>) - <2 x double> @llvm.loongarch.lasx.extract.128.lo.d(<4 x double>) - <2 x i64> @llvm.loongarch.lasx.extract.128.lo(<4 x i64>) - <4 x float> @llvm.loongarch.lasx.extract.128.hi.s(<8 x float>) - <2 x double> @llvm.loongarch.lasx.extract.128.hi.d(<4 x double>) - <2 x i64> @llvm.loongarch.lasx.extract.128.hi(<4 x i64>) - <8 x float> @llvm.loongarch.lasx.insert.128.lo.s(<8 x float>, <4 x float>) - <4 x double> @llvm.loongarch.lasx.insert.128.lo.d(<4 x double>, <2 x double>) - <4 x i64> @llvm.loongarch.lasx.insert.128.lo(<4 x i64>, <2 x i64>) - <8 x float> @llvm.loongarch.lasx.insert.128.hi.s(<8 x float>, <4 x float>) - <4 x double> @llvm.loongarch.lasx.insert.128.hi.d(<4 x double>, <2 x double>) - <4 x i64> @llvm.loongarch.lasx.insert.128.hi(<4 x i64>, <2 x i64>)

…__support/math folder. (llvm#162019) Part of llvm#147386 in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450

…n-overaligned-type` (llvm#166546)

…f macros (llvm#165855)

…ster (llvm#164479) Technically, it is possible that the a callee-saved register is saved in different locations. CFIInstrInserter should handle this, but currently it does not.

Use cannotHoistOrSinkRecipe to forbid sinking allocas.

When ISL encounters an internal error, it sets the error flag, but it is not isl_error_quota that was already checked. Check for general errors and abort the schedule optimization if that happens, instead of continuing on the good path. The error occured when compiling llvm-test-suite's MultiSource/Applications/JM/lencod/leaky_bucket.c with Polly enabled. Not adding a test case because it depends on ISL internals. We do not want to a test case to depend on which version of ISL is used.

…st_checks.py." (llvm#166549) Reverts llvm#164965

Currently, the ARM backend incorrectly parses every `arm` prefixed arch to be non-thumb, but `armv6m` is THUMB and doesnt have ARM ops causing the test to fail when compiling to assembly and not LLVM IR: `error: Function 'foo' uses ARM instructions, but the target does not support ARM mode execution.` This only happens when invoking cc1 directly and not the Clang driver. As a quick triage, this patch changes the tests to use `thumb`. Uncovered by llvm#151404

…lvm#160536) As discussed in llvm#153402, we have inefficiences in handling constant pool access that are difficult to address. Using an IR pass to promote double constants to a global allows a higher degree of control of code generation for these accesses, resulting in improved performance on benchmarks that might otherwise have high register pressure due to accessing constant pool values separately rather than via a common base. Directly promoting double constants to separate global values and relying on the global merger to do a sensible thing would be one potential avenue to explore, but it is _not_ done in this version of the patch because: * The global merger pass needs fixes. For instance it claims to be a function pass, yet all of the work is done in initialisation. This means that attempts by backends to schedule it after a given module pass don't actually work as expected. * The heuristics used can impact codegen unexpectedly, so I worry that tweaking it to get the behaviour desired for promoted constants may lead to other issues. This may be completely tractable though. Now that llvm#159352 has landed, the impact on terms if dynamically executed instructions is slightly smaller (as we are starting from a better baseline), but still worthwhile in lbm and nab from SPEC. Results below are for rva22u64: ``` Benchmark Baseline This PR Diff (%) ============================================================ ============================================================ 500.perlbench_r 180668945687 180666122417 -0.00% 502.gcc_r 221274522161 221277565086 0.00% 505.mcf_r 134656204033 134656204066 0.00% 508.namd_r 217646645332 216699783858 -0.44% 510.parest_r 291731988950 291916190776 0.06% 511.povray_r 30983594866 31107718817 0.40% 519.lbm_r 91217999812 87405361395 -4.18% 520.omnetpp_r 137699867177 137674535853 -0.02% 523.xalancbmk_r 284730719514 284734023366 0.00% 525.x264_r 379107521547 379100250568 -0.00% 526.blender_r 659391437610 659447919505 0.01% 531.deepsjeng_r 350038121654 350038121656 0.00% 538.imagick_r 238568674979 238560772162 -0.00% 541.leela_r 405660852855 405654701346 -0.00% 544.nab_r 398215801848 391352111262 -1.72% 557.xz_r 129832192047 129832192055 0.00% ``` --- Notes for reviewers: * As discussed at the sync-up meeting, the suggestion is to try to land an incremental improvement to the status quo even if there is more work to be done around the general issue of constant pool handling. We can discuss here if that is actually the best next step or not, but I just wanted to clarify that's why this is being posted with a somewhat narrow scope. * I've disabled transformations both for RV32 and on systems without D as both cases saw some regressions.

z1-cciauto · 2025-11-05T14:41:34Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/2668

vhscampos and others added 20 commits November 5, 2025 11:14

[clang-tidy][doc] add more information in twine-local's document (llv…

e856483

…m#166266) explain more about use-after-free in llvm-twine-local add note about manually adjusting code after applying fix-it. fixed: llvm#154810

[CIR] Add support for storing into _Atomic variables (llvm#165872)

c1dc064

[gn] port 0c73009

04c01f0

[clang-tidy][NFC] Fix broken link in `bugprone-default-operator-new-o…

98ca2e8

…n-overaligned-type` (llvm#166546)

[Clang][NFC] Refactor SemaCXX/dllexport.cpp to use -verify= instead o…

a389472

…f macros (llvm#165855)

[RISCV] Add a test for multiple save locations of a callee-saved regi…

b675c0c

…ster (llvm#164479) Technically, it is possible that the a callee-saved register is saved in different locations. CFIInstrInserter should handle this, but currently it does not.

[VPlan] Avoid sinking allocas in sinkScalarOperands (llvm#166135)

1de55c9

Use cannotHoistOrSinkRecipe to forbid sinking allocas.

Revert "[utils][UpdateLLCTestChecks] Add MIR support to update_llc_te…

ba1dbdd

…st_checks.py." (llvm#166549) Reverts llvm#164965

[X86] Add test coverage for llvm#166534 (llvm#166552)

438a18c

merge main into amd-staging

8ccff62

z1-cciauto requested a review from a team November 5, 2025 14:40

kzhuravl closed this Nov 5, 2025

kzhuravl deleted the upstream_merge_202511050940 branch November 5, 2025 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

merge main into amd-staging #500

merge main into amd-staging #500

Uh oh!

z1-cciauto commented Nov 5, 2025

Uh oh!

z1-cciauto commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

merge main into amd-staging #500

merge main into amd-staging #500

Uh oh!

Conversation

z1-cciauto commented Nov 5, 2025

Uh oh!

z1-cciauto commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants