forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 77
merge main into amd-staging #675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
z1-cciauto
merged 53 commits into
amd-staging
from
amd/merge/upstream_merge_20251124203139
Nov 25, 2025
Merged
merge main into amd-staging #675
z1-cciauto
merged 53 commits into
amd-staging
from
amd/merge/upstream_merge_20251124203139
Nov 25, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If the expected trip count is less than the VF, the vector loop will only execute a single iteration. When that's the case, the cost of the middle block has the same impact as the cost of the vector loop. Include it in isOutsideLoopWorkProfitable to avoid vectorizing when the extra work in the middle block makes it unprofitable. Note that isOutsideLoopWorkProfitable already scales the cost of blocks outside the vector region, but the patch restricts accounting for the middle block to cases where VF <= ExpectedTC, to initially catch some worst cases and avoid regressions. This initial version should specifically avoid unprofitable tail-folding for loops with low trip counts after re-applying llvm#149042. PR: llvm#168949
Currently, in the following snippet, the second designated initializer
is incorrectly detected as an OBJC method expr. Fix that and a test to
make sure we don't regress.
```
Foo foo[] = {[0] = 1, [1] = 2};
```
…Frontend and flangFrontend (llvm#165277)" This reverts commit 3773bbe.
Adds support for the `__builtin_ia32_kshiftli` and `__builtin_ia32_kshiftri` X86 builtins. Part of llvm#167765 --------- Signed-off-by: vishruth-thimmaiah <vishruththimmaiah@gmail.com>
…gFrontend and flangFrontend (llvm#165277)" This reverts commit 40334b8. Unfortunately the revert breaks the build.
…Frontend and flangFrontend (llvm#165277)" (llvm#169397) This reverts commit 3773bbe and relands the last revert attempt 40334b8. 3773bbe broke the build for the build configuration described in here: llvm#165277 (comment)
) This PR introduces a new additional type of map lowering for record types that Clang currently supports, in which a user can map a top-level record type and then individual members with different mapping, effectively creating a sort of "overlapping" mapping that we attempt to cut around. This is currently most predominantly used in Fortran, when mapping descriptors and there data, we map the descriptor and its data with separate map modifiers and "cut around" the pointer data, so that wedo not overwrite it unless the runtime deems it a neccesary action based on its reference counting mechanism. However, it is a mechanism that will come in handy/trigger when a user explitily maps a record type (derived type or structure) and then explicitly maps a member with a different map type. These additions were predominantly in the OpenMPToLLVMIRTranslation.cpp file and phase, however, one Flang test that checks end-to-end IR compilation (as far as we care for now at least) was altered. 2/3 required PRs to enable declare target to mapping, should look at PR 3/3 to check for full green passes (this one will fail a number due to some dependencies). Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com
…ntation (llvm#119589) While the infrastructure for declare target to/enter and link for variables exists in the MLIR dialect and at the Flang level, the current lowering from MLIR -> LLVM IR isn't in place, it's only in place for variables that have the link clause applied. This PR aims to extend that lowering to an initial implementation that incorporates declare target to as well, which primarily requires changes in the OpenMPToLLVMIRTranslation phase. However, a minor addition to the OpenMP dialect was required to extend the declare target enumerator to include a default None field as well. This also requires a minor change to the Flang lowering's MapInfoFinlization.cpp pass to alter the map type for descriptors to deal with cases where a variable is marked declare to. Currently, when a descriptor variable is mapped declare target to the descriptor component can become attatched, and cannot be updated, this results in issues when an unusual allocation range is specified (effectively an off-by X error). The current solution is to map the descriptor always, as we always require an up-to-date version of this data. However, this also requires an interlinked PR that adds a more intricate type of mapping of structures/record types that clang currently implements, to circumvent the overwriting of the pointer in the descriptor. 3/3 required PRs to enable declare target to mapping, this PR should pass all tests and provide an all green CI. Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com
…lvm#168973) (llvm#169091) This reverts commit 418204d.
Adds an explicit include of `<cassert>` in StringTable.h rather than relying on the one in StringRef.h. Fixes potential compile errors if assert() was undef'ed between StringRef.h and StringTable.h inclusion.
This verifier check will complain if there aren't enough implicit operands -- so it doesn't *allow* those operands, it *requires* them.
…8417) This patch add support for lowering of custom reductions to MLIR. It also enhances the capability of the pass to automatically mark functions as "declare target" by traversing custom reduction initializers and combiners.
…lvm#169235) This fixes test errors like this, at least for a mingw target, if building with Clang 21 instead of Clang 20, as in the CI environment: # .---command stderr------------ # | error: 'expected-error' diagnostics seen but not expected: # | File C:\a\llvm-mingw\llvm-mingw\llvm-project\libcxx\test\std\input.output\file.streams\c.files\gets-removed.verify.cpp Line 16: cannot initialize a parameter of type 'char *' with an lvalue of type 'const char *' # | 1 error generated. # `----------------------------- # error: command failed with exit status: 1 This extra, unexpected diagnostic appears in Clang 21, since commit 9eef4d1 ("Remove delayed typo expressions"). Before this, we got the expected diagnostic `error: no member named 'gets' in namespace 'std'`, with the typo correction hint `did you mean 'puts'?`. After this change, we get the typo correction hint `did you mean simply 'gets'?` instead. And with the typo correction finding `::gets`, it goes on to produce a second diagnostic about mismatched parameter for that function. Avoid these unexpected diagnostics by passing the right type of parameter to the gets function.
…8468) This commit adds a `MockDwarfDelegate` class that can be used to control what dwarf version is used when evaluating an expression. We also add a simple test that shows how dwarf version can change the result of the expression.
This header is only ever used inside `src/`, so we might as well move it there. As a drive-by this also removes some dead code.
…ualifiers of type locs. (llvm#167619) Previously, e.g. for TypeLoc "MyNamespace::MyClass", `node()` selects only "MyClass" without the qualifier. With this change, it now selects "MyNamespace::MyClass". --------- Co-authored-by: Florian Mayer <fmayer@google.com>
…lvm#168857) This avoids dozens of instances of benign error messages being printed when running the tests on e.g. Windows: Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: module 'os' has no attribute 'sysconf' Co-authored-by: Florian Mayer <fmayer@google.com>
Only some fortran source files in flang/test/Lower have been modified. The other files in the directory will be cleaned up in subsequent commits
…vm#169327) OffsetSizeAndStrideOpInterface does not specify whether it's operating on the input or output shape and in fact different ops implement this in different ways, which is also why SubviewOp is special cased here. This "marked as dynamic but not really dynamic" folding is better handled by shape inference, so just remove the bad fold.
SVE depends on a combination of host support and operating system support. Sometimes those don't line up with detected host CPU name; make sure SVE is disabled when it isn't available. Implement this for both Windows and Linux. (We don't have a codepath for other operating systems. If someone wants to implement this, it should be possible to adapt fmv code from compiler-rt.) While I'm here, also add support for detecting other Windows CPU features. For Windows, declare constants ourselves so the code builds on older SDKs; we also do this in compiler-rt.
This is currently a no op. This will be supported for the minimal runtime in a follow up. This allows to improve codegen for fsanitize-recover by compiling the handlers with [[clang::preserve_all]]. This makes sure that the caller does not need to spill any registers. We do not expect this function to be called frequently, so this is beneficial for code size.
Addresses feedback from llvm#147508 (review) : - Update access modifiers for SYCLWrapper members. - Update comments. - Update types.
…#169346) PR168884 flagged compiler directives (!dir$ ...) inside OpenMP loop constructs as errors. This caused some customer applications to fail to compile (issue 169229). Downgrade the error to a warning, and gracefully ignore compiler directives when lowering loop constructs to MLIR. Fixes llvm#169229
In the LoweringPrepare pass, the handling for global array destructor lowering was mishandling the insertion point, so that if this code needed to create a declaration for the __cxa_atexit function, that declaration was being created in the dtor region, rather than at module scope. This change fixes that.
AMDGPU requires more complex CFI rules, normally these would be expressed with .cfi_escape, however this would make the CFI unreadable and makes it difficult to update registers in CFI instructions (also something AMDGPU requires). Authored-by: Emma Pilkington <Emma.Pilkington@amd.com>
This is mostly the output of a vibe coded script running on VecFuncs.def, with a lot of manual cleanups and fixing where the vibes were off. This is not yet wired up to anything (except for the handful of calls which are already manually enabled). In the future the SystemLibrary mechanism needs to be generalized to allow plugging these sets in based on the flag. One annoying piece is there are some name conflicts across the libraries. Some of the libmvec functions have name collisions with some sleef functions. I solved this by just adding a prefix to the libmvec functions. It would probably be a good idea to add a prefix to every group. It gets ugly, particularly since some of the sleef functions started to use a Sleef_ prefix, but mostly do not.
… supported (llvm#169252) Fix kernel build when cl_khr_fp64 is not enabled: opencl-c.h:13785:50: error: unknown type name 'atomic_double' 13785 | double __ovld atomic_fetch_min(volatile __global atomic_double *, double); opencl-c.h:13785:67: error: use of type 'double' requires cl_khr_fp64 and __opencl_c_fp64 support 13785 | double __ovld atomic_fetch_min(volatile __global atomic_double *, double); This is a regression introduced by 423bdb2. Before that commit, __opencl_c_ext_fp64_global_atomic_add was guarded by cl_khr_fp64 in opencl-c-base.h.
Previously, i16 `bswap` was lowered using multiple shift and OR operations. This patch adds a pattern to directly lower i16 `bswap` using the `PRMT` (permute) instruction, which is more efficient. Additionally, the lowering of `bswap` is moved into operation legalization, which allows for DAGCombiner to optimize the lowered code.
This way if the downstream consuming project uses zstd we make sure they are dedup'd. This uses a new rule to make sure layering_check still works while allowing us to augment the upstream library rules with LLVM specific `defines`.
Clarify how Clang-generated HIP fat binaries are registered and unregistered with the HIP runtime, and how this interacts with global constructors, destructors, and atexit handlers. Document that there is no strong guarantee on ordering relative to user-defined global ctors/dtors, recommend that HIP application developers avoid using kernels or device variables from global ctors/dtors, and describe the implications for HIP runtime developers (synchronization and guards in __hipRegisterFatBinary/__hipUnregisterFatBinary). This is motivated by questions from HIP application and runtime developers about fat binary registration/unregistration order and its potential interference with their own initialization and teardown code.
llvm#169428) As per OpenMP 5.1, the pointers are expected to retain their original values when a lookup fails and there is no device pointer to translate to.
…lvm#169275) Add support for `arith.extf` and `arith.truncf`. No support for custom rounding modes yet.
This adds a pyproject.toml file for packaging the clang Python bindings as a sdist tarball and pure Python wheel packages for the clang python bindings. It is required to move updates of the clang and libclang PyPI packages to the LLVM monorepo. Versioning information is derived from LLVM git tags (using hatch-vcs, which is based on setuptools_scm), so no manual updates are needed to bump version numbers. The minimum python version required is set to 3.10 due to cindex.py using PEP 604 union type syntax (str | bytes | None). The .git_archival.txt file is populated with version information needed to get accurate version information if the bindings are installed from an LLVM/clang source code archive. The .gitignore file is populated with files that may get created as part of building/testing the sdist and wheel that should not be committed to source control. This is first step for addressing llvm#125220, and moving publishing of the clang and libclang PyPI packages into the LLVM monorepo. Signed-off-by: Ryan Mast <mast.ryan@gmail.com>
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [actions/checkout](https://redirect.github.com/actions/checkout) | action | major | `v5.0.0` -> `v6.0.0` |
…lvm#169277) Add support for `arith.fptosi` and `arith.fptoui`.
…m#169330) `[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant
This allows the compiler to verify we've covered all enum values.
…lvm#169284) Add support for `arith.sitofp` and `arith.uitofp`.
Collaborator
dpalermo
approved these changes
Nov 25, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.