forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 76
merge main into amd-staging #598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ronlieb
merged 38 commits into
amd-staging
from
amd/merge/upstream_merge_20251115172500
Nov 16, 2025
Merged
merge main into amd-staging #598
ronlieb
merged 38 commits into
amd-staging
from
amd/merge/upstream_merge_20251115172500
Nov 16, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… __builtin_elementwise_sqrt (llvm#168057) Followup to llvm#165682
…168084) In llvm#165748 constant expressions were allowed in `collectPossibleValues` because we are still using insertelement + shufflevector idioms to represent a scalable vector splat. However, it also accepts some unresolved constants like ptrtoint of globals or pointer difference between two globals. Absolutely we can ask the user to check this case with the constant folding API. However, since we don't observe the real-world usefulness of handling constant expressions, I decide to be more conservative and only handle immediate constants in the helper function. With this patch, we don't need to touch the SimplifyCFG part, as the values can only be either ConstantInt or undef/poison values (NB: switch on undef condition is UB). Fix the miscompilation reported by llvm#165748 (comment)
These tests were only checking the specialized prefix, leaving common code unchecked (and incorrect). Checked code was also not using patterns for SSA values.
Construct SCEVs for VPWidenIntOrFpInductionRecipe analogous to VPCanonicalInductionPHIRecipe: create an AddRec with start + step from the recipe. Currently the only impact should be computing more costs of replicating stores directly in VPlan.
…lvm#167918) Use clang linker wrapper to device-link and embed HIP fat binary directly. Match CUDA non-RDC flow in new driver by producing .hipfb like .fatbin. Previously, llvm offload binary is used to package the device IR's and embed them in the host object file, then clang linker wrapper is used with each host object file to extract device IR's, perform device linking, bundle code objects into a fat binary, wrap it in a host object file, then merge it with the original host object by the host linker with '-r' option. However, the host linker in MSVC toolchain does not support '-r' option. The new approach still package the device IR's with llvm offload binary, but instead of embed it in a host object, it is passed to clang linker wrapper directly, where device IR's are extracted and linked, fat binary is generated, then embeded in the host object directly. Compared with the old offload driver, this approach can parallelize the device linking for different GPU's by using the parallelization feature of clang linker wrapper. Fixes: SWDEV-565994
Only check up to CtxI (CtxIter) when checking for calls that may free in CtxI's block. Missed update in llvm#167965. This should be NFC, as all current callers pass a terminator that is guaranteed to not free as CtxI
…7736) As in title. AVX10.x doesn't distinguish between available vector lengths. -mattr=avx10.x-512 and defining of macros with _512 is kept for compatibility. Bit-positions of avx10.1/2 features in compiler-rt and X86TargetParser are synced to match those in the gcc.
…lvm#168128) When shrinking and/or to bitset* remove leftover implicit scc def. bitset* instructions do not set scc. Signed-off-by: John Lu <John.Lu@amd.com>
The section headers present in the DBI stream got lost when using `pdb2yaml` and `yaml2pdb`. They are a list of COFF section headers. The `llvm::object::coff_section` didn't have a YAML mapping, so I added one in llvm-pdbutil. The mapping for COFF sections in ObjectYAML includes the section data itself, so we can't use it here. Creation of the section map and headers in yaml2pdb is done like in LLD: https://github.com/llvm/llvm-project/blob/438a18c1e105ca04e624239644195e48b28b5099/lld/COFF/PDB.cpp#L1695-L1703
This adds additional test coverage for folding FCMP uno (llvm#166823)
Identified with bugprone-unused-local-non-trivial-variable.
Identified with llvm-use-ranges.
Identified with readability-delete-null-pointer.
NumElts is alreadyof type int. Identified with readability-redundant-casting.
This patch is limited to single-word replacements to fix spelling and/or grammar to ease the review process. Punctuation and markdown fixes are specifically excluded.
Simplifies some tests which no do not need to pass TC, and future changes will require to always have a trip count available.
Found this issue llvm#167958 when adding these tests, thanks for the quick fix @clementval.
Serialize throw Doxygen comments for exceptions. Accepts both \throw and \throws.
…llvm#168223) Makes it so that a NamedSequenceOp can be directly applied to a Module, via a method `apply(...)`.
Without this patch, DenseMapBase::moveFrom() moves buckets and leaves the moved-from object in a zombie state. This patch teaches moveFrom() to call kill() so that the move-from object is in a known good state. This brings moveFrom()'s behavior in line with standard C++ move semantics. kill() is implemented so that it takes the fast path in the destructor -- both destroyAll() and deallocateBuckets().
Identified with llvm-use-ranges.
callOperatorDecl is already of type FunctionDecl *. Identified with readability-redundant-casting.
I found that in some performance scenarios, such as under O2, this pr can be helpful for a series of loading global variables.
sorry, this was my mistake
…filter (llvm#168226) The CIBestPractices.rst document uses `releases/*` as the branch name filter for push events. The actual release branch names match the pattern `release/*`.
These were added in llvm#165803.
Forward declare a couple of classes for simplicity, remove some unused headers, clean up a comment. Tested with check-all.
Add a new builtin function __builtin_bswapg. It works on any integral types that has a multiple of 16 bits as well as a single byte. Closes llvm#160266
…Mem2Reg (llvm#168066) This patch adds `ub` as a dependent dialect to `memref`, and uses `ub.poison` as the default value in `AllocaOp::getDefaultValue` for the mem2reg pass. This aligns the behavior of `mem2reg` with LLVM, where loading a value before having a value should be poison. --------- Signed-off-by: Fabian Mora <fabian.mora-cordero@amd.com>
…EPENDENCE_MASK (llvm#168221) TargetConstant nodes don't match TableGen ImmLeaf patterns during instruction selection. When this zero constant flows into the AArch64 CCMP formation code, the machine verifier hits an assertion in expensive checks. Fixes: llvm#168227
Replace the assert checking if CurrentLinkI is a CmpInst with a pattern matching check in the if condition. This uses VPlan-level pattern matching instead of inspecting the underlying instruction type.
This patch groups public functions, including the constructors, the destructor, and the copy/move assignment operators.
llvm::all_of already returns bool. Identified with readability-redundant-casting.
Collaborator
dpalermo
approved these changes
Nov 16, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.