Skip to content

Conversation

@ronlieb
Copy link
Collaborator

@ronlieb ronlieb commented Nov 15, 2025

No description provided.

RKSimon and others added 30 commits November 15, 2025 12:42
…168084)

In llvm#165748 constant expressions
were allowed in `collectPossibleValues` because we are still using
insertelement + shufflevector idioms to represent a scalable vector
splat. However, it also accepts some unresolved constants like ptrtoint
of globals or pointer difference between two globals. Absolutely we can
ask the user to check this case with the constant folding API. However,
since we don't observe the real-world usefulness of handling constant
expressions, I decide to be more conservative and only handle immediate
constants in the helper function. With this patch, we don't need to
touch the SimplifyCFG part, as the values can only be either ConstantInt
or undef/poison values (NB: switch on undef condition is UB).

Fix the miscompilation reported by
llvm#165748 (comment)
These tests were only checking the specialized prefix, leaving common
code unchecked (and incorrect). Checked code was also not using patterns
for SSA values.
Construct SCEVs for VPWidenIntOrFpInductionRecipe analogous to
VPCanonicalInductionPHIRecipe: create an AddRec with start + step from
the recipe.

Currently the only impact should be computing more costs of replicating
stores directly in VPlan.
…lvm#167918)

Use clang linker wrapper to device-link and embed HIP fat binary
directly. Match CUDA non-RDC flow in new driver by producing .hipfb like
.fatbin.

Previously, llvm offload binary is used to package the device IR's and
embed them in the host object file, then clang linker wrapper is used
with each host object file to extract device IR's, perform device
linking, bundle code objects into a fat binary, wrap it in a host object
file, then merge it with the original host object by the host linker
with '-r' option. However, the host linker in MSVC toolchain does not
support '-r' option.

The new approach still package the device IR's with llvm offload binary,
but instead of embed it in a host object, it is passed to clang linker
wrapper directly, where device IR's are extracted and linked, fat binary
is generated, then embeded in the host object directly. Compared with
the old offload driver, this approach can parallelize the device linking
for different GPU's by using the parallelization feature of clang linker
wrapper.

Fixes: SWDEV-565994
Only check up to CtxI (CtxIter) when checking for calls that may free
in CtxI's block.

Missed update in llvm#167965.

This should be NFC, as all current callers pass a terminator that is
guaranteed to not free as CtxI
…7736)

As in title. AVX10.x doesn't distinguish between available vector
lengths.

-mattr=avx10.x-512 and defining of macros with _512 is kept for compatibility. 

Bit-positions of avx10.1/2 features in compiler-rt and X86TargetParser
are synced to match those in the gcc.
…lvm#168128)

When shrinking and/or to bitset* remove leftover implicit scc def.
bitset* instructions do not set scc.

Signed-off-by: John Lu <John.Lu@amd.com>
The section headers present in the DBI stream got lost when using
`pdb2yaml` and `yaml2pdb`.

They are a list of COFF section headers. The
`llvm::object::coff_section` didn't have a YAML mapping, so I added one
in llvm-pdbutil. The mapping for COFF sections in ObjectYAML includes
the section data itself, so we can't use it here.

Creation of the section map and headers in yaml2pdb is done like in LLD:
https://github.com/llvm/llvm-project/blob/438a18c1e105ca04e624239644195e48b28b5099/lld/COFF/PDB.cpp#L1695-L1703
This adds additional test coverage for folding FCMP uno
(llvm#166823)
Identified with bugprone-unused-local-non-trivial-variable.
Identified with llvm-use-ranges.
Identified with readability-delete-null-pointer.
NumElts is alreadyof type int.

Identified with readability-redundant-casting.
This patch is limited to single-word replacements to fix spelling
and/or grammar to ease the review process.  Punctuation and markdown
fixes are specifically excluded.
Simplifies some tests which no do not need to pass TC, and future
changes will require to always have a trip count available.
Found this issue llvm#167958 when adding these tests, thanks for the quick
fix @clementval.
Serialize throw Doxygen comments for exceptions. Accepts both \throw and
\throws.
…llvm#168223)

Makes it so that a NamedSequenceOp can be directly applied to a Module,
via a method `apply(...)`.
Without this patch, DenseMapBase::moveFrom() moves buckets and leaves
the moved-from object in a zombie state.  This patch teaches
moveFrom() to call kill() so that the move-from object is in a known
good state.  This brings moveFrom()'s behavior in line with standard
C++ move semantics.

kill() is implemented so that it takes the fast path in the destructor
-- both destroyAll() and deallocateBuckets().
Identified with llvm-use-ranges.
callOperatorDecl is already of type FunctionDecl *.

Identified with readability-redundant-casting.
I found that in some performance scenarios, such as under O2, this pr can be helpful for a series of loading global variables.
…filter (llvm#168226)

The CIBestPractices.rst document uses `releases/*` as the branch name
filter for push events. The actual release branch names match the
pattern `release/*`.
Forward declare a couple of classes for simplicity, remove some unused
headers, clean up a comment.

Tested with check-all.
clingfei and others added 8 commits November 15, 2025 12:27
Add a new builtin function __builtin_bswapg. It works on any integral
types that has a multiple of 16 bits as well as a single byte.

Closes llvm#160266
…Mem2Reg (llvm#168066)

This patch adds `ub` as a dependent dialect to `memref`, and uses
`ub.poison` as the default value in `AllocaOp::getDefaultValue` for the
mem2reg pass.

This aligns the behavior of `mem2reg` with LLVM, where loading a value
before having a value should be poison.

---------

Signed-off-by: Fabian Mora <fabian.mora-cordero@amd.com>
…EPENDENCE_MASK (llvm#168221)

TargetConstant nodes don't match TableGen ImmLeaf patterns during
instruction selection. When this zero constant flows into the AArch64
CCMP formation code, the machine verifier hits an assertion in expensive
checks.

Fixes: llvm#168227
Replace the assert checking if CurrentLinkI is a CmpInst with a pattern
matching check in the if condition. This uses VPlan-level pattern matching
instead of inspecting the underlying instruction type.
This patch groups public functions, including the constructors, the
destructor, and the copy/move assignment operators.
llvm::all_of already returns bool.

Identified with readability-redundant-casting.
@z1-cciauto
Copy link
Collaborator

@ronlieb ronlieb merged commit 878b266 into amd-staging Nov 16, 2025
6 of 7 checks passed
@ronlieb ronlieb deleted the amd/merge/upstream_merge_20251115172500 branch November 16, 2025 02:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.