Skip to content

Conversation

@ronlieb
Copy link
Collaborator

@ronlieb ronlieb commented Nov 21, 2025

No description provided.

arsenm and others added 30 commits November 20, 2025 11:10
Avoids regression which caused the revert 6d5f87f.

This is a hack on a hack. We currently have isUniformMMO,
which improperly treats unknown source value as known uniform.
This is hack from before we had divergence information in the
DAG, and should be removed. This is the minimum change to avoid
the regression; removing the aggressive handling of the unknown
case (or dropping isUniformMMO entirely) are more involved fixes.
Upstream ExtVectorElementExpr with rvalue base
…168447)

We can't do anything meaningful to such functions: they aren't optimizable, and even if inlined, they would bring no code open to optimization.
To make life easier for future contributors. Note that formatting
changes are due to git clang-format on the touched whitespace-error
lines.
test/Lower/select-case-statement.f90 was still using the old lowering.
Modified the test with FIR generated using the new lowering. Changed the
test to use flang_fc1 instead of bbc and added testing for -O0 and -O1,
since character comparison lowering is done differently at -O0 (uses
runtime function) and -O1 (inlines some cases). Use different FileCheck
prefixes for different optimization levels (CHECK-O0 for -O0, CHECK-O1
for -O1, CHECK for both).
…m#168292) (llvm#168786)

This reverts commit 6d5f87f.

Previously this failed due to treating the unknown MachineMemOperand
value as known uniform.
…lvm#165416)

This PR introduces new debug macros that allow a more fined control of
which debug message to output and introduce C++ stream style for debug
messages.

Changing existing messages (except a few that I changed for testing)
will come in subsequent PRs.

I also think that we should make debug enabling OpenMP agnostic but, for
now, I prioritized maintaing the current libomptarget behavior for now,
and we might need more changes further down the line as we we decouple
libomptarget.
)

If `HardwareBreakpointTestBase.supports_hw_breakpoints()` returns False,
`SimpleHWBreakpointTest.does_not_support_hw_breakpoints()` returns None,
so the test runs and fails. However, it should be skipped instead.

The test was added in llvm#146602, while `supports_hw_breakpoints()` was
changed in llvm#146609, which was landed earlier despite having a bigger
number.
…econstructing DIE names (llvm#168734)

Depends on: 
* llvm#168725

When compiling with `-glldb`, we repoint the `DW_AT_type` of a DIE to be
a typedef that refers to the `preferred_name`. I.e.,:
```
template <typename T> structure t7;
using t7i = t7<int>;
template <typename T> struct __attribute__((__preferred_name__(t7i))) t7 {};
template <typename... Ts> void f1()

int main() { f1<t7i>(); }
```
would produce following (minified) DWARF:
```
DW_TAG_subprogram
  DW_AT_name      ("_STN|f1|<t7<int> >")
  DW_TAG_template_type_parameter
    DW_AT_type  (0x0000299c "t7i")
...
DW_TAG_typedef
  DW_AT_type      (0x000029a7 "t7<int>")
  DW_AT_name      ("t7i")
```

Note how the `DW_AT_type` of the template parameter is a typedef itself
(instead of the canonical type). The `DWARFTypePrinter` would take the
`DW_AT_name` of this typedef when reconstructing the name of `f1`, so we
would end up with a verifier failure:
```
error: Simplified template DW_AT_name could not be reconstituted:
         original: f1<t7<int> >
    reconstituted: f1<t7i>
```

Fixing this allows us to un-XFAIL the `simplified-template-names.cpp`
test in `cross-project-tests`. Unfortunately this is only tested on
Darwin, where LLDB tuning is the default. AFAIK, there is no other case
where the template parameter type wouldn't be canonical.
Currently the tests for LLVM targets `AArch64` and `ARM` were in the
same directory. But if you only configured LLVM for one target (e.g.,
just `AArch64`, which is how I ran into this), then all tests under the
ARM directory are marked `UNSUPPORTED`.

This patch moves all the tests that are capable of running on
`AArch64`-only targets into a dedicated `AArch64` directory. The tests
that expected a plain `ARM` target were kept in the `ARM` directory.

Drive-by:
* Rename the `dummy-debug-map-amr64.map` to `dummy-debug-map-arm64.map`
(note the typo in `amr64`)
llvm#168619)

I've been working on some scripts that evaluate the parent and child
frame. It's been very annoying that the parent frame has a property but
not the child. So I've added this to the extensions, I would've
preferred to return None, but because the existing impl returns an
invalid SBFrame, so I'm conforming to that API.

```
(lldb) script
Python Interactive Interpreter. To exit, type 'quit()', 'exit()' or Ctrl-D.
>>> lldb.frame
frame #0: 0x0000555555555200 fib.out`main
>>> lldb.frame.parent
frame #1: 0x00007ffff782a610 libc.so.6`__libc_start_call_main + 128
>>> lldb.frame.parent.child
frame #0: 0x0000555555555200 fib.out`main
```
When downloading bazelisk/buildifier, we use curl, which still returns
exit code zero on HTTP 4xx errors unless we pass --fail. This patch adds
--fail flags so that error messages are more clear.
…lvm#168918)

We already know we're looking at BITREVERSE, we can match on the source
operand.
There are several places where we use `llvm::OwningArrayRef`. The
interface to this requires us to first construct temporary storage, then
allocate space and set the allocated memory to 0, then copy the values
we actually want into that memory, then move the array into place.
Instead we can just do it all inline in a single pass by using
`std::vector`. In one case we actually allocate a completely separate
container and then allocate + copy the data over because
`llvm::OwningArrayRef` does not (and can't) support `push_back`.

Note that `llvm::SmallVector` is not a suitable replacement here because
we rely on reference stability on move construction: when the outer
container reallocates, we need the the contents of the inner containers
to be fixed in memory, and `llvm::SmallVector` does not give us that
guarantee.
…vm#162952)

Pyright is an MIT-licensed static type checker and can be found at
    https://github.com/microsoft/pyright
there are also various integrations to use it as an LSP server in
various editors which is the main way I use it.

It's useful on our python scripts to detect issues such as where
functions are called with unexpected types or it's possible to access
obj.attr on an object that doesn't have that attribute. It can be used
without any configuration this config setting causes it to also report
issues with type hints that do not meet our python 3.8 minimum such as
this one from dap_server.py:
```
        init_commands: list[str],
```
subscripting the builtin type like that requires python 3.9 while the
3.8 equivalent is:
```
from typing import List
...
        init_commands: List[str],
```
In practice these scripts still work on 3.8 because type hints aren't
normally evaluated during normal execution but since we have a minimum,
we should fully comply with it.

Note: The error pyright reports for this particular issue isn't great:
```
error: Subscript for class "list" will generate runtime exception; enclose type expression in quotes
```
This is technically correct as it is possible to evaluate type hints at
runtime but I believe anything that would do so would also evaluate the
string form as well and still hit the runtime exception. A better
suggestion in this case would have been the 3.8 compatible `List[str]`.
However, it is better than silently passing code that doesn't confirm to
the minimum.
…lvm#168795)

This test explicitly sets the environment for a spawned process. Without
DYLD_LIBRARY_PATH, the spawned process may use a ASAN runtime other than
the one that was used by the parent process That other runtime library
may not work at all, or may not be in the default search path. Either
case can cause the spawned process to die before it makes it to main,
thus failing the test. The compiler-rt lit config sets the library path
variable
[here](https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/lit.common.cfg.py#L84)
(i.e. to ensure that just-built runtimes are used for tests, in the case
of a standalone compiler-rt build), and that is currently used by the
parent process but not the spawned ones.

My change only forwards the variable for Darwin (DYLD_LIBRARY_PATH), but
we **_ought_** to also forward the variable for other platforms.
However, it's not clear that there's any good way to plumb this into the
test, since some platforms actually have multiple library path variables
which would need to be forwarded (see: SunOS
[here](https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/lit.common.cfg.py#L102)).
I considered adding a substitution variable for the library path
variable, but that doesn't really work if there's multiple such
variables.
This reverts commit b725bdb.

This is still causing Darwin failures. There are six tests that are
still failing:
AddressSanitizer-x86_64-darwin.TestCases/Posix.deep_call_stack.cpp
AddressSanitizer-x86_64-darwin.TestCases.scariness_score_test.cpp
AddressSanitizer-x86_64h-darwin.TestCases/Posix.deep_call_stack.cpp
ORC-x86_64-darwin.TestCases/Darwin/x86-64.objc-imageinfo.S
UBSan-Minimal-x86_64-darwin.TestCases.test-darwin-interface.c
UBSan-Minimal-x86_64h-darwin.TestCases.test-darwin-interface.c

There are a couple failure modes:
1. deep_call_stack.cpp and scariness_score_test.cpp are failing due to
   ulimit issues that we have observed previously.
2. objc-imageinfo.S is failing in the x86 variant because I only updated
   the AArch64 variant.
3. test-darwin-interface.c is using subshells, so obviously fails with
   the internal shell. Also looks like this one did not run on my system
   due to it requiring x86_64 Darwin.
Add extra tests for over-eager tail-folding for tiny trip-count loops.

Reduced from llvm#167858.
This has been replaced by the MODULE.bazel file. Users can still use
their own WORKSPACE files, but they didn't inherit this file anyways.
Users should migrate to bzlmod as with bazel 9.x that is required.
Need to check if the non-schedulable phi parent node has unique
operands, if the incoming node has copyables, and the node is
commutative. Otherwise, there might be issues with the correct
calculation of the dependencies.

Fixes llvm#168589
…on iOS/Android (llvm#168821)

The tests added by llvm#163468 appear to be broken due to lack of libcxx support (?).

Marking unsupported everywhere for now since it passes on some platforms and fails on others, and
I don't know the full list.

Android fail: https://lab.llvm.org/buildbot/#/builders/186/builds/14106
…llvm#168289)

Remove `VPWidenPointerInductionRecipe::IsScalarAfterVectorization` and
replace it with `onlyScalarValuesUsed`. This removes the need to carry
state from the legacy cost model through VPlan, and the VPlan-based
analysis gives more accurate results, avoiding a number of extracts.

PR: llvm#168289
Removes unused headers or replaces them with headers that directly
provide the symbol instead. For example, `Serialize.h` included `AST.h`,
but it was actually `Serialize.cpp` that needed concept expressions, so
now it includes just `ExprConcepts.h`.
petrhosek and others added 24 commits November 20, 2025 20:44
These flags are not needed for building libc.
Mark ELF_ppc64_relocations.s as unsupported on SystemZ because of cross
build issue related to using dlsym for host symbols.
Test fails to resolve __tls_get_aadr on SystemZ host.

Co-authored-by: anoopkg6 <anoopkg6@github.com>
…lvm#168937)

This patch extracts the common logic for computing array element counts
from shape operands into a reusable helper function in CUFCommon.
These horizontal add/sub instructions are currently handled by
adding/subtracting tuples of the first operand, followed by tuples of
the second operand. This is not the correct semantics for the 256-bit
insructions: they process the first half of the first operand, then the
first half of the second operand, then the second half of the first
operand, and finally the second half of the second operand (trust me bro
[*]).

This patch fixes the issue by applying the "shards" functionality that
was added in llvm#167954, to handle
the top and bottom 128-bit "shards" in turn.

[*] clang/test/CodeGen/X86/avx2-builtins.c:
```
TEST_CONSTEXPR(match_v8si(_mm256_hadd_epi32(
    (__m256i)(__v8si){10, 20, 30, 40, 50, 60, 70, 80},
    (__m256i)(__v8si){5, 15, 25, 35, 45, 55, 65, 75}),
    30,70,20,60,110,150,100,140));
```
Add a low trip count test that is currently vectorized but unprofitable,
for llvm#167858.
…lvm#168609)

which reassigns scale operand in vgpr_32 register to agpr_32, not
permitted by instruction format. Reduced from ck.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Co-authored-by: theRonShark <ron.lieberman@amd.com>
Don't specifically target windows-msvc - the same goes for any windows
target; mingw doesn't have dlfcn.h either.
…#168900)

When comparing additions with the same base where one has `nsw`, the
following simplification can be performed:

```llvm
icmp slt/sgt/sle/sge (x + C1), (x +nsw C2)
=>
icmp slt/sgt/sle/sge C1, C2
```

Previously this was only done for `slt`. This patch extends it to the
`sgt`, `sle`, and `sge` predicates when either of the conditions hold:
- `C1 <= C2 && C1 >= 0`, or
- `C2 <= C1 && C1 <= 0`

This patch also handles the `C1 == C2` case, which was previously
excluded.

Proof: https://alive2.llvm.org/ce/z/LtmY4f
Remove all constraint propagation functions in Dependence Analysis.
Add dependency on headers with `in_addr` and `in_addr_t` type
definitions to ensure that these headers will be properly installed by
"install-libc" CMake target.
… a given tiled loop nest. (llvm#167634)

The existing `scf::tileAndFuseConsumerOfSlices` takes a list of slices
(and loops they are part of), tries to find the consumer of these slices
(all slices are expected to be the same consumer), and then tiles the
consumer into the loop nest using the `TilingInterface`. A more natural
way of doing consumer fusion is to just start from the consumer, look
for operands that are produced by the loop nest passed in as `loops`
(presumably these loops are generated by tiling, but that is not a
requirement for consumer fusion). Using the consumer you can find the
slices of the operands that are accessed within the loop which you can
then use to tile and fuse the consumer (using `TilingInterface`). This
handles more naturally the case where multiple operands of the consumer
come from the loop nest.

The `scf::tileAndFuseConsumerOfSlices` was implemented as a mirror of
`scf::tileAndFuseProducerOfSlice`. For the latter, the slice has a
single producer for the source of the slice, which makes it a natural
way of specifying producer fusion. But for consumers, the result might
have multiple users, resulting in multiple candidates for fusion, as
well as a fusion candidate using multiple results from the tiled loop
nest. This means using slices
(`tensor.insert_slice`/`tensor.parallel_insert_slice`) as a hook for
consumer fusion turns out to be quite hard to navigate. The use of the
consumer directly avoids all those pain points. In time the
`scf::tileAndFuseConsumerOfSlices` should be deprecated in favor of
`scf::tileAndFuseConsumer`. There is a lot of tech-debt that has
accumulated in `scf::tileAndFuseConsumerOfSlices` that needs to be
cleanedup. So while that gets cleaned up, and required functionality is
moved to `scf::tileAndFuseConsumer`, the old path is still maintained.

The test for `scf::tileAndFuseConsumerUsingSlices` is copied to
`tile-and-fuse-consumer.mlir` to
`tile-and-fuse-consumer-using-slices.mlir`. All the tests that were
there in this file are now using the `tileAndFuseConsumer` method. The
test op `test.tile_and_fuse_consumer` is modified to call
`scf::tileAndFuseConsumer`, while a new op
`test.tile_and_fuse_consumer_of_slice` is used to keep the old path
tested while it is deprecated.

---------

Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
Add declaration of command line options to BugDriver.h and remove extern
declarations in individual .cpp files.
llvm#156577)

Add detailed comments explaining each function's memory access patterns
and why they should/shouldn't be unroll-and-jammed:

- fore_aft_*: Dependencies between fore block and aft block
- fore_sub_*: Dependencies between fore block and sub block
- sub_aft_*: Dependencies between sub block and aft block
- sub_sub_*: Dependencies within sub block

- *_less: Backward dependency (i-1) - safe for fore/aft, fore/sub,
sub/aft; unsafe for sub/sub due to jamming conflicts
- *_eq: Same iteration dependency (i+0) - safe due to preserved
execution order
- *_more: Forward dependency (i+1) - unsafe due to write-after-write
races between unrolled iterations, except sub/sub case creates conflicts
…168962)

The only thing the docs should depend on is on the SWIG wrapper
(lldb.py) which only requires parsing the API headers. It should not
depend on building libLLDB.

The dependency was (I believe accidentally) introduced by 59f4267.

Fixes llvm#123316
…vm#168930)

Removes about 200 bytes of unneeded patterns from RISCVGenDAGISel.inc
…llvm#168932)

This removes an unnecessary isel pattern for the RV32 HwMode.
…llvm#168957)

On startup, bazel prints: `WARNING: Option
'experimental_guard_against_concurrent_changes' is deprecated: Use
--guard_against_concurrent_changes instead`
@ronlieb ronlieb requested review from a team and dpalermo November 21, 2025 00:24
@z1-cciauto
Copy link
Collaborator

@ronlieb ronlieb closed this Nov 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.