forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 76
merge main into amd-staging #639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Avoids regression which caused the revert 6d5f87f. This is a hack on a hack. We currently have isUniformMMO, which improperly treats unknown source value as known uniform. This is hack from before we had divergence information in the DAG, and should be removed. This is the minimum change to avoid the regression; removing the aggressive handling of the unknown case (or dropping isUniformMMO entirely) are more involved fixes.
Upstream ExtVectorElementExpr with rvalue base
…168447) We can't do anything meaningful to such functions: they aren't optimizable, and even if inlined, they would bring no code open to optimization.
To make life easier for future contributors. Note that formatting changes are due to git clang-format on the touched whitespace-error lines.
test/Lower/select-case-statement.f90 was still using the old lowering. Modified the test with FIR generated using the new lowering. Changed the test to use flang_fc1 instead of bbc and added testing for -O0 and -O1, since character comparison lowering is done differently at -O0 (uses runtime function) and -O1 (inlines some cases). Use different FileCheck prefixes for different optimization levels (CHECK-O0 for -O0, CHECK-O1 for -O1, CHECK for both).
…m#168292) (llvm#168786) This reverts commit 6d5f87f. Previously this failed due to treating the unknown MachineMemOperand value as known uniform.
…lvm#165416) This PR introduces new debug macros that allow a more fined control of which debug message to output and introduce C++ stream style for debug messages. Changing existing messages (except a few that I changed for testing) will come in subsequent PRs. I also think that we should make debug enabling OpenMP agnostic but, for now, I prioritized maintaing the current libomptarget behavior for now, and we might need more changes further down the line as we we decouple libomptarget.
) If `HardwareBreakpointTestBase.supports_hw_breakpoints()` returns False, `SimpleHWBreakpointTest.does_not_support_hw_breakpoints()` returns None, so the test runs and fails. However, it should be skipped instead. The test was added in llvm#146602, while `supports_hw_breakpoints()` was changed in llvm#146609, which was landed earlier despite having a bigger number.
…econstructing DIE names (llvm#168734) Depends on: * llvm#168725 When compiling with `-glldb`, we repoint the `DW_AT_type` of a DIE to be a typedef that refers to the `preferred_name`. I.e.,: ``` template <typename T> structure t7; using t7i = t7<int>; template <typename T> struct __attribute__((__preferred_name__(t7i))) t7 {}; template <typename... Ts> void f1() int main() { f1<t7i>(); } ``` would produce following (minified) DWARF: ``` DW_TAG_subprogram DW_AT_name ("_STN|f1|<t7<int> >") DW_TAG_template_type_parameter DW_AT_type (0x0000299c "t7i") ... DW_TAG_typedef DW_AT_type (0x000029a7 "t7<int>") DW_AT_name ("t7i") ``` Note how the `DW_AT_type` of the template parameter is a typedef itself (instead of the canonical type). The `DWARFTypePrinter` would take the `DW_AT_name` of this typedef when reconstructing the name of `f1`, so we would end up with a verifier failure: ``` error: Simplified template DW_AT_name could not be reconstituted: original: f1<t7<int> > reconstituted: f1<t7i> ``` Fixing this allows us to un-XFAIL the `simplified-template-names.cpp` test in `cross-project-tests`. Unfortunately this is only tested on Darwin, where LLDB tuning is the default. AFAIK, there is no other case where the template parameter type wouldn't be canonical.
Currently the tests for LLVM targets `AArch64` and `ARM` were in the same directory. But if you only configured LLVM for one target (e.g., just `AArch64`, which is how I ran into this), then all tests under the ARM directory are marked `UNSUPPORTED`. This patch moves all the tests that are capable of running on `AArch64`-only targets into a dedicated `AArch64` directory. The tests that expected a plain `ARM` target were kept in the `ARM` directory. Drive-by: * Rename the `dummy-debug-map-amr64.map` to `dummy-debug-map-arm64.map` (note the typo in `amr64`)
llvm#168619) I've been working on some scripts that evaluate the parent and child frame. It's been very annoying that the parent frame has a property but not the child. So I've added this to the extensions, I would've preferred to return None, but because the existing impl returns an invalid SBFrame, so I'm conforming to that API. ``` (lldb) script Python Interactive Interpreter. To exit, type 'quit()', 'exit()' or Ctrl-D. >>> lldb.frame frame #0: 0x0000555555555200 fib.out`main >>> lldb.frame.parent frame #1: 0x00007ffff782a610 libc.so.6`__libc_start_call_main + 128 >>> lldb.frame.parent.child frame #0: 0x0000555555555200 fib.out`main ```
When downloading bazelisk/buildifier, we use curl, which still returns exit code zero on HTTP 4xx errors unless we pass --fail. This patch adds --fail flags so that error messages are more clear.
…lvm#168918) We already know we're looking at BITREVERSE, we can match on the source operand.
There are several places where we use `llvm::OwningArrayRef`. The interface to this requires us to first construct temporary storage, then allocate space and set the allocated memory to 0, then copy the values we actually want into that memory, then move the array into place. Instead we can just do it all inline in a single pass by using `std::vector`. In one case we actually allocate a completely separate container and then allocate + copy the data over because `llvm::OwningArrayRef` does not (and can't) support `push_back`. Note that `llvm::SmallVector` is not a suitable replacement here because we rely on reference stability on move construction: when the outer container reallocates, we need the the contents of the inner containers to be fixed in memory, and `llvm::SmallVector` does not give us that guarantee.
…pInterface.cpp (NFC)
…vm#162952) Pyright is an MIT-licensed static type checker and can be found at https://github.com/microsoft/pyright there are also various integrations to use it as an LSP server in various editors which is the main way I use it. It's useful on our python scripts to detect issues such as where functions are called with unexpected types or it's possible to access obj.attr on an object that doesn't have that attribute. It can be used without any configuration this config setting causes it to also report issues with type hints that do not meet our python 3.8 minimum such as this one from dap_server.py: ``` init_commands: list[str], ``` subscripting the builtin type like that requires python 3.9 while the 3.8 equivalent is: ``` from typing import List ... init_commands: List[str], ``` In practice these scripts still work on 3.8 because type hints aren't normally evaluated during normal execution but since we have a minimum, we should fully comply with it. Note: The error pyright reports for this particular issue isn't great: ``` error: Subscript for class "list" will generate runtime exception; enclose type expression in quotes ``` This is technically correct as it is possible to evaluate type hints at runtime but I believe anything that would do so would also evaluate the string form as well and still hit the runtime exception. A better suggestion in this case would have been the 3.8 compatible `List[str]`. However, it is better than silently passing code that doesn't confirm to the minimum.
…lvm#168795) This test explicitly sets the environment for a spawned process. Without DYLD_LIBRARY_PATH, the spawned process may use a ASAN runtime other than the one that was used by the parent process That other runtime library may not work at all, or may not be in the default search path. Either case can cause the spawned process to die before it makes it to main, thus failing the test. The compiler-rt lit config sets the library path variable [here](https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/lit.common.cfg.py#L84) (i.e. to ensure that just-built runtimes are used for tests, in the case of a standalone compiler-rt build), and that is currently used by the parent process but not the spawned ones. My change only forwards the variable for Darwin (DYLD_LIBRARY_PATH), but we **_ought_** to also forward the variable for other platforms. However, it's not clear that there's any good way to plumb this into the test, since some platforms actually have multiple library path variables which would need to be forwarded (see: SunOS [here](https://github.com/llvm/llvm-project/blob/main/compiler-rt/test/lit.common.cfg.py#L102)). I considered adding a substitution variable for the library path variable, but that doesn't really work if there's multiple such variables.
This reverts commit b725bdb. This is still causing Darwin failures. There are six tests that are still failing: AddressSanitizer-x86_64-darwin.TestCases/Posix.deep_call_stack.cpp AddressSanitizer-x86_64-darwin.TestCases.scariness_score_test.cpp AddressSanitizer-x86_64h-darwin.TestCases/Posix.deep_call_stack.cpp ORC-x86_64-darwin.TestCases/Darwin/x86-64.objc-imageinfo.S UBSan-Minimal-x86_64-darwin.TestCases.test-darwin-interface.c UBSan-Minimal-x86_64h-darwin.TestCases.test-darwin-interface.c There are a couple failure modes: 1. deep_call_stack.cpp and scariness_score_test.cpp are failing due to ulimit issues that we have observed previously. 2. objc-imageinfo.S is failing in the x86 variant because I only updated the AArch64 variant. 3. test-darwin-interface.c is using subshells, so obviously fails with the internal shell. Also looks like this one did not run on my system due to it requiring x86_64 Darwin.
Add extra tests for over-eager tail-folding for tiny trip-count loops. Reduced from llvm#167858.
This has been replaced by the MODULE.bazel file. Users can still use their own WORKSPACE files, but they didn't inherit this file anyways. Users should migrate to bzlmod as with bazel 9.x that is required.
Need to check if the non-schedulable phi parent node has unique operands, if the incoming node has copyables, and the node is commutative. Otherwise, there might be issues with the correct calculation of the dependencies. Fixes llvm#168589
…on iOS/Android (llvm#168821) The tests added by llvm#163468 appear to be broken due to lack of libcxx support (?). Marking unsupported everywhere for now since it passes on some platforms and fails on others, and I don't know the full list. Android fail: https://lab.llvm.org/buildbot/#/builders/186/builds/14106
…llvm#168289) Remove `VPWidenPointerInductionRecipe::IsScalarAfterVectorization` and replace it with `onlyScalarValuesUsed`. This removes the need to carry state from the legacy cost model through VPlan, and the VPlan-based analysis gives more accurate results, avoiding a number of extracts. PR: llvm#168289
Removes unused headers or replaces them with headers that directly provide the symbol instead. For example, `Serialize.h` included `AST.h`, but it was actually `Serialize.cpp` that needed concept expressions, so now it includes just `ExprConcepts.h`.
These flags are not needed for building libc.
Mark ELF_ppc64_relocations.s as unsupported on SystemZ because of cross build issue related to using dlsym for host symbols. Test fails to resolve __tls_get_aadr on SystemZ host. Co-authored-by: anoopkg6 <anoopkg6@github.com>
…lvm#168937) This patch extracts the common logic for computing array element counts from shape operands into a reusable helper function in CUFCommon.
These horizontal add/sub instructions are currently handled by adding/subtracting tuples of the first operand, followed by tuples of the second operand. This is not the correct semantics for the 256-bit insructions: they process the first half of the first operand, then the first half of the second operand, then the second half of the first operand, and finally the second half of the second operand (trust me bro [*]). This patch fixes the issue by applying the "shards" functionality that was added in llvm#167954, to handle the top and bottom 128-bit "shards" in turn. [*] clang/test/CodeGen/X86/avx2-builtins.c: ``` TEST_CONSTEXPR(match_v8si(_mm256_hadd_epi32( (__m256i)(__v8si){10, 20, 30, 40, 50, 60, 70, 80}, (__m256i)(__v8si){5, 15, 25, 35, 45, 55, 65, 75}), 30,70,20,60,110,150,100,140)); ```
Add a low trip count test that is currently vectorized but unprofitable, for llvm#167858.
…lvm#168609) which reassigns scale operand in vgpr_32 register to agpr_32, not permitted by instruction format. Reduced from ck. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com> Co-authored-by: theRonShark <ron.lieberman@amd.com>
Don't specifically target windows-msvc - the same goes for any windows target; mingw doesn't have dlfcn.h either.
…#168900) When comparing additions with the same base where one has `nsw`, the following simplification can be performed: ```llvm icmp slt/sgt/sle/sge (x + C1), (x +nsw C2) => icmp slt/sgt/sle/sge C1, C2 ``` Previously this was only done for `slt`. This patch extends it to the `sgt`, `sle`, and `sge` predicates when either of the conditions hold: - `C1 <= C2 && C1 >= 0`, or - `C2 <= C1 && C1 <= 0` This patch also handles the `C1 == C2` case, which was previously excluded. Proof: https://alive2.llvm.org/ce/z/LtmY4f
Remove all constraint propagation functions in Dependence Analysis.
Add dependency on headers with `in_addr` and `in_addr_t` type definitions to ensure that these headers will be properly installed by "install-libc" CMake target.
… a given tiled loop nest. (llvm#167634) The existing `scf::tileAndFuseConsumerOfSlices` takes a list of slices (and loops they are part of), tries to find the consumer of these slices (all slices are expected to be the same consumer), and then tiles the consumer into the loop nest using the `TilingInterface`. A more natural way of doing consumer fusion is to just start from the consumer, look for operands that are produced by the loop nest passed in as `loops` (presumably these loops are generated by tiling, but that is not a requirement for consumer fusion). Using the consumer you can find the slices of the operands that are accessed within the loop which you can then use to tile and fuse the consumer (using `TilingInterface`). This handles more naturally the case where multiple operands of the consumer come from the loop nest. The `scf::tileAndFuseConsumerOfSlices` was implemented as a mirror of `scf::tileAndFuseProducerOfSlice`. For the latter, the slice has a single producer for the source of the slice, which makes it a natural way of specifying producer fusion. But for consumers, the result might have multiple users, resulting in multiple candidates for fusion, as well as a fusion candidate using multiple results from the tiled loop nest. This means using slices (`tensor.insert_slice`/`tensor.parallel_insert_slice`) as a hook for consumer fusion turns out to be quite hard to navigate. The use of the consumer directly avoids all those pain points. In time the `scf::tileAndFuseConsumerOfSlices` should be deprecated in favor of `scf::tileAndFuseConsumer`. There is a lot of tech-debt that has accumulated in `scf::tileAndFuseConsumerOfSlices` that needs to be cleanedup. So while that gets cleaned up, and required functionality is moved to `scf::tileAndFuseConsumer`, the old path is still maintained. The test for `scf::tileAndFuseConsumerUsingSlices` is copied to `tile-and-fuse-consumer.mlir` to `tile-and-fuse-consumer-using-slices.mlir`. All the tests that were there in this file are now using the `tileAndFuseConsumer` method. The test op `test.tile_and_fuse_consumer` is modified to call `scf::tileAndFuseConsumer`, while a new op `test.tile_and_fuse_consumer_of_slice` is used to keep the old path tested while it is deprecated. --------- Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
Add declaration of command line options to BugDriver.h and remove extern declarations in individual .cpp files.
Reverts llvm#168921 Causes build failures.
llvm#156577) Add detailed comments explaining each function's memory access patterns and why they should/shouldn't be unroll-and-jammed: - fore_aft_*: Dependencies between fore block and aft block - fore_sub_*: Dependencies between fore block and sub block - sub_aft_*: Dependencies between sub block and aft block - sub_sub_*: Dependencies within sub block - *_less: Backward dependency (i-1) - safe for fore/aft, fore/sub, sub/aft; unsafe for sub/sub due to jamming conflicts - *_eq: Same iteration dependency (i+0) - safe due to preserved execution order - *_more: Forward dependency (i+1) - unsafe due to write-after-write races between unrolled iterations, except sub/sub case creates conflicts
…168962) The only thing the docs should depend on is on the SWIG wrapper (lldb.py) which only requires parsing the API headers. It should not depend on building libLLDB. The dependency was (I believe accidentally) introduced by 59f4267. Fixes llvm#123316
…vm#168930) Removes about 200 bytes of unneeded patterns from RISCVGenDAGISel.inc
…llvm#168932) This removes an unnecessary isel pattern for the RV32 HwMode.
…llvm#168957) On startup, bazel prints: `WARNING: Option 'experimental_guard_against_concurrent_changes' is deprecated: Use --guard_against_concurrent_changes instead`
Collaborator
dpalermo
approved these changes
Nov 21, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.