Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions src/coreclr/jit/gentree.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7233,6 +7233,34 @@ bool GenTree::OperMayThrow(Compiler* comp)
return OperExceptions(comp) != ExceptionSetFlags::None;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if this is user written ConditionalSelect and contains something that throws in op1 or op3? Should we check for those as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also OperMayThrow should be kept in sync with OperExceptions. It does not make sense for the latter to return an empty set of the former returns true.

Copy link
Member

@jakobbotsch jakobbotsch Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if this is user written ConditionalSelect and contains something that throws in op1 or op3? Should we check for those as well?

These methods like OperMayThrow, OperRequiresCallFlag etc. do not consider operand nodes. They are meant to recompute the flag for the node itself only.

I would rather see us introducing a new variant of OperMayThrow that takes containment into account and which may be used in the rare case that the backend needs it. It should not need hardcoding to a specific oper type like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather see us introducing a new variant of OperMayThrow

Done. Not keen on the new name (OperOrEmbeddedChildrenMayThrow())

that takes containment into account and which may be used in the rare case that the backend needs it. It should not need hardcoding to a specific oper type like this.

Specifically, it still checks IsEmbeddingMaskOp() and not containment (as there is a period where the nodes are not yet contained).

What if this is user written ConditionalSelect and contains something that throws in op1 or op3? Should we check for those as well?

Also done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, it still checks IsEmbeddingMaskOp() and not containment (as there is a period where the nodes are not yet contained).

The liveness that is using this runs after lowering. After lowering, all the containment checks should be done. If we are not properly marking these operands as contained during lowering then that seems like a bug. Have we actually introduced another concept of "containment" called "embedding" here?

Copy link
Contributor Author

@a74nh a74nh Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I interpreted this comment was that I shouldn't be relying on the changes that happen in lowering. Re-reading now I'm not sure if I still agree.

I'm happy to remove the new GTF_HW_EMBEDDING_OP flag and instead OperOrEmbeddedChildrenMayThrow() would check nodes based on containing.

??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to remove the new GTF_HW_EMBEDDING_OP flag and instead OperOrEmbeddedChildrenMayThrow() would check nodes based on containing.

Fixed it up to work that way

Copy link
Member

@jakobbotsch jakobbotsch Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I upvoted that because changing OperMayThrow to check child nodes is outside the contract of what OperMayThrow is supposed to do. I think @tannergooding's comment was about the same thing. HIR uses this function to determine if the evaluation of a node itself may throw, and its contract is supposed to exclude child nodes (function header could be clearer on this point). For checking if a subtree rooted at a node may throw we have GTF_EXCEPT. So changing OperMayThrow could indeed be a pessimization for HIR.

However, in LIR things are different because of containment and because of how evaluation works. The flags no longer make sense there because the operands of the nodes are not actually its children, and are their own thing. The exception of course is containment; evaluation of a node with a contained operand does include the effects of its operand. So we need the spiritual equivalent of GTF_EXCEPT for LIR. I think it makes sense to introduce a version of OperMayThrow that does that. I would suggest calling it something like NodeOrContainedOperandsMayThrow().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @tannergooding's comment was about the same thing

Yep. There's quite a bit of nuance between LIR and HIR in some cases and the guarantees we have in place. We generally don't want to make modifications to general IR for something that is LIR only and instead want to have a Lowering::* specific function instead -or- try to work with what LIR may expect, such as by introducing a different LIR specific intrinsic ID (which is what we do for xarch in several cases).


//------------------------------------------------------------------------------
// NodeOrContainedOperandsMayThrow : Check whether the operation or any contained
// children will throw
//
// Arguments:
// comp - Compiler instance
//
// Return Value:
// True if the given operator or contained children may cause an exception
//
bool GenTree::NodeOrContainedOperandsMayThrow(Compiler* comp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in short term, this should only be for arm64. We are seeing diffs on x64 because this is also affecting x64.

{
if (OperMayThrow(comp))
{
return true;
}

// Check all contained children
for (GenTree* operand : Operands())
{
if (operand->isContained() && operand->NodeOrContainedOperandsMayThrow(comp))
{
return true;
}
}
return false;
}

//------------------------------------------------------------------------------
// OperRequiresGlobRefFlag : Check whether the operation requires GTF_GLOB_REF
// flag regardless of the children's flags.
Expand Down
1 change: 1 addition & 0 deletions src/coreclr/jit/gentree.h
Original file line number Diff line number Diff line change
Expand Up @@ -1910,6 +1910,7 @@ struct GenTree

ExceptionSetFlags OperExceptions(Compiler* comp);
bool OperMayThrow(Compiler* comp);
bool NodeOrContainedOperandsMayThrow(Compiler* comp);

bool OperRequiresGlobRefFlag(Compiler* comp) const;

Expand Down
6 changes: 6 additions & 0 deletions src/coreclr/jit/jit.h
Original file line number Diff line number Diff line change
Expand Up @@ -591,6 +591,11 @@ const bool dspGCtbls = true;
#define DISPTREERANGE(range, t) \
if (JitTls::GetCompiler()->verbose) \
JitTls::GetCompiler()->gtDispTreeRange(range, t);
#define LABELEDDISPTREERANGE(label, range, t) \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to remove this, but I do think it's useful. Any suggestions for a better name are welcome

JITDUMP(label ":\n"); \
if (JitTls::GetCompiler()->verbose) \
JitTls::GetCompiler()->gtDispTreeRange(range, t); \
JITDUMP("\n");
#define DISPBLOCK(b) \
if (JitTls::GetCompiler()->verbose) \
JitTls::GetCompiler()->fgTableDispBasicBlock(b);
Expand All @@ -609,6 +614,7 @@ const bool dspGCtbls = true;
#define DISPSTMT(t)
#define DISPRANGE(range)
#define DISPTREERANGE(range, t)
#define LABELEDDISPTREERANGE(title, range, t)
#define DISPBLOCK(b)
#define VERBOSE 0
#endif // !DEBUG
Expand Down
2 changes: 1 addition & 1 deletion src/coreclr/jit/liveness.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1624,7 +1624,7 @@ bool Compiler::fgTryRemoveNonLocal(GenTree* node, LIR::Range* blockRange)
// (as opposed to side effects of their children).
// This default case should never include calls or stores.
assert(!node->OperRequiresAsgFlag() && !node->OperIs(GT_CALL));
if (!node->gtSetFlags() && !node->OperMayThrow(this))
if (!node->gtSetFlags() && !node->NodeOrContainedOperandsMayThrow(this))
{
JITDUMP("Removing dead node:\n");
DISPNODE(node);
Expand Down
39 changes: 12 additions & 27 deletions src/coreclr/jit/lowerarmarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1948,9 +1948,7 @@ GenTree* Lowering::LowerHWIntrinsic(GenTreeHWIntrinsic* node)
if (HWIntrinsicInfo::IsEmbeddedMaskedOperation(intrinsicId))
{
LIR::Use use;
JITDUMP("lowering EmbeddedMasked HWIntrinisic (before):\n");
DISPTREERANGE(BlockRange(), node);
JITDUMP("\n");
LABELEDDISPTREERANGE("lowering EmbeddedMasked HWIntrinisic (before)", BlockRange(), node);

// Use lastOp to verify if it's a ConditionlSelectNode.
size_t lastOpNum = node->GetOperandCount();
Expand All @@ -1959,12 +1957,12 @@ GenTree* Lowering::LowerHWIntrinsic(GenTreeHWIntrinsic* node)
node->Op(lastOpNum)->AsHWIntrinsic()->GetHWIntrinsicId() == NI_Sve_ConditionalSelect &&
TryContainingCselOp(node, node->Op(lastOpNum)->AsHWIntrinsic()))
{
JITDUMP("lowering EmbeddedMasked HWIntrinisic (after):\n");
DISPTREERANGE(BlockRange(), node);
JITDUMP("\n");
LABELEDDISPTREERANGE("Contained conditional select", BlockRange(), node);
return node->gtNext;
}

// Wrap a conditional select around the embedded mask operation

CorInfoType simdBaseJitType = node->GetSimdBaseJitType();
unsigned simdSize = node->GetSimdSize();
var_types simdType = Compiler::getSIMDTypeForSize(simdSize);
Expand Down Expand Up @@ -1996,9 +1994,7 @@ GenTree* Lowering::LowerHWIntrinsic(GenTreeHWIntrinsic* node)
condSelNode->SetUnusedValue();
}

JITDUMP("lowering EmbeddedMasked HWIntrinisic (after):\n");
DISPTREERANGE(BlockRange(), condSelNode);
JITDUMP("\n");
LABELEDDISPTREERANGE("Embedded HWIntrinisic inside conditional select", BlockRange(), condSelNode);
}

ContainCheckHWIntrinsic(node);
Expand Down Expand Up @@ -4116,8 +4112,7 @@ GenTree* Lowering::LowerHWIntrinsicCndSel(GenTreeHWIntrinsic* cndSelNode)

if (op2->OperIsHWIntrinsic(NI_Sve_ConditionalSelect))
{
// Handle cases where there is a nested ConditionalSelect for
// `trueValue`
// Handle cases where there is a nested ConditionalSelect for `trueValue`
GenTreeHWIntrinsic* nestedCndSel = op2->AsHWIntrinsic();
GenTree* nestedOp1 = nestedCndSel->Op(1);
GenTree* nestedOp2 = nestedCndSel->Op(2);
Expand All @@ -4137,25 +4132,20 @@ GenTree* Lowering::LowerHWIntrinsicCndSel(GenTreeHWIntrinsic* cndSelNode)
GenTree* nestedOp2 = nestedCndSel->Op(2);
GenTree* nestedOp3 = nestedCndSel->Op(3);

JITDUMP("lowering nested ConditionalSelect HWIntrinisic (before):\n");
DISPTREERANGE(BlockRange(), cndSelNode);
JITDUMP("\n");
LABELEDDISPTREERANGE("Removed nested conditionalselect (before):", BlockRange(), cndSelNode);

// Transform:
//
// CndSel(mask, CndSel(AllTrue, embeddedMask(trueValOp2), trueValOp3), op3) to
// CndSel(mask, embedded(trueValOp2), op3)
// CndSel1(mask, CndSel2(AllTrue, embedded(), trueValOp3), op3) to
// CndSel1(mask, embedded(), op3)
//
cndSelNode->Op(2) = nestedCndSel->Op(2);
nestedOp3->SetUnusedValue();

BlockRange().Remove(nestedOp1);
BlockRange().Remove(nestedCndSel);

JITDUMP("lowering nested ConditionalSelect HWIntrinisic (after):\n");
DISPTREERANGE(BlockRange(), cndSelNode);
JITDUMP("\n");

LABELEDDISPTREERANGE("Removed nested conditionalselect (after)", BlockRange(), cndSelNode);
return cndSelNode;
}
}
Expand All @@ -4166,9 +4156,7 @@ GenTree* Lowering::LowerHWIntrinsicCndSel(GenTreeHWIntrinsic* cndSelNode)
if (!op2->OperIsHWIntrinsic() ||
!HWIntrinsicInfo::IsEmbeddedMaskedOperation(op2->AsHWIntrinsic()->GetHWIntrinsicId()))
{
JITDUMP("lowering ConditionalSelect HWIntrinisic (before):\n");
DISPTREERANGE(BlockRange(), cndSelNode);
JITDUMP("\n");
LABELEDDISPTREERANGE("Lowered ConditionalSelect(True, op2, op3) to op2 (before)", BlockRange(), cndSelNode);

// Transform
// CndSel(AllTrue, op2, op3) to
Expand All @@ -4190,10 +4178,7 @@ GenTree* Lowering::LowerHWIntrinsicCndSel(GenTreeHWIntrinsic* cndSelNode)
GenTree* next = cndSelNode->gtNext;
BlockRange().Remove(cndSelNode);

JITDUMP("lowering ConditionalSelect HWIntrinisic (after):\n");
DISPTREERANGE(BlockRange(), op2);
JITDUMP("\n");

LABELEDDISPTREERANGE("Lowered ConditionalSelect(True, op2, op3) to op2 (after)", BlockRange(), op2);
return next;
}
}
Expand Down
84 changes: 84 additions & 0 deletions src/tests/JIT/opt/SVE/EmbeddedLoads.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

// Unit tests for the masks conversion optimization
// Uses vectors as masks and vice versa.

using System;
using System.Numerics;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Runtime.Intrinsics;
using System.Runtime.Intrinsics.Arm;
using System.Threading;
using Xunit;

public class EmbeddedLoads
{
[MethodImpl(MethodImplOptions.NoInlining)]
private static void Consume<T>(T value) { }

[Fact]
public static void TestEntryPoint()
{

if (Sve.IsSupported)
{
int[] array = new int[10];

Vector<int> op1 = Vector.Create<int>(11);
Vector<int> op2 = Vector.Create<int>(22);
Vector<int> op3 = Vector.Create<int>(33);
Vector<long> opl1 = Vector.Create<long>(44);
Vector<long> opl2 = Vector.Create<long>(55);

CndSelectEmbeddedOp3LoadTrueMask(array, op1);
CndSelectEmbeddedOp3LoadAllBits(array, op1);
CndSelectEmbeddedOp3LoadFalseMask(array, op1);
CndSelectEmbeddedOp3LoadZero(array, op1);
}
}

// SVE load operation with embedded mask inside a conditional select

[MethodImpl(MethodImplOptions.NoInlining)]
static unsafe void CndSelectEmbeddedOp3LoadTrueMask(int[] array, Vector<int> op1) {
//ARM6-FULL-LINE: ldnf1w { {{z[0-9]+}}.s }, {{p[0-9]+}}/m, [{{x[0-9]+}}]
fixed (int* arr_ptr = array)
{
var result = Sve.ConditionalSelect(Sve.CreateTrueMaskInt32(), op1, Sve.LoadVectorNonFaulting(arr_ptr));
Consume(result);
}
}

[MethodImpl(MethodImplOptions.NoInlining)]
static unsafe void CndSelectEmbeddedOp3LoadAllBits(int[] array, Vector<int> op1) {
//ARM6-FULL-LINE: ldnf1w { {{z[0-9]+}}.s }, {{p[0-9]+}}/m, [{{x[0-9]+}}]
fixed (int* arr_ptr = array)
{
var result = Sve.ConditionalSelect(Vector<int>.AllBitsSet, op1, Sve.LoadVectorNonFaulting(arr_ptr));
Consume(result);
}
}

[MethodImpl(MethodImplOptions.NoInlining)]
static unsafe void CndSelectEmbeddedOp3LoadFalseMask(int[] array, Vector<int> op1) {
//ARM6-FULL-LINE: ldnf1w { {{z[0-9]+}}.s }, {{p[0-9]+}}/m, [{{x[0-9]+}}]
fixed (int* arr_ptr = array)
{
var result = Sve.ConditionalSelect(Sve.CreateFalseMaskInt32(), op1, Sve.LoadVectorNonFaulting(arr_ptr));
Consume(result);
}
}

[MethodImpl(MethodImplOptions.NoInlining)]
static unsafe void CndSelectEmbeddedOp3LoadZero(int[] array, Vector<int> op1) {
//ARM6-FULL-LINE: ldnf1w { {{z[0-9]+}}.s }, {{p[0-9]+}}/m, [{{x[0-9]+}}]
fixed (int* arr_ptr = array)
{
var result = Sve.ConditionalSelect(Vector<int>.Zero, op1, Sve.LoadVectorNonFaulting(arr_ptr));
Consume(result);
}
}

}
20 changes: 20 additions & 0 deletions src/tests/JIT/opt/SVE/EmbeddedLoads.csproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<!-- Needed for CLRTestEnvironmentVariable -->
<RequiresProcessIsolation>true</RequiresProcessIsolation>
</PropertyGroup>
<PropertyGroup>
<DebugType>None</DebugType>
<Optimize>True</Optimize>
<NoWarn>$(NoWarn),SYSLIB5003</NoWarn>
</PropertyGroup>
<ItemGroup>
<Compile Include="$(MSBuildProjectName).cs">
<HasDisasmCheck>true</HasDisasmCheck>
</Compile>

<CLRTestEnvironmentVariable Include="DOTNET_TieredCompilation" Value="0" />
<CLRTestEnvironmentVariable Include="DOTNET_JITMinOpts" Value="0" />
<CLRTestEnvironmentVariable Include="DOTNET_EnableHWIntrinsic" Value="1" />
</ItemGroup>
</Project>
Loading