Skip to content

[BoundsSafety] Allow evaluating the same CLE in a bounds-check #11175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: next
Choose a base branch
from

Conversation

patrykstefanski
Copy link

BoundsSafety can reuse the same CompoundLiteralExpr when emitting a bounds-check condition. This change allows evaluating the same CLE in LValueExprEvaluator::VisitCompoundLiteralExpr, and when it was previously evaluated, we reset the state and evaluate it again.

rdar://157033241

@patrykstefanski patrykstefanski self-assigned this Aug 14, 2025
@patrykstefanski patrykstefanski added the clang:bounds-safety Issue relating to the experimental -fbounds-safety feature in Clang label Aug 14, 2025
@hnrklssn
Copy link

Are we sure this isn't something that should be wrapped in OpaqueValueExpr to allow sharing the AST node?

BoundsSafety can reuse the same CompoundLiteralExpr when emitting a
bounds-check condition. This change allows evaluating the same CLE in
LValueExprEvaluator::VisitCompoundLiteralExpr, and when it was
previously evaluated, we reset the state and evaluate it again.

rdar://157033241
@patrykstefanski
Copy link
Author

Are we sure this isn't something that should be wrapped in OpaqueValueExpr to allow sharing the AST node?

TBH, I'm not sure. This is caused by building a bounds-check in ActOnBoundsSafetyCall:

https://github.com/swiftlang/llvm-project/blob/5bfcbc9b62a9c48206f2c5892ff7845cb9952ee9/clang/lib/Sema/SemaExpr.cpp#L25737

We create the bounds-check condition first, which causes the bug, and then wrap it in BoundsCheckExpr/MaterializeSequenceExpr. If anyone has a better idea how to solve it, please comment.

@patrykstefanski patrykstefanski force-pushed the eng/pstefanski/PR-157033241 branch from 5a70759 to 410016e Compare August 15, 2025 23:35
@ziqingluo-90 ziqingluo-90 self-requested a review August 18, 2025 17:14
@ziqingluo-90
Copy link

I was imagining in general that if you are creating new AST nodes and want to reuse some nodes, you need to make clones of them. But I'm not sure if this is actually required in clang development.

@ziqingluo-90
Copy link

I was imagining in general that if you are creating new AST nodes and want to reuse some nodes, you need to make clones of them. But I'm not sure if this is actually required in clang development.

Ah, I didn't know that OpaqueValueExpr can be used for sharing.

@hnrklssn
Copy link

Are we sure this isn't something that should be wrapped in OpaqueValueExpr to allow sharing the AST node?

TBH, I'm not sure. This is caused by building a bounds-check in ActOnBoundsSafetyCall:

https://github.com/swiftlang/llvm-project/blob/5bfcbc9b62a9c48206f2c5892ff7845cb9952ee9/clang/lib/Sema/SemaExpr.cpp#L25737

We create the bounds-check condition first, which causes the bug, and then wrap it in BoundsCheckExpr/MaterializeSequenceExpr. If anyone has a better idea how to solve it, please comment.

Earlier in the function we scan all the arguments passed in the function call, and wrap them in OVEs. Is this not done for the compound literal? Or do we somehow peel the OVE off later? Perhaps the explicit cast is related.

@patrykstefanski
Copy link
Author

Are we sure this isn't something that should be wrapped in OpaqueValueExpr to allow sharing the AST node?

TBH, I'm not sure. This is caused by building a bounds-check in ActOnBoundsSafetyCall:
https://github.com/swiftlang/llvm-project/blob/5bfcbc9b62a9c48206f2c5892ff7845cb9952ee9/clang/lib/Sema/SemaExpr.cpp#L25737
We create the bounds-check condition first, which causes the bug, and then wrap it in BoundsCheckExpr/MaterializeSequenceExpr. If anyone has a better idea how to solve it, please comment.

Earlier in the function we scan all the arguments passed in the function call, and wrap them in OVEs. Is this not done for the compound literal? Or do we somehow peel the OVE off later? Perhaps the explicit cast is related.

The compound literal is wrapped in OVE at this point. For context:

    frame #29: 0x0000000108250c78 clang`clang::Expr::EvaluateAsInt(this=0x000000015093b018, Result=0x000000016fdebdd8, Ctx=0x0000000150830800, AllowSideEffects=SE_NoSideEffects, InConstantContext=false) const at ExprConstant.cpp:17556:10
  * frame #30: 0x00000001069b169c clang`clang::Sema::CheckLogicalOperands(this=0x000000015090c000, LHS=0x000000016fdec558, RHS=0x000000016fdec550, Loc=(ID = 43), Opc=BO_LAnd) at SemaExpr.cpp:16428:20
    frame #31: 0x00000001069af41c clang`clang::Sema::CreateBuiltinBinOp(this=0x000000015090c000, OpLoc=(ID = 43), Opc=BO_LAnd, LHSExpr=0x000000015093afc8, RHSExpr=0x000000015093b018, ForFoldExpression=false) at SemaExpr.cpp:18282:16
    frame #32: 0x00000001069d1bf4 clang`clang::BoundsCheckBuilder::BuildLEChecks(S=0x000000015090c000, Loc=(ID = 43), Bounds=ArrayRef<clang::Expr *> @ 0x000000016fdee840, OVEs=0x000000016fdeeec0) at SemaExpr.cpp:26301:30
    frame #33: 0x00000001069cffc4 clang`clang::BoundsCheckBuilder::Build(this=0x000000016fdeecc0, S=0x000000015090c000, GuardedValue=0x000000015093ad48) at SemaExpr.cpp:26250:37
    frame #34: 0x0000000106982c14 clang`clang::Sema::ActOnBoundsSafetyCall(this=0x000000015090c000, Call=(Value = 5646822728)) at SemaExpr.cpp:25737:30
(lldb) f 30
frame #30: 0x00000001069b169c clang`clang::Sema::CheckLogicalOperands(this=0x000000015090c000, LHS=0x000000016fdec558, RHS=0x000000016fdec550, Loc=(ID = 43), Opc=BO_LAnd) at SemaExpr.cpp:16428:20
   16425            // happened to fold to true/false) then warn.
   16426            // Parens on the RHS are ignored.
   16427            Expr::EvalResult EVResult;
-> 16428            if (RHS.get()->EvaluateAsInt(EVResult, Context)) {
   16429              llvm::APSInt Result = EVResult.Val.getInt();
   16430              if ((getLangOpts().CPlusPlus && !RHS.get()->getType()->isBooleanType() &&
   16431                   !RHS.get()->getExprLoc().isMacroID()) ||
(lldb) p RHS.get()->dumpColor()
BinaryOperator 0x15093b018 'int' '<='
|-ImplicitCastExpr 0x15093afe8 'char *' <BoundsSafetyPointerCast>
| `-GetBoundExpr 0x15093ae38 'char *__bidi_indexable' lower
|   `-OpaqueValueExpr 0x15093adf8 'char *__bidi_indexable'
|     `-ImplicitCastExpr 0x15093adb0 'char *__bidi_indexable' <ArrayToPointerDecay>
|       `-CompoundLiteralExpr 0x15093ac38 'char[3]' lvalue
|         `-InitListExpr 0x15093ab98 'char[3]'
|           |-ImplicitCastExpr 0x15093abf0 'char' <IntegralCast>
|           | `-CharacterLiteral 0x15093aae8 'int' 65
|           |-ImplicitCastExpr 0x15093ac08 'char' <IntegralCast>
|           | `-CharacterLiteral 0x15093ab00 'int' 66
|           `-ImplicitCastExpr 0x15093ac20 'char' <IntegralCast>
|             `-CharacterLiteral 0x15093ab18 'int' 67
`-ImplicitCastExpr 0x15093b000 'char *' <BoundsSafetyPointerCast>
  `-OpaqueValueExpr 0x15093adf8 'char *__bidi_indexable'
    `-ImplicitCastExpr 0x15093adb0 'char *__bidi_indexable' <ArrayToPointerDecay>
      `-CompoundLiteralExpr 0x15093ac38 'char[3]' lvalue
        `-InitListExpr 0x15093ab98 'char[3]'
          |-ImplicitCastExpr 0x15093abf0 'char' <IntegralCast>
          | `-CharacterLiteral 0x15093aae8 'int' 65
          |-ImplicitCastExpr 0x15093ac08 'char' <IntegralCast>
          | `-CharacterLiteral 0x15093ab00 'int' 66
          `-ImplicitCastExpr 0x15093ac20 'char' <IntegralCast>
            `-CharacterLiteral 0x15093ab18 'int' 67

The issue here is that those OpaqueValueExprs are not yet bound by MaterializeSequenceExpr/BoundsCheckExpr, so the const evaluator will just try to evaluate their subexpressions each time. That is, we:

  1. Wrap arguments in OVEs (they are not bound yet).
  2. Build bounds-check condition, which fails since the same CLE is being evaluated twice.
  3. Wrap the whole thing in MaterializeSequenceExpr.

@patrykstefanski
Copy link
Author

Uhh, not sure if my patch is even correct:

void foo(char tag[3]);

void bar(void) {
  char c = 'A';
  foo((char[3]){c++, 'B', 'C'});
}

@patrykstefanski
Copy link
Author

TBF, this constant evaluation is used to diagnose:

  return x && 0; // expected-warning {{use of logical '&&' with constant operand}} \

We could disable this if we are inside of bounds-check condition and call it a day...

@hnrklssn
Copy link

Are we sure this isn't something that should be wrapped in OpaqueValueExpr to allow sharing the AST node?

TBH, I'm not sure. This is caused by building a bounds-check in ActOnBoundsSafetyCall:
https://github.com/swiftlang/llvm-project/blob/5bfcbc9b62a9c48206f2c5892ff7845cb9952ee9/clang/lib/Sema/SemaExpr.cpp#L25737
We create the bounds-check condition first, which causes the bug, and then wrap it in BoundsCheckExpr/MaterializeSequenceExpr. If anyone has a better idea how to solve it, please comment.

Earlier in the function we scan all the arguments passed in the function call, and wrap them in OVEs. Is this not done for the compound literal? Or do we somehow peel the OVE off later? Perhaps the explicit cast is related.

The compound literal is wrapped in OVE at this point. For context:

    frame #29: 0x0000000108250c78 clang`clang::Expr::EvaluateAsInt(this=0x000000015093b018, Result=0x000000016fdebdd8, Ctx=0x0000000150830800, AllowSideEffects=SE_NoSideEffects, InConstantContext=false) const at ExprConstant.cpp:17556:10
  * frame #30: 0x00000001069b169c clang`clang::Sema::CheckLogicalOperands(this=0x000000015090c000, LHS=0x000000016fdec558, RHS=0x000000016fdec550, Loc=(ID = 43), Opc=BO_LAnd) at SemaExpr.cpp:16428:20
    frame #31: 0x00000001069af41c clang`clang::Sema::CreateBuiltinBinOp(this=0x000000015090c000, OpLoc=(ID = 43), Opc=BO_LAnd, LHSExpr=0x000000015093afc8, RHSExpr=0x000000015093b018, ForFoldExpression=false) at SemaExpr.cpp:18282:16
    frame #32: 0x00000001069d1bf4 clang`clang::BoundsCheckBuilder::BuildLEChecks(S=0x000000015090c000, Loc=(ID = 43), Bounds=ArrayRef<clang::Expr *> @ 0x000000016fdee840, OVEs=0x000000016fdeeec0) at SemaExpr.cpp:26301:30
    frame #33: 0x00000001069cffc4 clang`clang::BoundsCheckBuilder::Build(this=0x000000016fdeecc0, S=0x000000015090c000, GuardedValue=0x000000015093ad48) at SemaExpr.cpp:26250:37
    frame #34: 0x0000000106982c14 clang`clang::Sema::ActOnBoundsSafetyCall(this=0x000000015090c000, Call=(Value = 5646822728)) at SemaExpr.cpp:25737:30
(lldb) f 30
frame #30: 0x00000001069b169c clang`clang::Sema::CheckLogicalOperands(this=0x000000015090c000, LHS=0x000000016fdec558, RHS=0x000000016fdec550, Loc=(ID = 43), Opc=BO_LAnd) at SemaExpr.cpp:16428:20
   16425            // happened to fold to true/false) then warn.
   16426            // Parens on the RHS are ignored.
   16427            Expr::EvalResult EVResult;
-> 16428            if (RHS.get()->EvaluateAsInt(EVResult, Context)) {
   16429              llvm::APSInt Result = EVResult.Val.getInt();
   16430              if ((getLangOpts().CPlusPlus && !RHS.get()->getType()->isBooleanType() &&
   16431                   !RHS.get()->getExprLoc().isMacroID()) ||
(lldb) p RHS.get()->dumpColor()
BinaryOperator 0x15093b018 'int' '<='
|-ImplicitCastExpr 0x15093afe8 'char *' <BoundsSafetyPointerCast>
| `-GetBoundExpr 0x15093ae38 'char *__bidi_indexable' lower
|   `-OpaqueValueExpr 0x15093adf8 'char *__bidi_indexable'
|     `-ImplicitCastExpr 0x15093adb0 'char *__bidi_indexable' <ArrayToPointerDecay>
|       `-CompoundLiteralExpr 0x15093ac38 'char[3]' lvalue
|         `-InitListExpr 0x15093ab98 'char[3]'
|           |-ImplicitCastExpr 0x15093abf0 'char' <IntegralCast>
|           | `-CharacterLiteral 0x15093aae8 'int' 65
|           |-ImplicitCastExpr 0x15093ac08 'char' <IntegralCast>
|           | `-CharacterLiteral 0x15093ab00 'int' 66
|           `-ImplicitCastExpr 0x15093ac20 'char' <IntegralCast>
|             `-CharacterLiteral 0x15093ab18 'int' 67
`-ImplicitCastExpr 0x15093b000 'char *' <BoundsSafetyPointerCast>
  `-OpaqueValueExpr 0x15093adf8 'char *__bidi_indexable'
    `-ImplicitCastExpr 0x15093adb0 'char *__bidi_indexable' <ArrayToPointerDecay>
      `-CompoundLiteralExpr 0x15093ac38 'char[3]' lvalue
        `-InitListExpr 0x15093ab98 'char[3]'
          |-ImplicitCastExpr 0x15093abf0 'char' <IntegralCast>
          | `-CharacterLiteral 0x15093aae8 'int' 65
          |-ImplicitCastExpr 0x15093ac08 'char' <IntegralCast>
          | `-CharacterLiteral 0x15093ab00 'int' 66
          `-ImplicitCastExpr 0x15093ac20 'char' <IntegralCast>
            `-CharacterLiteral 0x15093ab18 'int' 67

The issue here is that those OpaqueValueExprs are not yet bound by MaterializeSequenceExpr/BoundsCheckExpr, so the const evaluator will just try to evaluate their subexpressions each time. That is, we:

1. Wrap arguments in OVEs (they are not bound yet).

2. Build bounds-check condition, which fails since the same CLE is being evaluated twice.

3. Wrap the whole thing in `MaterializeSequenceExpr`.

This feels like a bug imo: ExprEvaluatorBase::VisitOpaqueValueExpr also uses Info.CurrentCall->getCurrentTemporary to get the cached value. If there's a current temporary for the CLE, there also ought to be one for the OVE wrapping it. I don't see why we're crashing when reevaluating the CLE when it's already wrapped in an OVE, MSE or not.

@patrykstefanski
Copy link
Author

Are we sure this isn't something that should be wrapped in OpaqueValueExpr to allow sharing the AST node?

TBH, I'm not sure. This is caused by building a bounds-check in ActOnBoundsSafetyCall:
https://github.com/swiftlang/llvm-project/blob/5bfcbc9b62a9c48206f2c5892ff7845cb9952ee9/clang/lib/Sema/SemaExpr.cpp#L25737
We create the bounds-check condition first, which causes the bug, and then wrap it in BoundsCheckExpr/MaterializeSequenceExpr. If anyone has a better idea how to solve it, please comment.

Earlier in the function we scan all the arguments passed in the function call, and wrap them in OVEs. Is this not done for the compound literal? Or do we somehow peel the OVE off later? Perhaps the explicit cast is related.

The compound literal is wrapped in OVE at this point. For context:

    frame #29: 0x0000000108250c78 clang`clang::Expr::EvaluateAsInt(this=0x000000015093b018, Result=0x000000016fdebdd8, Ctx=0x0000000150830800, AllowSideEffects=SE_NoSideEffects, InConstantContext=false) const at ExprConstant.cpp:17556:10
  * frame #30: 0x00000001069b169c clang`clang::Sema::CheckLogicalOperands(this=0x000000015090c000, LHS=0x000000016fdec558, RHS=0x000000016fdec550, Loc=(ID = 43), Opc=BO_LAnd) at SemaExpr.cpp:16428:20
    frame #31: 0x00000001069af41c clang`clang::Sema::CreateBuiltinBinOp(this=0x000000015090c000, OpLoc=(ID = 43), Opc=BO_LAnd, LHSExpr=0x000000015093afc8, RHSExpr=0x000000015093b018, ForFoldExpression=false) at SemaExpr.cpp:18282:16
    frame #32: 0x00000001069d1bf4 clang`clang::BoundsCheckBuilder::BuildLEChecks(S=0x000000015090c000, Loc=(ID = 43), Bounds=ArrayRef<clang::Expr *> @ 0x000000016fdee840, OVEs=0x000000016fdeeec0) at SemaExpr.cpp:26301:30
    frame #33: 0x00000001069cffc4 clang`clang::BoundsCheckBuilder::Build(this=0x000000016fdeecc0, S=0x000000015090c000, GuardedValue=0x000000015093ad48) at SemaExpr.cpp:26250:37
    frame #34: 0x0000000106982c14 clang`clang::Sema::ActOnBoundsSafetyCall(this=0x000000015090c000, Call=(Value = 5646822728)) at SemaExpr.cpp:25737:30
(lldb) f 30
frame #30: 0x00000001069b169c clang`clang::Sema::CheckLogicalOperands(this=0x000000015090c000, LHS=0x000000016fdec558, RHS=0x000000016fdec550, Loc=(ID = 43), Opc=BO_LAnd) at SemaExpr.cpp:16428:20
   16425            // happened to fold to true/false) then warn.
   16426            // Parens on the RHS are ignored.
   16427            Expr::EvalResult EVResult;
-> 16428            if (RHS.get()->EvaluateAsInt(EVResult, Context)) {
   16429              llvm::APSInt Result = EVResult.Val.getInt();
   16430              if ((getLangOpts().CPlusPlus && !RHS.get()->getType()->isBooleanType() &&
   16431                   !RHS.get()->getExprLoc().isMacroID()) ||
(lldb) p RHS.get()->dumpColor()
BinaryOperator 0x15093b018 'int' '<='
|-ImplicitCastExpr 0x15093afe8 'char *' <BoundsSafetyPointerCast>
| `-GetBoundExpr 0x15093ae38 'char *__bidi_indexable' lower
|   `-OpaqueValueExpr 0x15093adf8 'char *__bidi_indexable'
|     `-ImplicitCastExpr 0x15093adb0 'char *__bidi_indexable' <ArrayToPointerDecay>
|       `-CompoundLiteralExpr 0x15093ac38 'char[3]' lvalue
|         `-InitListExpr 0x15093ab98 'char[3]'
|           |-ImplicitCastExpr 0x15093abf0 'char' <IntegralCast>
|           | `-CharacterLiteral 0x15093aae8 'int' 65
|           |-ImplicitCastExpr 0x15093ac08 'char' <IntegralCast>
|           | `-CharacterLiteral 0x15093ab00 'int' 66
|           `-ImplicitCastExpr 0x15093ac20 'char' <IntegralCast>
|             `-CharacterLiteral 0x15093ab18 'int' 67
`-ImplicitCastExpr 0x15093b000 'char *' <BoundsSafetyPointerCast>
  `-OpaqueValueExpr 0x15093adf8 'char *__bidi_indexable'
    `-ImplicitCastExpr 0x15093adb0 'char *__bidi_indexable' <ArrayToPointerDecay>
      `-CompoundLiteralExpr 0x15093ac38 'char[3]' lvalue
        `-InitListExpr 0x15093ab98 'char[3]'
          |-ImplicitCastExpr 0x15093abf0 'char' <IntegralCast>
          | `-CharacterLiteral 0x15093aae8 'int' 65
          |-ImplicitCastExpr 0x15093ac08 'char' <IntegralCast>
          | `-CharacterLiteral 0x15093ab00 'int' 66
          `-ImplicitCastExpr 0x15093ac20 'char' <IntegralCast>
            `-CharacterLiteral 0x15093ab18 'int' 67

The issue here is that those OpaqueValueExprs are not yet bound by MaterializeSequenceExpr/BoundsCheckExpr, so the const evaluator will just try to evaluate their subexpressions each time. That is, we:

1. Wrap arguments in OVEs (they are not bound yet).

2. Build bounds-check condition, which fails since the same CLE is being evaluated twice.

3. Wrap the whole thing in `MaterializeSequenceExpr`.

This feels like a bug imo: ExprEvaluatorBase::VisitOpaqueValueExpr also uses Info.CurrentCall->getCurrentTemporary to get the cached value. If there's a current temporary for the CLE, there also ought to be one for the OVE wrapping it. I don't see why we're crashing when reevaluating the CLE when it's already wrapped in an OVE, MSE or not.

The CLE and OVE use their addresses as keys, so they are different. In the example above, CLE uses 0x15093ac38, while OVE uses 0x15093adf8. ExprEvaluatorBase::VisitOpaqueValueExpr only tries to get the temporary and doesn't initialize it. Since there is no MSE at this point, nobody initializes the temporary for the OVE.

@rapidsna
Copy link

TBF, this constant evaluation is used to diagnose:

  return x && 0; // expected-warning {{use of logical '&&' with constant operand}} \

We could disable this if we are inside of bounds-check condition and call it a day...

This sounds reasonable to me.

@hnrklssn
Copy link

ExprEvaluatorBase::VisitOpaqueValueExpronly tries to get the temporary and doesn't initialize it. Since there is noMSEat this point, nobody initializes the temporary for theOVE`.

Ahh that explains it, I didn't even notice that VisitOpaqueValueExpr doesn't initialize the temp value itself.
Then I agree, we should disable that diagnostic inside the bounds-check. It would be pretty bad UX if we emitted warnings about the logic in auto-generated, hidden bounds checks anyhow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:bounds-safety Issue relating to the experimental -fbounds-safety feature in Clang
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants