Skip to content

Swift: mass enable diff-informed data flow #19662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

d10c
Copy link
Contributor

@d10c d10c commented Jun 3, 2025

An auto-generated patch that enables diff-informed data flow in the obvious cases.

Builds on #18343 and https://github.com/github/codeql-patch/pull/88

@d10c d10c marked this pull request as ready for review June 4, 2025 12:18
@Copilot Copilot AI review requested due to automatic review settings June 4, 2025 12:18
@d10c d10c requested a review from a team as a code owner June 4, 2025 12:18
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Enables diff-informed incremental data flow analysis across numerous Swift security queries by adding the required predicate and updates the change notes to reflect this new capability.

  • Added observeDiffInformedIncrementalMode() predicate to all relevant ConfigSig and StateConfigSig modules to opt into diff-informed mode.
  • Updated 2025-06-04-diff-informed.md change notes to document the new feature.

Reviewed Changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
swift/ql/lib/codeql/swift/security/XXEQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/WeakSensitiveDataHashingQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/WeakPasswordHashingQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/UnsafeUnpackQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/UnsafeJsEvalQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/UncontrolledFormatStringQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/StringLengthConflationQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/StaticInitializationVectorQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/SqlInjectionQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/PredicateInjectionQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/PathInjectionQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/InsufficientHashIterationsQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/HardcodedEncryptionKeyQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/ECBEncryptionQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/ConstantSaltQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/ConstantPasswordQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/CommandInjectionQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/CleartextTransmissionQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/codeql/swift/security/CleartextLoggingQuery.qll Added diff-informed incremental mode predicate
swift/ql/lib/change-notes/2025-06-04-diff-informed.md Documented that built-in Swift queries can now run in diff-informed mode
Comments suppressed due to low confidence (2)

swift/ql/lib/change-notes/2025-06-04-diff-informed.md:4

  • [nitpick] This note is concise but could benefit from listing the specific modules or providing a link to detailed docs so users know exactly which queries are affected.
* A number of built-in Swift queries can now run in diff-informed mode.

swift/ql/lib/codeql/swift/security/WeakSensitiveDataHashingQuery.qll:42

  • There are no accompanying tests to verify that diff-informed incremental mode behaves as expected. Consider adding unit or integration tests to cover this new predicate in incremental runs.
predicate observeDiffInformedIncrementalMode() { any() }

@@ -25,6 +25,8 @@ module CleartextLoggingConfig implements DataFlow::ConfigSig {
predicate isAdditionalFlowStep(DataFlow::Node n1, DataFlow::Node n2) {
any(CleartextLoggingAdditionalFlowStep s).step(n1, n2)
}

predicate observeDiffInformedIncrementalMode() { any() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

(I wrote many of these queries, I'm sure I can figure out if it's appropriate)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line enables diff-informed mode, i.e. only showing results that are in a diff range. There is currently some internal documentation here, not sure if it's documented publicly yet.

geoffw0
geoffw0 previously approved these changes Jun 5, 2025
Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with all of these. 👍

The only Swift queries that don't select dataflow sources and sinks in the usual way are the regular expression queries, which you haven't tagged (apart from regex injection, which is indeed "simple" and you have correctly tagged it).

The other regex queries use three(?) data flow configurations to track the flow of regular expression strings, regular expression objects and regular expression configuration flag, putting things together at the site of the regex eval. We probably could make those work as well with a bit of effort (I don't recommend prioritizing it).

@@ -21,8 +21,6 @@ module InsecureTlsConfig implements DataFlow::ConfigSig {
predicate isAdditionalFlowStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
any(InsecureTlsExtensionsAdditionalFlowStep s).step(nodeFrom, nodeTo)
}

predicate observeDiffInformedIncrementalMode() { any() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This surprises me. This library is only used by the swift/insecure-tls query, which is an entirely straightforward dataflow query, so I would expect this to work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the select clause in the corresponding .ql file, it seems only the sink node is used as a location source, which requires an override of getASelectedSourceLocation() { none() }. The patch script currently misses these cases, which would explain why this one slipped through. To properly check the correctness of these configs, we need to look at the corresponding select, which is an awkward code review flow. Another mystery is why only this query failed the diff-informed consistency check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. It would also be possible to add the source as a $@ parameter to make this a normal query.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. this is the run with the failing InsecureTLS diff-informed consistency check (Wrongly omitted: | :0:0:0:0; strange location): https://github.com/github/semmle-code/actions/runs/15425113807/job/43409978713

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out the problem wasn't about selecting the source or not since we've modified the diff-informed tests to not care about omitted results. The problem was that the test selects an element without a location, leading to errors like this:

Wrongly omitted: | :0:0:0:0 | [post] self | This TLS configuration is insecure. |

That problem is orthogonal to diff-informed queries, but it's probably easiest to leave this query not diff-informed until the problem is fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. FYI fixing that particular issue will be very low priority for the foreseeable future, unless it's affecting customers. I think we should just press on and get everything else diff-informed.

@d10c
Copy link
Contributor Author

d10c commented Jun 5, 2025

It turns out that some of the generated changes in the PRs were not correct, e.g. because they should have also generated a getASelected{Source,Sink}Location() override but didn't (see Chuan-kai's comment here). So for now I'm putting them back in Draft until I make sure (via the patch script) that we are correctly handling all 3 documented query patterns, starting with the simplest one (both source and sink are used as location sources). Thanks for the review and stay tuned for an update as to what has changed in the meantime!

@d10c d10c marked this pull request as draft June 5, 2025 16:02
@geoffw0
Copy link
Contributor

geoffw0 commented Jun 9, 2025

Happy to re-review when this is ready.

@d10c d10c force-pushed the d10c/swift/diff-informed branch from 72da199 to 02341c0 Compare June 10, 2025 13:02
@d10c
Copy link
Contributor Author

d10c commented Jun 10, 2025

Update: no changes since last time I opened the PR. It turns out that it's sound (but not optimally performant) to leave getASelected{Source,Sink}Location() un-overridden, specifically in case of a select clause containing only one of source or sink but not both. The patch script currently does not differentiate between that case and the one in which both source and sink are present in the select clause. So I will re-open these PRs as they are, and generate an appropriate getASelected{Source,Sink}Location() override in a follow-up round of PRs.

@d10c d10c marked this pull request as ready for review June 10, 2025 15:13
geoffw0
geoffw0 previously approved these changes Jun 10, 2025
Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for doing this.

@d10c d10c added the no-change-note-required This PR does not need a change note label Jun 11, 2025
d10c added 2 commits June 11, 2025 18:34
An auto-generated patch that enables diff-informed data flow in the obvious cases.

Builds on github#18343 and github/codeql-patch#88
@d10c d10c force-pushed the d10c/swift/diff-informed branch from 27dce51 to 2078a34 Compare June 11, 2025 16:40
@d10c
Copy link
Contributor Author

d10c commented Jun 11, 2025

Sorry for the review invalidation; I removed the change note commit since diff-informed queries are not publicly documented yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-change-note-required This PR does not need a change note Swift
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants