-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Swift: mass enable diff-informed data flow #19662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Enables diff-informed incremental data flow analysis across numerous Swift security queries by adding the required predicate and updates the change notes to reflect this new capability.
- Added
observeDiffInformedIncrementalMode()
predicate to all relevant ConfigSig and StateConfigSig modules to opt into diff-informed mode. - Updated
2025-06-04-diff-informed.md
change notes to document the new feature.
Reviewed Changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
swift/ql/lib/codeql/swift/security/XXEQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/WeakSensitiveDataHashingQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/WeakPasswordHashingQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/UnsafeUnpackQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/UnsafeJsEvalQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/UncontrolledFormatStringQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/StringLengthConflationQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/StaticInitializationVectorQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/SqlInjectionQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/PredicateInjectionQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/PathInjectionQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/InsufficientHashIterationsQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/HardcodedEncryptionKeyQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/ECBEncryptionQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/ConstantSaltQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/ConstantPasswordQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/CommandInjectionQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/CleartextTransmissionQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/codeql/swift/security/CleartextLoggingQuery.qll | Added diff-informed incremental mode predicate |
swift/ql/lib/change-notes/2025-06-04-diff-informed.md | Documented that built-in Swift queries can now run in diff-informed mode |
Comments suppressed due to low confidence (2)
swift/ql/lib/change-notes/2025-06-04-diff-informed.md:4
- [nitpick] This note is concise but could benefit from listing the specific modules or providing a link to detailed docs so users know exactly which queries are affected.
* A number of built-in Swift queries can now run in diff-informed mode.
swift/ql/lib/codeql/swift/security/WeakSensitiveDataHashingQuery.qll:42
- There are no accompanying tests to verify that diff-informed incremental mode behaves as expected. Consider adding unit or integration tests to cover this new predicate in incremental runs.
predicate observeDiffInformedIncrementalMode() { any() }
@@ -25,6 +25,8 @@ module CleartextLoggingConfig implements DataFlow::ConfigSig { | |||
predicate isAdditionalFlowStep(DataFlow::Node n1, DataFlow::Node n2) { | |||
any(CleartextLoggingAdditionalFlowStep s).step(n1, n2) | |||
} | |||
|
|||
predicate observeDiffInformedIncrementalMode() { any() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean?
(I wrote many of these queries, I'm sure I can figure out if it's appropriate)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line enables diff-informed mode, i.e. only showing results that are in a diff range. There is currently some internal documentation here, not sure if it's documented publicly yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy with all of these. 👍
The only Swift queries that don't select dataflow sources and sinks in the usual way are the regular expression queries, which you haven't tagged (apart from regex injection, which is indeed "simple" and you have correctly tagged it).
The other regex queries use three(?) data flow configurations to track the flow of regular expression strings, regular expression objects and regular expression configuration flag, putting things together at the site of the regex eval. We probably could make those work as well with a bit of effort (I don't recommend prioritizing it).
@@ -21,8 +21,6 @@ module InsecureTlsConfig implements DataFlow::ConfigSig { | |||
predicate isAdditionalFlowStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) { | |||
any(InsecureTlsExtensionsAdditionalFlowStep s).step(nodeFrom, nodeTo) | |||
} | |||
|
|||
predicate observeDiffInformedIncrementalMode() { any() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This surprises me. This library is only used by the swift/insecure-tls
query, which is an entirely straightforward dataflow query, so I would expect this to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the select clause in the corresponding .ql file, it seems only the sink node is used as a location source, which requires an override of getASelectedSourceLocation() { none() }
. The patch script currently misses these cases, which would explain why this one slipped through. To properly check the correctness of these configs, we need to look at the corresponding select
, which is an awkward code review flow. Another mystery is why only this query failed the diff-informed consistency check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. It would also be possible to add the source as a $@
parameter to make this a normal query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.S. this is the run with the failing InsecureTLS diff-informed consistency check (Wrongly omitted: | :0:0:0:0
; strange location): https://github.com/github/semmle-code/actions/runs/15425113807/job/43409978713
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out the problem wasn't about selecting the source or not since we've modified the diff-informed tests to not care about omitted results. The problem was that the test selects an element without a location, leading to errors like this:
Wrongly omitted: | :0:0:0:0 | [post] self | This TLS configuration is insecure. |
That problem is orthogonal to diff-informed queries, but it's probably easiest to leave this query not diff-informed until the problem is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. FYI fixing that particular issue will be very low priority for the foreseeable future, unless it's affecting customers. I think we should just press on and get everything else diff-informed.
It turns out that some of the generated changes in the PRs were not correct, e.g. because they should have also generated a |
Happy to re-review when this is ready. |
72da199
to
02341c0
Compare
Update: no changes since last time I opened the PR. It turns out that it's sound (but not optimally performant) to leave |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for doing this.
An auto-generated patch that enables diff-informed data flow in the obvious cases. Builds on github#18343 and github/codeql-patch#88
27dce51
to
2078a34
Compare
Sorry for the review invalidation; I removed the change note commit since diff-informed queries are not publicly documented yet. |
An auto-generated patch that enables diff-informed data flow in the obvious cases.
Builds on #18343 and https://github.com/github/codeql-patch/pull/88