Add string search circuit example using lib (#31) #81

cypriansakwa · 2025-07-24T09:06:19Z

Description

This pull request introduces a robust, circuit-friendly wrapper for substring search using the noir_string_search library. The circuit, substring_search, validates input lengths with assertions and then always calls the underlying library function, using selector logic to guarantee correct output for all cases. This approach ensures the circuit is safe for zero-knowledge proof systems by avoiding control flow branching and handling all edge cases arithmetically.

For an empty needle, the circuit returns (true, 0), following standard library behavior.
For a needle longer than the haystack, the circuit returns (false, 0).
For all other cases, it returns whether the substring was found and its index.

Selector logic is used to mask the result of the library call as appropriate, ensuring all outputs are circuit-safe and preventing assertion failures or panics.

Problem

Addresses the need for a practical example demonstrating integration with the noir_string_search library (see noir-examples#31). This implementation demonstrates best practices for integrating external Noir libraries into real-world applications, particularly in the context of zero-knowledge circuits.

Summary

Adds a wrapper circuit that asserts input validity and uses selector logic to determine output.
Always calls the underlying library, but uses selectors to mask outputs for edge cases.
Returns (true, 0) for empty needles (matching standard substring search behavior), and (false, 0) for needles longer than the haystack.
Maintains a public interface of (bool, u32) for clarity and compatibility with the noir_string_search contract.
Ensures all test cases, including edge cases, are handled gracefully with circuit-safe logic.

Additional Context

Selector logic ensures there are no control-flow branches, which is important for circuit determinism and security.
The approach is robust against future changes in the upstream library and can be easily adapted if the library's behavior changes.
All code and tests work without runtime assertion errors, panics, or constant return value warnings.

PR Checklist

I have tested the changes locally.
I have formatted the changes with Prettier and/or cargo fmt on default settings.

cypriansakwa · 2025-10-02T03:32:00Z

Hi @critesjosh and @signorecello 👋,

Just a gentle reminder to please take a moment to review this PR when you get a chance.
Your feedback will be really helpful in moving it forward. 🙏

Thanks a lot for your time and support!

critesjosh · 2025-10-02T19:19:51Z

Just did a quick pass with Claude. do you agree with this analysis?

Issues Found

Critical Logic Error in main.nr (lines 8-10, 25-37)

The documentation comments and implementation are contradictory and incorrect:

Comment says (line 9): "If needle is empty, found=true at index 0"
Implementation does (line 34): Empty needle returns found=true
Library expects: The library is called regardless of edge cases (line 23), then selector logic is applied afterward

Problem: The code calls haystack_body.substring_match(needle_body) on line 23 before checking edge cases. According to proper usage, the library should handle the search internally.
However, the selector logic afterward (lines 25-37) attempts to override the library's result for empty needles and "too long" cases.

Actual behavior with empty needle: An empty needle conventionally matches at position 0, but this depends on the library's implementation. The code returns found=true when
is_empty=true, which may be semantically incorrect for substring search (empty string is typically considered present at every position, not just index 0).

Incorrect README Documentation (lines 58-63)

The README claims:
The circuit only attempts the underlying library search if:

The needle is not empty.
The needle length does not exceed the haystack length.

Otherwise, it returns (false, 0).

This is false. The code at main.nr:23 calls substring_match() unconditionally, then applies selector logic to potentially override the result. The library is always invoked.

Missing Edge Case Tests

README line 69 claims tests cover "Needle longer than haystack" but no such test exists in test_inputs.nr. The tests only cover positive matches.

Semantic Issue: Empty Needle Behavior

Line 9 documents that empty needle returns found=true, but this is questionable. In most string search semantics, an empty needle should either:

Match at every position (standard semantics)
Be considered invalid/error
Match at position 0 only (current behavior, but unusual)

The current implementation chooses the third option without justification.

Redundant Field Conversions

Lines 29-35 use Field arithmetic for selector logic (bool_to_field, line 4-6), but this could be simplified with direct boolean logic since Noir supports conditional expressions.

What's Correct

Nargo.toml properly specifies the dependency with a version tag
Prover.toml has valid test inputs ("hello world" / "world")
The positive test cases that exist are correct
General structure and imports are appropriate
The library types (StringBody256, SubString32) are used correctly

Recommendations

Fix the logic: Either remove the wrapper logic and use the library directly, OR guard the library call to not invoke it for edge cases
Update README: Accurately describe what the code actually does
Add missing tests: Empty needle, needle longer than haystack, needle not found
Clarify semantics: Document why empty needle returns (true, 0) rather than other options
Simplify if possible: Consider if the selector logic is even necessary or if the library handles these cases appropriately

cypriansakwa · 2025-10-02T19:45:02Z

Just did a quick pass with Claude. do you agree with this analysis?

Issues Found

Critical Logic Error in main.nr (lines 8-10, 25-37)

The documentation comments and implementation are contradictory and incorrect:

Comment says (line 9): "If needle is empty, found=true at index 0"

Implementation does (line 34): Empty needle returns found=true

Library expects: The library is called regardless of edge cases (line 23), then selector logic is applied afterward

Problem: The code calls haystack_body.substring_match(needle_body) on line 23 before checking edge cases. According to proper usage, the library should handle the search internally. However, the selector logic afterward (lines 25-37) attempts to override the library's result for empty needles and "too long" cases.

Actual behavior with empty needle: An empty needle conventionally matches at position 0, but this depends on the library's implementation. The code returns found=true when is_empty=true, which may be semantically incorrect for substring search (empty string is typically considered present at every position, not just index 0).

Incorrect README Documentation (lines 58-63)

The README claims: The circuit only attempts the underlying library search if:

The needle is not empty.

The needle length does not exceed the haystack length.

Otherwise, it returns (false, 0).

This is false. The code at main.nr:23 calls substring_match() unconditionally, then applies selector logic to potentially override the result. The library is always invoked.

Missing Edge Case Tests

README line 69 claims tests cover "Needle longer than haystack" but no such test exists in test_inputs.nr. The tests only cover positive matches.

Semantic Issue: Empty Needle Behavior

Line 9 documents that empty needle returns found=true, but this is questionable. In most string search semantics, an empty needle should either:

Match at every position (standard semantics)

Be considered invalid/error

Match at position 0 only (current behavior, but unusual)

The current implementation chooses the third option without justification.

Redundant Field Conversions

Lines 29-35 use Field arithmetic for selector logic (bool_to_field, line 4-6), but this could be simplified with direct boolean logic since Noir supports conditional expressions.

What's Correct

Nargo.toml properly specifies the dependency with a version tag

Prover.toml has valid test inputs ("hello world" / "world")

The positive test cases that exist are correct

General structure and imports are appropriate

The library types (StringBody256, SubString32) are used correctly

Recommendations

Fix the logic: Either remove the wrapper logic and use the library directly, OR guard the library call to not invoke it for edge cases

Update README: Accurately describe what the code actually does

Add missing tests: Empty needle, needle longer than haystack, needle not found

Clarify semantics: Document why empty needle returns (true, 0) rather than other options

Simplify if possible: Consider if the selector logic is even necessary or if the library handles these cases appropriately

Thank you for taking the time to do such a detailed review and for running this through Claude 🙏.
I really appreciate the thorough analysis and the clear breakdown of issues — this helps me see where the implementation and documentation diverged.

I agree with your points on the logic error, README mismatch, missing edge case tests, and the semantic ambiguity around empty needles. I’ll:

Refactor the logic so edge cases are handled cleanly (either by guarding the library call or by removing redundant overrides).
Update the README to accurately reflect the actual behavior.
Add proper test coverage for empty needles, longer-needle-than-haystack, and not-found cases.
Clarify the semantic choice for empty needle behavior in both code and docs.
Simplify the selector logic if Noir’s conditional support makes it cleaner.

I’ll push these corrections in the next update.
Thanks again for the constructive feedback — it definitely makes the project stronger.

…d README, and comprehensive tests

cypriansakwa · 2025-10-06T12:50:12Z

Just did a quick pass with Claude. do you agree with this analysis?

Issues Found

Critical Logic Error in main.nr (lines 8-10, 25-37)

The documentation comments and implementation are contradictory and incorrect:

Comment says (line 9): "If needle is empty, found=true at index 0"

Implementation does (line 34): Empty needle returns found=true

Library expects: The library is called regardless of edge cases (line 23), then selector logic is applied afterward

Problem: The code calls haystack_body.substring_match(needle_body) on line 23 before checking edge cases. According to proper usage, the library should handle the search internally. However, the selector logic afterward (lines 25-37) attempts to override the library's result for empty needles and "too long" cases.
Actual behavior with empty needle: An empty needle conventionally matches at position 0, but this depends on the library's implementation. The code returns found=true when is_empty=true, which may be semantically incorrect for substring search (empty string is typically considered present at every position, not just index 0).

Incorrect README Documentation (lines 58-63)

The README claims: The circuit only attempts the underlying library search if:

The needle is not empty.

The needle length does not exceed the haystack length.

Otherwise, it returns (false, 0).
This is false. The code at main.nr:23 calls substring_match() unconditionally, then applies selector logic to potentially override the result. The library is always invoked.

Missing Edge Case Tests

README line 69 claims tests cover "Needle longer than haystack" but no such test exists in test_inputs.nr. The tests only cover positive matches.

Semantic Issue: Empty Needle Behavior

Line 9 documents that empty needle returns found=true, but this is questionable. In most string search semantics, an empty needle should either:

Match at every position (standard semantics)

Be considered invalid/error

Match at position 0 only (current behavior, but unusual)

The current implementation chooses the third option without justification.

Redundant Field Conversions

Lines 29-35 use Field arithmetic for selector logic (bool_to_field, line 4-6), but this could be simplified with direct boolean logic since Noir supports conditional expressions.

What's Correct

Nargo.toml properly specifies the dependency with a version tag

Prover.toml has valid test inputs ("hello world" / "world")

The positive test cases that exist are correct

General structure and imports are appropriate

The library types (StringBody256, SubString32) are used correctly

Recommendations

Fix the logic: Either remove the wrapper logic and use the library directly, OR guard the library call to not invoke it for edge cases

Update README: Accurately describe what the code actually does

Add missing tests: Empty needle, needle longer than haystack, needle not found

Clarify semantics: Document why empty needle returns (true, 0) rather than other options

Simplify if possible: Consider if the selector logic is even necessary or if the library handles these cases appropriately

Thank you for taking the time to do such a detailed review and for running this through Claude 🙏. I really appreciate the thorough analysis and the clear breakdown of issues — this helps me see where the implementation and documentation diverged.

I agree with your points on the logic error, README mismatch, missing edge case tests, and the semantic ambiguity around empty needles. I’ll:

Refactor the logic so edge cases are handled cleanly (either by guarding the library call or by removing redundant overrides).

Update the README to accurately reflect the actual behavior.

Add proper test coverage for empty needles, longer-needle-than-haystack, and not-found cases.

Clarify the semantic choice for empty needle behavior in both code and docs.

Simplify the selector logic if Noir’s conditional support makes it cleaner.

I’ll push these corrections in the next update. Thanks again for the constructive feedback — it definitely makes the project stronger.

Hi @critesjosh

Thank you very much for your thoughtful review and valuable feedback on this project.
I truly appreciate the time and attention you dedicated to examining both the implementation and documentation.

Following your recommendations, I have carefully revised the project to ensure it aligns fully with Noir best practices and circuit safety standards. Specifically:

The main.nr file now includes explicit input validation and safe handling of edge cases (empty or oversized substrings) before invoking the noir_string_search library.
The logic now avoids panics or assertion failures, ensuring deterministic and verifiable circuit behavior.
The project README has been expanded with clear usage instructions, detailed feature explanations, and comprehensive test examples to enhance clarity and reproducibility.

These updates make the circuit more robust, compliant with your feedback, and better suited for both educational and practical zero-knowledge use cases.

Thank you once again for your insightful feedback and guidance — it has been instrumental in refining this work.
Please let me know if there are any further adjustments or clarifications you would like to see.

…tory

initial

4e8712e

cypriansakwa marked this pull request as draft July 24, 2025 09:06

cypriansakwa added 2 commits July 24, 2025 12:11

Add safe string search example using noir_string_search (noir-lang#31)

1f78432

Update main.nr and add test_inputs.nr

34055ab

cypriansakwa marked this pull request as ready for review July 24, 2025 13:33

Refine substring search circuit: add safe edge-case handling, improve…

c98bb1d

…d README, and comprehensive tests

cypriansakwa added 2 commits October 6, 2025 16:32

Move string_search_example project files into string_search/ subdirec…

6949ae8

…tory

Add GitHub Actions workflow for string_search_example tests

387d30b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add string search circuit example using lib (#31) #81

Add string search circuit example using lib (#31) #81

Uh oh!

cypriansakwa commented Jul 24, 2025 •

edited

Loading

Uh oh!

cypriansakwa commented Oct 2, 2025

Uh oh!

critesjosh commented Oct 2, 2025

Uh oh!

cypriansakwa commented Oct 2, 2025

Issues Found

What's Correct

Recommendations

Uh oh!

cypriansakwa commented Oct 6, 2025 •

edited

Loading

Issues Found

What's Correct

Recommendations

Uh oh!

Uh oh!

Add string search circuit example using lib (#31) #81

Are you sure you want to change the base?

Add string search circuit example using lib (#31) #81

Uh oh!

Conversation

cypriansakwa commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Summary

Additional Context

PR Checklist

Uh oh!

cypriansakwa commented Oct 2, 2025

Uh oh!

critesjosh commented Oct 2, 2025

Issues Found

What's Correct

Recommendations

Uh oh!

cypriansakwa commented Oct 2, 2025

Issues Found

What's Correct

Recommendations

Uh oh!

cypriansakwa commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issues Found

What's Correct

Recommendations

Uh oh!

Uh oh!

cypriansakwa commented Jul 24, 2025 •

edited

Loading

cypriansakwa commented Oct 6, 2025 •

edited

Loading