-
Notifications
You must be signed in to change notification settings - Fork 0
8702: fix: strict parsing of decimals with zero scale #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughRefactors decimal parsing by replacing a local iterator variable with an explicit iterator and updating loop traversals. Adds conditional logic to handle scale-0 decimal parsing: when fractional digits exist and input ends with ".0", the zero is folded into the integer part; otherwise a parse error is returned. Extends test coverage with overflow and zero-scale scenarios. Changes
Sequence Diagram(s)sequenceDiagram
participant Parser as Decimal Parser
participant Logic as Scale-0 Logic
participant Result as Parse Result
Parser->>Logic: is_e_notation == false AND scale == 0?
alt Scale is 0 and fractional digits exist
Logic->>Logic: Check if input ends with ".0"
alt Input ends with ".0" (single fractional digit)
Logic->>Logic: Fold zero into integer part
Logic->>Logic: Clear fractional state
Logic->>Result: Return success
else Other fractional digits present
Logic->>Result: Return parse error<br/>(cannot parse with scale 0)
end
else Scale is not 0 or no fractional digits
Logic->>Result: Continue standard parsing
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes The changes involve careful logic modifications to decimal parsing with conditional branching, iterator pattern adjustments requiring verification across multiple code paths, and meaningful test coverage changes. Review demands understanding of decimal number parsing semantics and validation of the scale-0 edge case handling. Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review completed. No suggestions at this time.
Pull Request ReviewSummaryThis PR fixes strict parsing of decimals with zero scale by adding validation to reject fractional values (except .0) when scale=0. The implementation is sound with good test coverage. Code Quality and Best Practices ✅Positive aspects:
Implementation Analysis ✅The core logic (lines 966-979): The code checks if scale is 0 but fractional digits were parsed, and handles the special case of .0 suffix appropriately. Analysis:
Edge Cases and Testing ✅Logic Verification:
Excellent test coverage added:
Suggested additional tests (not blockers):
Performance and Security ✅Performance: Minimal O(1) overhead, no allocations, efficient operations Security: No issues - strengthens input validation, uses safe Rust, no panics RecommendationsMedium Priority:
Low Priority:
Final Verdict: ✅ LGTM with minor suggestionsThis is a well-implemented fix that solves the problem correctly, maintains backward compatibility, has comprehensive test coverage, and follows existing patterns. Ready to merge. File: arrow-cast/src/parse.rs:966-979 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
arrow-cast/src/parse.rs (1)
898-914: Fix E-notation scale-0 bypass by applying same strictness as non-E-notation pathThe review comment correctly identifies an inconsistency. The non-E-notation path enforces scale-0 strictness (rejecting inputs like "1.2" but accepting "1.0") at lines 966-978, but the E-notation path bypasses this check entirely at lines 901 and 943. The
parse_e_notationfunction lacks scale-0 validation, allowing "1.2e0" with scale 0 to parse when it should error.Existing tests confirm the gap: scale-0 cases like
("1.234e16", "12340000000000000", 0)only test inputs without fractional parts before 'e'. No tests cover fractional E-notation with scale 0 (e.g., "1.2e0", "1.0e0").Recommend Option B (allow only
.0e...), as it mirrors the existing non-E-notation rule which accepts "1.0" with scale 0. Add the pre-validation check before bothparse_e_notationcalls (lines 901 and 943), and include test cases for "1.2e0", "1.0e0", and "1.0e-1" with scale 0 to encode the intended behavior.
🧹 Nitpick comments (3)
arrow-cast/src/parse.rs (3)
966-979: Good: strict scale-0 handling for non‑E decimals; tiny safety nit.Logic to fold a trailing ".0" and error otherwise is correct and aligns with the stated intent. Consider a minor safety/readability tweak to avoid any chance of underflow on u8:
- digits -= 1; + digits = digits.saturating_sub(1);
2771-2776: Zero-scale acceptance tests look good; add a few more edges.Consider also asserting these cases:
- ".0" => 0
- "0.0" => 0
- "-1.0" => -1
This tightens coverage around signs and missing integer parts.
2778-2789: Great rejection set; add E-notation counterparts to lock semantics.Add cases to make the E-notation behavior explicit under scale 0:
- Reject: "1.2e0", "1.00e0", "1.0e-1"
- Decide on: "1.0e0" and "1.0e10" (accept or reject per chosen policy in code)
Example test additions:
@@ - let zero_scale_err = [ + let zero_scale_err = [ "1.2", "1.00", "1.23", "1.000", "1.123", "123.4", "123.00", "123.45", + "1.2e0", "1.00e0", "1.0e-1", ];If you choose Option A (strict disallow), also add to zero_scale_err: "1.0e0", "1.0e10".
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
arrow-cast/src/parse.rs(5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: MIRI
- GitHub Check: Rustdocs are clean
- GitHub Check: Test
- GitHub Check: Check Compilation
- GitHub Check: Test on Mac
- GitHub Check: Pyarrow C Data Interface (stable, 16)
🔇 Additional comments (2)
arrow-cast/src/parse.rs (2)
878-885: Iterator refactor reads clean and preserves indices.Using enumerate().skip(signed as usize) keeps indices aligned to the original byte slice (including the sign) and works with downstream uses (e.g., point_index). Looks good.
2655-2655: Nice overflow coverage addition.Adding "12345678900.0" to overflow cases (scale 0, precision 10) is a useful regression guard, especially after folding ".0".
|
Findings
|
8702: To review by AI
Summary by CodeRabbit
Bug Fixes
Tests