-
Notifications
You must be signed in to change notification settings - Fork 82
feat(clp-s): Implement part of the parsing side of the new timestamp_parser
utility.
#1385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…several compilation errors.
WalkthroughAdds a new clp_s::timestamp_parser library (CMake target, headers, implementation, and error codes), unit tests, and build-option wiring; implements parse_timestamp to parse patterned timestamps into epoch nanoseconds and return explicit error codes. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Test as Unit Tests
participant Parser as clp_s::timestamp_parser::parse_timestamp
participant Helpers as Internal helpers
participant DateLib as date::date / chrono
participant Err as ystdlib::error_handling
Test->>Parser: parse_timestamp(timestamp, pattern, generated_pattern)
Parser->>Helpers: scan pattern, handle escapes, match literals/specifiers
alt Unsupported specifier
Parser-->>Err: return FormatSpecifierNotImplemented
Parser-->>Test: Result<Error>
else Parsed components
Parser->>DateLib: validate year/month/day & compute weekday (if provided)
alt Invalid date or mismatch
Parser-->>Err: return InvalidDate / IncompatibleTimestampPattern / InvalidTimestampPattern
Parser-->>Test: Result<Error>
else Valid date/time
Parser->>DateLib: compute epoch nanoseconds
Parser-->>Test: Result<Success>(epochtime_ns, used_pattern_view)
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: ASSERTIVE Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}⚙️ CodeRabbit configuration file
Files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
🔇 Additional comments (8)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (9)
components/core/CMakeLists.txt
(2 hunks)components/core/cmake/Options/options.cmake
(3 hunks)components/core/src/clp_s/CMakeLists.txt
(1 hunks)components/core/src/clp_s/timestamp_parser/CMakeLists.txt
(1 hunks)components/core/src/clp_s/timestamp_parser/ErrorCode.cpp
(1 hunks)components/core/src/clp_s/timestamp_parser/ErrorCode.hpp
(1 hunks)components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
(1 hunks)components/core/src/clp_s/timestamp_parser/TimestampParser.hpp
(1 hunks)components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>
rather than!<expression>
.
Files:
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
components/core/src/clp_s/timestamp_parser/ErrorCode.cpp
components/core/src/clp_s/timestamp_parser/TimestampParser.hpp
components/core/src/clp_s/timestamp_parser/ErrorCode.hpp
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
🧠 Learnings (3)
📚 Learning: 2024-11-26T19:12:09.102Z
Learnt from: LinZhihao-723
PR: y-scope/clp#486
File: components/core/src/clp/error_handling/ErrorCode.hpp:131-134
Timestamp: 2024-11-26T19:12:09.102Z
Learning: In the error handling system (`components/core/src/clp/error_handling/ErrorCode.hpp`), the `ErrorCode` class is intentionally designed to wrap the `ErrorCodeEnum`, not to use the enum directly. This wrapper provides a general interface for adding customized error enums.
Applied to files:
components/core/src/clp_s/timestamp_parser/ErrorCode.cpp
components/core/src/clp_s/timestamp_parser/ErrorCode.hpp
📚 Learning: 2024-11-26T19:16:19.933Z
Learnt from: LinZhihao-723
PR: y-scope/clp#486
File: components/core/tests/test-error_handling.cpp:44-91
Timestamp: 2024-11-26T19:16:19.933Z
Learning: In `components/core/tests/test-error_handling.cpp` (C++), when specializing template methods of `ErrorCategory` via type aliases (e.g., `AlwaysSuccessErrorCategory`), it's acceptable to define these specializations in the current namespace, even though the primary template is in the `clp::error_handling` namespace, due to the use of type aliases and template instantiation.
Applied to files:
components/core/src/clp_s/timestamp_parser/ErrorCode.cpp
📚 Learning: 2024-11-26T19:10:05.051Z
Learnt from: LinZhihao-723
PR: y-scope/clp#486
File: components/core/src/clp/error_handling/ErrorCode.hpp:34-34
Timestamp: 2024-11-26T19:10:05.051Z
Learning: In the `ErrorCategory` template class in `ErrorCode.hpp`, the `name()` and `message(ErrorCodeEnum)` methods should not be declared as pure virtual functions because the class relies on template specializations, and making them pure virtual would make the class abstract, which is not intended.
Applied to files:
components/core/src/clp_s/timestamp_parser/ErrorCode.cpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: manylinux_2_28-x86_64-static-linked-bins
- GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
- GitHub Check: package-image
- GitHub Check: musllinux_1_2-x86_64-static-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: rust-checks (ubuntu-24.04)
- GitHub Check: rust-checks (ubuntu-22.04)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-14, false)
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: lint-check (macos-15)
🔇 Additional comments (5)
components/core/src/clp_s/CMakeLists.txt (1)
3-3
: Looks good.Including the subdirectory keeps the option local, and the inner CMake handles the OFF case.
components/core/src/clp_s/timestamp_parser/ErrorCode.hpp (1)
1-21
: Header is in good shape.Enum values and alias line up with the new parser errors; no gaps spotted.
components/core/src/clp_s/timestamp_parser/ErrorCode.cpp (1)
1-31
: Implementation matches the spec.Error-category name and messages cover every enum path; default guard is appreciated.
components/core/CMakeLists.txt (1)
754-754
: Guard the new link dependency.When
CLP_BUILD_CLP_S_TIMESTAMP_PARSER
is OFF no targetclp_s::timestamp_parser
exists, so this line breaks the configure step. Either add a conditional link or make the test dependency check enforce the option. The latter is discussed in options.cmake; once that’s in place this line is fine. Please wire both changes together.⛔ Skipped due to learnings
Learnt from: Bill-hbrhbr PR: y-scope/clp#1156 File: components/core/CMakeLists.txt:772-772 Timestamp: 2025-08-09T04:07:27.083Z Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.
components/core/src/clp_s/timestamp_parser/CMakeLists.txt (1)
1-25
: Consider reshaping the header dependency.Listing
../Defs.hpp
as a “source” pulls it into IDE projects, but it is still owned by the timestamp_pattern target. If someone disablesCLP_BUILD_CLP_S_TIMESTAMPPATTERN
, the file still exists yet exposes the parser to unwanted coupling. Prefer including it transitively via headers instead of compiling it. That aligns dependency lines with the Option validation suggested earlier.⛔ Skipped due to learnings
Learnt from: Bill-hbrhbr PR: y-scope/clp#1156 File: components/core/CMakeLists.txt:772-772 Timestamp: 2025-08-09T04:07:27.083Z Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.
option( | ||
CLP_BUILD_CLP_S_TIMESTAMP_PARSER | ||
"Build clp_s::timestamp_parser" | ||
ON | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Add trailing period for option description.
All other option descriptions end in a period. Matching that pattern keeps help text tidy.
🤖 Prompt for AI Agents
In components/core/cmake/Options/options.cmake around lines 86 to 90, the option
description string "Build clp_s::timestamp_parser" is missing a trailing period;
update the description to end with a period so it matches the punctuation style
of other options (e.g., "Build clp_s::timestamp_parser.").
function(validate_clp_s_timestamp_parser_dependencies) | ||
validate_clp_dependencies_for_target(CLP_BUILD_CLP_S_TIMESTAMP_PARSER | ||
CLP_BUILD_CLP_STRING_UTILS | ||
) | ||
endfunction() | ||
|
||
function(set_clp_s_timestamp_parser_dependencies) | ||
set_clp_need_flags( | ||
CLP_NEED_DATE | ||
CLP_NEED_YSTDLIB | ||
) | ||
endfunction() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing dependency on CLP_BUILD_CLP_S_TIMESTAMP_PATTERN.
The parser implementation pulls in ../Defs.hpp
, which in turn depends on the timestamp pattern library. Please require CLP_BUILD_CLP_S_TIMESTAMPPATTERN
here to avoid surprising build breaks when timestamp_pattern is disabled.
function(validate_clp_s_timestamp_parser_dependencies)
validate_clp_dependencies_for_target(CLP_BUILD_CLP_S_TIMESTAMP_PARSER
CLP_BUILD_CLP_STRING_UTILS
+ CLP_BUILD_CLP_S_TIMESTAMPPATTERN
)
endfunction()
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In components/core/cmake/Options/options.cmake around lines 368–379, the
timestamp parser functions do not require the timestamp pattern library; add
CLP_BUILD_CLP_S_TIMESTAMP_PATTERN to the validate_clp_dependencies_for_target
call in validate_clp_s_timestamp_parser_dependencies and also add the
corresponding need flag (e.g. CLP_NEED_TIMESTAMP_PATTERN) to the
set_clp_need_flags list in set_clp_s_timestamp_parser_dependencies so the
timestamp_pattern library is enabled when the parser is built.
if (CLP_BUILD_CLP_S_TIMESTAMP_PARSER) | ||
validate_clp_s_timestamp_parser_dependencies() | ||
set_clp_s_timestamp_parser_dependencies() | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests need the timestamp parser option enabled.
unitTest
links against clp_s::timestamp_parser
. If someone disables CLP_BUILD_CLP_S_TIMESTAMP_PARSER
but leaves testing on, CMake will fail during configure. Please add CLP_BUILD_CLP_S_TIMESTAMP_PARSER
to validate_clp_tests_dependencies
so the option gate stays consistent.
function(validate_clp_tests_dependencies)
validate_clp_dependencies_for_target(CLP_BUILD_TESTING
CLP_BUILD_CLP_REGEX_UTILS
CLP_BUILD_CLP_STRING_UTILS
CLP_BUILD_CLP_S_SEARCH_AST
CLP_BUILD_CLP_S_SEARCH_KQL
CLP_BUILD_CLP_S_SEARCH_SQL
CLP_BUILD_CLP_S_TIMESTAMPPATTERN
+ CLP_BUILD_CLP_S_TIMESTAMP_PARSER
)
endfunction()
🤖 Prompt for AI Agents
In components/core/cmake/Options/options.cmake around lines 465-468, tests
require the timestamp parser option but the validation function for tests
doesn't enforce CLP_BUILD_CLP_S_TIMESTAMP_PARSER; update
validate_clp_tests_dependencies to include a check that
CLP_BUILD_CLP_S_TIMESTAMP_PARSER is ON (or call the existing
validate_clp_s_timestamp_parser_dependencies) so that when tests are enabled the
timestamp parser option is required, and return a clear error message if it's
disabled.
src/clp_s/TimestampDictionaryWriter.hpp | ||
src/clp_s/TimestampEntry.cpp | ||
src/clp_s/TimestampEntry.hpp | ||
src/clp_s/timestamp_parser/test/test_TimestampParser.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Protect test sources behind the build option.
If CLP_BUILD_CLP_S_TIMESTAMP_PARSER
is OFF, this source file should not be compiled. Consider wrapping the addition with if(CLP_BUILD_CLP_S_TIMESTAMP_PARSER)
to avoid missing-file errors when the feature is disabled.
set(SOURCE_FILES_clp_s_unitTest
...
- src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
+ $<$<BOOL:${CLP_BUILD_CLP_S_TIMESTAMP_PARSER}>:src/clp_s/timestamp_parser/test/test_TimestampParser.cpp>
...
)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
src/clp_s/timestamp_parser/test/test_TimestampParser.cpp | |
set(SOURCE_FILES_clp_s_unitTest | |
... | |
$<$<BOOL:${CLP_BUILD_CLP_S_TIMESTAMP_PARSER}>:src/clp_s/timestamp_parser/test/test_TimestampParser.cpp> | |
... | |
) |
🤖 Prompt for AI Agents
In components/core/CMakeLists.txt around line 409, the test source
src/clp_s/timestamp_parser/test/test_TimestampParser.cpp is unconditionally
listed which causes missing-file errors when CLP_BUILD_CLP_S_TIMESTAMP_PARSER is
OFF; wrap the addition of this source (or the whole target that references it)
in an if(CLP_BUILD_CLP_S_TIMESTAMP_PARSER) ... endif() block so the file is only
appended/added when the feature is enabled, or alternatively conditionally call
target_sources/add_executable only when that CMake option is TRUE.
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
Show resolved
Hide resolved
* We support the following format specifiers: | ||
* | ||
* - \y Zero-padded year in century (69-99 -> 1969-1999, 00-68 -> 2000-2068). | ||
* - \Y Zero-padded year (0000-9999). | ||
* - \B Full month name (e.g., January). | ||
* - \b Abbreviated month name (e.g., Jan). | ||
* - \m Zero-padded month (01-12). | ||
* - \d Zero-padded day in month (01-31). | ||
* - \e Space-padded day in month( 1-31). | ||
* - \a Abbreviated day in week (e.g., Mon). | ||
* - \p Part of day (AM/PM). | ||
* - \H 24-hour clock, zero-padded hour (00-23). | ||
* - \k 24-hour clock, space-padded hour ( 0-23). | ||
* - \I 12-hour clock, zero-padded hour (01-12). | ||
* - \l 12-hour clock, space-padded hour ( 1-12). | ||
* - \M Zero-padded minute (00-59). | ||
* - \S Zero-padded second (00-60) (60 for leap seconds). | ||
* - \3 Zero-padded millisecond (000-999). | ||
* - \6 Zero-padded microsecond (000000-999999). | ||
* - \9 Zero-padded nanosecond (000000000-999999999). | ||
* - \T Zero-padded fractional second, up to nanoseconds, without trailing zeroes. | ||
* - \E Epoch seconds. | ||
* - \L Epoch miLliseconds. | ||
* - \C Epoch miCroseconds. | ||
* - \N Epoch Nanoseconds. | ||
* - \z{...} Specific timezone, described by content between {}. | ||
* - \\ Literal backslash. | ||
* | ||
* We also support the following CAT sequences: | ||
* | ||
* - \Z Generic timezone -- resolves to literal content, and potentially \z{...}. | ||
* - \? Generic fractional second -- resolves to \3, \6, \9, or \T. | ||
* - \P Unknown-precision epoch time -- resolves to \E, \L, \C, or \N based on heuristic. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Align the documented specifier list with the implemented behaviour.
The comment advertises support for \p
, \H
, \k
, \I
, \l
, \M
, \S
, \3
, \6
, \9
, \T
, \E
, \L
, \C
, \N
, \z{…}
, \Z
, \?
, and \P
, but the implementation presently returns FormatSpecifierNotImplemented
for each of them. Either trim the list to what is truly supported in this PR or implement the missing cases before shipping; otherwise, callers relying on the documented contract will hit runtime failures.
🤖 Prompt for AI Agents
components/core/src/clp_s/timestamp_parser/TimestampParser.hpp lines 31-64: the
comment block lists many specifiers that are not implemented and currently cause
FormatSpecifierNotImplemented at runtime; update the documentation to match
implemented behavior by removing or marking as unsupported the specifiers that
return FormatSpecifierNotImplemented (specifically \p, \H, \k, \I, \l, \M, \S,
\3, \6, \9, \T, \E, \L, \C, \N, \z{...}, \Z, \?, \P), or alternatively implement
those cases in the parser before shipping; pick one approach — for this PR, trim
the comment to only the specifiers the code currently handles, add a short TODO
note listing the unimplemented specifiers and reference an issue/PR for future
work, and run/update unit tests to ensure the doc/test matrix matches behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>
rather than!<expression>
.
Files:
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: musllinux_1_2-x86_64-static-linked-bins
- GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: package-image
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
- GitHub Check: manylinux_2_28-x86_64-static-linked-bins
- GitHub Check: lint-check (macos-15)
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: rust-checks (macos-15)
- GitHub Check: rust-checks (ubuntu-22.04)
- GitHub Check: rust-checks (ubuntu-24.04)
🔇 Additional comments (4)
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp (4)
13-31
: LGTM: Helper function declarations are clear.The documentation and signatures are appropriate for the test helpers.
33-45
: LGTM: Specifier validation logic is sound.The trailing literal technique ensures exact consumption, and the implementation correctly follows the
false == <expression>
coding guideline at line 43.
59-107
: The past review comment about inclusive bounds is based on a misunderstanding.The implementation at line 52 uses
i <= end
, which produces a closed range[begin, end]
. Therefore:
- Line 99:
generate_padded_numbers_in_range(1, 12, ...)
does generate months01
through12
(inclusive).- Lines 102 & 105:
generate_padded_numbers_in_range(1, 31, ...)
does generate days01
through31
(inclusive).The confusion arose because the function name suggests half-open
[begin, end)
semantics, but the implementation uses closed[begin, end]
semantics. The current test coverage is correct.
108-124
: LGTM: Day-of-week validation logic is correct.The timestamps align with the Unix epoch (Thursday, January 1, 1970), and the pattern
"\\d \\aa"
correctly validates day-of-week alignment. Line 123 properly follows thefalse == <expression>
coding guideline.
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
Show resolved
Hide resolved
TEST_CASE("timestamp_parser_parse_timestamp", "[clp-s][timestamp-parser]") { | ||
SECTION("Format specifiers accept valid content.") { | ||
auto const two_digit_years{generate_padded_numbers_in_range(0, 99, 2, '0')}; | ||
assert_specifier_accepts_valid_content('y', two_digit_years); | ||
|
||
auto const four_digit_years{generate_padded_numbers_in_range(0, 9999, 4, '0')}; | ||
assert_specifier_accepts_valid_content('Y', four_digit_years); | ||
|
||
std::vector<std::string> const months{ | ||
"January", | ||
"February", | ||
"March", | ||
"April", | ||
"May", | ||
"June", | ||
"July", | ||
"August", | ||
"September", | ||
"October", | ||
"November", | ||
"December" | ||
}; | ||
assert_specifier_accepts_valid_content('B', months); | ||
|
||
std::vector<std::string> const abbreviated_months{ | ||
"Jan", | ||
"Feb", | ||
"Mar", | ||
"Apr", | ||
"May", | ||
"Jun", | ||
"Jul", | ||
"Aug", | ||
"Sep", | ||
"Oct", | ||
"Nov", | ||
"Dec" | ||
}; | ||
assert_specifier_accepts_valid_content('b', abbreviated_months); | ||
|
||
auto const two_digit_months{generate_padded_numbers_in_range(1, 12, 2, '0')}; | ||
assert_specifier_accepts_valid_content('m', two_digit_months); | ||
|
||
auto const two_digit_days{generate_padded_numbers_in_range(1, 31, 2, '0')}; | ||
assert_specifier_accepts_valid_content('d', two_digit_days); | ||
|
||
auto const space_padded_days(generate_padded_numbers_in_range(1, 31, 2, ' ')); | ||
assert_specifier_accepts_valid_content('e', space_padded_days); | ||
|
||
// The parser asserts that the day of the week in the timestamp is actually correct, so we | ||
// need to include some extra date information to have days of the week line up. | ||
std::vector<std::string> abbreviated_day_in_week_timestamps{ | ||
"04 Sun", | ||
"05 Mon", | ||
"06 Tue", | ||
"07 Wed", | ||
"01 Thu", // Thursday January 1st, 1970 | ||
"02 Fri", | ||
"03 Sat" | ||
}; | ||
for (auto const& day_in_week_timestamp : abbreviated_day_in_week_timestamps) { | ||
std::string generated_pattern; | ||
auto const timestamp{fmt::format("{}a", day_in_week_timestamp)}; | ||
auto const result{parse_timestamp(timestamp, "\\d \\aa", generated_pattern)}; | ||
REQUIRE(false == result.has_error()); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider adding negative test cases.
The current test suite validates successful parsing of valid inputs. To improve robustness, consider adding test cases that verify expected failures (e.g., invalid month numbers, malformed patterns, mismatched day-of-week).
Example:
SECTION("Format specifiers reject invalid content.") {
std::string generated_pattern;
auto result = parse_timestamp("13a", "\\ma", generated_pattern); // Invalid month
REQUIRE(result.has_error());
result = parse_timestamp("32a", "\\da", generated_pattern); // Invalid day
REQUIRE(result.has_error());
}
🤖 Prompt for AI Agents
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp lines
59-126: add negative test cases that assert parsing fails for invalid inputs;
create a new SECTION("Format specifiers reject invalid content.") and call
parse_timestamp with examples like an out-of-range month ("13a" with pattern
"\ma"), out-of-range day ("32a" with pattern "\da"), malformed patterns, and a
timestamp whose day-of-week doesn't match, then REQUIRE that result.has_error()
is true for each case to ensure the parser correctly reports errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, since all these files are newly added, can u address all clang-tidy violations?
#ifndef CLP_S_TIMESTAMPPARSER_HPP | ||
#define CLP_S_TIMESTAMPPARSER_HPP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#ifndef CLP_S_TIMESTAMPPARSER_HPP | |
#define CLP_S_TIMESTAMPPARSER_HPP | |
#ifndef CLP_S_TIMESTAMP_PARSER_TIMESTAMPPARSER_HPP | |
#define CLP_S_TIMESTAMP_PARSER_TIMESTAMPPARSER_HPP |
Also need to update the comment in #endif
#ifndef CLP_S_TIMESTAMPPARSERERRORCODE_HPP | ||
#define CLP_S_TIMESTAMPPARSERERRORCODE_HPP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, this seems out dated.
* @param pattern A timestamp pattern made up of literals, format specifiers, and potentially CAT | ||
* sequences. | ||
* @param generated_pattern A buffer where a newly-generated timestamp pattern can be written, if | ||
* necessary. | ||
* @return On success: | ||
* - The timestamp in epoch nanoseconds. | ||
* - A string_view of the timestamp pattern that corresponds to the timestamp. | ||
* - Lifetime is least of `pattern` and `generated_pattern`. | ||
* On error: | ||
* - A `clp_s::timestamp_parser::ErrorCode`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't reviewed the content yet. But:
- We don't need to indent when wrapping to a new line (unless we recently udpated the guideline that I wasn't aware of)
- We have a guideline to document result return type. Here's an example that also returns a pair:
* @return A result containing a pair or an error code indicating the failure:
In general the rule for result return is:
A result containing xxxx on success, or an error code indicating the failure:\n List all possible error codes
, or useA void result containing xxx on success
if the return type isvoid
.
Can u go through other docstrings to ensure these are all applied?
* @param value The integer value, returned by reference. | ||
* @return Whether conversion was successful. | ||
*/ | ||
[[nodiscard]] auto | ||
convert_padded_string_to_number(std::string_view str, char padding_character, int& value) -> bool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should modernize this by returning std::optional<int>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
components/core/src/clp_s/timestamp_parser/TimestampParser.hpp (1)
32-65
: Document only the implemented specifiers or mark unimplemented ones clearly.The documentation lists many specifiers (
\p
,\H
,\k
,\I
,\l
,\M
,\S
,\3
,\6
,\9
,\T
,\E
,\L
,\C
,\N
,\z{...}
,\Z
,\?
,\P
) that returnFormatSpecifierNotImplemented
in the implementation. This creates a mismatch between the documented API surface and actual behaviour, which can mislead users.Option 1 (preferred for this PR): Remove the unimplemented specifiers from this list and add a note:
* We support the following format specifiers: * * - \y Zero-padded year in century (69-99 -> 1969-1999, 00-68 -> 2000-2068). * - \Y Zero-padded year (0000-9999). * - \B Full month name (e.g., January). * - \b Abbreviated month name (e.g., Jan). * - \m Zero-padded month (01-12). * - \d Zero-padded day in month (01-31). * - \e Space-padded day in month( 1-31). * - \a Abbreviated day in week (e.g., Mon). - * - \p Part of day (AM/PM). - * - \H 24-hour clock, zero-padded hour (00-23). - * - \k 24-hour clock, space-padded hour ( 0-23). - * - \I 12-hour clock, zero-padded hour (01-12). - * - \l 12-hour clock, space-padded hour ( 1-12). - * - \M Zero-padded minute (00-59). - * - \S Zero-padded second (00-60) (60 for leap seconds). - * - \3 Zero-padded millisecond (000-999). - * - \6 Zero-padded microsecond (000000-999999). - * - \9 Zero-padded nanosecond (000000000-999999999). - * - \T Zero-padded fractional second, up to nanoseconds, without trailing zeroes. - * - \E Epoch seconds. - * - \L Epoch miLliseconds. - * - \C Epoch miCroseconds. - * - \N Epoch Nanoseconds. - * - \z{...} Specific timezone, described by content between {}. * - \\ Literal backslash. + * + * Additional format specifiers (\p, \H, \k, \I, \l, \M, \S, \3, \6, \9, \T, \E, \L, \C, \N, \z{...}) + * and CAT sequences (\Z, \?, \P) are planned for future implementation. * - * We also support the following CAT sequences: - * - * - \Z Generic timezone -- resolves to literal content, and potentially \z{...}. - * - \? Generic fractional second -- resolves to \3, \6, \9, or \T. - * - \P Unknown-precision epoch time -- resolves to \E, \L, \C, or \N based on a heuristic. - *Option 2 (if unimplemented specifiers must remain): Clearly mark them as unimplemented:
* We support the following format specifiers: * * - \y Zero-padded year in century (69-99 -> 1969-1999, 00-68 -> 2000-2068). * ... * - \a Abbreviated day in week (e.g., Mon). + * + * The following format specifiers are not yet implemented and will return `FormatSpecifierNotImplemented`: + * * - \p Part of day (AM/PM). * ...
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
components/core/src/clp_s/timestamp_parser/ErrorCode.hpp
(1 hunks)components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
(1 hunks)components/core/src/clp_s/timestamp_parser/TimestampParser.hpp
(1 hunks)components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>
rather than!<expression>
.
Files:
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
components/core/src/clp_s/timestamp_parser/TimestampParser.hpp
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
components/core/src/clp_s/timestamp_parser/ErrorCode.hpp
🧠 Learnings (1)
📚 Learning: 2024-11-26T19:12:09.102Z
Learnt from: LinZhihao-723
PR: y-scope/clp#486
File: components/core/src/clp/error_handling/ErrorCode.hpp:131-134
Timestamp: 2024-11-26T19:12:09.102Z
Learning: In the error handling system (`components/core/src/clp/error_handling/ErrorCode.hpp`), the `ErrorCode` class is intentionally designed to wrap the `ErrorCodeEnum`, not to use the enum directly. This wrapper provides a general interface for adding customized error enums.
Applied to files:
components/core/src/clp_s/timestamp_parser/ErrorCode.hpp
🔇 Additional comments (28)
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp (8)
1-9
: LGTM! Clean includes and namespace setup.The includes are properly organized and the namespace structure is appropriate for the test file.
12-30
: LGTM! Helper function declarations are well-documented.The docstrings clearly describe the purpose and parameters of the helper functions.
32-44
: LGTM! Test helper follows coding guidelines.The function correctly uses
false == result.has_error()
as per the coding guidelines.
46-55
: LGTM! Range generation function is correctly implemented.The function now properly supports inclusive ranges with the updated precondition
begin <= end
and the loop conditioni <= end
, addressing the previous review feedback.
58-65
: LGTM! Year format specifiers are thoroughly tested.The test cases appropriately cover both 2-digit and 4-digit year formats with comprehensive ranges.
66-96
: LGTM! Month format specifiers are well-tested.The test cases cover full month names, abbreviated names, and numeric formats comprehensively.
107-124
: LGTM! Day-of-week validation is properly tested.The test correctly validates that the parser checks day-of-week alignment with actual dates, using Thursday January 1st, 1970 as the reference point. The test also follows the coding guideline by using
false == result.has_error()
.
98-106
: Inclusive upper bounds are correctly tested
generate_padded_numbers_in_range
usesi <= end
, and test ranges for months (1–12) and days (1–31) include the upper bounds.components/core/src/clp_s/timestamp_parser/ErrorCode.hpp (1)
1-21
: LGTM! Error code definitions follow the established pattern.The error code enum and type alias follow the ystdlib error handling pattern consistently. The enum values are descriptive and cover the expected error scenarios for timestamp parsing. Based on learnings, the wrapper pattern using
ErrorCode = ystdlib::error_handling::ErrorCode<ErrorCodeEnum>
is the intentional design.components/core/src/clp_s/timestamp_parser/TimestampParser.hpp (4)
1-12
: LGTM! Standard header structure.The include guard, includes, and namespace declaration are properly structured.
14-31
: LGTM! The introduction and high-level documentation are clear.The documentation effectively explains the purpose of the parser, the concept of format specifiers vs CAT sequences, and the motivation for CAT sequences.
66-84
: LGTM! The parameter and return documentation is comprehensive.The documentation clearly describes the function parameters, return value structure, and all possible error codes.
85-92
: LGTM! Function signature is well-defined.The use of
[[nodiscard]]
, proper parameter types, and Result return type align with modern C++ best practices.components/core/src/clp_s/timestamp_parser/TimestampParser.cpp (15)
1-17
: LGTM! Includes are appropriate.The standard library headers, third-party dependencies (date, string_utils, ystdlib), and local headers are all properly included.
19-72
: LGTM! Constants are well-defined.The parsing constraints and string literal arrays for days/months are clearly defined with appropriate naming and scope.
73-93
: LGTM! Helper function declarations are properly documented.The docstrings clearly describe the template parameter, parameters, and return values.
112-123
: LGTM! Prefix matching helper is correctly implemented.The function properly iterates through candidates and returns the matching index or an error code.
125-156
: LGTM! Parser initialization and escape handling are correct.The function properly initializes state variables and handles escape sequences. The use of
false == escaped
follows the coding guidelines.
158-185
: LGTM! Two-digit year parsing with century inference is correct.The implementation properly handles the 69-99 → 1969-1999 and 00-68 → 2000-2068 mapping as documented. The bounds checking and error handling are appropriate.
186-208
: LGTM! Four-digit year parsing is correct.The implementation validates the year range and properly handles padding.
209-227
: LGTM! Month name parsing is correct.Both full and abbreviated month name parsing use the helper function appropriately and convert the zero-based index to a one-based month value.
228-250
: LGTM! Numeric month parsing is correct.The implementation properly validates the month range (1-12) with appropriate error handling.
251-296
: LGTM! Day parsing handles both padding types correctly.The implementation properly handles both zero-padded (
\d
) and space-padded (\e
) days with appropriate validation.
297-306
: LGTM! Day-of-week parsing is correct.The implementation stores the parsed day-of-week index for later validation against the computed date.
307-336
: LGTM! Unimplemented specifiers and escape handling are correct.The implementation appropriately returns
FormatSpecifierNotImplemented
for specifiers not yet implemented and properly handles literal backslashes.
339-352
: LGTM! Final validation checks are appropriate.The code correctly validates that both pattern and timestamp are fully consumed and prevents mixing date-based and epoch-number representations.
354-372
: LGTM! Date validation and day-of-week verification are correct.The implementation uses the date library to validate the date and correctly verifies day-of-week alignment when provided. The arithmetic for computing the actual day-of-week index is correct (Sunday = 0).
374-379
: LGTM! Epoch time conversion is correct.The implementation properly constructs a time_point and converts it to epoch nanoseconds using chrono duration_cast.
Description
This PR implements part of the parsing side of a new timestamp utility that will allow us to consistently parse timestamps into a known precision, support timezones, and ease maintenance of timestamp patterns.
This initial PR includes all of the necessary boilerplate for the parsing side (docstring for the parsing function, error codes + error category), implements about half of the planned format specifiers, and adds basic tests to ensure that all of the format specifiers that have been implemented so far can correctly parse the kind of content that they're supposed to accept.
Tests that ensure that whole timestamps are parsed correctly will be added in a future PR.
For more context on the project, the full design doc is available upon request.
Checklist
breaking change.
Validation performed
Summary by CodeRabbit
New Features
Configuration
Tests