Skip to content

Conversation

gibber9809
Copy link
Contributor

@gibber9809 gibber9809 commented Oct 6, 2025

Description

This PR implements part of the parsing side of a new timestamp utility that will allow us to consistently parse timestamps into a known precision, support timezones, and ease maintenance of timestamp patterns.

This initial PR includes all of the necessary boilerplate for the parsing side (docstring for the parsing function, error codes + error category), implements about half of the planned format specifiers, and adds basic tests to ensure that all of the format specifiers that have been implemented so far can correctly parse the kind of content that they're supposed to accept.

Tests that ensure that whole timestamps are parsed correctly will be added in a future PR.

For more context on the project, the full design doc is available upon request.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  • Added test case to validate that all of the implemented format specifiers correctly parse the content that they're supposed to.

Summary by CodeRabbit

  • New Features

    • Added timestamp parsing API that converts formatted timestamps to epoch nanoseconds and returns structured error results for malformed or incompatible patterns.
  • Configuration

    • New build option to enable/disable the timestamp parser; enabling validates and configures required build dependencies.
  • Tests

    • Added unit tests covering format specifiers, padding variants and weekday alignment to improve reliability.

@gibber9809 gibber9809 requested review from a team and wraymo as code owners October 6, 2025 17:09
Copy link
Contributor

coderabbitai bot commented Oct 6, 2025

Walkthrough

Adds a new clp_s::timestamp_parser library (CMake target, headers, implementation, and error codes), unit tests, and build-option wiring; implements parse_timestamp to parse patterned timestamps into epoch nanoseconds and return explicit error codes.

Changes

Cohort / File(s) Summary of changes
Build option & dependency wiring
components/core/cmake/Options/options.cmake
Adds CLP_BUILD_CLP_S_TIMESTAMP_PARSER (default ON); validates dependency on CLP_BUILD_CLP_STRING_UTILS; sets dependency flags (CLP_NEED_DATE, CLP_NEED_YSTDLIB) and integrates TIMESTAMP_PARSER into dependency setup.
Top-level CLP_S build inclusion
components/core/src/clp_s/CMakeLists.txt
Adds timestamp_parser/ as a CLP_S subdirectory.
Timestamp parser CMake target
components/core/src/clp_s/timestamp_parser/CMakeLists.txt
Creates library clp_s_timestamp_parser and public alias clp_s::timestamp_parser; sets C++20, public include dirs, and private links to clp::string_utils, date::date, ysdlib::error_handling.
Error code types & implementation
components/core/src/clp_s/timestamp_parser/ErrorCode.hpp, components/core/src/clp_s/timestamp_parser/ErrorCode.cpp
Adds ErrorCodeEnum and ErrorCode alias; marks enum as error-code type; implements error-category name() and message() mapping enum values (InvalidTimestampPattern, IncompatibleTimestampPattern, InvalidDate, FormatSpecifierNotImplemented).
Timestamp parser API & implementation
components/core/src/clp_s/timestamp_parser/TimestampParser.hpp, components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
Declares and implements parse_timestamp(std::string_view, std::string_view, std::string&) -> Result<std::pair<epochtime_t,std::string_view>, ErrorCode>; supports pattern specifiers (y,Y,B,b,m,d,e,a), escaping, component extraction, calendar validation via date lib, epoch-ns computation, and explicit error returns.
Unit tests
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
Adds Catch2 tests exercising specifiers, padded-number generation helpers, and day-of-week alignment checks.
Unit test integration
components/core/CMakeLists.txt
Adds src/clp_s/timestamp_parser/test/test_TimestampParser.cpp to CLP_S unit test sources and links clp_s::timestamp_parser to the test target.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Test as Unit Tests
  participant Parser as clp_s::timestamp_parser::parse_timestamp
  participant Helpers as Internal helpers
  participant DateLib as date::date / chrono
  participant Err as ystdlib::error_handling

  Test->>Parser: parse_timestamp(timestamp, pattern, generated_pattern)
  Parser->>Helpers: scan pattern, handle escapes, match literals/specifiers
  alt Unsupported specifier
    Parser-->>Err: return FormatSpecifierNotImplemented
    Parser-->>Test: Result<Error>
  else Parsed components
    Parser->>DateLib: validate year/month/day & compute weekday (if provided)
    alt Invalid date or mismatch
      Parser-->>Err: return InvalidDate / IncompatibleTimestampPattern / InvalidTimestampPattern
      Parser-->>Test: Result<Error>
    else Valid date/time
      Parser->>DateLib: compute epoch nanoseconds
      Parser-->>Test: Result<Success>(epochtime_ns, used_pattern_view)
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly and accurately describes the primary change, namely adding part of the parsing functionality for the new timestamp_parser utility within the clp-s component. It follows conventional commit style and clearly conveys the scope of the feature without unnecessary detail or noise.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7d0e335 and 36ad265.

📒 Files selected for processing (1)
  • components/core/src/clp_s/timestamp_parser/TimestampParser.cpp (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: musllinux_1_2-x86_64-static-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
  • GitHub Check: package-image
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: rust-checks (ubuntu-22.04)
  • GitHub Check: rust-checks (ubuntu-24.04)
  • GitHub Check: build-macos (macos-15, false)
  • GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (8)
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp (8)

1-16: LGTM!

The includes are appropriate and well-organized.


18-71: LGTM!

The constants and arrays are well-defined. The two-digit year offset logic (69+ → 1900s, 0-68 → 2000s) follows standard conventions.


73-93: LGTM!

The helper function declarations are well-documented with clear docstrings and appropriate use of [[nodiscard]] attributes.


95-110: LGTM!

The function correctly handles empty strings, strips padding while preserving at least one digit, and converts to an integer. The past review concern about returning false instead of std::nullopt has been addressed.


112-122: LGTM!

The function correctly searches for the first matching prefix and returns the appropriate result or error code.


140-140: Verify the unused number_type_representation variable.

The variable number_type_representation is declared as const bool and always false, making the checks on lines 346 and 350 effectively dead code. Given that the PR description mentions implementing "approximately half of the planned format specifiers," this appears to be intentional placeholder code for future epoch-number parsing support.

Please confirm whether this const variable should remain as a placeholder or if the implementation is incomplete.


143-336: LGTM!

The parsing loop correctly handles escape sequences, literal matching, and all implemented format specifiers. The code consistently follows the coding guideline of using false == <expression> rather than !<expression> (e.g., lines 144, 148, 168, 196, 238, 261, 284). The unimplemented format specifiers appropriately return FormatSpecifierNotImplemented, aligning with the PR's goal of implementing approximately half of the planned specifiers.


339-379: LGTM!

The validation logic is thorough and correct:

  • Trailing content check properly uses false == (line 340) per coding guidelines
  • Date validation uses the date library for correctness (line 355)
  • Day-of-week validation correctly compares the parsed value against the actual weekday derived from the date
  • Conversion to epoch nanoseconds is accurate

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 212200a and c673b7d.

📒 Files selected for processing (9)
  • components/core/CMakeLists.txt (2 hunks)
  • components/core/cmake/Options/options.cmake (3 hunks)
  • components/core/src/clp_s/CMakeLists.txt (1 hunks)
  • components/core/src/clp_s/timestamp_parser/CMakeLists.txt (1 hunks)
  • components/core/src/clp_s/timestamp_parser/ErrorCode.cpp (1 hunks)
  • components/core/src/clp_s/timestamp_parser/ErrorCode.hpp (1 hunks)
  • components/core/src/clp_s/timestamp_parser/TimestampParser.cpp (1 hunks)
  • components/core/src/clp_s/timestamp_parser/TimestampParser.hpp (1 hunks)
  • components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
  • components/core/src/clp_s/timestamp_parser/ErrorCode.cpp
  • components/core/src/clp_s/timestamp_parser/TimestampParser.hpp
  • components/core/src/clp_s/timestamp_parser/ErrorCode.hpp
  • components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
🧠 Learnings (3)
📚 Learning: 2024-11-26T19:12:09.102Z
Learnt from: LinZhihao-723
PR: y-scope/clp#486
File: components/core/src/clp/error_handling/ErrorCode.hpp:131-134
Timestamp: 2024-11-26T19:12:09.102Z
Learning: In the error handling system (`components/core/src/clp/error_handling/ErrorCode.hpp`), the `ErrorCode` class is intentionally designed to wrap the `ErrorCodeEnum`, not to use the enum directly. This wrapper provides a general interface for adding customized error enums.

Applied to files:

  • components/core/src/clp_s/timestamp_parser/ErrorCode.cpp
  • components/core/src/clp_s/timestamp_parser/ErrorCode.hpp
📚 Learning: 2024-11-26T19:16:19.933Z
Learnt from: LinZhihao-723
PR: y-scope/clp#486
File: components/core/tests/test-error_handling.cpp:44-91
Timestamp: 2024-11-26T19:16:19.933Z
Learning: In `components/core/tests/test-error_handling.cpp` (C++), when specializing template methods of `ErrorCategory` via type aliases (e.g., `AlwaysSuccessErrorCategory`), it's acceptable to define these specializations in the current namespace, even though the primary template is in the `clp::error_handling` namespace, due to the use of type aliases and template instantiation.

Applied to files:

  • components/core/src/clp_s/timestamp_parser/ErrorCode.cpp
📚 Learning: 2024-11-26T19:10:05.051Z
Learnt from: LinZhihao-723
PR: y-scope/clp#486
File: components/core/src/clp/error_handling/ErrorCode.hpp:34-34
Timestamp: 2024-11-26T19:10:05.051Z
Learning: In the `ErrorCategory` template class in `ErrorCode.hpp`, the `name()` and `message(ErrorCodeEnum)` methods should not be declared as pure virtual functions because the class relies on template specializations, and making them pure virtual would make the class abstract, which is not intended.

Applied to files:

  • components/core/src/clp_s/timestamp_parser/ErrorCode.cpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
  • GitHub Check: package-image
  • GitHub Check: musllinux_1_2-x86_64-static-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: rust-checks (ubuntu-24.04)
  • GitHub Check: rust-checks (ubuntu-22.04)
  • GitHub Check: build-macos (macos-15, true)
  • GitHub Check: build-macos (macos-14, false)
  • GitHub Check: build-macos (macos-15, false)
  • GitHub Check: build-macos (macos-14, true)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
🔇 Additional comments (5)
components/core/src/clp_s/CMakeLists.txt (1)

3-3: Looks good.

Including the subdirectory keeps the option local, and the inner CMake handles the OFF case.

components/core/src/clp_s/timestamp_parser/ErrorCode.hpp (1)

1-21: Header is in good shape.

Enum values and alias line up with the new parser errors; no gaps spotted.

components/core/src/clp_s/timestamp_parser/ErrorCode.cpp (1)

1-31: Implementation matches the spec.

Error-category name and messages cover every enum path; default guard is appreciated.

components/core/CMakeLists.txt (1)

754-754: Guard the new link dependency.

When CLP_BUILD_CLP_S_TIMESTAMP_PARSER is OFF no target clp_s::timestamp_parser exists, so this line breaks the configure step. Either add a conditional link or make the test dependency check enforce the option. The latter is discussed in options.cmake; once that’s in place this line is fine. Please wire both changes together.

⛔ Skipped due to learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.
components/core/src/clp_s/timestamp_parser/CMakeLists.txt (1)

1-25: Consider reshaping the header dependency.

Listing ../Defs.hpp as a “source” pulls it into IDE projects, but it is still owned by the timestamp_pattern target. If someone disables CLP_BUILD_CLP_S_TIMESTAMPPATTERN, the file still exists yet exposes the parser to unwanted coupling. Prefer including it transitively via headers instead of compiling it. That aligns dependency lines with the Option validation suggested earlier.

⛔ Skipped due to learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.

Comment on lines +86 to +90
option(
CLP_BUILD_CLP_S_TIMESTAMP_PARSER
"Build clp_s::timestamp_parser"
ON
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Add trailing period for option description.

All other option descriptions end in a period. Matching that pattern keeps help text tidy.

🤖 Prompt for AI Agents
In components/core/cmake/Options/options.cmake around lines 86 to 90, the option
description string "Build clp_s::timestamp_parser" is missing a trailing period;
update the description to end with a period so it matches the punctuation style
of other options (e.g., "Build clp_s::timestamp_parser.").

Comment on lines +368 to +379
function(validate_clp_s_timestamp_parser_dependencies)
validate_clp_dependencies_for_target(CLP_BUILD_CLP_S_TIMESTAMP_PARSER
CLP_BUILD_CLP_STRING_UTILS
)
endfunction()

function(set_clp_s_timestamp_parser_dependencies)
set_clp_need_flags(
CLP_NEED_DATE
CLP_NEED_YSTDLIB
)
endfunction()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing dependency on CLP_BUILD_CLP_S_TIMESTAMP_PATTERN.

The parser implementation pulls in ../Defs.hpp, which in turn depends on the timestamp pattern library. Please require CLP_BUILD_CLP_S_TIMESTAMPPATTERN here to avoid surprising build breaks when timestamp_pattern is disabled.

 function(validate_clp_s_timestamp_parser_dependencies)
     validate_clp_dependencies_for_target(CLP_BUILD_CLP_S_TIMESTAMP_PARSER
         CLP_BUILD_CLP_STRING_UTILS
+        CLP_BUILD_CLP_S_TIMESTAMPPATTERN
     )
 endfunction()

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In components/core/cmake/Options/options.cmake around lines 368–379, the
timestamp parser functions do not require the timestamp pattern library; add
CLP_BUILD_CLP_S_TIMESTAMP_PATTERN to the validate_clp_dependencies_for_target
call in validate_clp_s_timestamp_parser_dependencies and also add the
corresponding need flag (e.g. CLP_NEED_TIMESTAMP_PATTERN) to the
set_clp_need_flags list in set_clp_s_timestamp_parser_dependencies so the
timestamp_pattern library is enabled when the parser is built.

Comment on lines +465 to +468
if (CLP_BUILD_CLP_S_TIMESTAMP_PARSER)
validate_clp_s_timestamp_parser_dependencies()
set_clp_s_timestamp_parser_dependencies()
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Tests need the timestamp parser option enabled.

unitTest links against clp_s::timestamp_parser. If someone disables CLP_BUILD_CLP_S_TIMESTAMP_PARSER but leaves testing on, CMake will fail during configure. Please add CLP_BUILD_CLP_S_TIMESTAMP_PARSER to validate_clp_tests_dependencies so the option gate stays consistent.

 function(validate_clp_tests_dependencies)
     validate_clp_dependencies_for_target(CLP_BUILD_TESTING
         CLP_BUILD_CLP_REGEX_UTILS
         CLP_BUILD_CLP_STRING_UTILS
         CLP_BUILD_CLP_S_SEARCH_AST
         CLP_BUILD_CLP_S_SEARCH_KQL
         CLP_BUILD_CLP_S_SEARCH_SQL
         CLP_BUILD_CLP_S_TIMESTAMPPATTERN
+        CLP_BUILD_CLP_S_TIMESTAMP_PARSER
     )
 endfunction()
🤖 Prompt for AI Agents
In components/core/cmake/Options/options.cmake around lines 465-468, tests
require the timestamp parser option but the validation function for tests
doesn't enforce CLP_BUILD_CLP_S_TIMESTAMP_PARSER; update
validate_clp_tests_dependencies to include a check that
CLP_BUILD_CLP_S_TIMESTAMP_PARSER is ON (or call the existing
validate_clp_s_timestamp_parser_dependencies) so that when tests are enabled the
timestamp parser option is required, and return a clear error message if it's
disabled.

src/clp_s/TimestampDictionaryWriter.hpp
src/clp_s/TimestampEntry.cpp
src/clp_s/TimestampEntry.hpp
src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Protect test sources behind the build option.

If CLP_BUILD_CLP_S_TIMESTAMP_PARSER is OFF, this source file should not be compiled. Consider wrapping the addition with if(CLP_BUILD_CLP_S_TIMESTAMP_PARSER) to avoid missing-file errors when the feature is disabled.

 set(SOURCE_FILES_clp_s_unitTest
     ...
-    src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
+    $<$<BOOL:${CLP_BUILD_CLP_S_TIMESTAMP_PARSER}>:src/clp_s/timestamp_parser/test/test_TimestampParser.cpp>
     ...
 )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
set(SOURCE_FILES_clp_s_unitTest
...
$<$<BOOL:${CLP_BUILD_CLP_S_TIMESTAMP_PARSER}>:src/clp_s/timestamp_parser/test/test_TimestampParser.cpp>
...
)
🤖 Prompt for AI Agents
In components/core/CMakeLists.txt around line 409, the test source
src/clp_s/timestamp_parser/test/test_TimestampParser.cpp is unconditionally
listed which causes missing-file errors when CLP_BUILD_CLP_S_TIMESTAMP_PARSER is
OFF; wrap the addition of this source (or the whole target that references it)
in an if(CLP_BUILD_CLP_S_TIMESTAMP_PARSER) ... endif() block so the file is only
appended/added when the feature is enabled, or alternatively conditionally call
target_sources/add_executable only when that CMake option is TRUE.

Comment on lines 31 to 64
* We support the following format specifiers:
*
* - \y Zero-padded year in century (69-99 -> 1969-1999, 00-68 -> 2000-2068).
* - \Y Zero-padded year (0000-9999).
* - \B Full month name (e.g., January).
* - \b Abbreviated month name (e.g., Jan).
* - \m Zero-padded month (01-12).
* - \d Zero-padded day in month (01-31).
* - \e Space-padded day in month( 1-31).
* - \a Abbreviated day in week (e.g., Mon).
* - \p Part of day (AM/PM).
* - \H 24-hour clock, zero-padded hour (00-23).
* - \k 24-hour clock, space-padded hour ( 0-23).
* - \I 12-hour clock, zero-padded hour (01-12).
* - \l 12-hour clock, space-padded hour ( 1-12).
* - \M Zero-padded minute (00-59).
* - \S Zero-padded second (00-60) (60 for leap seconds).
* - \3 Zero-padded millisecond (000-999).
* - \6 Zero-padded microsecond (000000-999999).
* - \9 Zero-padded nanosecond (000000000-999999999).
* - \T Zero-padded fractional second, up to nanoseconds, without trailing zeroes.
* - \E Epoch seconds.
* - \L Epoch miLliseconds.
* - \C Epoch miCroseconds.
* - \N Epoch Nanoseconds.
* - \z{...} Specific timezone, described by content between {}.
* - \\ Literal backslash.
*
* We also support the following CAT sequences:
*
* - \Z Generic timezone -- resolves to literal content, and potentially \z{...}.
* - \? Generic fractional second -- resolves to \3, \6, \9, or \T.
* - \P Unknown-precision epoch time -- resolves to \E, \L, \C, or \N based on heuristic.
*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Align the documented specifier list with the implemented behaviour.

The comment advertises support for \p, \H, \k, \I, \l, \M, \S, \3, \6, \9, \T, \E, \L, \C, \N, \z{…}, \Z, \?, and \P, but the implementation presently returns FormatSpecifierNotImplemented for each of them. Either trim the list to what is truly supported in this PR or implement the missing cases before shipping; otherwise, callers relying on the documented contract will hit runtime failures.

🤖 Prompt for AI Agents
components/core/src/clp_s/timestamp_parser/TimestampParser.hpp lines 31-64: the
comment block lists many specifiers that are not implemented and currently cause
FormatSpecifierNotImplemented at runtime; update the documentation to match
implemented behavior by removing or marking as unsupported the specifiers that
return FormatSpecifierNotImplemented (specifically \p, \H, \k, \I, \l, \M, \S,
\3, \6, \9, \T, \E, \L, \C, \N, \z{...}, \Z, \?, \P), or alternatively implement
those cases in the parser before shipping; pick one approach — for this PR, trim
the comment to only the specifiers the code currently handles, add a short TODO
note listing the unimplemented specifiers and reference an issue/PR for future
work, and run/update unit tests to ensure the doc/test matrix matches behavior.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c673b7d and bdcd7b1.

📒 Files selected for processing (1)
  • components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: build-macos (macos-15, false)
  • GitHub Check: build-macos (macos-15, true)
  • GitHub Check: build-macos (macos-14, true)
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: musllinux_1_2-x86_64-static-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: package-image
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: rust-checks (macos-15)
  • GitHub Check: rust-checks (ubuntu-22.04)
  • GitHub Check: rust-checks (ubuntu-24.04)
🔇 Additional comments (4)
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp (4)

13-31: LGTM: Helper function declarations are clear.

The documentation and signatures are appropriate for the test helpers.


33-45: LGTM: Specifier validation logic is sound.

The trailing literal technique ensures exact consumption, and the implementation correctly follows the false == <expression> coding guideline at line 43.


59-107: The past review comment about inclusive bounds is based on a misunderstanding.

The implementation at line 52 uses i <= end, which produces a closed range [begin, end]. Therefore:

  • Line 99: generate_padded_numbers_in_range(1, 12, ...) does generate months 01 through 12 (inclusive).
  • Lines 102 & 105: generate_padded_numbers_in_range(1, 31, ...) does generate days 01 through 31 (inclusive).

The confusion arose because the function name suggests half-open [begin, end) semantics, but the implementation uses closed [begin, end] semantics. The current test coverage is correct.


108-124: LGTM: Day-of-week validation logic is correct.

The timestamps align with the Unix epoch (Thursday, January 1, 1970), and the pattern "\\d \\aa" correctly validates day-of-week alignment. Line 123 properly follows the false == <expression> coding guideline.

Comment on lines +59 to +126
TEST_CASE("timestamp_parser_parse_timestamp", "[clp-s][timestamp-parser]") {
SECTION("Format specifiers accept valid content.") {
auto const two_digit_years{generate_padded_numbers_in_range(0, 99, 2, '0')};
assert_specifier_accepts_valid_content('y', two_digit_years);

auto const four_digit_years{generate_padded_numbers_in_range(0, 9999, 4, '0')};
assert_specifier_accepts_valid_content('Y', four_digit_years);

std::vector<std::string> const months{
"January",
"February",
"March",
"April",
"May",
"June",
"July",
"August",
"September",
"October",
"November",
"December"
};
assert_specifier_accepts_valid_content('B', months);

std::vector<std::string> const abbreviated_months{
"Jan",
"Feb",
"Mar",
"Apr",
"May",
"Jun",
"Jul",
"Aug",
"Sep",
"Oct",
"Nov",
"Dec"
};
assert_specifier_accepts_valid_content('b', abbreviated_months);

auto const two_digit_months{generate_padded_numbers_in_range(1, 12, 2, '0')};
assert_specifier_accepts_valid_content('m', two_digit_months);

auto const two_digit_days{generate_padded_numbers_in_range(1, 31, 2, '0')};
assert_specifier_accepts_valid_content('d', two_digit_days);

auto const space_padded_days(generate_padded_numbers_in_range(1, 31, 2, ' '));
assert_specifier_accepts_valid_content('e', space_padded_days);

// The parser asserts that the day of the week in the timestamp is actually correct, so we
// need to include some extra date information to have days of the week line up.
std::vector<std::string> abbreviated_day_in_week_timestamps{
"04 Sun",
"05 Mon",
"06 Tue",
"07 Wed",
"01 Thu", // Thursday January 1st, 1970
"02 Fri",
"03 Sat"
};
for (auto const& day_in_week_timestamp : abbreviated_day_in_week_timestamps) {
std::string generated_pattern;
auto const timestamp{fmt::format("{}a", day_in_week_timestamp)};
auto const result{parse_timestamp(timestamp, "\\d \\aa", generated_pattern)};
REQUIRE(false == result.has_error());
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider adding negative test cases.

The current test suite validates successful parsing of valid inputs. To improve robustness, consider adding test cases that verify expected failures (e.g., invalid month numbers, malformed patterns, mismatched day-of-week).

Example:

SECTION("Format specifiers reject invalid content.") {
    std::string generated_pattern;
    auto result = parse_timestamp("13a", "\\ma", generated_pattern);  // Invalid month
    REQUIRE(result.has_error());
    
    result = parse_timestamp("32a", "\\da", generated_pattern);  // Invalid day
    REQUIRE(result.has_error());
}
🤖 Prompt for AI Agents
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp lines
59-126: add negative test cases that assert parsing fails for invalid inputs;
create a new SECTION("Format specifiers reject invalid content.") and call
parse_timestamp with examples like an out-of-range month ("13a" with pattern
"\ma"), out-of-range day ("32a" with pattern "\da"), malformed patterns, and a
timestamp whose day-of-week doesn't match, then REQUIRE that result.has_error()
is true for each case to ensure the parser correctly reports errors.

Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, since all these files are newly added, can u address all clang-tidy violations?

Comment on lines 1 to 2
#ifndef CLP_S_TIMESTAMPPARSER_HPP
#define CLP_S_TIMESTAMPPARSER_HPP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#ifndef CLP_S_TIMESTAMPPARSER_HPP
#define CLP_S_TIMESTAMPPARSER_HPP
#ifndef CLP_S_TIMESTAMP_PARSER_TIMESTAMPPARSER_HPP
#define CLP_S_TIMESTAMP_PARSER_TIMESTAMPPARSER_HPP

Also need to update the comment in #endif

Comment on lines 1 to 2
#ifndef CLP_S_TIMESTAMPPARSERERRORCODE_HPP
#define CLP_S_TIMESTAMPPARSERERRORCODE_HPP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, this seems out dated.

Comment on lines 66 to 75
* @param pattern A timestamp pattern made up of literals, format specifiers, and potentially CAT
* sequences.
* @param generated_pattern A buffer where a newly-generated timestamp pattern can be written, if
* necessary.
* @return On success:
* - The timestamp in epoch nanoseconds.
* - A string_view of the timestamp pattern that corresponds to the timestamp.
* - Lifetime is least of `pattern` and `generated_pattern`.
* On error:
* - A `clp_s::timestamp_parser::ErrorCode`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't reviewed the content yet. But:

  • We don't need to indent when wrapping to a new line (unless we recently udpated the guideline that I wasn't aware of)
  • We have a guideline to document result return type. Here's an example that also returns a pair:
    * @return A result containing a pair or an error code indicating the failure:

    In general the rule for result return is:
    A result containing xxxx on success, or an error code indicating the failure:\n List all possible error codes, or use A void result containing xxx on success if the return type is void.

Can u go through other docstrings to ensure these are all applied?

Comment on lines 60 to 64
* @param value The integer value, returned by reference.
* @return Whether conversion was successful.
*/
[[nodiscard]] auto
convert_padded_string_to_number(std::string_view str, char padding_character, int& value) -> bool;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should modernize this by returning std::optional<int>.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
components/core/src/clp_s/timestamp_parser/TimestampParser.hpp (1)

32-65: Document only the implemented specifiers or mark unimplemented ones clearly.

The documentation lists many specifiers (\p, \H, \k, \I, \l, \M, \S, \3, \6, \9, \T, \E, \L, \C, \N, \z{...}, \Z, \?, \P) that return FormatSpecifierNotImplemented in the implementation. This creates a mismatch between the documented API surface and actual behaviour, which can mislead users.

Option 1 (preferred for this PR): Remove the unimplemented specifiers from this list and add a note:

 * We support the following format specifiers:
 *
 * - \y Zero-padded year in century (69-99 -> 1969-1999, 00-68 -> 2000-2068).
 * - \Y Zero-padded year (0000-9999).
 * - \B Full month name (e.g., January).
 * - \b Abbreviated month name (e.g., Jan).
 * - \m Zero-padded month (01-12).
 * - \d Zero-padded day in month (01-31).
 * - \e Space-padded day in month( 1-31).
 * - \a Abbreviated day in week (e.g., Mon).
- * - \p Part of day (AM/PM).
- * - \H 24-hour clock, zero-padded hour (00-23).
- * - \k 24-hour clock, space-padded hour ( 0-23).
- * - \I 12-hour clock, zero-padded hour (01-12).
- * - \l 12-hour clock, space-padded hour ( 1-12).
- * - \M Zero-padded minute (00-59).
- * - \S Zero-padded second (00-60) (60 for leap seconds).
- * - \3 Zero-padded millisecond (000-999).
- * - \6 Zero-padded microsecond (000000-999999).
- * - \9 Zero-padded nanosecond (000000000-999999999).
- * - \T Zero-padded fractional second, up to nanoseconds, without trailing zeroes.
- * - \E Epoch seconds.
- * - \L Epoch miLliseconds.
- * - \C Epoch miCroseconds.
- * - \N Epoch Nanoseconds.
- * - \z{...} Specific timezone, described by content between {}.
 * - \\ Literal backslash.
+ *
+ * Additional format specifiers (\p, \H, \k, \I, \l, \M, \S, \3, \6, \9, \T, \E, \L, \C, \N, \z{...})
+ * and CAT sequences (\Z, \?, \P) are planned for future implementation.
 *
- * We also support the following CAT sequences:
- *
- * - \Z Generic timezone -- resolves to literal content, and potentially \z{...}.
- * - \? Generic fractional second -- resolves to \3, \6, \9, or \T.
- * - \P Unknown-precision epoch time -- resolves to \E, \L, \C, or \N based on a heuristic.
- *

Option 2 (if unimplemented specifiers must remain): Clearly mark them as unimplemented:

 * We support the following format specifiers:
 *
 * - \y Zero-padded year in century (69-99 -> 1969-1999, 00-68 -> 2000-2068).
 * ...
 * - \a Abbreviated day in week (e.g., Mon).
+ *
+ * The following format specifiers are not yet implemented and will return `FormatSpecifierNotImplemented`:
+ *
 * - \p Part of day (AM/PM).
 * ...
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bdcd7b1 and 7d0e335.

📒 Files selected for processing (4)
  • components/core/src/clp_s/timestamp_parser/ErrorCode.hpp (1 hunks)
  • components/core/src/clp_s/timestamp_parser/TimestampParser.cpp (1 hunks)
  • components/core/src/clp_s/timestamp_parser/TimestampParser.hpp (1 hunks)
  • components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit configuration file

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
  • components/core/src/clp_s/timestamp_parser/TimestampParser.hpp
  • components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
  • components/core/src/clp_s/timestamp_parser/ErrorCode.hpp
🧠 Learnings (1)
📚 Learning: 2024-11-26T19:12:09.102Z
Learnt from: LinZhihao-723
PR: y-scope/clp#486
File: components/core/src/clp/error_handling/ErrorCode.hpp:131-134
Timestamp: 2024-11-26T19:12:09.102Z
Learning: In the error handling system (`components/core/src/clp/error_handling/ErrorCode.hpp`), the `ErrorCode` class is intentionally designed to wrap the `ErrorCodeEnum`, not to use the enum directly. This wrapper provides a general interface for adding customized error enums.

Applied to files:

  • components/core/src/clp_s/timestamp_parser/ErrorCode.hpp
🔇 Additional comments (28)
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp (8)

1-9: LGTM! Clean includes and namespace setup.

The includes are properly organized and the namespace structure is appropriate for the test file.


12-30: LGTM! Helper function declarations are well-documented.

The docstrings clearly describe the purpose and parameters of the helper functions.


32-44: LGTM! Test helper follows coding guidelines.

The function correctly uses false == result.has_error() as per the coding guidelines.


46-55: LGTM! Range generation function is correctly implemented.

The function now properly supports inclusive ranges with the updated precondition begin <= end and the loop condition i <= end, addressing the previous review feedback.


58-65: LGTM! Year format specifiers are thoroughly tested.

The test cases appropriately cover both 2-digit and 4-digit year formats with comprehensive ranges.


66-96: LGTM! Month format specifiers are well-tested.

The test cases cover full month names, abbreviated names, and numeric formats comprehensively.


107-124: LGTM! Day-of-week validation is properly tested.

The test correctly validates that the parser checks day-of-week alignment with actual dates, using Thursday January 1st, 1970 as the reference point. The test also follows the coding guideline by using false == result.has_error().


98-106: Inclusive upper bounds are correctly tested
generate_padded_numbers_in_range uses i <= end, and test ranges for months (1–12) and days (1–31) include the upper bounds.

components/core/src/clp_s/timestamp_parser/ErrorCode.hpp (1)

1-21: LGTM! Error code definitions follow the established pattern.

The error code enum and type alias follow the ystdlib error handling pattern consistently. The enum values are descriptive and cover the expected error scenarios for timestamp parsing. Based on learnings, the wrapper pattern using ErrorCode = ystdlib::error_handling::ErrorCode<ErrorCodeEnum> is the intentional design.

components/core/src/clp_s/timestamp_parser/TimestampParser.hpp (4)

1-12: LGTM! Standard header structure.

The include guard, includes, and namespace declaration are properly structured.


14-31: LGTM! The introduction and high-level documentation are clear.

The documentation effectively explains the purpose of the parser, the concept of format specifiers vs CAT sequences, and the motivation for CAT sequences.


66-84: LGTM! The parameter and return documentation is comprehensive.

The documentation clearly describes the function parameters, return value structure, and all possible error codes.


85-92: LGTM! Function signature is well-defined.

The use of [[nodiscard]], proper parameter types, and Result return type align with modern C++ best practices.

components/core/src/clp_s/timestamp_parser/TimestampParser.cpp (15)

1-17: LGTM! Includes are appropriate.

The standard library headers, third-party dependencies (date, string_utils, ystdlib), and local headers are all properly included.


19-72: LGTM! Constants are well-defined.

The parsing constraints and string literal arrays for days/months are clearly defined with appropriate naming and scope.


73-93: LGTM! Helper function declarations are properly documented.

The docstrings clearly describe the template parameter, parameters, and return values.


112-123: LGTM! Prefix matching helper is correctly implemented.

The function properly iterates through candidates and returns the matching index or an error code.


125-156: LGTM! Parser initialization and escape handling are correct.

The function properly initializes state variables and handles escape sequences. The use of false == escaped follows the coding guidelines.


158-185: LGTM! Two-digit year parsing with century inference is correct.

The implementation properly handles the 69-99 → 1969-1999 and 00-68 → 2000-2068 mapping as documented. The bounds checking and error handling are appropriate.


186-208: LGTM! Four-digit year parsing is correct.

The implementation validates the year range and properly handles padding.


209-227: LGTM! Month name parsing is correct.

Both full and abbreviated month name parsing use the helper function appropriately and convert the zero-based index to a one-based month value.


228-250: LGTM! Numeric month parsing is correct.

The implementation properly validates the month range (1-12) with appropriate error handling.


251-296: LGTM! Day parsing handles both padding types correctly.

The implementation properly handles both zero-padded (\d) and space-padded (\e) days with appropriate validation.


297-306: LGTM! Day-of-week parsing is correct.

The implementation stores the parsed day-of-week index for later validation against the computed date.


307-336: LGTM! Unimplemented specifiers and escape handling are correct.

The implementation appropriately returns FormatSpecifierNotImplemented for specifiers not yet implemented and properly handles literal backslashes.


339-352: LGTM! Final validation checks are appropriate.

The code correctly validates that both pattern and timestamp are fully consumed and prevents mixing date-based and epoch-number representations.


354-372: LGTM! Date validation and day-of-week verification are correct.

The implementation uses the date library to validate the date and correctly verifies day-of-week alignment when provided. The arithmetic for computing the actual day-of-week index is correct (Sunday = 0).


374-379: LGTM! Epoch time conversion is correct.

The implementation properly constructs a time_point and converts it to epoch nanoseconds using chrono duration_cast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants