Skip to content

Conversation

stackotter
Copy link

@stackotter stackotter commented Sep 18, 2025

What does this change do?

Implements optional table order and trivia preservation.

I have included table order preservation in this PR because fine-grained trivia preservation doesn't make much sense if large blocks of the document get reordered regardless. I can split table order preservation into a separate PR if you'd like.

This is my first time writing CPP (although I've written plenty of C in the past), so just let me know if there's anything I've got wrong. I think I'm starting to understand move semantics and stuff like that, but I made quite a few mistakes earlier on (some of which led to crashes), and I probably haven't caught all of them.

Is it related to an exisiting bug report or feature request?

Addresses #28 (and extends its goals to include all trivia)

Todo

  • Implement table order preservation (gated behind -Dordered_tables=true)
  • Implement trivia preservation in parser (gated behind a runtime collect_trivia parameter available on all top-level parsing functions)
  • Implement trivia recreation in toml formatter (gated behind format_flags::preserve_source_trivia)
  • Support whitespace and comments
  • Preserve numeric literal encoding (e.g. 0x15 should remain hexadecimal after round-tripping)
  • Preserve string encoding (e.g. 'foo' shouldn't become '"foo"' after round-tripping)
  • Test against a wide variety of TOML documents to find syntax that fails to be successfully round-tripped
  • Sensibly infer trivia when inserting toml values into an existing document with preserved trivia.

Trivia inference

Trivia inference for inserted toml values won't be perfect of course, but should at least be able to respect when e.g. a document only has single quotes, or always uses 4 space indentation. I'm thinking perhaps trivia preferences could be 'dragged' down through the document. So e.g. a new array element will use the style of the previous array element (if any). We could possibly start the 'drag' operation below the new element before wrapping around to the top in order to syntax that comes later in the document if there aren't any examples before the inserted syntax. This approach sounds slow describing it like that, but we'd probably be able to cache the dragged preferences at each node and only update them when inserting new elements.

Pre-merge checklist

  • I've read CONTRIBUTING.md
  • I've rebased my changes against the current HEAD of origin/master (if necessary)
  • I've added new test cases to verify my change
  • I've regenerated toml.hpp (how-to)
  • I've updated any affected documentation
  • I've rebuilt and run the tests with at least one of:
    • Clang 8 or higher
    • GCC 8 or higher
    • MSVC 19.20 (Visual Studio 2019) or higher
  • I've added my name to the list of contributors in README.md

@stackotter stackotter force-pushed the trivia_preservation branch 2 times, most recently from 794fb0b to ca6d419 Compare September 18, 2025 08:18
@marzer
Copy link
Owner

marzer commented Sep 18, 2025

Hmmn, there's a lot going on here! First note would be: you're doing this work against a commit that's 40 commits behind the current main, so there's likely to be a lot of old code and merge issues. I'd recommend addressing that before you go any further 😅

@marzer
Copy link
Owner

marzer commented Sep 18, 2025

Also, another note: This will introduce a lot of overhead in the storage, so please make this entire feature opt-in at compile-time. Every single piece of it should be gated behind some sort of #if TOML_ENABLE_BLAH, as otherwise the overheads are likely to be unacceptable for existing users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants