Improve error message for identifiers with leading underscore #28897

tetektoza · 2025-10-05T23:13:25Z

When users write identifiers starting with underscore, (e.g.: let _c: u32), the parser produces an uninformative error.

Root cause of that is that lexer splits _c into two separate tokens, and the parser had no grammar rule matching the pattern, so LARLPOP generates a generir error listing what tokens it expected at that position (which happened to be '(' for tuple destructuring).

So, this adds new lexer token - InvalidLeadingUnderscoreIdent that catches any leading underscore next to identifiers.

Before:

Error [EPAR0370005]: expected an identifier, '(' -- found '_'
    --> \\?\C:\Users\tetektoza\source\repos\leo\test_underscore_project\test_underscore\src\main.leo:3:13
     |
   3 |         let _c: u32 = a + b;
     |             ^

After:

Error [EPAR0370047]: Identifier cannot start with an underscore
    --> \\?\C:\Users\tetektoza\source\repos\leo\test_underscore_project\test_underscore\src\main.leo:3:13
     |
   3 |         let _c: u32 = a + b;
     |             ^^
     |
     = Identifiers must start with a letter.

Resolves: #28755

tetektoza · 2025-10-05T23:29:03Z

Argh, I didn't see it before (and I should've checked first), but now after pasting before and after to the PR message, it looks like the error message has been improved on mainnet. 😅 Maybe the current one (which is different compared to the one from #28755) is enough and this PR could be closed and issue too, since it seems like there was some improvement in this error message. Not sure, but pasting my solution anyways.

mohammadfawaz · 2025-10-05T23:33:05Z

In https://github.com/ProvableHQ/leo/blob/mainnet/compiler/parser-lossless/src/lib.rs, we already have a function that checks identifiers. Is it better to do this check there? I'm not sure how I feel about inserting invalid regrexes in the tokenizer 🤔

Also, when adding new errors, we have this quirk that new errors should always be appended at the end of the errors file, so that old error codes don't change.

tetektoza · 2025-10-06T00:01:34Z

Point 1 - hmm, you're talking about check_identifier right? I took a look on it, but _c becomes two tokens at this point, Underscore and Identifier, so the function runs only on Identifier token. I can extend it, but it seemed counter-intuitive at this point, it's also possible to modify grammar, but probably with a ton of changes.

Point 2 - ahhh, good point, thanks. I can adjust it.

Let me know about point 1 and my first comment in the PR, maybe we don't need it anymore. Not sure.

mohammadfawaz · 2025-10-06T00:30:05Z

Point 1 - hmm, you're talking about check_identifier right? I took a look on it, but _c becomes two tokens at this point, Underscore and Identifier, so the function runs only on Identifier token. I can extend it, but it seemed counter-intuitive at this point, it's also possible to modify grammar, but probably with a ton of changes.

Oh I now undersatnd. Yeah then doing it in in the tokenizer is probably the way to go. Thanks!

mohammadfawaz · 2025-10-06T00:50:05Z

errors/src/errors/parser/parser_errors.rs

+    @formatted
+    identifier_cannot_start_with_underscore {
+        args: (),
+        msg: "Identifier cannot start with an underscore",


Suggested change

msg: "Identifier cannot start with an underscore",

msg: "Identifiers cannot start with an underscore.",

When users write identifiers starting with underscore, (e.g.: let _c: u32), the parser produces an uninformative error. Root cause of that is that lexer splits `_c` into two separate tokens, and the parser had no grammar rule matching the pattern, so LARLPOP generates a generir error listing what tokens it expected at that position (which happened to be '(' for tuple destructuring). So, this adds new lexer token - `InvalidLeadingUnderscoreIdent` that catches any leading underscore next to identifiers.

tetektoza · 2025-10-07T00:40:38Z

Also moved the error at the end of the file as discussed, removed a lot of changes with newly updated error codes.

mohammadfawaz

LGTM!

mohammadfawaz · 2025-10-09T16:17:23Z

Closing in favour of #28909

mohammadfawaz reviewed Oct 6, 2025

View reviewed changes

tetektoza added 4 commits October 7, 2025 02:35

Add testcases that cover leading underscore in identifiers/expressions

8eaa2db

Add expectations for newly created testcases

7968953

Rewrite expectations after newly added lexer rules for underscore

019ccf1

tetektoza force-pushed the fix/28755_add_rules_to_grammar_for_underscore branch from 88a72cb to 019ccf1 Compare October 7, 2025 00:39

mohammadfawaz mentioned this pull request Oct 9, 2025

Improve error message for identifiers with leading underscore #28909

Merged

mohammadfawaz approved these changes Oct 9, 2025

View reviewed changes

mohammadfawaz closed this Oct 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve error message for identifiers with leading underscore #28897

Improve error message for identifiers with leading underscore #28897

Uh oh!

tetektoza commented Oct 5, 2025

Uh oh!

tetektoza commented Oct 5, 2025 •

edited

Loading

Uh oh!

mohammadfawaz commented Oct 5, 2025

Uh oh!

tetektoza commented Oct 6, 2025

Uh oh!

mohammadfawaz commented Oct 6, 2025

Uh oh!

mohammadfawaz Oct 6, 2025

Uh oh!

tetektoza Oct 7, 2025

Uh oh!

tetektoza commented Oct 7, 2025

Uh oh!

mohammadfawaz left a comment

Uh oh!

mohammadfawaz commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	msg: "Identifier cannot start with an underscore",
	msg: "Identifiers cannot start with an underscore.",

Improve error message for identifiers with leading underscore #28897

Improve error message for identifiers with leading underscore #28897

Uh oh!

Conversation

tetektoza commented Oct 5, 2025

Uh oh!

tetektoza commented Oct 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mohammadfawaz commented Oct 5, 2025

Uh oh!

tetektoza commented Oct 6, 2025

Uh oh!

mohammadfawaz commented Oct 6, 2025

Uh oh!

mohammadfawaz Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

tetektoza Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

tetektoza commented Oct 7, 2025

Uh oh!

mohammadfawaz left a comment

Choose a reason for hiding this comment

Uh oh!

mohammadfawaz commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tetektoza commented Oct 5, 2025 •

edited

Loading