You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Make {eol_}comments_re read-only and non-init arguments in ParserConfig (#352)
* [buffering] drop forced multiline match for string patterns
Previously, when scanning for matches to a regex, if the type of the
pattern was `str`, the pattern was always compiled with `re.MULTILINE`.
Recent changes to `ParserConfig` [0] changed the type used for regex
matches in generated code from `str` to `re.Pattern` which could lead to
a difference in behavior from previous versions where a defined comments
or eol_comments may have been implicitly relying on the `re.MULTILINE`
flag.
After discussion [1], it has been determined that usage of `re` flags
within TatSu should be deprecated in favor of users specifying the
necessary flags within patterns.
As such, drop the `re.MULTILINE` flag for strings compiled on the fly.
[0]: #338
[1]: #351 (comment)
* [grammar] make eol_comments multiline match
Make the default eol_comments regex use multiline matching.
Recent changes to `ParserConfig` [0] now use a precompiled regex (an
`re.Pattern`) instead of compiling the `str` regex on the fly.
The `Tokenizer` previously assumed `str` type regexes should all be
`re.MULTILINE` regardless of options defined in the regex itself when
compiling the pattern. This behavior has since changed to no longer
automatically apply and thus requires configurations to specify the
option in the pattern.
[0]: #338
* [infos] make {eol_}comments_re read-only attributes
Previously, the `eol_comments_re` and `comments_re` attributes were
public init arguments, were modifiable, and could thus become out of
sync with the `eol_comments` and `comments` attributes.
Also, with recent changes to `ParserConfig` [0], there were two ways to
initialize the regex values for comments and eol_comments directives;
either via the constructor using the *_re variables or by using the
sister string arguments and relying on `__post_init__` to compile the
values which trumped the explicit *_re argument values.
Now, the constructor interface has been simplified to not take either
`eol_comments_re` or `comments_re` as arguments. Callers may only use
`eol_comments` and `comments`.
The `eol_comments_re` and `comments_re` attributes are still
public, but are read-only so they are always a reflection of their
sister string values passed into the constructor.
[0]: #200
* [codegen] migrate to {eol_}comments
* [ngcodegen] migrate to {eol_}comments
* [bootstrap] migrate to {eol_}comments
* [lint] resolve errors
* [docs] note {eol_}comments directive behavior changes
* [docs] update syntax to reflect {eol_}comments arguments
* [test] fix test_parse_hash to use eol_comments
* [test] explicitly use multiline match in test_patterns_with_newlines
Copy file name to clipboardExpand all lines: docs/directives.rst
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,6 +29,8 @@ Specifies a regular expression to identify and exclude inline (bracketed) commen
29
29
30
30
@@comments :: /\(\*((?:.|\n)*?)\*\)/
31
31
32
+
.. note::
33
+
Prior to 5.12.1, comments implicitly had the `(?m) <https://docs.python.org/3/library/re.html#re.MULTILINE>`_ option defined. This is no longer the case.
32
34
33
35
``@@eol_comments :: <regexp>``
34
36
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -39,6 +41,8 @@ Specifies a regular expression to identify and exclude end-of-line comments befo
39
41
40
42
@@eol_comments :: /#([^\n]*?)$/
41
43
44
+
.. note::
45
+
Prior to 5.12.1, eol_comments implicitly had the `(?m) <https://docs.python.org/3/library/re.html#re.MULTILINE>`_ option defined. This is no longer the case.
0 commit comments