-
Notifications
You must be signed in to change notification settings - Fork 591
[Lua] Highlight strings used by string.{format,gsub,pack,etc} #4194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
From a quick glance over this PR I noticed:
|
Noted. I'll probably look at the tests from eregex etc.
That is correct. Anchors are only matched at the ends of strings. elsewhere they match the character. |
It has become the practice more and more to use named contexts to |
I've added (roughly) the tests from the Regular Expression package and worked on getting the syntax to work with character classes and captures now.
This is fixed now so that the anchor only matches at the very start and very end. (Because of how multiline strings work, it has to account for a possible preceding newline) @michaelblyons |
I've added some lookahead matching on the one line for immediate calls so that it can work with |
Lua/Lua.sublime-syntax
Outdated
push: | ||
- pattern-unquoted-balanced-char | ||
- pattern-unquoted-balanced-char | ||
- match: \% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume:
- match: '%.'
scope: constant.character.escape.lua
should work just fine? No?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. With the quoted ones, it needs to be in a separate pattern so that the additional prototype can end the string if it is %'
, but I'll test what happens with an embedded syntax
|
This seems to be to do with Or are you meaning in the |
Instead of something like strings:
- match: '"'
scope: punctuation.definition.string.begin
push:
- meta_scope: string.quoted.double
- match: '"'
scope: punctuation.definition.string.end
pop: 1 Use strings:
- match: '"'
scope: punctuation.definition.string.begin
push: string-double-body
string-double-body:
- meta_scope: string.quoted.double
- match: '"'
scope: punctuation.definition.string.end
pop: 1 |
- add named contexts - add `leading_wspace` var - add `first_line_language` var
added some named contexts and section markers |
- match: \[(=*)\[ | ||
scope: punctuation.definition.string.begin.lua | ||
set: multiline-string-body | ||
with_prototype: | ||
- include: fmtstr-embed | ||
|
||
- match: \' | ||
scope: punctuation.definition.string.begin.lua | ||
set: single-quoted-string-body | ||
with_prototype: | ||
- include: fmtstr-embed | ||
|
||
- match: \" | ||
scope: punctuation.definition.string.begin.lua | ||
set: double-quoted-string-body | ||
with_prototype: | ||
- include: fmtstr-embed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I'd try to avoid with_prototype
wherever possible as it is known to cause trouble every here and then by either causing complexity to explode or by just creating inclusion loops.
In this very special case it is bombs on ants, as string contexts are trivial.
Hence I'd recommend:
- match: \[(=*)\[ | |
scope: punctuation.definition.string.begin.lua | |
set: multiline-string-body | |
with_prototype: | |
- include: fmtstr-embed | |
- match: \' | |
scope: punctuation.definition.string.begin.lua | |
set: single-quoted-string-body | |
with_prototype: | |
- include: fmtstr-embed | |
- match: \" | |
scope: punctuation.definition.string.begin.lua | |
set: double-quoted-string-body | |
with_prototype: | |
- include: fmtstr-embed | |
- match: \[(=*)\[ | |
scope: punctuation.definition.string.begin.lua | |
set: multiline-fmtstr-body | |
- match: \' | |
scope: punctuation.definition.string.begin.lua | |
set: single-quoted-fmtstr-body | |
- match: \" | |
scope: punctuation.definition.string.begin.lua | |
set: double-quoted-fmtstr-body | |
single-quoted-fmtstr-body: | |
- meta_include_prototype: false | |
- meta_scope: meta.string.lua string.quoted.single.lua | |
- include: single-quoted-string-body | |
- include: fmtstr-placeholder | |
double-quoted-fmtstr-body: | |
- meta_include_prototype: false | |
- meta_scope: meta.string.lua string.quoted.double.lua | |
- include: double-quoted-string-body | |
- include: fmtstr-placeholder | |
multiline-fmtstr-body: | |
- meta_include_prototype: false | |
- meta_scope: meta.string.lua string.quoted.multiline.lua | |
- include: multiline-string-body | |
- include: fmtstr-placeholder |
Maybe a little bit more verbose implementation wise, but more efficient.
fmtstr-embed: | ||
- meta_include_prototype: false | ||
- match: '%%' | ||
scope: constant.character.escape.lua | ||
|
||
- match: '%' | ||
scope: constant.other.placeholder.lua | ||
push: fmtstr-pattern | ||
|
||
fmtstr-pattern: | ||
- meta_include_prototype: false | ||
# these characters never appear in a format pattern so it can be simpler | ||
- match: (?=[\\\"\'\]]) | ||
pop: 1 | ||
- match: '[-+#0-9. ]' | ||
scope: constant.other.placeholder.lua | ||
- match: '[cdiuoxXaAfeEgGpqs]' | ||
scope: constant.other.placeholder.lua | ||
pop: 1 | ||
- match: '.' | ||
scope: invalid.illegal.invalid-format.lua | ||
pop: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format string placeholder patterns should be as restrictive as possible to avoid false positives. They are also rather unlikely to be subject of interpolation and for same reason probably don't need or even shouldn't support it.
The following suggestion is a tweaked pattern from Python syntax.
fmtstr-embed: | |
- meta_include_prototype: false | |
- match: '%%' | |
scope: constant.character.escape.lua | |
- match: '%' | |
scope: constant.other.placeholder.lua | |
push: fmtstr-pattern | |
fmtstr-pattern: | |
- meta_include_prototype: false | |
# these characters never appear in a format pattern so it can be simpler | |
- match: (?=[\\\"\'\]]) | |
pop: 1 | |
- match: '[-+#0-9. ]' | |
scope: constant.other.placeholder.lua | |
- match: '[cdiuoxXaAfeEgGpqs]' | |
scope: constant.other.placeholder.lua | |
pop: 1 | |
- match: '.' | |
scope: invalid.illegal.invalid-format.lua | |
pop: 1 | |
fmtstr-placeholder: | |
- match: '%%' | |
scope: constant.character.escape.lua | |
- match: |- | |
(?x) | |
% | |
\#? # alternate form | |
0? # pad with zeros | |
\-? # left-adjust | |
\ ? # implicit sign | |
[+-]? # sign | |
\d* # width | |
(\.\d*)? # precision | |
[cdiuoxXaAfeEgGpqs%] | |
scope: constant.other.placeholder.lua | |
- match: '%[^"''\\\]]' | |
scope: invalid.illegal.invalid-format.lua |
Note: See also somewhat renamed context name.
Please double check, whether I missed something.
packstr-string: | ||
- match: \[(=*)\[ | ||
scope: punctuation.definition.string.begin.lua | ||
set: multiline-string-body | ||
with_prototype: | ||
- include: packstr-pattern | ||
|
||
- match: \' | ||
scope: punctuation.definition.string.begin.lua | ||
set: single-quoted-string-body | ||
with_prototype: | ||
- include: packstr-pattern | ||
|
||
- match: \" | ||
scope: punctuation.definition.string.begin.lua | ||
set: double-quoted-string-body | ||
with_prototype: | ||
- include: packstr-pattern | ||
|
||
packstr-pattern: | ||
- meta_include_prototype: false | ||
- match: '[=<>]|![0-9]*' | ||
scope: storage.modifier.lua | ||
|
||
- match: '[bBhHlLjJTfdnz]|[Iis][0-9]*|c[0-9]+' | ||
scope: storage.type.lua | ||
|
||
- match: '[xX]' | ||
scope: punctuation.separator.padding.lua | ||
|
||
- match: (?:[^\\\"\'\]\s]) | ||
scope: invalid.illegal.invalid-storage.lua |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous comments apply here as well.
packstr-string: | |
- match: \[(=*)\[ | |
scope: punctuation.definition.string.begin.lua | |
set: multiline-string-body | |
with_prototype: | |
- include: packstr-pattern | |
- match: \' | |
scope: punctuation.definition.string.begin.lua | |
set: single-quoted-string-body | |
with_prototype: | |
- include: packstr-pattern | |
- match: \" | |
scope: punctuation.definition.string.begin.lua | |
set: double-quoted-string-body | |
with_prototype: | |
- include: packstr-pattern | |
packstr-pattern: | |
- meta_include_prototype: false | |
- match: '[=<>]|![0-9]*' | |
scope: storage.modifier.lua | |
- match: '[bBhHlLjJTfdnz]|[Iis][0-9]*|c[0-9]+' | |
scope: storage.type.lua | |
- match: '[xX]' | |
scope: punctuation.separator.padding.lua | |
- match: (?:[^\\\"\'\]\s]) | |
scope: invalid.illegal.invalid-storage.lua | |
packstr-string: | |
- match: \[(=*)\[ | |
scope: punctuation.definition.string.begin.lua | |
set: multiline-packstr-body | |
- match: \' | |
scope: punctuation.definition.string.begin.lua | |
set: single-quoted-packstr-body | |
- match: \" | |
scope: punctuation.definition.string.begin.lua | |
set: double-quoted-packstr-body | |
single-quoted-packstr-body: | |
- meta_include_prototype: false | |
- meta_scope: meta.string.lua string.quoted.single.lua | |
- include: single-quoted-string-body | |
- include: packstr-patterns | |
double-quoted-packstr-body: | |
- meta_include_prototype: false | |
- meta_scope: meta.string.lua string.quoted.double.lua | |
- include: double-quoted-string-body | |
- include: packstr-patterns | |
multiline-packstr-body: | |
- meta_include_prototype: false | |
- meta_scope: meta.string.lua string.quoted.multiline.lua | |
- include: multiline-string-body | |
- include: packstr-patterns | |
packstr-patterns: | |
- match: '[=<>]|![0-9]*' | |
scope: storage.modifier.lua | |
- match: '[bBhHlLjJTfdnz]|[Iis][0-9]*|c[0-9]+' | |
scope: storage.type.lua | |
- match: '[xX]' | |
scope: punctuation.separator.padding.lua | |
- match: '[^"''\\\]\s]' | |
scope: invalid.illegal.invalid-storage.lua |
- match: \[(=*)\[ | ||
scope: punctuation.definition.string.begin.lua | ||
set: | ||
- pattern-multiline-string-body | ||
- pattern-test-first-caret-multiline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reliably escape on proper amount of =
chars, embed in combo with pop could be used here:
- match: \[(=*)\[ | |
scope: punctuation.definition.string.begin.lua | |
set: | |
- pattern-multiline-string-body | |
- pattern-test-first-caret-multiline | |
- match: \[(=*)\[ | |
scope: meta.string.lua string.quoted.multiline.lua punctuation.definition.string.begin.lua | |
embed: pattern-multiline-string-body | |
embed_scope: meta.string.lua string.quoted.multiline.lua | |
escape: \]\1\] | |
escape_captures: | |
0: meta.string.lua string.quoted.multiline.lua punctuation.definition.string.end.lua | |
pop: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks better than the other suggestion above, I wouldn't call it a "quoted" string, though.
pattern-test-first-caret-multiline: | ||
- meta_include_prototype: false | ||
- match: $\n? | ||
set: pattern-test-first-caret | ||
- include: pattern-test-first-caret | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obsolete with previous change
pattern-test-first-caret-multiline: | |
- meta_include_prototype: false | |
- match: $\n? | |
set: pattern-test-first-caret | |
- include: pattern-test-first-caret |
pattern-multiline-string-body: | ||
- meta_include_prototype: false | ||
- meta_scope: meta.string.lua string.quoted.multiline.lua | ||
- include: multiline-string-body | ||
- match: '' | ||
embed: pattern-unquoted | ||
# we don't seem to be able to use the capture to get the right one | ||
# this means that patterns that contain something that matches this | ||
# will lose their context half way through. | ||
escape: (?=\]=*\]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to simple multi-push, optionally consuming \n
directly after opening punctuation.
pattern-multiline-string-body: | |
- meta_include_prototype: false | |
- meta_scope: meta.string.lua string.quoted.multiline.lua | |
- include: multiline-string-body | |
- match: '' | |
embed: pattern-unquoted | |
# we don't seem to be able to use the capture to get the right one | |
# this means that patterns that contain something that matches this | |
# will lose their context half way through. | |
escape: (?=\]=*\]) | |
pattern-multiline-string-body: | |
- meta_include_prototype: false | |
- match: \n? | |
push: | |
- pattern-unquoted | |
- pattern-test-first-caret |
This PR adds highlighting to the strings that are passed to
string.{format,find,match,gsub,gmatch,pack,unpack,packsize}
.They are split into three categories:
Format strings
https://www.lua.org/manual/5.4/manual.html#pdf-string.format
These are done similarly to C's
sprintf
and friends, except that Luaonly supports some specifiers and adds
q
as an option.Patterns
https://www.lua.org/manual/5.4/manual.html#6.4.1
They are passed as the second argument to the four functions that take
patterns, meaning that I can also match on
(whatever_value):match 'this%s*that'
. For scopes, I have roughlyfollowed what the builtin regular expressions package does.
I haven't worked on nested groups as that got quite hard with handling
it inside the three different types of strings that Lua provides.
The special capture
()
returns a position. I'm not sure what scopethis should be. I've set it to
punctuation.section.brackets.lua
forthe moment.
Pack format strings
https://www.lua.org/manual/5.4/manual.html#6.4.2
The alignment and endian specifiers I have scoped as
storage.modifier
and I have scoped the actual type designations asstorage.type
. For the padding that can be specified withx
orX
,I've scoped it to
punctuation.separator.padding
as I didn't see anybetter one, and that helped differentiate it from ones that have
values.
Since format strings are always the first argument, I did not find a way
of handling them when the function is called as a method.
(e.g.
('%3.5f'):format(3)
).