Skip to content

Parameters break syntax highlighting by capturing punctuation #134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ct-martin opened this issue Jan 10, 2025 · 2 comments · May be fixed by #135
Open

Parameters break syntax highlighting by capturing punctuation #134

ct-martin opened this issue Jan 10, 2025 · 2 comments · May be fixed by #135

Comments

@ct-martin
Copy link

ct-martin commented Jan 10, 2025

Required information for this issue:

  • Your VS Code Version: 1.96.2
  • Your OS Version: Windows 11 23H2
  • Is this extension running on VS Code Remote or https://vscode.dev: No

Steps to Reproduce:

  1. Use the sample file below

Snippet of note:

  upstream big_server_com {
    server 127.0.0.3:8000 weight=5;
    server 127.0.0.3:8001 weight=5;
    server 192.168.0.1:8000;
    server 192.168.0.1:8001;
  }

VS Code doesn't properly highlight the line after a parameter is used:
image

I think what is happening is that on the final group of the regex on this line, the semicolon punctuation is being captured and then not being parsed as punctuation.

match: /(?:^|\s)(weight|max_conn|max_fails|fail_timeout|slow_start)(=)(\d[\d\.]*[bBkKmMgGtTsShHdD]?)(?:\s|;|$)/,

(Same line but in the XML)

<string>(?:^|\s)(weight|max_conn|max_fails|fail_timeout|slow_start)(=)(\d[\d\.]*[bBkKmMgGtTsShHdD]?)(?:\s|;|$)</string>

You can see it work correctly if you add a space before the semicolon like this:

    server 127.0.0.3:8001 weight=5 ; # <-- notice added space here
    server 192.168.0.1:8000;

image

This can be fixed by changing the final group from being a non-capturing group to a lookahead.

Current final group: (?:\s|;|$)

As a lookahead: (?=\s|;|$) (notice the = instead of :)


I can open a pull request for this if you'd like.


Full sample file
user       www www;  ## Default: nobody
worker_processes  5;  ## Default: 1
error_log  logs/error.log;
pid        logs/nginx.pid;
worker_rlimit_nofile 8192;

events {
  worker_connections  4096;  ## Default: 1024
}

http {
  include    conf/mime.types;
  include    /etc/nginx/proxy.conf;
  include    /etc/nginx/fastcgi.conf;
  index    index.html index.htm index.php;

  default_type application/octet-stream;
  log_format   main '$remote_addr - $remote_user [$time_local]  $status '
    '"$request" $body_bytes_sent "$http_referer" '
    '"$http_user_agent" "$http_x_forwarded_for"';
  access_log   logs/access.log  main;
  sendfile     on;
  tcp_nopush   on;
  server_names_hash_bucket_size 128; # this seems to be required for some vhosts

  server { # php/fastcgi
    listen       80;
    server_name  domain1.com www.domain1.com;
    access_log   logs/domain1.access.log  main;
    root         html;

    location ~ \.php$ {
      fastcgi_pass   127.0.0.1:1025;
    }
  }

  server { # simple reverse-proxy
    listen       80;
    server_name  domain2.com www.domain2.com;
    access_log   logs/domain2.access.log  main;

    # serve static files
    location ~ ^/(images|javascript|js|css|flash|media|static)/  {
      root    /var/www/virtual/big.server.com/htdocs;
      expires 30d;
    }

    # pass requests for dynamic content to rails/turbogears/zope, et al
    location / {
      proxy_pass      http://127.0.0.1:8080;
    }
  }

  upstream big_server_com {
    server 127.0.0.3:8000 weight=5;
    server 127.0.0.3:8001 weight=5;
    server 192.168.0.1:8000;
    server 192.168.0.1:8001;
  }

  server { # simple load balancing
    listen          80;
    server_name     big.server.com;
    access_log      logs/big.server.access.log main;

    location / {
      proxy_pass      http://big_server_com;
    }
  }
}

# From https://www.nginx.com/resources/wiki/start/topics/examples/full/
@ct-martin ct-martin changed the title Paramters break syntax highlighting; Parameters break syntax highlighting by capturing punctuation Jan 10, 2025
@slevithan
Copy link

slevithan commented Jan 13, 2025

This can be fixed by changing the final group from being a capture group to a lookahead.

Current final group: (?:\s|;|$)

As a lookahead: (?=:\s|;|$) (notice the added =)

Your analysis seems good, but note that your lookahead version should be (?=\s|;|$) (without the : after (?=).

Another option is (?=[\s;]|$). Relying less on backtracking like this is often slightly better.

And a pedantic note: (?: is a non-capturing group, not a capturing group. If by "capture group" you were trying to distinguish both capturing and non-capturing groups from lookaround, there isn't a standard term but I might call them "consumptive groups", i.e. they add characters to the match.

@ct-martin
Copy link
Author

Thanks for catching the typo, I edited the initial comment.

Re: the pedantic note, as far as I can tell they're just called "groups" (e.g. on MDN) but I agree that there could be better terms to describe how the different types of groups overlap with their effects.

@ct-martin ct-martin linked a pull request Jan 14, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants