|
| 1 | +# AGENTS.md |
| 2 | + |
| 3 | +This file provides guidance to AI/LLM when working with code in this repository. |
| 4 | + |
| 5 | +## Build and Test Commands |
| 6 | + |
| 7 | +```bash |
| 8 | +# Generate parser from grammar (required after grammar.js changes) |
| 9 | +npx tree-sitter generate |
| 10 | + |
| 11 | +# Run all corpus tests |
| 12 | +npx tree-sitter test |
| 13 | + |
| 14 | +# Run tests for a specific grammar rule (filter by name) |
| 15 | +npx tree-sitter test --include "switch" |
| 16 | + |
| 17 | +# Parse a file to see syntax tree (useful for debugging) |
| 18 | +npm run parse path/to/file.res |
| 19 | + |
| 20 | +# Compare with ReScript compiler's AST (useful for debugging grammar issues) |
| 21 | +npx bsc -dparsetree -only-parse -ignore-parse-errors path/to/file.res |
| 22 | + |
| 23 | +# Launch interactive playground (builds WASM first) |
| 24 | +npm start |
| 25 | + |
| 26 | +# Install npm dependencies |
| 27 | +npm install |
| 28 | +``` |
| 29 | + |
| 30 | +## Architecture |
| 31 | + |
| 32 | +### Grammar Definition |
| 33 | + |
| 34 | +**grammar.js** - The main grammar definition using Tree-sitter's JavaScript DSL. Key sections: |
| 35 | +- `externals` - Tokens handled by the custom scanner (newlines, comments, template strings, decorators, parentheses) |
| 36 | +- `precedences` - Operator precedence rules for expressions and module paths |
| 37 | +- `conflicts` - Shift/reduce conflict resolutions for ambiguous constructs |
| 38 | +- `rules` - All grammar rules (~80+ covering statements, expressions, types, modules, JSX) |
| 39 | + |
| 40 | +### External Scanner |
| 41 | + |
| 42 | +**src/scanner.c** - Custom C scanner for context-sensitive tokens that the grammar alone cannot handle: |
| 43 | +- Significant vs insignificant newlines (statement-ending vs formatting) |
| 44 | +- Nested multiline comments (`/* /* */ */`) |
| 45 | +- Template string interpolation (`\`hello ${name}\``) |
| 46 | +- Parenthesis nesting tracking (affects newline significance) |
| 47 | +- `list{` and `dict{` constructor detection |
| 48 | +- Decorator parsing (`@decorator` vs `@decorator(args)`) |
| 49 | + |
| 50 | +The scanner maintains state (`ScannerState`) tracking parenthesis nesting depth, whether inside quotes/backticks, and EOF reporting. |
| 51 | + |
| 52 | +### Generated Files (do not edit manually) |
| 53 | + |
| 54 | +- `src/parser.c` - Generated LR parser from grammar.json |
| 55 | +- `src/grammar.json` - Intermediate grammar representation |
| 56 | +- `src/node-types.json` - AST node type definitions |
| 57 | + |
| 58 | +### Query Files |
| 59 | + |
| 60 | +**queries/** - TreeSitter query files in S-expression syntax: |
| 61 | +- `highlights.scm` - Syntax highlighting rules mapping AST nodes to semantic scopes |
| 62 | +- `injections.scm` - Language injection boundaries |
| 63 | +- `locals.scm` - Variable scope definitions |
| 64 | +- `textobjects.scm` - Editor text object definitions |
| 65 | + |
| 66 | +### Test Corpus |
| 67 | + |
| 68 | +**test/corpus/*.txt** - Test cases in Tree-sitter's corpus format. Each test has a name, ReScript code, and expected parse tree. Files cover: comments, decorators, expressions, functions, JSX, let bindings, literals, modules, type declarations. |
| 69 | + |
| 70 | +## Development Workflow |
| 71 | + |
| 72 | +1. Modify `grammar.js` to add/fix grammar rules |
| 73 | +2. Run `tree-sitter generate` to regenerate the parser |
| 74 | +3. Add test cases in `test/corpus/*.txt` with expected parse trees |
| 75 | +4. Run `tree-sitter test` to verify |
| 76 | +5. For scanner changes, edit `src/scanner.c` and rebuild |
| 77 | + |
| 78 | +## Language Bindings |
| 79 | + |
| 80 | +Bindings are provided for Node.js, Python, Rust, Go, Swift, and C in the `bindings/` directory. CI tests Rust, Python, and Go bindings automatically. |
0 commit comments