ADR-0090: Tree-sitter Grammar and Parser Differential Fuzzing
Status
Implemented
Summary
Add a tree-sitter grammar for Gruel (housed in-tree under tree-sitter-gruel/) targeted at editor / IDE integrations (Zed, Neovim, Helix, GitHub highlighting, etc.). Keep gruel-parser (chumsky-based) as the canonical compiler parser. Guard against drift between the two grammars with a parser_differential fuzz target that asserts both parsers agree on whether a given input is syntactically valid. Add a smoke-fuzz CI job that runs every fuzz target (existing + new) for a short, fixed time budget on every PR so regressions are caught at PR time rather than the next nightly fuzz.yml run.
Context
gruel-parser is a chumsky-based parser tightly coupled to the compiler's AST shape, span model, and diagnostic surface. It is well-suited to producing high-quality compiler errors but is not directly usable by editors:
- Editors expect either an LSP that supplies syntax tokens, or a tree-sitter grammar.
- Tree-sitter grammars unlock highlighting in Zed/Neovim/Helix/Emacs/GitHub with no per-editor plugin work.
- Tree-sitter is incremental and error-tolerant, which is what an editor wants while a user is typing.
The standard pattern across production compilers (Rust, Zig, Swift) is to keep a hand-written/canonical parser for the compiler and maintain a tree-sitter grammar separately for editor tooling. The well-known cost is drift: the tree-sitter grammar silently rots whenever the canonical parser learns new syntax. The mitigation, per the Rust project's experience and others, is differential testing: feed the same program to both parsers in CI and assert they agree on acceptance.
Today, fuzz coverage runs only nightly via .github/workflows/fuzz.yml (5 minutes per target). A grammar regression introduced in a PR can sit unnoticed until the next nightly run, and by then the bisect target is a full day of merges. A short smoke-fuzz pass on every PR catches the obvious crashes immediately.
Decision
Part 1: Tree-sitter grammar layout
A new top-level directory tree-sitter-gruel/ (sibling to crates/, not a workspace member, so cargo workflows are unaffected):
tree-sitter-gruel/
├── grammar.js # Grammar source
├── package.json # npm metadata (for tree-sitter CLI / editor consumption)
├── src/ # Generated by `tree-sitter generate` — committed
│ ├── parser.c
│ ├── grammar.json
│ ├── node-types.json
│ └── tree_sitter/parser.h
├── bindings/
│ └── rust/ # Rust crate that wraps parser.c
│ ├── Cargo.toml
│ ├── build.rs # cc-builds parser.c
│ └── lib.rs # exposes `LANGUAGE: tree_sitter::Language`
├── queries/
│ ├── highlights.scm # Editor highlighting
│ ├── locals.scm # Scope / local-variable queries
│ ├── indents.scm
│ └── folds.scm
├── test/corpus/ # tree-sitter's native corpus tests
│ ├── lexical.txt
│ ├── items.txt
│ ├── expressions.txt
│ └── ...
└── README.md
Rationale for in-tree:
- Keeps grammar and canonical parser in lockstep — the same PR can update both.
- The differential fuzzer needs the tree-sitter crate as a path dependency; in-tree avoids the chicken-and-egg of versioning a separate repo while the language is still moving.
- We can split out to a standalone
tree-sitter-gruelrepository (the conventional location editors look for) once syntax stabilizes; nothing in this layout precludes that.
Generated src/ is committed. This means contributors do not need node + tree-sitter-cli to build the differential fuzzer or run CI — only to regenerate after editing grammar.js. A make tree-sitter-generate Make target encapsulates this.
Part 2: Grammar scope
The grammar must cover all syntax that gruel-parser accepts, structurally — i.e., enough that the differential fuzzer is meaningful rather than trivially "tree-sitter rejects everything past keyword X." Tree shape does not need to match the chumsky AST; tree-sitter produces a CST optimized for tooling queries.
Initial coverage targets:
- Lexical: all keywords, operators, literals (int/float/string/char), comments, doc comments (
///) - Items:
fn,struct,enum,impl,use,@import(ADR-0026),comptimeblocks at item level - Statements:
let, assignment, expression statements,return - Expressions: literals, binary (with precedence matching the Pratt parser in
chumsky_parser.rs), unary, calls, method calls, field access, indexing, struct literals, blocks,if/while/for/match, intrinsic calls (@name(...)), path expressions,comptime { ... } - Types: named, generic params via
comptime T: typesyntax (ADR per[[project_no_user_generics]]memory — no user-facing<T>), references viaRef(I)/MutRef(I), arrays
What the grammar can omit:
- Constant evaluation (purely a sema concern)
- Type inference rules (sema)
- Anything that requires resolving symbols across files
Part 3: Rust bindings
tree-sitter-gruel/bindings/rust/ is a standalone cargo crate (not a workspace member; consumed only by fuzz/ and any future tooling) that:
- Uses
ccinbuild.rsto compilesrc/parser.cinto a static library. - Exposes a single
pub const LANGUAGE: tree_sitter::Languagefor callers to plug intotree_sitter::Parser::set_language.
// fuzz/fuzz_targets/parser_differential.rs
use Parser as TsParser;
use LANGUAGE;
let mut ts = new;
ts.set_language.unwrap;
let tree = ts.parse.unwrap;
let ts_accepted = !tree.root_node.has_error;
Part 4: Differential fuzzer
A new fuzz target parser_differential:
fuzz_target!;
Comparison criterion: acceptance only. Both parsers must agree on whether a program is syntactically valid. Tree shape, error positions, and recovery strategies are explicitly not compared — they will differ legitimately, and forcing parity there is a losing battle.
Input sources:
MaybeInvalidProgramfromgruel-fuzz/src/lib.rs— biases toward programs near the validity boundary, which is where parsers most often disagree.GruelProgram(valid) — sanity check that the tree-sitter grammar accepts everything chumsky accepts.- Raw
&[u8]via UTF-8 conversion — catches lexical edge cases.
We will likely need all three as separate sub-targets or as alternatives within a single Arbitrary enum.
Part 5: Corpus-based differential test (non-fuzz)
In addition to fuzzing, add a deterministic test that runs the differential check over every source = "..." string in crates/gruel-spec/cases/ and crates/gruel-ui-tests/cases/. This:
- Runs under
make test(no nightly Rust required, unlike fuzzing) - Catches drift immediately on PR
- Gives a fixed, reproducible regression suite
Implementation: a new integration test in tree-sitter-gruel/bindings/rust/tests/ (or a small gruel-parser-diff test crate) that walks the TOML cases and runs both parsers.
Part 6: CI smoke fuzz
Add a smoke-fuzz job to .github/workflows/ci.yml (PR-gating), distinct from the existing nightly fuzz.yml:
smoke-fuzz:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install LLVM 22
run: ... # same as other jobs
- name: Install nightly + cargo-fuzz
run: |
rustup toolchain install nightly
cargo +nightly install cargo-fuzz --locked
- name: Restore fuzz corpus
uses: actions/cache/restore@v4
with:
path: fuzz/corpus
key: fuzz-corpus-${{ github.run_id }}
restore-keys: fuzz-corpus-
- name: Smoke fuzz each target (60s)
run: |
for target in lexer parser compiler structured_compiler structured_invalid comptime_differential parser_differential; do
cargo +nightly fuzz run "$target" -- -max_total_time=60
done
- name: Save corpus
uses: actions/cache/save@v4
with:
path: fuzz/corpus
key: fuzz-corpus-${{ github.run_id }}
Time budget: 60 seconds per target × 7 targets = 7 minutes of pure fuzzing, plus build / cache overhead. Acceptable for PR CI; not so long that contributors avoid it.
Why on PRs, not just merge: A grammar drift caught at PR is a 5-line fix; the same drift caught 24 hours later in nightly fuzz is a bisect across N merges.
The nightly fuzz.yml is unchanged — it continues to run each target for 5 minutes (and we may extend per-target time there separately).
Implementation Phases
Phase 1: Tree-sitter scaffolding + lexical grammar
- Create
tree-sitter-gruel/withgrammar.js,package.json,README.md - Define lexical rules: identifiers, keywords, all operator tokens, integer / float / string / char literals,
////////* */comments - Add
test/corpus/lexical.txt - Wire up
make tree-sitter-generateand commit generatedsrc/
- Create
Phase 2: Grammar for items, statements, types
fn,struct,enum,impl,use,@importlet, assignment,return, expression statements- Type expressions (named,
Ref(...),MutRef(...), arrays,comptime T: type) - Item-level
comptime { }blocks - Expand
test/corpus/
Phase 3: Grammar for expressions
- Pratt-style precedence matching
chumsky_parser.rs - All operator forms, calls, method calls, field access, indexing
if/while/for/match, blocks- Intrinsic calls
@name(...) - Struct literals (with the grammar disambiguation against block-expressions)
- Full
test/corpus/expressions.txt
- Pratt-style precedence matching
Phase 4: Rust bindings + spec-corpus differential test
tree-sitter-gruel/bindings/rust/crate withbuild.rsandlib.rs- Integration test that walks
crates/gruel-spec/cases/+crates/gruel-ui-tests/cases/and asserts acceptance parity - Wire into
make test - Fix any genuine grammar gaps surfaced by the spec corpus
Phase 5: Differential fuzz target
- New
fuzz/fuzz_targets/parser_differential.rs - Add
tree-sitterandtree-sitter-grueltofuzz/Cargo.toml - Register the binary, run locally for 5 minutes, fix anything found
- Document the target in
CLAUDE.mdfuzz section
- New
Phase 6: CI smoke-fuzz job
- Add
smoke-fuzzjob to.github/workflows/ci.yml - 60s per target × 7 targets
- Corpus caching across runs
- Make it a required check via repo settings (manual step, noted in PR description)
- Add
Phase 7: Editor queries + docs
queries/highlights.scm,locals.scm,indents.scm,folds.scmtree-sitter-gruel/README.mdwith usage from Zed / Neovim / Helix- Update
CLAUDE.md"Modifying the Language" section with: "If you change syntax, also updatetree-sitter-gruel/grammar.jsand regenerate" - Optional: GitHub language detection PR to
github-linguist(deferred, out of scope here)
Consequences
Positive
- Editors get syntax highlighting and basic structural queries with no per-editor plugin work.
- Differential fuzzer + spec-corpus differential test catch grammar drift automatically.
- Smoke fuzz on PRs catches regressions at the time of introduction, not 24 hours later.
- The tree-sitter grammar is a stepping stone toward an LSP (incremental reparse for free).
- Generated
src/committed means most contributors never need to installnode.
Negative
- Two grammars to maintain. Mitigated by the differential infrastructure, but adding new syntax now requires touching
grammar.jsas well. Documented inCLAUDE.md. - CI time increases by ~8–10 minutes for the smoke-fuzz job. Acceptable given the value of catching regressions early.
node+tree-sitter-clirequired to regenerate the parser. Not required for builds or CI fuzzing; only for grammar edits.- Acceptance-only differential will miss some real bugs (e.g., chumsky and tree-sitter parse the same program but produce wildly different structures). This is a known limitation and matches industry practice — tree comparison is impractical between a compiler AST and a tree-sitter CST.
Neutral
- Tree-sitter version pinning: we'll target a specific tree-sitter runtime version (likely 0.24+). Bumping it is a deliberate ADR-worthy decision down the line.
- In-tree vs separate repo: starts in-tree, can be split out when the language stabilizes. Editors that auto-discover grammars by repo name (
tree-sitter-<lang>) won't find ours until then; this is fine for early-stage adoption.
Open Questions
External scanner needed? Some constructs (string interpolation, raw strings, indentation-sensitive blocks) require tree-sitter's external scanner (
scanner.c). Gruel currently has none of these — but worth a check during Phase 1.Should the smoke-fuzz job be
requiredor advisory? Recommendation: required, but with a clear runbook for "the fuzzer found something" to avoid blocking unrelated PRs on flaky finds. Decision deferred to Phase 6.tree-sitter-gruel/bindings/rustoutside the workspace — this avoids dragging tree-sitter into everycargo build. Need to verifyfuzz/Cargo.tomlcan path-depend on a non-workspace crate cleanly. Expected to work; verifying in Phase 5.GitHub Linguist registration to get GitHub-native syntax highlighting — deferred to future work since it requires a stable grammar and external policy compliance.
Future Work
- LSP server for Gruel building on
tree-sitter-gruelfor incremental reparse. Out of scope here; this ADR is the foundation. - Split
tree-sitter-gruelto its own repository once syntax stabilizes (so editors can discover it conventionally). - Publish to crates.io + npm for editor consumption.
- Tree-shape differential (not just acceptance) — only worth pursuing if a normalized common form proves tractable.
- Extend nightly
fuzz.ymlto run longer (e.g., 30 min per target) now that PR smoke-fuzz handles the regression-detection role.
References
- ADR-0018: Tracing Infrastructure — pattern for adding cross-cutting tooling
- ADR-0019: Performance Dashboard — pattern for adding a CI-integrated tooling feature
- ADR-0093:
gruel fmtsource formatter — also walks the chumsky AST; the formatter and the differential here both depend on the same parser output, so syntax churn that breaks one tends to surface in the other. - tree-sitter documentation
- rustc / rust-analyzer parser duality — Rust's approach to two-parser maintenance
- Zig's std.zig.Ast shared between compiler and ZLS — counterpoint approach (unified parser); we choose differently because chumsky's AST is too compiler-specific to share