ADR-0090: Tree-sitter Grammar and Parser Differential Fuzzing

Status

Implemented

Summary

Add a tree-sitter grammar for Gruel (housed in-tree under tree-sitter-gruel/) targeted at editor / IDE integrations (Zed, Neovim, Helix, GitHub highlighting, etc.). Keep gruel-parser (chumsky-based) as the canonical compiler parser. Guard against drift between the two grammars with a parser_differential fuzz target that asserts both parsers agree on whether a given input is syntactically valid. Add a smoke-fuzz CI job that runs every fuzz target (existing + new) for a short, fixed time budget on every PR so regressions are caught at PR time rather than the next nightly fuzz.yml run.

Context

gruel-parser is a chumsky-based parser tightly coupled to the compiler's AST shape, span model, and diagnostic surface. It is well-suited to producing high-quality compiler errors but is not directly usable by editors:

Editors expect either an LSP that supplies syntax tokens, or a tree-sitter grammar.
Tree-sitter grammars unlock highlighting in Zed/Neovim/Helix/Emacs/GitHub with no per-editor plugin work.
Tree-sitter is incremental and error-tolerant, which is what an editor wants while a user is typing.

The standard pattern across production compilers (Rust, Zig, Swift) is to keep a hand-written/canonical parser for the compiler and maintain a tree-sitter grammar separately for editor tooling. The well-known cost is drift: the tree-sitter grammar silently rots whenever the canonical parser learns new syntax. The mitigation, per the Rust project's experience and others, is differential testing: feed the same program to both parsers in CI and assert they agree on acceptance.

Today, fuzz coverage runs only nightly via .github/workflows/fuzz.yml (5 minutes per target). A grammar regression introduced in a PR can sit unnoticed until the next nightly run, and by then the bisect target is a full day of merges. A short smoke-fuzz pass on every PR catches the obvious crashes immediately.

Decision

Part 1: Tree-sitter grammar layout

A new top-level directory tree-sitter-gruel/ (sibling to crates/, not a workspace member, so cargo workflows are unaffected):

tree-sitter-gruel/
├── grammar.js                 # Grammar source
├── package.json               # npm metadata (for tree-sitter CLI / editor consumption)
├── src/                       # Generated by `tree-sitter generate` — committed
│   ├── parser.c
│   ├── grammar.json
│   ├── node-types.json
│   └── tree_sitter/parser.h
├── bindings/
│   └── rust/                  # Rust crate that wraps parser.c
│       ├── Cargo.toml
│       ├── build.rs           # cc-builds parser.c
│       └── lib.rs             # exposes `LANGUAGE: tree_sitter::Language`
├── queries/
│   ├── highlights.scm         # Editor highlighting
│   ├── locals.scm             # Scope / local-variable queries
│   ├── indents.scm
│   └── folds.scm
├── test/corpus/               # tree-sitter's native corpus tests
│   ├── lexical.txt
│   ├── items.txt
│   ├── expressions.txt
│   └── ...
└── README.md

Rationale for in-tree:

Keeps grammar and canonical parser in lockstep — the same PR can update both.
The differential fuzzer needs the tree-sitter crate as a path dependency; in-tree avoids the chicken-and-egg of versioning a separate repo while the language is still moving.
We can split out to a standalone tree-sitter-gruel repository (the conventional location editors look for) once syntax stabilizes; nothing in this layout precludes that.

Generated src/ is committed. This means contributors do not need node + tree-sitter-cli to build the differential fuzzer or run CI — only to regenerate after editing grammar.js. A make tree-sitter-generate Make target encapsulates this.

Part 2: Grammar scope

The grammar must cover all syntax that gruel-parser accepts, structurally — i.e., enough that the differential fuzzer is meaningful rather than trivially "tree-sitter rejects everything past keyword X." Tree shape does not need to match the chumsky AST; tree-sitter produces a CST optimized for tooling queries.

Initial coverage targets:

Lexical: all keywords, operators, literals (int/float/string/char), comments, doc comments (///)
Items: fn, struct, enum, impl, use, @import (ADR-0026), comptime blocks at item level
Statements: let, assignment, expression statements, return
Expressions: literals, binary (with precedence matching the Pratt parser in chumsky_parser.rs), unary, calls, method calls, field access, indexing, struct literals, blocks, if/while/for/match, intrinsic calls (@name(...)), path expressions, comptime { ... }
Types: named, generic params via comptime T: type syntax (ADR per [[project_no_user_generics]] memory — no user-facing <T>), references via Ref(I) / MutRef(I), arrays

What the grammar can omit:

Constant evaluation (purely a sema concern)
Type inference rules (sema)
Anything that requires resolving symbols across files

Part 3: Rust bindings

tree-sitter-gruel/bindings/rust/ is a standalone cargo crate (not a workspace member; consumed only by fuzz/ and any future tooling) that:

Uses cc in build.rs to compile src/parser.c into a static library.
Exposes a single pub const LANGUAGE: tree_sitter::Language for callers to plug into tree_sitter::Parser::set_language.

// fuzz/fuzz_targets/parser_differential.rs
use tree_sitter::Parser as TsParser;
use tree_sitter_gruel::LANGUAGE;

let mut ts = TsParser::new();
ts.set_language(&LANGUAGE.into()).unwrap();
let tree = ts.parse(source, None).unwrap();
let ts_accepted = !tree.root_node().has_error();

Part 4: Differential fuzzer

A new fuzz target parser_differential:

fuzz_target!(|prog: MaybeInvalidProgram| {
    let source = &prog.0;

    // Path A: chumsky parser
    let chumsky_accepted = match gruel_lexer::Lexer::new(source).tokenize() {
        Ok((tokens, interner)) => gruel_parser::Parser::new(tokens, interner).parse().is_ok(),
        Err(_) => false,
    };

    // Path B: tree-sitter parser
    let mut ts = TsParser::new();
    ts.set_language(&LANGUAGE.into()).unwrap();
    let tree = ts.parse(source.as_bytes(), None).unwrap();
    let ts_accepted = !tree.root_node().has_error();

    assert_eq!(
        chumsky_accepted, ts_accepted,
        "parser disagreement on:\n{}\n(chumsky={}, tree-sitter={})",
        source, chumsky_accepted, ts_accepted,
    );
});

Comparison criterion: acceptance only. Both parsers must agree on whether a program is syntactically valid. Tree shape, error positions, and recovery strategies are explicitly not compared — they will differ legitimately, and forcing parity there is a losing battle.

Input sources:

MaybeInvalidProgram from gruel-fuzz/src/lib.rs — biases toward programs near the validity boundary, which is where parsers most often disagree.
GruelProgram (valid) — sanity check that the tree-sitter grammar accepts everything chumsky accepts.
Raw &[u8] via UTF-8 conversion — catches lexical edge cases.

We will likely need all three as separate sub-targets or as alternatives within a single Arbitrary enum.

Part 5: Corpus-based differential test (non-fuzz)

In addition to fuzzing, add a deterministic test that runs the differential check over every source = "..." string in crates/gruel-spec/cases/ and crates/gruel-ui-tests/cases/. This:

Runs under make test (no nightly Rust required, unlike fuzzing)
Catches drift immediately on PR
Gives a fixed, reproducible regression suite

Implementation: a new integration test in tree-sitter-gruel/bindings/rust/tests/ (or a small gruel-parser-diff test crate) that walks the TOML cases and runs both parsers.

Part 6: CI smoke fuzz

Add a smoke-fuzz job to .github/workflows/ci.yml (PR-gating), distinct from the existing nightly fuzz.yml:

smoke-fuzz:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - name: Install LLVM 22
      run: ...  # same as other jobs
    - name: Install nightly + cargo-fuzz
      run: |
        rustup toolchain install nightly
        cargo +nightly install cargo-fuzz --locked
    - name: Restore fuzz corpus
      uses: actions/cache/restore@v4
      with:
        path: fuzz/corpus
        key: fuzz-corpus-${{ github.run_id }}
        restore-keys: fuzz-corpus-
    - name: Smoke fuzz each target (60s)
      run: |
        for target in lexer parser compiler structured_compiler structured_invalid comptime_differential parser_differential; do
          cargo +nightly fuzz run "$target" -- -max_total_time=60
        done
    - name: Save corpus
      uses: actions/cache/save@v4
      with:
        path: fuzz/corpus
        key: fuzz-corpus-${{ github.run_id }}

Time budget: 60 seconds per target × 7 targets = 7 minutes of pure fuzzing, plus build / cache overhead. Acceptable for PR CI; not so long that contributors avoid it.

Why on PRs, not just merge: A grammar drift caught at PR is a 5-line fix; the same drift caught 24 hours later in nightly fuzz is a bisect across N merges.

The nightly fuzz.yml is unchanged — it continues to run each target for 5 minutes (and we may extend per-target time there separately).

Implementation Phases

Phase 1: Tree-sitter scaffolding + lexical grammar
- Create tree-sitter-gruel/ with grammar.js, package.json, README.md
- Define lexical rules: identifiers, keywords, all operator tokens, integer / float / string / char literals, // / /// / /* */ comments
- Add test/corpus/lexical.txt
- Wire up make tree-sitter-generate and commit generated src/
Phase 2: Grammar for items, statements, types
- fn, struct, enum, impl, use, @import
- let, assignment, return, expression statements
- Type expressions (named, Ref(...), MutRef(...), arrays, comptime T: type)
- Item-level comptime { } blocks
- Expand test/corpus/
Phase 3: Grammar for expressions
- Pratt-style precedence matching chumsky_parser.rs
- All operator forms, calls, method calls, field access, indexing
- if/while/for/match, blocks
- Intrinsic calls @name(...)
- Struct literals (with the grammar disambiguation against block-expressions)
- Full test/corpus/expressions.txt
Phase 4: Rust bindings + spec-corpus differential test
- tree-sitter-gruel/bindings/rust/ crate with build.rs and lib.rs
- Integration test that walks crates/gruel-spec/cases/ + crates/gruel-ui-tests/cases/ and asserts acceptance parity
- Wire into make test
- Fix any genuine grammar gaps surfaced by the spec corpus
Phase 5: Differential fuzz target
- New fuzz/fuzz_targets/parser_differential.rs
- Add tree-sitter and tree-sitter-gruel to fuzz/Cargo.toml
- Register the binary, run locally for 5 minutes, fix anything found
- Document the target in CLAUDE.md fuzz section
Phase 6: CI smoke-fuzz job
- Add smoke-fuzz job to .github/workflows/ci.yml
- 60s per target × 7 targets
- Corpus caching across runs
- Make it a required check via repo settings (manual step, noted in PR description)
Phase 7: Editor queries + docs
- queries/highlights.scm, locals.scm, indents.scm, folds.scm
- tree-sitter-gruel/README.md with usage from Zed / Neovim / Helix
- Update CLAUDE.md "Modifying the Language" section with: "If you change syntax, also update tree-sitter-gruel/grammar.js and regenerate"
- Optional: GitHub language detection PR to github-linguist (deferred, out of scope here)

Consequences

Positive

Editors get syntax highlighting and basic structural queries with no per-editor plugin work.
Differential fuzzer + spec-corpus differential test catch grammar drift automatically.
Smoke fuzz on PRs catches regressions at the time of introduction, not 24 hours later.
The tree-sitter grammar is a stepping stone toward an LSP (incremental reparse for free).
Generated src/ committed means most contributors never need to install node.

Negative

Two grammars to maintain. Mitigated by the differential infrastructure, but adding new syntax now requires touching grammar.js as well. Documented in CLAUDE.md.
CI time increases by ~8–10 minutes for the smoke-fuzz job. Acceptable given the value of catching regressions early.
node + tree-sitter-cli required to regenerate the parser. Not required for builds or CI fuzzing; only for grammar edits.
Acceptance-only differential will miss some real bugs (e.g., chumsky and tree-sitter parse the same program but produce wildly different structures). This is a known limitation and matches industry practice — tree comparison is impractical between a compiler AST and a tree-sitter CST.

Neutral

Tree-sitter version pinning: we'll target a specific tree-sitter runtime version (likely 0.24+). Bumping it is a deliberate ADR-worthy decision down the line.
In-tree vs separate repo: starts in-tree, can be split out when the language stabilizes. Editors that auto-discover grammars by repo name (tree-sitter-<lang>) won't find ours until then; this is fine for early-stage adoption.

Open Questions

External scanner needed? Some constructs (string interpolation, raw strings, indentation-sensitive blocks) require tree-sitter's external scanner (scanner.c). Gruel currently has none of these — but worth a check during Phase 1.
Should the smoke-fuzz job be required or advisory? Recommendation: required, but with a clear runbook for "the fuzzer found something" to avoid blocking unrelated PRs on flaky finds. Decision deferred to Phase 6.
tree-sitter-gruel/bindings/rust outside the workspace — this avoids dragging tree-sitter into every cargo build. Need to verify fuzz/Cargo.toml can path-depend on a non-workspace crate cleanly. Expected to work; verifying in Phase 5.
GitHub Linguist registration to get GitHub-native syntax highlighting — deferred to future work since it requires a stable grammar and external policy compliance.

Future Work

LSP server for Gruel building on tree-sitter-gruel for incremental reparse. Out of scope here; this ADR is the foundation.
Split tree-sitter-gruel to its own repository once syntax stabilizes (so editors can discover it conventionally).
Publish to crates.io + npm for editor consumption.
Tree-shape differential (not just acceptance) — only worth pursuing if a normalized common form proves tractable.
Extend nightly fuzz.yml to run longer (e.g., 30 min per target) now that PR smoke-fuzz handles the regression-detection role.

References

ADR-0018: Tracing Infrastructure — pattern for adding cross-cutting tooling
ADR-0019: Performance Dashboard — pattern for adding a CI-integrated tooling feature
ADR-0093: gruel fmt source formatter — also walks the chumsky AST; the formatter and the differential here both depend on the same parser output, so syntax churn that breaks one tends to surface in the other.
tree-sitter documentation
rustc / rust-analyzer parser duality — Rust's approach to two-parser maintenance
Zig's std.zig.Ast shared between compiler and ZLS — counterpoint approach (unified parser); we choose differently because chumsky's AST is too compiler-specific to share