ADR-0024 Revised: Type Intern Pool - Simplified Migration
Status
Implemented (2026-01-02) - Phases 1-4 complete. Type is now a u32 newtype with O(1) equality.
Executive Summary
After multiple failed migration attempts, we discovered that the pool is already the primary lookup mechanism for struct/enum definitions. The struct_defs and enum_defs Vecs are legacy artifacts carried around for "backwards compatibility" but aren't used in the main codepath.
This revised approach:
- Removes the Vec duplication (Phase 2A) - simple cleanup
- Keeps the
Typeenum unchanged - no pattern match migration needed - Migrates arrays to the pool (Phase 2B) - the real value
- Defers
Type→TypeIdrename until generics needs it (Phase 4, optional)
Context: Why Previous Attempts Failed
Original ADR-0024 Phase 4
The original plan required:
- Renaming
InternedType→Typeglobally (~675 usages) - Updating all pattern matches simultaneously
- Massive compiler errors eating all context (600+ errors)
Incremental Migration Approach
A previous incremental approach added a Type::Interned(TypeId) variant, but:
- Created dual representations that both needed handling
- Every pattern match needed
Type::Interned(_) => panic!(...)or.normalize() - The migration stalled with 9+ locations needing manual updates
Key Discovery: Pool Already Primary
Analysis revealed that SemaContext.get_struct_def() already uses the pool:
// sema_context.rs:370-372
The only code using the Vec directly:
TypeContext.get_struct_def()- legacy, limited use- Test assertions checking
output.struct_defs.len() - Logging for struct_count
This means 90%+ of struct/enum lookups already use the pool.
Revised Approach
Guiding Principles
- Keep
Typeenum unchanged - pattern matching works, don't break it - Remove duplication first - the Vecs are pure overhead
- Pool is canonical - all lookups go through pool
- Defer rename to Phase 4 - only if generics specialization needs it
What Changes
| Component | Current | After Phase 2A | After Phase 2B |
|---|---|---|---|
Sema.struct_defs | Vec<StructDef> | Removed | Removed |
Sema.enum_defs | Vec<EnumDef> | Removed | Removed |
TypeContext.struct_defs | Vec<StructDef> | Removed | Removed |
SemaContext.struct_defs | Vec<StructDef> (unused) | Removed | Removed |
SemaOutput.struct_defs | Vec<StructDef> | Pool ref | Pool ref |
ArrayTypeRegistry | Separate | Separate | Pool |
Type enum | 15 variants | Unchanged | Unchanged |
| Pattern matches | ~215 locations | Unchanged | Unchanged |
What Stays the Same
Type::I32,Type::Struct(StructId)- unchanged- All pattern matches on
Type- unchanged StructId,EnumIdnewtypes - unchanged (they wrap pool indices)ArrayTypeId- unchanged until Phase 2B
Implementation Phases
Phase 1: Infrastructure ✅ (Already Complete)
The pool infrastructure exists and is populated:
TypeInternPoolinintern_pool.rstype_pool.struct_def(id)worksSemaContextuses pool for lookups
Phase 2A: Remove Vec Duplication (NEW - Easy)
Goal: Single source of truth for struct/enum definitions.
Changes:
- Remove
struct_defs: Vec<StructDef>fromSema,TypeContext,SemaContext - Remove
enum_defs: Vec<EnumDef>from same - Update
SemaOutputto provide pool access instead of Vecs - Update tests to use
type_pool.struct_count()instead ofoutput.struct_defs.len() - Update logging to use pool stats
Files affected: ~8-10 files, mostly deletions
Ship criterion: All tests pass, no struct_defs or enum_defs Vecs anywhere.
Phase 2B: Migrate Arrays to Pool
Goal: Array types interned in pool, enabling parallel creation without merging.
Changes:
- Move
ArrayTypeRegistryfunctionality intoTypeInternPool - Use
type_pool.intern_array(element, len)instead of registry - Remove
ArrayTypeRegistryfromSemaContext - Arrays deduplicate automatically (same element+len = same type)
Files affected: ~5-10 files
Ship criterion: Arrays work, no separate array registry, parallel function analysis cleaner.
Phase 3: Struct/Enum Unified Indexing (Optional)
Goal: StructId and EnumId are just TypeId under the hood.
Currently StructId(0) and EnumId(0) could both exist (different types). After this phase, all composite types share one index space.
Changes:
- Make
StructIdandEnumIdaliases for a range ofTypeId - Update pattern matching on
Type::Struct(id)to extract from TypeId
Complexity: Medium. May not be needed if current design works.
Phase 4: Type Enum → TypeId (Deferred)
Goal: Replace Type enum with TypeId(u32) for O(1) comparison in generics.
Only do this when:
- Generics specialization needs canonical type comparison
SpecializationKey { type_args: Vec<Type> }hash collisions become an issue- We're adding
Vec<T>and need to intern generic instantiations
Changes:
- Rename
Type→TypeKind(the pattern-matchable form) - Make
TypeIdthe primary type representation - Add
TypeId::kind(&self, pool) -> TypeKindfor pattern matching - Migrate storage:
ty: Type→ty: TypeId - Migrate patterns:
match ty { Type::I32 => }→match ty.kind(pool) { TypeKind::I32 => }
Complexity: High. 200+ pattern matches need updating. Only do if benefits justify cost.
Benefits of This Approach
Immediate (Phase 2A)
- Simpler codebase: Remove redundant Vec storage
- Single source of truth: Pool is canonical
- No risk: Just deletions, easy to verify
Medium-term (Phase 2B)
- Parallel array creation: No per-function merging
- Array deduplication:
[i32; 5]same type everywhere - Cleaner architecture: One registry for all composite types
Long-term (Phase 4, if needed)
- O(1) type equality: Critical for generic specialization caching
- Foundation for generics:
Vec<i32>as interned type - Future type features: Pointers, function types, etc.
Comparison to Original Plan
| Aspect | Original | Revised |
|---|---|---|
| Pattern matches changed | 215+ | 0 (until Phase 4) |
| Files changed (Phase 2) | ~25 | ~10 |
| Risk of breaking changes | High | Low |
| Immediate benefit | Low (just infrastructure) | High (remove duplication) |
| Type representation | Changes immediately | Unchanged until needed |
| Generics support | Required before generics | Only if needed |
Migration Order
Phase 1 ✅ (done)
│
▼
Phase 2A: Remove Vec duplication (~1-2 hours)
│
▼
Phase 2B: Migrate arrays to pool (~2-4 hours)
│
▼
[STOP HERE unless generics needs it]
│
▼
Phase 3: Unified indexing (optional, ~2-4 hours)
│
▼
Phase 4: Type→TypeId rename (only if needed, ~8-16 hours)
Files to Change
Phase 2A (Remove Vecs)
Delete fields:
crates/gruel-air/src/sema/mod.rs:struct_defs,enum_defsfieldscrates/gruel-air/src/sema_context.rs:struct_defs,enum_defsfieldscrates/gruel-air/src/type_context.rs:struct_defs,enum_defsfields
Update:
crates/gruel-air/src/sema/declarations.rs: Remove.push()callscrates/gruel-air/src/sema/builtins.rs: Remove.push()callcrates/gruel-air/src/sema/analysis.rs: Removestd::mem::take(&mut sema.struct_defs)crates/gruel-air/src/sema/mod.rs: Remove Vec cloning inbuild_type_context()crates/gruel-air/src/sema/tests.rs: Usetype_pool.struct_count()insteadcrates/gruel-compiler/src/lib.rs: Usetype_pool.struct_count()for logging
Phase 2B (Arrays to Pool)
crates/gruel-air/src/intern_pool.rs: Already hasintern_array()crates/gruel-air/src/sema_context.rs: ReplaceArrayTypeRegistrywith poolcrates/gruel-air/src/sema/analysis.rs: Usetype_pool.intern_array()crates/gruel-codegen/src/types.rs: Update array lookups
Success Criteria
Phase 2A Complete ✅ (2026-01-02)
- No
struct_defs: Vec<StructDef>anywhere in codebase - No
enum_defs: Vec<EnumDef>anywhere in codebase - All struct/enum lookups go through
type_pool - All tests pass
-
./test.shgreen
Phase 2B Complete ✅ (2026-01-02)
- No
ArrayTypeRegistry - Arrays interned via
type_pool.intern_array() - Array deduplication works (same element+len = same ArrayTypeId)
- All tests pass
Phase 3 & 4: Type Enum → Type(u32) Migration
Status: Implemented (2026-01-02)
After completing Phase 2B, we proceeded with Phases 3 and 4 to achieve the full benefits described in the original ADR-0024:
- O(1) type equality via u32 comparison
- Foundation for generic type instantiation
- Unified type representation
Migration Strategy: "Shadow Type" Approach
The key challenge is migrating ~61 pattern match sites without creating 600+ simultaneous compilation errors. Our approach uses incremental migration with TypeKind:
Phase 3.1: Introduce TypeKind enum
Create a new TypeKind enum that mirrors the current Type enum structure:
// crates/gruel-air/src/types.rs
Why: TypeKind is the pattern-matchable representation of a Type. Separating these concerns allows incremental migration.
Phase 3.2: Add Type::kind() method
Add a method to convert Type to TypeKind:
Why: This allows pattern matches to gradually migrate from match ty { Type::I32 => } to match ty.kind() { TypeKind::I32 => } while keeping everything compiling.
Phase 3.3: Migrate pattern matches incrementally
Migrate one file at a time:
// Before:
match ty
// After:
match ty.kind
Benefits:
- Each file compiles and tests pass ✅
- Can ship intermediate states ✅
- Easy to back out if issues arise ✅
- Clear progress tracking (~61 match sites)
Phase 4.1: Replace Type enum with Type(InternedType)
Once all pattern matches use .kind(), replace the Type enum:
// Remove the old enum:
// pub enum Type { I8, I16, ... }
// Replace with newtype:
;
Why: Now Type is just a u32 index, giving us O(1) equality. All existing pattern matches continue to work via .kind().
Phase 4.2: Update method signatures
Once Type is Type(InternedType), update methods that pattern match:
// Before (Phase 3):
// After (Phase 4, optimized):
Success Criteria
Phase 3 Complete ✅ (2026-01-02)
- TypeKind enum exists in crates/gruel-air/src/types.rs
- Type::kind() method implemented
- All ~61 pattern match sites migrated to use .kind()
- All tests pass
- No direct pattern matches on Type enum remain
Phase 4 Complete ✅ (2026-01-02)
- Type enum removed, replaced with Type(u32) newtype
- Type::kind() decodes u32 back to TypeKind for pattern matching
- Type constants (Type::I32, etc.) defined as const Type(n)
- Helper methods (is_integer, as_struct, etc.) optimized with u32 checks
- All tests pass (1230 spec, 275 unit, 38 UI)
- O(1) type equality via u32 comparison works
Files Affected (Estimated)
Phase 3.1-3.2 (~1-2 files):
crates/gruel-air/src/types.rs- Add TypeKind, Type::kind()
Phase 3.3 (~20 files, 61 match sites):
crates/gruel-air/src/sema/analysis.rs(~19 matches)crates/gruel-air/src/sema/typeck.rs(~9 matches)crates/gruel-codegen/src/x86_64/cfg_lower.rs(~7 matches)crates/gruel-compiler/src/drop_glue.rs(~8 matches)crates/gruel-air/src/intern_pool.rs(~15 matches)- ... (15 more files with 1-3 matches each)
Phase 4.1-4.2 (~3-5 files):
crates/gruel-air/src/types.rs- Replace enum with newtypecrates/gruel-air/src/intern_pool.rs- Add get_kind() methodcrates/gruel-air/src/lib.rs- Update exports
Comparison to Big-Bang Approach
| Aspect | Big-Bang | Shadow Type (Our Approach) |
|---|---|---|
| Compilation errors | 600+ all at once | 0 (compiles at each step) |
| Testability | Only at the end | After each file migration |
| Risk | High | Low |
| Context window | Fills with errors | Clean, focused changes |
| Reversibility | Difficult | Easy (one file at a time) |
| Progress tracking | Binary (done/not done) | Linear (~61 match sites) |
Why This Works
- TypeKind is the same structure as Type: Just a renamed copy, so semantics don't change
- Type::kind() starts trivial: Just returns the enum variant, no pool lookup
- Incremental migration: Each file can be done independently
- Final flip is mechanical: Once all matches use .kind(), replacing the enum is safe
Implementation Order (Completed)
- ✅ Add TypeKind enum to types.rs
- ✅ Add Type::kind() → TypeKind conversion
- ✅ Migrate pattern matches file by file, testing after each
- ✅ Replace Type enum with Type(u32) newtype
- ✅ Optimize Type::kind() and helper methods
- Kept TypeKind for pattern matching (provides better ergonomics than direct u32 decoding)
Appendix: Why We Proceeded with Phases 3 & 4
The revised ADR originally recommended stopping after Phase 2B and only proceeding if generics needed it. However, we implemented Phases 3 & 4 because:
- Clean foundation: Better to complete the migration while the architecture is fresh
- Original design intent: The full InternPool design provides clear benefits
- Incremental safety: Our "Shadow Type" approach mitigates the risk that caused the original deferral
- Future-proofing: O(1) type comparison and generic type instantiation will be needed eventually