ADR-0017: Emitter Instruction Abstraction
Status
Implemented
Summary
Refactor the x86-64 and aarch64 emitters to use an explicit instruction representation that captures both machine code bytes and assembly text, enabling accurate --emit asm output that includes prologue/epilogue.
Context
The current --emit asm output is misleading. It calls format_assembly() on the MIR, which only shows MIR instructions. But the actual emitter generates additional code:
- Prologue:
push rbp,mov rbp, rsp, callee-saved register saves, stack allocation, parameter spills - Epilogue augmentation: When the emitter sees
mov rsp, rbp, it inserts callee-saved restoration - Implicit instruction expansion: Some MIR instructions expand to multiple machine instructions
What --emit asm shows:
main:
mov rax, 42
ret
What actually runs:
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov rax, 42
mov rsp, rbp
pop rbp
ret
This gap makes debugging stack/ABI issues difficult.
Why This Is Hard to Fix
The difficulty reveals an architectural issue: the emitter conflates instruction selection with encoding. Each emit_* method directly pushes bytes to a buffer without leaving any record of what instruction was emitted.
There are ~72 emit methods in x86-64 and ~54 in aarch64. Adding assembly text recording to each would:
- Duplicate the instruction description (once in bytes, once in text)
- Risk drift between bytes and text
- Be error-prone and hard to maintain
Decision
Introduce an explicit EmittedInst type that represents a single emitted instruction with both its bytes and text representation. The emitter will produce a sequence of these, which can then be serialized to either bytes or assembly text.
Core Types
/// A single emitted machine instruction.
/// Result of emitting a function.
Emit Method Pattern
Each emit_* method will return or record an EmittedInst:
// Before
// After
Helper Method
Add a helper to reduce boilerplate:
Finalization
After all instructions are emitted, compute offsets and apply fixups:
Prologue/Epilogue
The prologue and epilogue become explicit sequences of EmittedInst:
Implementation Phases
Phase 1: Core types and x86-64 refactor - gruel-hf6s (completed)
- Add
EmittedInstandEmittedCodetypes - Refactor x86-64 emitter to use new pattern
- Update
--emit asmto useto_asm() - Verify byte output is identical (regression test)
- Add
Phase 2: aarch64 refactor - gruel-4rzx (depends on Phase 1)
- Apply same pattern to aarch64 emitter
- Verify byte output is identical
Phase 3: Cleanup and optimization - gruel-h8r0 (completed)
- Add shared
format_offsethelper to lib.rs - Add
callee_saved_size()andadjust_fp_offset()helpers to x86_64 emitter - Add
adjust_fp_offset()helper to aarch64 emitter - Refactor offset adjustment code to use new helpers
- Add shared
Consequences
Positive
- Accurate
--emit asm: Output matches what actually executes, including prologue/epilogue - Single source of truth: Each instruction is described once, with both bytes and text derived from the same emit call
- Easier debugging: Can correlate assembly lines with byte offsets
- Foundation for future tools: Disassembly, instruction-level profiling, binary diffing
- Cleaner code: Explicit instruction list vs implicit byte buffer
- Testability: Can assert on instruction sequences, not just final bytes
Negative
- Larger refactor: ~70 methods in x86-64, ~54 in aarch64 need updating
- Memory overhead: Vec
vs Vec uses more memory during compilation - String allocations: Assembly text requires string formatting per instruction
- Churn: Significant changes to stable code
Mitigations
- Incremental migration: Can be done method-by-method
- Regression tests: Existing tests verify byte output doesn't change
- Memory: Only during emit phase, which is fast; can optimize later if needed
- Strings: Only computed; could be lazy if profiling shows issues
Open Questions
Should asm text be optional? Could use
Option<String>and only populate when--emit asmis requested. Trades memory for slight complexity.Should we share types between backends?
EmittedInstcould be in a shared crate. The backends would still have their own emit methods.Should labels be separate? Currently mixing labels (0-byte "instructions") with real instructions. Could have
enum EmittedItem { Inst(EmittedInst), Label(String) }.Fixup representation: Currently fixups reference byte offsets. With instruction indices, we could have a cleaner model. Worth changing?
Future Work
- Instruction-level optimizations: With explicit instruction list, could implement peephole optimizations (e.g., remove redundant moves)
- Binary diffing: Compare two compilations at instruction level
- Size analysis: Which functions generate the most code? Which patterns are expensive?
- Alternative text formats: Intel vs AT&T syntax, or custom annotations
References
- Issue: gruel-3dxp (
--emit asm should show actual emitted code including prologue/epilogue) - Current emit.rs files:
crates/gruel-codegen/src/{x86_64,aarch64}/emit.rs