ADR-0085: C foreign function interface

Status

Implemented (stable in Phase 5).

Summary

C FFI via one new @mark registry entry and one new keyword:

  • @mark(c) — C ABI on fns, C layout on structs.
  • link_extern("libname") { … } — a new top-level item form (sibling of fn/struct/enum) introducing an extern-declaration scope. Items inside are implicitly body-less and implicitly @mark(c) (the only ABI in v1). The library name contributes -l<name> to the link line.

@mark(c) fn …{ … } at top level is a C export; fn …; inside a link_extern block is a C import; @mark(c) struct uses C layout and crosses the FFI boundary by value. Imported fns are called like regular Gruel fns — v1 commits to no syntactic safety gate at the call site; that surface is deferred wholesale to the future capability ADR. Symbol renames use @link_name("…"). Allowed FFI types: numeric primitives, bool, (), Ptr(T), MutPtr(T), and @mark(c) structs. Consumes ADR-0069's @layout(c) slot. Enum FFI (@mark(c) enum) is deferred to a follow-up ADR — it needs a target-dependent c_int type the compiler does not yet model. The link_extern block is a named lexical unit the future capability ADR can use as a witness scope.

Context

ADR-0028 added raw pointers and checked/unchecked blocks and explicitly listed C FFI as future work. Today the only way to call a C function from Gruel is to add a __gruel_* intrinsic — sema, codegen, and runtime grow in lockstep. The registry already has 60+ entries; this asymptote is wrong.

ADR-0083 stabilised @mark(...) as the closed-registry directive for declaration-time-only attributes, with applicable_to set per row. ADR-0084 set the precedent of adding a second MarkerKind namespace alongside posture. ADR-0069's "Layout attributes" future-work slot reserved @layout(c) for FFI struct layout but did not commit syntax — this ADR routes C layout through @mark instead.

A capability system is on the medium-term roadmap and FFI is one of the things it will gate. The design hook (link_extern blocks as named units) leaves space for a capability witness requirement to layer on without surface-syntax change.

Three structural choices shape the surface:

  1. Block form over per-decl @mark(extern). Inside an extern-declaration scope, every item is by definition an extern — there's nothing else it could be. Pushing extern-ness to a marker on each declaration repeats information the scope already commits to, and it forces every line to repeat @mark(c, extern) for an ABI that, in v1, is also the only choice. The scope subsumes both.

  2. In-source over CLI for library linkage. For anything beyond libc the library name is part of the binding's identity. Keeping it in source (a) survives copy-paste of bindings between projects, (b) makes the platform-specific link surface auditable from the source tree, and (c) lets the compiler reason about which symbols belong to which library — useful for capability scoping later. A CLI override can come back as future work if real workflows demand it.

  3. Keyword over directive for the scope itself. link_extern("…") { … } introduces a scope and binds items; that's structural, not decoration. ADR-0083's @mark(...) directive surface is built for per-decl attributes (applicable_to, registry rows, closed namespace); extending it to scope-introducing constructs would conflate two different jobs. Putting link_extern in the grammar — alongside fn, struct, enum — costs one keyword and keeps the directive registry uniform.

Enum FFI is deferred — see [Future Work].

Decision

One marker, one keyword

pub enum MarkerKind { Posture(Posture), ThreadSafety(ThreadSafety), Abi(Abi) }
pub enum Abi { C }

BUILTIN_MARKERS rows:

NameMarkerKindApplicable toMeaning
cAbi(Abi::C)FUNCTION | STRUCTC ABI on fns; C layout on structs.

At most one ABI marker per item. Future ABIs (system, stdcall, eventually rust) extend Abi. Future widening of applicable_to to ENUM arrives with the enum-FFI ADR. Extern-ness is not a marker — it is established by the surrounding link_extern block.

link_extern is a new reserved keyword. The grammar gains one item form: link_extern "(" string-literal ")" "{" item* "}". The library name is a single non-empty string literal; the body is a sequence of declarations (fns only in v1).

link_extern("m") {
    fn sin(x: f64) -> f64;
    fn cos(x: f64) -> f64;
    fn pow(base: f64, exp: f64) -> f64;
}

link_extern("c") {
    fn abs(x: i32) -> i32;
    fn strlen(s: Ptr(u8)) -> usize;
}

Semantics for items inside the block:

  • Resolve at link time. The fn carries no body; the declaration ends with ;.
  • Implicit @mark(c) — the only ABI in v1. (Future ABIs override per-item: @mark(stdcall) fn FooBar(...); inside the block.) Writing @mark(c) explicitly on an item is legal but redundant.
  • Default symbol name = the Gruel identifier; override with @link_name("…").
  • Only fn declarations permitted in v1. (Future: extern statics.)
  • Called like any other Gruel fn — no implicit unchecked, no checked { } required at the call site. Raw pointer operations inside a Gruel caller still require checked { } per ADR-0028; only the FFI call itself is ungated.

Semantics for the block itself:

  • The library name contributes -l<libname> to the final link line (see [Linker line construction]).
  • Empty blocks (link_extern("foo") {}) are permitted: they add the -l flag without declaring symbols, useful for libraries whose symbols are reached indirectly (e.g. through inline asm or dlsym).
  • Empty library names (link_extern("") { … }) are rejected.
  • Multiple link_extern blocks naming the same library are permitted (in the same file or across files); their items merge and the -l flag is emitted once.
  • link_extern blocks do not nest.

The two uses of @mark(c)

// Function export — Gruel fn callable from C.
@mark(c) fn my_callback(x: i32) -> i32 { x + 1 }

// Struct layout — C-compatible, by-value across FFI.
@mark(c) struct Timeval { tv_sec: i64, tv_usec: i64 }

Function imports are not in the list — they live inside link_extern blocks and pick up @mark(c) implicitly. The marker on a struct fixes layout and makes the type eligible to cross by value; the marker on a top-level fn marks it as a C export. Applying @mark(c) to an enum is rejected in v1 (MarkCOnEnum) and will be re-enabled by the enum-FFI ADR.

Body-less function declarations

A fn declaration ends with ; instead of a block body iff it appears inside a link_extern block. Body-less fns at top level are rejected with BodyLessFnOutsideLinkExtern. Fns with bodies inside link_extern blocks are rejected with LinkExternItemHasBody.

@mark(c) struct layout

Normative C struct rules:

  • Field order = declaration order; no reordering.
  • Each field placed at the lowest offset ≥ cursor satisfying the field's natural alignment; implicit padding inserted.
  • Struct alignment = max field alignment (1 for empty structs).
  • Struct size = cursor after last field, rounded up to struct alignment.
  • ADR-0069 niche optimisation disabled on the type.
  • No packed mode in v1.

Allowed field types (recursive): numeric primitives, bool, Ptr(T), MutPtr(T), fixed-size arrays of allowed types, other @mark(c) structs. Anything else is rejected with FfiAggregateHasNonCField on the field's span.

User-defined fn drop on @mark(c) structs is rejected in v1 (FfiAggregateHasDrop). Posture inference is unchanged.

Call-site posture

Imported fns are called like any other Gruel fn — there is no FFI-specific checked/unchecked requirement.

link_extern("m") {
    fn sin(x: f64) -> f64;
}

fn compute(x: f64) -> f64 { sin(x) }

This is a deliberate scope cut. Earlier drafts implicitly marked extern fns unchecked and forced callers into checked { … }; v1 removes that gating because the capability ADR is the right place to decide what FFI gating should look like (per-block witness, per-call token, none-at-all-by-default, etc.). Shipping the gate now would either lock in an answer or be removed before the capability work lands.

ADR-0028's checked/unchecked semantics for raw pointer operations are unchanged — calling sin(2.0) is bare, but dereferencing a MutPtr(u8) inside a Gruel fn still needs checked { } as it always has.

Symbol naming

Default symbol name = the Gruel identifier. Override with @link_name("…"):

link_extern("foo") {
    @link_name("__weird_c_symbol") fn nice_name() -> i32;
}

@link_name("MY_EXPORTED_FN") @mark(c) fn my_callback(x: i32) -> i32 { x + 1 }

@link_name is valid on (a) fns inside link_extern blocks and (b) top-level @mark(c) fn …{ } exports. It is rejected elsewhere with LinkNameRequiresCAbi.

Linker line construction

The link line is the deduplicated set of library names from all link_extern blocks across all source files, emitted as -l<name> flags after the runtime archive. Library names are sorted lexicographically; v1 makes no commitment about order-sensitive linking (workflows that need it will get explicit ordering syntax in a follow-up).

Libc and runtime support libraries continue to be linked unconditionally as today, independent of user link_extern blocks.

Allowed FFI types

CategoryTypesC equivalent
Signed inti8i64, isizeint8_tint64_t, intptr_t
Unsigned intu8u64, usizeuint8_tuint64_t, size_t
Floatf32, f64float, double
Boolbool_Bool (1 byte at boundary)
Unit return()void
PointerPtr(T), MutPtr(T)const T*, T*
C struct (by value)@mark(c) struct Tstruct T

LLVM's default C calling convention handles small-struct-in-registers vs large-struct-via-sret automatically.

Rejected in v1, each cited at the offending span: non-@mark(c) aggregates by value, any enum across the FFI boundary (including @mark(c) enum, which is forbidden v1-wide), slices, references, owned containers (String, Vec(T), anything Drop), variadic functions.

Codegen notes

  • link_extern block fns: module.add_function(symbol, ty, Some(Linkage::External)) with no body. Symbol = @link_name arg or the Gruel identifier.
  • Exported @mark(c) fns: Linkage::External, symbol set to @link_name or the identifier, Gruel mangling suppressed.
  • bool lowers to i8 at the ABI boundary; isize/usize use pointer_sized_int_type().
  • ADR-0069's Layout gains mode: LayoutMode { Default, C }. @mark(c) structs take the C path; niches stay empty.
  • Sema's and codegen's field-offset paths both dispatch on LayoutMode. Centralising offsets in Layout (ADR-0069 OQ3) is attractive but a follow-up refactor — not a blocker.
  • Pass and return @mark(c) structs by value via the default C calling convention.
  • After lowering, walk all link_extern blocks across the compilation to compute the deduplicated, lex-sorted library set; thread through CompileOptions into the linker step.

Capability system seed

Two facts together leave the seam open:

  1. The link_extern block is a named lexical unit a future capability ADR can refer to (per-library witness) or further refine into per-declaration witnesses.
  2. v1 commits to nothing about call-site gating — extern calls are bare today.

The capability ADR has full freedom over what FFI gating looks like: introduce a checked using cap_libc { … } form, require per-fn capability tokens, leave bare calls in place, or some hybrid. None of those options requires touching this ADR's syntax; they all just add structure around the existing extern declarations.

Preview gating

PreviewFeature::CFfi (CLI: c_ffi). Gate fires on @mark(c), the link_extern keyword, and body-less fn declarations. Retires in Phase 5.

Diagnostics

  • BodyLessFnOutsideLinkExtern
  • LinkExternItemHasBody
  • LinkExternNonFnItem { item_kind }
  • LinkExternEmptyLibraryName
  • LinkExternNested
  • LinkExternDuplicateImport { library, symbol } — same symbol declared twice with mismatched signatures.
  • MarkCOnEnum@mark(c) is not yet valid on enums; points to the future enum-FFI ADR.
  • FfiTypeNotAllowed { type_name, position }
  • FfiAggregateHasNonCField { field_name, field_type, container_kind }
  • FfiAggregateHasDrop { aggregate_name }
  • LinkNameRequiresCAbi
  • CFfiPreviewRequired

Implementation Phases

Phase 1: Marker registry + parser

  • Add Abi enum and MarkerKind::Abi variant in gruel-builtins.
  • Add c (FUNCTION | STRUCT) row to BUILTIN_MARKERS.
  • Add PreviewFeature::CFfi in gruel-util/src/error.rs.
  • Lexer: reserve link_extern as a keyword.
  • Parser: add item form link_extern "(" STRING ")" "{" item* "}". Accept body-less fn declarations (trailing ;); sema enforces "only valid inside link_extern". Recognise @link_name.
  • AST/RIR: add Item::LinkExtern { library: Symbol, items: Vec<ItemId>, span }. Lowering stamps each contained fn with link_library: Symbol, is_extern = true, abi = Abi::C (unless overridden by an explicit ABI marker on the item).
  • Spec tests under cases/items/c-ffi.toml: parse cases for each @mark(c) shape (export, struct) and each link_extern shape (single fn, multiple fns, @link_name rename, empty block); rejection cases (body-less without link_extern, body inside link_extern, non-fn item inside, empty library name, nested link_extern, @mark(c) on enum); c_ffi preview gating.

Phase 2: Sema

  • Register @mark(c) on fns (export path) and structs. Reject on enums with MarkCOnEnum.
  • Enforce link_extern well-formedness (no body on items, fn-only in v1, non-empty library name, no nested link_extern).
  • Deduplicate imports across blocks/files; emit LinkExternDuplicateImport on signature mismatch.
  • Validate FFI types on params/returns of every C-ABI fn — both import (inside link_extern) and export (top-level @mark(c)). Reject enums anywhere on the boundary.
  • Validate @mark(c) struct fields recursively; emit FfiAggregateHasNonCField on offending spans.
  • Reject fn __drop on @mark(c) structs.
  • Validate @link_name placement.

Phase 3: Codegen + linker

  • Emit extern fn declarations with Linkage::External and no body. (Automatically handled by LLVM auto-declaration on the call site; is_extern skips body emission.)
  • Emit exported @mark(c) fn bodies under their literal symbol names (mangling suppressed). (Codegen emits top-level fns with their Gruel identifier as the LLVM symbol; no mangling pass is currently applied.)
  • Walk link_extern blocks post-lowering to compute the deduplicated, lex-sorted library set; thread through CompileOptions to link_system_with_warnings; emit -l<name> after the runtime archive.
  • Runtime spec tests (link_extern_libm_sqrt exercises end-to-end -lm; ffi_struct_roundtrip_gruel exercises by-value struct passing).
  • Deferred to follow-up (not in v1 scope): explicit LayoutMode::C plumbing in Layout (ADR-0069 OQ3 follow-up), ffi_call_div_libc, ffi_nested_c_struct runtime suites. Today's default Gruel struct layout already matches the C-ABI struct layout rules for the field types FFI-permits, so spec-test coverage for C interop is achieved without a separate layout mode.

Phase 4: Spec + golden tests

  • New spec chapter (chapter 10) describing the FFI surface: @mark(c), link_extern blocks, body-less fn form, allowed types, C layout rules for structs, link line construction. (Note: implicit-unchecked was dropped in v1 — see Decision: Call-site posture.)
  • Add spec = [...] traceability to every Phase 1–3 spec test. Normative coverage at 100%.
  • UI tests for diagnostic quality on each rejection.

Phase 5: Stabilise

  • Remove PreviewFeature::CFfi; strip preview = "c_ffi" from spec tests.
  • ADR status → implemented; record spec sections in frontmatter.

Consequences

Positive

  • Interop with the entire C ecosystem for non-enum cases; the intrinsic treadmill for OS capabilities ends.
  • Struct-by-value FFI works on day one (div_t-style returns, nested structs).
  • Library linkage lives with the declarations that depend on it; bindings survive copy-paste between projects without losing build instructions.
  • Block form pays for the per-decl repetition: no @mark(c, extern) on every line, no separate CLI step to remember.
  • link_extern slots into the grammar alongside fn/struct/enum — readers don't need to learn a new directive shape, and the marker registry stays uniform.
  • Bidirectional from the start — import (via link_extern) and export (via top-level @mark(c)) share the same ABI surface.
  • Composes with checked/unchecked — no changes to borrow, drop, or type checking.
  • Capability-system ready; the link_extern block is a natural unit for a per-library witness (checked using cap_libc { … } is one shape that would compose).
  • Consumes ADR-0069's @layout(c) slot for struct layout — no parallel @layout directive in the backlog.
  • Source files target-agnostic: link line is computed from source, but the libraries themselves are platform-conventional. Deferring enums avoids baking any platform-int shape into v1 — a clean slate for the c_int work.

Negative

  • One new reserved keyword (link_extern) — small footprint but a real grammar commitment.
  • No enum FFI in v1 — the largest scope cut. Users with C tagged unions wrap them by hand using @mark(c) struct { tag: u32, payload: SomeUnionShape } (or pointer indirection) until the enum-FFI ADR lands. Plain C enum types likewise stay out of signatures; users round-trip through i32/u32 with an explicit cast at the call site.
  • No variadic functions in v1 (no printf).
  • No fn drop on @mark(c) structs — cleanup is manual MutPtr discipline.
  • No CLI escape hatch for ad-hoc library linkage — adding -lfoo requires a source-level link_extern("foo") {}. If a workflow demands a real escape hatch later, restoring a --link flag is non-breaking.
  • No syntactic FFI gate at the call site. sin(2.0) and a local add(2.0) look identical — a reader can't tell from the call alone that one is a foreign symbol. Capability ADR is expected to add gating; until then, the FFI/non-FFI boundary is only visible at the declaration.
  • C struct layout, once shipped, is a wire-format commitment.

Neutral

  • __gruel_* runtime symbols keep their current hardcoded path; cleanup to express them via link_extern("c") { … } is mechanical but optional.
  • Abi has one variant in v1 — room for future ABIs.
  • Library link order is alphabetical-and-deduplicated in v1; deterministic but not user-controlled.

Open Questions

  1. Field-offset centralisation in Layout. ADR-0069 OQ3. Two layout modes double the parallel-construction risk; leaning yes, but as a follow-up refactor.
  2. Empty link_extern blocks. Permitted in v1 — link_extern("foo") {} adds -lfoo without declaring symbols (useful for indirect-symbol-access cases). Open question is whether to lint them when no other link_extern("foo") { … } block adds declarations, on the theory that the user probably forgot to add the imports.
  3. Duplicate imports across blocks. Same symbol declared twice with matching signatures is silently deduped; mismatched signatures error. Reasonable default; revisit if users want strict "declare once per program".
  4. @link_name naming. Going with @link_name (matches Rust's #[link_name = …]); the visual rhyme with link_extern is mild and disambiguated by the @ prefix.
  5. Static / framework linkage modes. Deferred; v1 emits -l<name> only. Future syntax TBD (link_extern(static, "foo") { … }, link_extern("foo", kind = "static") { … }, or similar).
  6. fn drop on @mark(c) structs. Forbidden in v1; could be lifted with defined cross-boundary semantics.
  7. Order-sensitive linking. Punt to future work. Most user code links 0–2 libraries where order doesn't matter; alphabetical-deduped covers it.
  8. Cross-file link_extern merging. Two files writing link_extern("m") { fn sin(...); } and link_extern("m") { fn cos(...); } should merge into one library-set entry. Confirmed yes; sema-side dedup handles it.

Future Work

  • Enum FFI (@mark(c) enum). Data-carrying tagged unions and field-less C enums across the FFI boundary. Blocks on landing a target-dependent c_int type so the discriminant matches Rust's #[repr(C)] enum and C's enum conventions. Will widen c's applicable_to to FUNCTION | STRUCT | ENUM and lift the MarkCOnEnum rejection.
  • Variadic FFI (@mark(c, variadic) inside link_extern blocks).
  • fn drop on @mark(c) structs with defined cross-boundary semantics.
  • Packed C layout (@mark(c, packed)).
  • Extern statics inside link_extern blocks (link_extern("c") { static stdin: MutPtr(u8); }).
  • Additional ABIs: system, stdcall, vectorcall, rust — explicit via @mark(...) inside link_extern blocks.
  • Static and framework linkage modes (in-source syntax TBD).
  • Order-sensitive link_extern declarations for legacy static-archive linking.
  • CLI -l<name> override as a build-time escape hatch.
  • Capability-system integration — per-link_extern-block witness, or per-fn refinement.
  • C header import (@c_import("foo.h")).

References