frontend-reentrancy #1

Merged
bngreer merged 17 commits from frontend-reentrancy into master 2026-06-03 02:03:38 +00:00
Owner
No description provided.
Exploration toward making a 2nd in-process check_parsed_files clean (the
correctness model for #insert), on an isolated branch. Confirms it's a CHAIN of
session-local-identity forks, not a couple of swaps.

- reset_preload_caches (checker_entity.odin) + call at check_parsed_files top:
  nil the init_preload guard vars per session so runtime types (Allocator_Error,
  Context, Type_Info, ...) re-resolve against this session's base:runtime instead
  of a stale prior one. Fixes fork #1.
- global_file_id_counter (common.odin) + parser.odin: allocate file ids from one
  process-global monotonic counter, not a per-Parser one, so a 2nd session's ids
  don't collide with session 1's entries in the append-only global_files table
  (which would make ast_file return stale files -> import-graph crash). Fixes
  fork #2 (the clean way — unique ids, not clearing the tables).
- cmd/macrocheck: run_once + `--twice` re-entrancy test (the tool that pins each
  successive fork; currently red at fork #3).

Fork #3 (the wall): pass 2 SEGVs in check_procedure_later_info — persisted
builtin_pkg entities carry session-1 decl_info; builtin must be rebuilt per
session too. See docs/goal_a_pass_findings.md. No single-check regression
(macro suite green).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Completes the Goal A walk: macrocheck --twice now passes (pass 2 clean,
deterministic, stdlib-heavy programs, single-check unregressed). The chain was
NOT the canonical-identity wall — 4 session-local-state forks, same pattern each.

Forks 3-4 (1-2 already on this branch):
- #3 builtin packages: re-run init_universal per session (driver discipline in
  cmd/macrocheck — init_universal unconditionally rebuilds builtin/intrinsics/config).
- #4 global_after_checking_procedure_bodies: a session flag set after body-checking
  (checker.odin:2077), never cleared, that made check_procedure_later_info take a
  debug branch derefing a not-yet-linked info.decl.entity on session 2 (SEGV). Reset
  it at check_parsed_files top.

So in-process re-check is solved -> unblocks the #insert loop (check -> run
metaprogram -> splice ^Ast -> re-check in-process, no subprocess). This is
re-check, not preload; module sharing still needs canonical identity. See
docs/goal_a_pass_findings.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`--inject` now builds a full top-level declaration `gen :: proc() { _ = dub(21) }`
entirely in-process (no source string, no parser), collects it into the live
package via check_collect_value_decl, drains the exported-entity queue, checks the
signature, and drains the deferred body through check_proc_info. The body is
genuinely type-checked: INJECT_BAD=1 swaps in `dub("x")` and the checker rejects it
("Cannot convert '"x"' to 'int' from 'untyped string'").

This is the emit-^Ast half of the metaprogram loop: a metaprogram can synthesize a
declaration and have it collected + checked in the same session, no string
marshaling, no parser re-entry. Documents the one caveat: hand-built proc/constant
entities land with .file == nil (check_collect_value_decl sets e.file only on the
variable branch), so the drain sets item.entity.file before add_entity_with_name_info.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
plan_run_test_set.md — #run as spec-pinned corpus, tiered T0..T4 (scalar fold →
typed run-result → RTTI introspection → scope-effect/metaprogram-bridge (pending)
→ robustness), driver-level cmd/runcheck harness over tests/run/, with the
i64-exit-code baseline and the typed-Exact_Value generalization called out.

plan_insert_splice.md — dynamic (computed) #insert via Model A: provisional check →
JIT-run the metaprogram (lb_run_jit_code -> ^Ast) → mutate AST in place (clone +
gensym + ordered_remove/inject, removing the directive = recursion-trap fix) →
authoritative re-check (proven re-entrant path). tests/insert/ I1..I8 including the
stale-entity "5th fork" canary.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cmd/runcheck evaluates value-position `X :: #run f()` constants (vm2, no
backend link — same philosophy as cmd/macrocheck) and compares folded values
to sibling .expect files. Per-fixture init_universal exercises the proven
re-entrant path; multiple fixtures run cleanly in one process.

Value channel: synthesize `__frun_N :: proc() -> int { return f() }` and read
its return slot via ovm.run_program_value (host-safe + unclamped). This replaces
run_directive.odin's dead `exit(<call>)` mechanism, which routed through the
foreign exit handler -> libc.exit (would kill the harness) and capped at 0..255.

Tier-0 GATING (green): t0_basic (scalar incl. negative), t0_wide (full 64-bit,
proving the channel is unclamped), t0_error (bad #run body -> !error, no fold).
PENDING: pending/r2_rtti (#run reading type_info_of -> int; checks but vm2 can't
yet execute RTTI). tests/run/run.sh runs gating (must pass) + pending (reported,
non-gating); README maps each Rx.y id -> fixture -> status and scopes T1/T2/T3.

Depends on an additive ovm.run_program_value in odin-vm2/src/run.odin (left
uncommitted there atop the user's in-progress vm2 changes).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
vm2 is incomplete and must not be used as a backend; rebuild the #run harness on
backend-llvm (the ORC LLJIT run-driver).

- run_directive.odin: run_eval_consts now parses, neutralizes each `X :: #run f()`,
  synthesizes an exported `fs_run_N :: proc "c" () -> int { context =
  runtime.default_context(); return f() }`, checks, codegens, and JIT-runs every
  entry, reading its i64 return. The @(export) attribute (synthesized before check)
  makes each entry a minimum-dependency ROOT so it + its callees survive DCE and get
  external linkage; the context setup lets a #run body call ordinary Odin procs.
- backend-llvm/jit.odin: lb_run_jit_i64_multi — build the JIT once, resolve+call N
  `proc "c" () -> i64` symbols (lb_run_jit_i64 consumes the modules, so N calls can't).
- barnyard_native.odin: barnyard_jit_eval_symbols — public codegen+multi-JIT helper
  (reaches the file-private native_* setup).
- cmd/runcheck: thin driver over fs.run_eval_consts; links the full backend stack.

Tier-0 GATING green via LLVM (one process, re-entrant): t0_basic (scalar incl.
negative), t0_wide (full 64-bit), t0_error (!error). PENDING pending/r2_rtti now
EXECUTES under LLVM (returns -1) where vm2 couldn't run it — Type_Info_Struct match
still fails (JIT type-info table incomplete for a metaprogram-only type). run.sh now
builds the backend stack (needs libcranelift_wrap.a per STATUS.md); README updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Implements the dynamic #insert path: `#insert f()` where f is a metaprogram
returning a Code value. The driver (src/insert_directive.odin) runs Model A —
"check, then re-enter" — over TWO parses to avoid re-checking a mutated, already-
checked tree:

  ROUND 1 (provisional): parse, EMPTY any body containing a computed #insert
    (only the metaprograms need to run), synthesize an exported
    `fs_insert_N :: proc "c" () -> rawptr { context = runtime.default_context();
    return <operand> }`, check, codegen, JIT-run each -> a ^Ast Code node.
  ROUND 2 (authoritative): fresh parse; splice each Code node's statements
    (cloned into round-2's file; clone_ast nils ident entities so they re-resolve)
    in place of the #insert; check ONCE. The directive is gone -> no re-trigger.

- backend-llvm/jit.odin: lb_run_jit_code_multi (build JIT once, run N code entries).
- barnyard_native.odin: barnyard_jit_eval_code_symbols (codegen + multi code-JIT).
- cmd/insertcheck + tests/insert/: I1 splice+recheck, I2 spliced value USED by the
  caller, I3 spliced stmt CALLS a program proc (the identity canary — Plan-2's
  predicted "5th fork" did NOT bite), I7 negative (bad spliced #code rejected).
  4/4 green. Plan-1 runcheck regression still 3/3.

Seam `#insert #run gen()`, nested/recursive #insert, and operands closing over a
caller local are noted as not-yet-wired in tests/insert/README.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
type_info_of(T) was never broken in the metaprogram JIT — the r2_rtti fixture was
wrong. A top-level `Point :: struct{...}` is a NAMED type, so
type_info_of(Point).variant is Type_Info_Named, not Type_Info_Struct; the fixture
matched Type_Info_Struct directly (correctly false) and looked like a backend bug.
Corrected to unwrap via runtime.type_info_base; r2_rtti now reads field_count=3 at
compile time and is promoted from pending/ into gating (4/4 run suite green).

Diagnosis tooling: FS_JIT_DUMP_IR=<prefix> dumps each JIT'd module's .ll
(jit_build_and_load) — that showed type_info_data[110] was Type_Info_Named.

Re-entrancy fork FIXED: lb_reset_global_type_info_state() (backend general.odin),
called at the top of lb_generate_code. The process-global type-info offset cursors
(lb_global_type_info_*_index) are append-only per codegen; the in-process driver runs
lb_generate_code once per #run/#insert session, so without a reset session N inherited
N-1's cursors and type_info_of read the wrong giant-array slot. No-op for single-shot
builds.

Also: reuse the #run operand node directly instead of clone_ast (avoids a needless
copy / potential type-identity split). A separate OPEN fork — a failed-check session
corrupts the next session's RTTI codegen — is worked around by ordering failing-check
fixtures last in run.sh (documented).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Traced the "failed-check poisons next session" fork: a #run/#insert session whose
check FAILED left global_error_collector.count non-zero, and init_global_error_
collector() (called at the top of each in-process session) only RESERVED the arrays
— it never reset count. So the next session's any_errors() returned a stale true and
run_eval_consts aborted before codegen, silently (no new diagnostic). Not RTTI-
specific: it broke every fold after any failing session; only looked RTTI-ish because
the RTTI fixture happened to run after the negative one.

Fix: init_global_error_collector now atomically zeroes count/warning_count/in_block/
curr_error_value_set and clears error_values. tests/run/run.sh restores an order-
independent gating list (t0_error mid-list, RTTI after it — the case that used to
fail) and drops the "order failing fixtures last" workaround. run 4/4 + insert 4/4.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
#run results now carry their real type, not just i64. Each entry is synthesized as
`proc "c" () -> rawptr { context = runtime.default_context(); return new_clone(f()) }`
— new_clone heap-allocates ^T and copies the value, so any type comes back through one
rawptr ABI. The driver keeps the JIT alive (barnyard_jit_eval_ptrs / lb_jit_eval_ptrs,
no dispose) and decodes the heap value per the const's CHECKED type
(operand.tav.type): int/uint/bool/enum -> i64 (sign-/zero-extended by size), f32/f64 ->
f64, string -> bytes copied out before dispose (string data lives in module rodata).

run_eval_consts returns map[string]Run_Value {kind, i, f, s}; runcheck's .expect now
parses 42 / 3.14 / "hello" and compares by kind (float with epsilon). New gating fixture
t1_typed (string/f64/f32/bool/enum/u64) — run suite 5/5, insert 4/4.

Backend additions are purely additive: lb_jit_eval_ptrs + lb_jit_dispose (jit.odin),
barnyard_jit_eval_ptrs + barnyard_jit_dispose (barnyard_native). Aggregates
(array/struct) decode to Invalid for now; folding typed values back into the AST
(run_fold_consts) still only handles the integer kind.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The computed-#insert walker only scanned top-level proc bodies, so `#insert`
inside an if/for/switch/case silently did nothing. Replaced proc_body_blocks +
flat collect/splice with a recursive pair (collect_in_stmt / splice_in_stmt) that
descends into BlockStmt, if (body+else), for, range, when, switch/type-switch
bodies, and case clauses — in identical DFS pre-order on both parses so the JIT'd
codes still align by index. Round 1 still empties any top-level body that contains
an #insert (anywhere nested).

New gating fixtures: i4_multi (2 sites, combined), i5_nested (inside if),
i6_control (if + for + switch + deep if-in-for, 3 sites, order preserved),
i8_collide (negative: non-hygienic by design — two inserts declaring `t` →
redeclaration error; spliced names are referenceable so deliberately not gensym'd).
insert 8/8 + run 5/5; 0 crashes over 8 stability runs.

Documented two limits this surfaced (not splice-walker bugs): #code that
references splice-site locals fails (it's checked in the metaprogram scope), and
no fixpoint over a spliced #code that itself contains a computed #insert.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Dead code removed (superseded by the typed #run channel + cmd/runcheck):
- backend-llvm: lb_run_jit_i64_multi (0 callers after T1 switched to the rawptr/
  typed-value path); lb_run_jit_code_multi now delegates to lb_jit_eval_ptrs +
  lb_jit_dispose instead of duplicating the symbol-lookup loop.
- src/barnyard_native: barnyard_jit_eval_symbols (i64 multi; 0 callers).
- cmd/spike: do_crun / do_crun_eval / run_compile_time_procs — the OLD subprocess +
  vm2 #run prototype (RUNVAL text marshaling). It was broken by the Run_Value
  signature change and is superseded by cmd/runcheck (in-process backend-llvm).
  `spike macro` (the only test-used subcommand) still builds + runs.

Dedupe: the 5 identical AST constructors (mk_ident / mk_field_list /
mk_rawptr_result_list / mk_export_attr / mk_context_setup) were copy-pasted across
run_directive.odin (mk_*) and insert_directive.odin (ins_mk_*). Hoisted to one
package-private copy in src/macro_ast.odin; both drivers reuse it.

Also: removed the vm2-era stale fixtures (value/combined/side_effect, which called
the incomplete vm2's write_int) and fixed two comments referencing deleted symbols
(do_crun, lb_run_jit_i64_multi). run 5/5 + insert 8/8; spike macro green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
run_fold_consts now folds non-int #run results: mk_int_lit / mk_float_lit /
mk_str_lit (quoted, round-trips through unquote_string) / mk_bool_lit (true/false
ident), dispatched in fold_literal off the const's checked type (Run_Value gains a
`type` field; a bool decodes as Int but folds to `true`). enum + aggregates are not
folded yet (noted). This is the build-integration prerequisite: the value has to
land in the tree as a constant, not just be read.

run_fold_check + `runcheck --fold` prove the loop end-to-end: eval -> fold -> fresh
parse + check. New fixture fold_typed uses the folded consts where only a constant
works (`[N]int`, `PI + 1.0`, `MSG`, `when ON`) — checks clean, 4/4 folded.

Also fixed an eval limitation the fold test exposed: neutralizing a USED const to
`0` (e.g. `when ON` -> `when 0`) broke the eval check. run_eval_consts now empties
CONSUMER proc bodies during eval (keeping the metaprogram procs = the operands'
direct call targets), same idea as #insert's body-empty. run 5/5 + fold 1/1 +
insert 8/8.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds enum to run_fold_consts: fold a #run enum result to `TypeName(backing)` via
mk_enum_conv + named_type_name (Type_Named.name). The decoded value is the enum's
backing integer, so the conversion recovers the exact variant even for explicit-
valued enums; anonymous enums (no name) fall through to nil (unfolded). fold_typed
now also folds COL :: #run hue() -> Color(2) and uses it in `when COL == Color.Blue`
— 5/5 folded + re-checked.

Scalar fold-back is now complete: int / uint / float / string / bool / enum. Only
aggregate (array/struct → compound literals) fold-back remains. run 5/5 + fold 1/1
+ insert 8/8.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Finishes the typed fold-back. Run_Value gains an .Agg kind + nested `elems`.
decode_run_value reads aggregate heap bytes per the type — Type_Array (elem/count)
and Type_Struct (fields + computed offsets) — recursively into child Run_Values.
fold_literal's new .Agg case synthesizes a compound literal via mk_compound_lit:
type_expr emits `Name` for a named type or `[N]E` for an anonymous fixed array
(mk_array_type), and elements fold recursively, so nested struct/array nest too.
Anonymous structs (no nameable literal type) fall through to nil.

New fixture fold_agg folds [3]int / Point / Line{Point,Point} (nested struct) /
[2][2]int (nested array) and uses ARR[0] as a const array size — 4/4 folded +
re-checked. runcheck value-mode switches handle the new kind (aggregates are
exercised via --fold, not value comparison). run 5/5 + fold 2/2 + insert 8/8.

Typed fold-back is now complete: int/uint/float/string/bool/enum + array/struct.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
barnyard_resolve_directives between parse and the authoritative check (Model A):
one provisional compile runs all metaprograms, then mutates yard.parser in place
(run_fold_consts for #run, splice for #insert), then the real check sees a
directive-free AST. Documents the frontend delta (66 lines, all re-entrancy — no
directive logic in the frontend) and why it's a driver pass, not a frontend patch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bngreer merged commit 7b097d69a5 into master 2026-06-03 02:03:38 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
bngreer/rdnk!1
No description provided.