Skip to content

Defer user error handlers to a VM safepoint to prevent mid-opcode reentry#124

Closed
iliaal wants to merge 8 commits into
masterfrom
proto/gh20018-defer-errors
Closed

Defer user error handlers to a VM safepoint to prevent mid-opcode reentry#124
iliaal wants to merge 8 commits into
masterfrom
proto/gh20018-defer-errors

Conversation

@iliaal

@iliaal iliaal commented Jun 26, 2026

Copy link
Copy Markdown
Owner

Defers a user error handler raised from a user-code opcode to the next VM safepoint, after the opcode completes, so the handler cannot reenter and free or mutate state the opcode still holds a raw pointer into. The interpreter and both JITs reach that safepoint through the existing vm_interrupt mechanism. Fixes phpGH-16792, phpGH-17416, phpGH-18274, phpGH-20042, phpGH-21245, phpGH-22419. A diagnostic silenced by @ keeps its error_reporting across the deferral.

Function JIT now flushes at the interpreter's safepoints: post-call with the callee frame still on top, each jump-target block, and function entry, spilling the live SSA values before it resumes. A diagnostic deferred before a loop no longer fires on the loop's forward entry, so the interpreter, function JIT and tracing JIT produce identical output. Exception handling flushes any buffered diagnostic with the pending exception saved and restored around the handler, so nothing is dropped when a later opcode throws (this is what resolved method_exists_variation_001 and the assign_dim_011 group). Diagnostics raised from internal functions still run synchronously, since the function may gate on EG(exception) or throw right after emitting; php_sort() and the array pointer functions pin their by-ref argument across the deprecation they raise.

Under the CALL VM (Windows and macOS), a deferred error could still flush while a call frame was being assembled. The throwing handler then unwound into the half-built frame and cleanup_unfinished_calls destroyed an unsent argument slot, which is NULL there; the hybrid VM keeps a live value in that slot, so glibc/x86_64 never faulted. zend_interrupt_helper deferred the flush past EX(call) only when no zend_interrupt_function was installed, but Windows always registers a Ctrl-C interrupt function, which disabled that guard. zend_interrupt_helper now re-arms vm_interrupt and defers the flush whenever a call is being assembled, regardless of zend_interrupt_function, and the JIT interrupt stub carries the same EX(call) guard. Resolves phpGH-18262 002/003 on both platforms.

@iliaal iliaal force-pushed the proto/gh20018-defer-errors branch 6 times, most recently from b882d82 to d990619 Compare June 27, 2026 18:59
A user set_error_handler() invoked synchronously mid-opcode can free or
mutate state the opcode still holds a raw pointer into, causing a
use-after-free or assertion failure (phpGH-16792, phpGH-17416, phpGH-18274,
phpGH-20042, phpGH-21245, phpGH-22419). Buffer diagnostics raised from a user-code
opcode and run the handler at the next VM safepoint, after the opcode
completes, reusing the existing vm_interrupt mechanism. Diagnostics from
internal functions still run synchronously, since such a function may
observe the handler's effect on EG(exception) or throw immediately after
emitting the diagnostic.

A deferred diagnostic that surfaces while another error handler is running
is delivered to the user handler first, before the handler that triggered
the flush. php_sort() and the internal-pointer functions separate the
array and pin the object across the now-synchronous deprecation they raise
on a by-ref array or object argument, so a handler cannot free it.

Save the opline before dispatching a pending interrupt. The deferral arms
vm_interrupt mid-statement, so the interrupt can fire at a point such as a
forward jump after an opcode already freed a temporary. The tracing-JIT
trace recorder runs CALL-convention handlers whose opline is a local
register distinct from EX(opline); ZEND_VM_SET_OPCODE updated the register
but not EX(opline), so the interrupt helper reloaded a stale EX(opline) and
re-executed the freed temporary's opcode under tracing JIT. Saving the
opline before the dispatch, gated on the existing vm_interrupt check,
resumes at the correct instruction without affecting the common path.

The inlined tracing-JIT function-leave flushes pending diagnostics before
tearing down the frame, so a handler that throws there leaves a pending
exception and a live return temporary the teardown then double-frees. Emit
the leave's EG(exception) guard unconditionally, since the flush can always
throw, rather than only when an unused return value's destruction could.

Run a pending deferred error's handler at the next call boundary instead of
the JIT's coarser safepoint: flush EG(deferred_errors) at the top of the
tracing/function-JIT DO_*CALL codegen, so a handler that inspects or
restores state runs while that state is still live, matching the
interpreter's per-call ZEND_VM_ENTER safepoint.

Keep diagnostics raised by the call-execution opcodes (DO_FCALL,
DO_FCALL_BY_NAME, DO_UCALL) synchronous, like the INIT_* opcodes: a
#[\Deprecated] function's deprecation is emitted as the function is invoked
and its handler may throw to suppress the call, so deferring it would run
the callee body first.
@iliaal iliaal force-pushed the proto/gh20018-defer-errors branch 3 times, most recently from 4774353 to 14a9a1e Compare June 28, 2026 15:36
Function JIT only checked for pending deferred errors at calls, leaves
and loop headers, much coarser than the interpreter, which flushes at
every jump (ZEND_VM_SET_OPCODE), call, leave and on function entry
(ZEND_VM_ENTER). This left diagnostics emitted in JIT-compiled code
surfacing at a different point than under the interpreter and tracing
JIT.

Add a materializing deopt at jump-target block starts and at the first
opcode of the function (before RECV): when a deferred error is pending,
spill the live register-allocated SSA vars (phi results and non-phi
values live into the block) back to their EX_VAR slots and route to the
interrupt handler, which flushes and resumes the interpreter for the
rest of the function. Skip e-SSA Pi nodes when materializing: a Pi node
aliases its source (for example a loop increment not yet computed at the
block entry), so storing it writes a stale value into the CV slot the
interpreter resumes on.

Run the post-call interrupt check with the callee frame still on top,
matching the interpreter's ZEND_VM_FCALL_INTERRUPT_CHECK, so a
diagnostic deferred from an argument expression captures the same
backtrace under function JIT as under the interpreter. Set EX(opline)
from IP before the leave-helper flush so a backtrace through a
still-suspended JIT frame reads a valid opline instead of NULL.

Record EG(error_reporting) when a diagnostic is deferred and restore it
around the handler at flush time, so a warning silenced by @ stays
silenced regardless of which safepoint flushes it: the handler observes
the error_reporting active when the diagnostic was raised, matching
synchronous behaviour.
@iliaal iliaal force-pushed the proto/gh20018-defer-errors branch from 14a9a1e to 2fb25d1 Compare June 28, 2026 15:58
iliaal added 3 commits June 28, 2026 17:32
zend_jit_leave_func's deferred-error flush points EG(current_execute_data)
at the leaving frame so the user error handler's backtrace is correct. On
a closure leave the frame is already left (left_frame is set when its CVs
are freed), so neither restore path runs and EG(current_execute_data)
stays the popped frame after the leave; a following tracing RETURN trace
then resumes on that frame and crashes. Reproduces only under the CALL VM
(HYBRID keeps execute_data in a register). Save and restore it around the
flush so the leave's frame bookkeeping is unaffected.
When the tracing JIT resumes the VM with a pending deferred error at a
DO_*CALL opline, zend_interrupt_helper flushed it with the caller frame
current and the callee call frame only half built. A throwing user error
handler then let HANDLE_EXCEPTION free the unset call args (a wild
refcounted slot, hence the SEGV) and dropped the callee from the
backtrace. Re-arm vm_interrupt and let the call opcode's own
FCALL_INTERRUPT_CHECK flush instead, matching the interpreter, so the
handler runs with the callee frame on the stack.
The trace recorder (zend_jit_trace_execute) lacked the interpreter's
function-entry vm_interrupt check, so a deferred error armed during a
call's argument binding was serviced at the callee's RETURN, after its
body had run, instead of at its first opcode. A deferred
must-be-passed-by-reference warning whose user handler reads a backtrace
then captured the post-assignment argument value (callback(0, 1, 0)
instead of (0, 0, 0)). Flush deferred errors when the recorder enters a
function, matching the interpreter's zend_interrupt_helper.
zend_flush_deferred_errors skipped the user handler while an exception
was pending, since a handler cannot run then, but it still drained the
buffer and silently lost the errors. master surfaces these, and so does
arnaud-lb#31. Keep them buffered while an exception is pending
and flush them in ZEND_HANDLE_EXCEPTION with the exception temporarily
cleared, so a throwing handler chains onto the original exception.

Update the tests whose expectations encoded the previous drop behaviour;
the deferred errors now fire (and chain) at the exception instead of
vanishing. This also makes the function and tracing JIT agree with the
interpreter on method_exists_variation_001.
@iliaal iliaal force-pushed the proto/gh20018-defer-errors branch from e228938 to eccfa77 Compare June 29, 2026 11:59
iliaal added 2 commits June 29, 2026 09:48
The function JIT drained EG(deferred_errors) at a loop header's forward
entry, through both the loop-header timeout check and the block-start
deopt. Errors armed before the loop fired one opcode too early, before
the loop body ran. The interpreter has no loop-header flush; it drains
only at the post-call interrupt check, on leave, and at HANDLE_EXCEPTION,
so method_exists_variation_001 diverged under opcache.jit=function.

Loop headers now use a timeout-only check that services a pending
timeout or interrupt function but leaves buffered deferred errors and
EG(vm_interrupt) in place, letting them ride to the next post-call,
leave, or exception safepoint. The block-start deopt skips loop headers;
non-loop branch targets keep it, which preserves the earlier-flush
ordering they require.
Flushing a deferred error at an interrupt safepoint can run a user error
handler that throws. If a call frame is being assembled (EX(call) is set,
or the opline is a DO_*CALL), the throw unwinds into the half-built frame
and cleanup_unfinished_calls destroys an unsent argument slot. On the CALL
VM that slot is NULL and the cleanup faults; the hybrid VM keeps a live
value there, so only Windows and macOS crash.

zend_interrupt_helper already deferred the flush past EX(call), but only
when zend_interrupt_function was unset. Windows always installs a Ctrl-C
interrupt function, which disabled the ride and let the flush run into the
half-built call. Re-arm vm_interrupt and defer the flush whenever a call
is being assembled, regardless of zend_interrupt_function, and run the
interrupt function separately. The JIT interrupt stub gets the same
EX(call) guard through zend_jit_check_deferred_errors.
@iliaal

iliaal commented Jun 29, 2026

Copy link
Copy Markdown
Owner Author

Promoted upstream as php#22515. Closing this staging PR.

@iliaal iliaal closed this Jun 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Assertion failure at Zend/zend_hash.c:3076 Assertion failure Zend/zend_vm_execute.h JIT

1 participant