Fix GH-21368 crash: runtime orig_handler lookup in escape_if_undef#21710
Fix GH-21368 crash: runtime orig_handler lookup in escape_if_undef#21710iliaal wants to merge 1 commit intophp:PHP-8.4from
Conversation
|
It would be great to understand where the failure comes from. As I see, JIT_G(current_frame) is always set during trace compilation (it can't be NULL) , so this part of deduction can not be the reason of the failure. The fix generates more expensive JIT code that performs run-time resolution, I believe it should be possible to make this resolution at compile time. |
|
I can reproduce this bug consistently. If somone can help compile a Windows version, I can test this fix |
php#21368 dispatched to orig_handler via exit_info->op_array. That pointer is set at parent-trace compile time from JIT_G(current_frame)->func, and can become stale by the time a side trace compiles for that exit, producing an access violation in zend_jit_escape_if_undef on long-lived IIS+FastCGI workers. Use jit->current_op_array instead. On the crash path (zend_jit_compile_side_trace -> zend_jit_trace), zend_jit_trace_start sets it to trace_buffer->op_array, which is freshly captured for the current compilation and avoids the parent's potentially stale reference. On the zend_jit_trace_exit_to_vm path, zend_jit_deoptimizer_start leaves current_op_array unset, so set it from exit_info->op_array there. The gh21267 tests still pass.
4c71cf2 to
7c69d4b
Compare
PR updated. |
| /* We can't use trace_escape() because opcode handler may be overridden by JIT */ | ||
| zend_jit_op_array_trace_extension *jit_extension = | ||
| (zend_jit_op_array_trace_extension*)ZEND_FUNC_INFO(op_array); | ||
| (zend_jit_op_array_trace_extension*)ZEND_FUNC_INFO(jit->current_op_array); |
There was a problem hiding this comment.
Are you sure jit->current_op_array is always initialized?
It seems, it may be NULL on the following patch - zend_jit_trace_exit() -> zend_jit_trace_hot_side() -> zend_jit_blacklist_trace_exit() -> zend_jit_trace_exit_to_vm() - > zend_jit_trace_deoptimization() -> zend_jit_escape_if_undef().
May be I'm wrong...
There was a problem hiding this comment.
exit_info.op_array is written in one place, zend_jit_trace_get_exit_point (trace.c:197), and the surrounding code at trace.c:146-164 assigns op_array and stack_size together: JIT_G(current_frame) non-NULL gives a real op_array with some stack_size; the else branch sets both op_array = NULL and stack_size = 0 in the same step.
So on your path, ctx.current_op_array == NULL only when exit_info.stack_size == 0, which means zend_jit_trace_deoptimization loops over zero entries, check2 stays at -1, and zend_jit_escape_if_undef at trace.c:3606 isn't called. The NULL never reaches ZEND_FUNC_INFO(jit->current_op_array).
Can add ZEND_ASSERT(exit_info[exit_num].op_array || exit_info[exit_num].stack_size == 0) to make it explicit.
|
Should the original fix be reverted until we fully understand the issue? I don't have the time to look at the problem, and also don't have a Windows setup. Otherwise we'll miss the next release cycle (I don't feel comfortable merging this without an RC-phase). |
|
Possibly, same as you I cannot reproduce the issue due to lack of windows env, fix is made based on code analysis. Ultimately be nice to have windows build the reporter could try, maybe revert from 8.5/8.4 keep in master? And then try fully patched version for 8.4/8.5 back port after current release is out? |
Follow-up to #21368.
@vibbow reported an access violation in zend_jit_escape_if_undef on
PHP 8.5.5 VS17 x64 NTS (Windows + IIS + FastCGI), crashing at the fix
I introduced in #21368:
The crashing instruction is
mov rcx, [rax+rcx*8+0xD0], reading 0xD0from a heap address that isn't mapped. On 64-bit NTS that offset
corresponds to
zend_op_array->reserved[zend_func_info_rid], which iswhat
ZEND_FUNC_INFO(op_array)expands to.Root cause
#21368 dispatched to
orig_handlervia a compile-time load computedfrom
exit_info->op_array. That pointer is set at parent-trace compiletime from
JIT_G(current_frame)->func, and can become stale by the timea side trace compiles for that exit. Long-lived FastCGI workers with
many trace invalidation/recompile cycles match the observed failure
pattern.
Fix
Keep compile-time dispatch but resolve
orig_handleragainstjit->current_op_arrayinstead of the parent'sexit_info:zend_jit_trace_startalready setsjit->current_op_arraytotrace_buffer->op_array, which is freshly captured for the currentcompilation and doesn't depend on the parent's stale pointer.
zend_jit_trace_exit_to_vmpath,zend_jit_deoptimizer_startleaves
current_op_arrayunset, so it is now set fromexit_info->op_arraythere.This drops the
op_arrayparameter added tozend_jit_escape_if_undefin #21368.
Verification
gh21267.phptandgh21267_blacklist.phptboth pass. The originalinfinite-loop fix is preserved.
Reproducer
I couldn't build a native Linux reproducer for the stale-pointer
scenario through synthetic stress. @vibbow can reproduce reliably on
IIS+FastCGI and will validate the fix once a Windows build is
available.