Skip to content

XNNPACK weights cache crashes on null pointer #18946

@jgibson2

Description

@jgibson2

Bug

ExecuTorch crashes with SIGSEGV on Android when loading a dynamically quantized model containing bias-less Conv2d layers. The crash occurs during XNNWeightsCache::look_up_or_insert which calls memcmp on a null pointer from the xnn_weights_cache_look_up_key struct.

The same model runs correctly on the macOS Python runtime, suggesting the weights cache is either disabled or uses a different code path there.

Root Cause

When creating a xnn_create_convolution2d_nhwc_qd8_f32_qc8w op (dynamic int8 quantization with per-channel weights), the XNNPACK weights cache builds a lookup key that includes pointers to the raw kernel and bias buffers. For bias-less convolutions, the bias pointer is null. The cache then calls memcmp on this null pointer, causing a segfault.

Stack Trace (Android, aarch64)

Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0

#00 pc 0x693cc   libc.so                (__memcmp_aarch64+12)
#01 pc 0x36be70  libexecutorch_jni.so   (XNNWeightsCache::look_up_or_insert +92)
#02 pc 0x410c68  libexecutorch_jni.so   [XNNPACK weights cache internal]
#03 pc 0x40c99c  libexecutorch_jni.so   [XNNPACK weights packing]
#04 pc 0x40d070  libexecutorch_jni.so   (xnn_create_convolution2d_nhwc_qd8_f32_qc8w +452)
#05 pc 0x409ae4  libexecutorch_jni.so
#06 pc 0x3d8a08  libexecutorch_jni.so
#07 pc 0x368e88  libexecutorch_jni.so   (XNNCompiler::compileModel +1168)
#08 pc 0x36ac8c  libexecutorch_jni.so   (XnnpackBackend::init +340)

Reproduction

  1. Export any model with bias-less Conv2d layers (e.g., MobileNetV4 backbone — 379 out of 381 convs have no bias)
  2. Apply dynamic quantization:
    from executorch.backends.xnnpack.quantizer.xnnpack_quantizer import (
        XNNPACKQuantizer, get_symmetric_quantization_config,
    )
    quantizer = XNNPACKQuantizer()
    quantizer.set_global(get_symmetric_quantization_config(is_per_channel=True, is_dynamic=True))
  3. Lower with XnnpackPartitioner(per_op_mode=True) and export to .pte
  4. Load and execute on Android via libexecutorch_jni.so

The model runs fine on macOS Python ExecuTorch runtime.

Workaround

Add zero bias to all bias-less Conv2d modules before export:

for module in model.modules():
    if isinstance(module, torch.nn.Conv2d) and module.bias is None:
        module.bias = torch.nn.Parameter(torch.zeros(module.out_channels))

This is semantically identical (zero bias is a no-op) but ensures the weights cache lookup key always has a non-null bias pointer.

Suggested Fix

XNNWeightsCache::look_up_or_insert should check for null bias pointer before calling memcmp, or the xnn_weights_cache_look_up_key construction should handle the bias-less case (e.g., zero-length memcmp or skip bias comparison).

Environment

  • ExecuTorch 1.1.0 (Android JNI build)
  • PyTorch 2.9.1 (export)
  • Android aarch64
  • Dynamic quantization with per-channel int8 weights (qd8_f32_qc8w ops)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions