XNNPACK weights cache crashes on null pointer

## Bug

ExecuTorch crashes with `SIGSEGV` on Android when loading a dynamically quantized model containing bias-less `Conv2d` layers. The crash occurs during `XNNWeightsCache::look_up_or_insert` which calls `memcmp` on a null pointer from the `xnn_weights_cache_look_up_key` struct.

The same model runs correctly on the macOS Python runtime, suggesting the weights cache is either disabled or uses a different code path there.

## Root Cause

When creating a `xnn_create_convolution2d_nhwc_qd8_f32_qc8w` op (dynamic int8 quantization with per-channel weights), the XNNPACK weights cache builds a lookup key that includes pointers to the raw kernel and bias buffers. For bias-less convolutions, the bias pointer is null. The cache then calls `memcmp` on this null pointer, causing a segfault.

## Stack Trace (Android, aarch64)

```
Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0

#00 pc 0x693cc   libc.so                (__memcmp_aarch64+12)
#01 pc 0x36be70  libexecutorch_jni.so   (XNNWeightsCache::look_up_or_insert +92)
#02 pc 0x410c68  libexecutorch_jni.so   [XNNPACK weights cache internal]
#03 pc 0x40c99c  libexecutorch_jni.so   [XNNPACK weights packing]
#04 pc 0x40d070  libexecutorch_jni.so   (xnn_create_convolution2d_nhwc_qd8_f32_qc8w +452)
#05 pc 0x409ae4  libexecutorch_jni.so
#06 pc 0x3d8a08  libexecutorch_jni.so
#07 pc 0x368e88  libexecutorch_jni.so   (XNNCompiler::compileModel +1168)
#08 pc 0x36ac8c  libexecutorch_jni.so   (XnnpackBackend::init +340)
```

## Reproduction

1. Export any model with bias-less `Conv2d` layers (e.g., MobileNetV4 backbone — 379 out of 381 convs have no bias)
2. Apply dynamic quantization:
   ```python
   from executorch.backends.xnnpack.quantizer.xnnpack_quantizer import (
       XNNPACKQuantizer, get_symmetric_quantization_config,
   )
   quantizer = XNNPACKQuantizer()
   quantizer.set_global(get_symmetric_quantization_config(is_per_channel=True, is_dynamic=True))
   ```
3. Lower with `XnnpackPartitioner(per_op_mode=True)` and export to `.pte`
4. Load and execute on Android via `libexecutorch_jni.so`

The model runs fine on macOS Python ExecuTorch runtime.

## Workaround

Add zero bias to all bias-less Conv2d modules before export:

```python
for module in model.modules():
    if isinstance(module, torch.nn.Conv2d) and module.bias is None:
        module.bias = torch.nn.Parameter(torch.zeros(module.out_channels))
```

This is semantically identical (zero bias is a no-op) but ensures the weights cache lookup key always has a non-null bias pointer.

## Suggested Fix

`XNNWeightsCache::look_up_or_insert` should check for null bias pointer before calling `memcmp`, or the `xnn_weights_cache_look_up_key` construction should handle the bias-less case (e.g., zero-length memcmp or skip bias comparison).

## Environment

- ExecuTorch 1.1.0 (Android JNI build)
- PyTorch 2.9.1 (export)
- Android aarch64
- Dynamic quantization with per-channel int8 weights (`qd8_f32_qc8w` ops)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XNNPACK weights cache crashes on null pointer #18946

Bug

Root Cause

Stack Trace (Android, aarch64)

Reproduction

Workaround

Suggested Fix

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

XNNPACK weights cache crashes on null pointer #18946

Description

Bug

Root Cause

Stack Trace (Android, aarch64)

Reproduction

Workaround

Suggested Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions