fix: calibrate onnx custom domain findings#1639
Conversation
|
@codex review |
Performance BenchmarksCompared
|
|
Codex Review: Didn't find any major issues. 🚀 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
…t09-onnx-custom-domain-calibration-20260610
|
@codex review |
|
@codex review |
|
Codex Review: Didn't find any major issues. More of your lovely PRs please. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
com.microsoftoptimized-export operators observed in public transformer ONNX exports:FastGeluandSkipLayerNormalizationatcom.microsoftopset 1 with empty overload.com.microsoftops, missing/unsupported/conflicting opsets, non-empty overloads, and function-body unknowns.Root Cause
_is_external_custom_operator()treated every non-standard, non-model-local, non-schema-validated domain identically. ONNX Runtime optimized exports usecom.microsoftmetadata for runtime-provided kernels, but that domain is not an ONNX standard domain and cannot be validated viaonnx.defs.has(), so common benigncom.microsoftnodes produced S1111 alongside truly unknown domains.Security Tradeoff
This does not add
com.microsoftas a trusted standard domain. The low-noise policy is exact tuple based and requires:com.microsoftUnknown domains and ambiguous vendor claims still fail closed into S1111. Python-like operators remain critical S902 regardless of domain.
Pinned Real-Model QA
Baseline on the starting SHA
ab6e58ebe587197b2b46866c69f7d41d75e08a55reproduced the false-positive path ononnx/model_O4.onnx:sentence-transformers/all-MiniLM-L6-v2@1110a243fdf4706b3f48f1d95db1a4f5529b4d41:com.microsoftnodesSkipLayerNormalization=12,FastGelu=6; S1111 count before fix: 18cross-encoder/ms-marco-MiniLM-L6-v2@c5ee24cb16019beea0893ab7796b1df96625c6b8:com.microsoftnodesSkipLayerNormalization=12,FastGelu=6; S1111 count before fix: 18sentence-transformers/all-mpnet-base-v2@e8c3b32edf5434bc2275fc9bab85f82640a19130:com.microsoftnodesSkipLayerNormalization=25,FastGelu=12; S1111 count before fix: 37Post-fix direct QA command:
Post-fix outcome: all three pinned models still contain the expected
com.microsoftoperators, and all reporttotal_s1111=0,microsoft_s1111=0,metadata_custom_domains=[].Validation
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_onnx_scanner.py -k "microsoft or custom_domain or custom_operator" -m "not slow and not integration" --maxfail=1→ 20 passedPROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_onnx_scanner.py -m "not slow and not integration" --maxfail=1→ 260 passed, 3 deselectedMODELAUDIT_RUN_HF_REAL_MODEL_TESTS=1 PROMPTFOO_DISABLE_TELEMETRY=1 HF_HUB_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_onnx_scanner.py -k "pinned_hf" -m "slow and integration" -s --maxfail=1→ 3 passeduv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/→ 419 files already formatteduv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/→ all checks passeduv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/→ success, 474 source filesPROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1→ 18075 passed, 793 skippedgit diff --check→ clean