fix(huggingface): avoid probes for selection-skipped shards by mldangelo-oai · Pull Request #1633 · promptfoo/modelaudit

mldangelo-oai · 2026-06-11T00:39:51Z

Summary

Fixes Hugging Face streaming selection so shard-shaped .safetensors artifacts are skipped before any remote content probe when the SafeTensors scanner is excluded by scanner selection. This prevents metadata/text or other scanner-selective runs from issuing range/body reads against large skipped SafeTensors shards.

Root Cause

_select_streamable_hf_files() correctly included files matching the selected scanner suffixes, but then the renamed-content-route loop could still call _detect_huggingface_content_route_format() on remaining files. For selected scanners such as pickle, scanner_ids_for_detected_format("safetensors") includes overlap scanners, so a large shard like model-00001-of-00002.safetensors excluded by suffix policy could still be range-read and selected after body sniffing.

Security Tradeoff

The fix is intentionally narrow: it skips only shard-shaped SafeTensors filenames when safetensors is not selected. Unknown suffixes and non-shard .safetensors files are still bounded-probed, so selected renamed malicious controls and payload.safetensors files whose bytes are pickle continue to route to the selected pickle scanner.

Real-Model QA

Pinned reproduction target:

# deepseek-ai/DeepSeek-R1 @ 56d4cbbb4d29f4355bab4b9a39ccb717a14ad5ad
PROMPTFOO_DISABLE_TELEMETRY=1 uv run python <guarded_hf_streaming_harness>

Harness behavior: forces the listed revision to 56d4cbbb4d29f4355bab4b9a39ccb717a14ad5ad, runs modelaudit scan --stream --scanners metadata,text --quiet --format json --max-size 100MB hf://deepseek-ai/DeepSeek-R1, patches requests.get to fail on any artifact body/range probe, and asserts no selected .safetensors filenames.

Observed outcome after the final fix:

PINNED_LIST repo=deepseek-ai/DeepSeek-R1 revision=56d4cbbb4d29f4355bab4b9a39ccb717a14ad5ad files=174
SELECTED README.md
EXIT 1
BODY_GET_COUNT 0
SUMMARY {'success': True, 'files_scanned': 1, 'scanner_names': ['text']}

EXIT 1 is expected because the selected README scan records informational findings; the scan itself completed successfully and no skipped shard body/range GETs occurred.

Validation

uv sync --extra all-ci
uv run ruff format modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check --fix modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_pickle_skips_safetensors_shard_without_probe tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_pickle_preserves_safetensors_pickle_control tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_pickle_preserves_renamed_malicious_control tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_extension_sniffs_renamed_files tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_extension_rejects_other_renamed_formats -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py::test_scan_huggingface_streaming_preserves_selected_extensionless_filenames tests/test_cli.py::test_scan_huggingface_streaming_passes_selected_scanner_extensions -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_safetensors_scanner.py::test_malformed_safetensors_custom_metadata_still_reports_security_flags tests/scanners/test_safetensors_scanner.py::test_oversized_header_returns_operational_exit2 tests/scanners/test_safetensors_scanner.py::test_oversized_header_triggers_limit_check -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py tests/test_scanner_selection.py -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_streaming_scan.py -q
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1
# post-fetch origin/main == 2f782ba1f18ab5aac4716c8c7ac6f0a16f0c6f60
uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py tests/test_scanner_selection.py tests/test_cli.py tests/test_streaming_scan.py -q
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
git diff --check

mldangelo-oai · 2026-06-11T00:40:00Z

@codex review

github-actions · 2026-06-11T00:41:59Z

Workflow run and artifacts

Performance Benchmarks

Compared 12 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 12 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 1.436s -> 1.442s (+0.4%).

Workload	Benchmark	Target	Size	Files	Baseline	Current	Change	Status
`direct-malicious-upload`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_direct_malicious_upload`	`malicious_reduce`	52 B	1	534.5us	514.9us	-3.7%	stable
`single-checkpoint-preflight`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_single_checkpoint_before_load`	`single_checkpoint.pkl`	183.0 KiB	1	71.21ms	72.75ms	+2.2%	stable
`warm-cache-rescan`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_warm_cached_repository_rescan`	`release-candidate`	547.3 KiB	32	112.78ms	110.33ms	-2.2%	stable
`mixed-model-repository`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_release_candidate_repository`	`release-candidate`	547.3 KiB	32	482.76ms	487.90ms	+1.1%	stable
`suspicious-pickle-intake`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_suspicious_pickle_intake`	`suspicious-intake`	183.8 KiB	4	146.24ms	144.69ms	-1.1%	stable
`duplicate-heavy-registry`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_registry_snapshot`	`registry-snapshot`	915.2 KiB	13	394.78ms	398.22ms	+0.9%	stable
`chunked-upload-stream`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_upload_stream`	`chunked_stream`	278.2 KiB	1	114.01ms	113.22ms	-0.7%	stable
`nested-payload-review`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_base64]`	`nested_base64`	98 B	1	593.7us	590.4us	-0.6%	stable
`clean-training-checkpoint`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_clean_training_checkpoint`	`safe_large`	278.2 KiB	1	111.21ms	111.69ms	+0.4%	stable
`nested-payload-review`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_raw]`	`nested_raw`	78 B	1	584.7us	585.9us	+0.2%	stable
`padded-multi-stream-upload`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_padded_multi_stream_upload`	`multi_stream_padded`	4.1 KiB	1	654.5us	655.7us	+0.2%	stable
`nested-payload-review`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_hex]`	`nested_hex`	130 B	1	622.5us	622.3us	-0.0%	stable

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6f7fed2f61

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

mldangelo-oai · 2026-06-11T00:53:52Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 478f2395c2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…t13-hf-selective-skip-reads-20260610

mldangelo-oai · 2026-06-11T01:34:50Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0dab63c606

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

mldangelo-oai · 2026-06-11T01:59:41Z

@codex address that feedback. Fix the current P2 on exact head 0dab63c: preserve selected overlap routes such as --scanners compressed for shard-shaped SafeTensors with trailing gzip content. Add malicious and benign overlap/selection regressions, push a new head, and report tests.

mldangelo-oai · 2026-06-11T02:00:07Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 65d88c9d29

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-11T02:05:02Z

Summary

Preserved selected SafeTensors overlap routes by making shard exclusion depend on whether the selected scanner IDs overlap with scanner_ids_for_detected_format("safetensors"), so selected scanners like compressed can still receive shard-shaped SafeTensors artifacts after bounded content probing. modelaudit/utils/sources/huggingface.pyL912-L927 modelaudit/utils/sources/huggingface.pyL1085-L1102
Added Hugging Face streaming regressions covering benign non-overlap skipping after probe and selected compressed overlap routing for shard-shaped SafeTensors with trailing gzip content. tests/utils/sources/test_huggingface.pyL2662-L2703 tests/utils/sources/test_huggingface.pyL2711-L2748
Kept malicious renamed/shard-shaped pickle coverage to ensure scanner selection still routes suspicious pickle bytes hidden behind SafeTensors shard names. tests/utils/sources/test_huggingface.pyL2756-L2788 tests/test_cli.pyL4509-L4539

Testing

✅ PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_non_overlap_skips_detected_safetensors_shard_after_probe tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_compressed_preserves_safetensors_shard_overlap_route tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_pickle_routes_shard_shaped_renamed_pickle tests/test_cli.py::test_scan_huggingface_streaming_selected_pickle_scans_shard_shaped_renamed_pickle -q
✅ uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
✅ uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
✅ uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/

View task →

mldangelo-oai · 2026-06-11T02:22:08Z

@codex review

mldangelo-oai · 2026-06-11T02:23:54Z

@codex review the exact current head after preserving selected overlap routes. Recheck selected compressed/SafeTensors behavior, skipped shard reads, and malicious overlap controls.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c06226792a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

mldangelo-oai · 2026-06-11T02:46:47Z

@codex review

mldangelo-oai · 2026-06-11T02:48:24Z

@codex review the exact current head fa87d8fe. Recheck skipped-shard selection, URL resolution/probe behavior, bounded acquisition, and malicious selected-shard controls.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fa87d8fe3a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

mldangelo-oai · 2026-06-11T03:09:47Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c998168cd2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

mldangelo-oai · 2026-06-11T03:35:45Z

@codex review

mldangelo-oai · 2026-06-11T03:44:59Z

@codex review the exact current head 4865428. Recheck shard-shaped SafeTensors selection, bounded content probing before exclusion, XGBoost/ONNX renamed route coverage, and genuine SafeTensors shard skip/cap behavior.

mldangelo-oai · 2026-06-11T05:46:21Z

@codex review the exact current head 189f5f357345ff572d19e00259935e9d3800455f. Recheck SafeTensors shard family scoping: parser full-matches the full repo path, complete declared families skip before probe only within the same path-scoped stem, incomplete cross-directory shard-shaped candidates still count against _HF_CONTENT_SNIFF_MAX_FILES, and selected-route malicious shard-shaped controls still route.

chatgpt-codex-connector · 2026-06-11T05:49:19Z

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Preserve selected noncanonical shard-shaped XGBoost routes, make Hugging Face dry runs avoid downloads/probes, and add regressions for probe budgets, direct URLs, dry-run exit codes, and sanitized failures.

mldangelo-oai · 2026-06-11T10:12:46Z

Pushed additive follow-up at cbadb71d.

What changed:

Preserved selected noncanonical/zero-based/one-of-one shard-shaped .safetensors XGBoost UBJSON routes under bounded range probes while complete canonical SafeTensors shard families still skip without HEAD/GET/probe.
Added Hugging Face direct/model dry-run previews that avoid downloads/probes and return success only when there is no operational error.
Added regressions for excluded shard no-probe behavior, companion index accounting, noncanonical probe caps, gated/signed URL redaction, direct URL streaming bypass, HF dry-run, and dry-run exit codes.

Local verification:

PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py -k "huggingface or dry_run" -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_exit_codes.py -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_streaming_scan.py tests/test_scanner_selection.py -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_core.py tests/test_core_asset_extraction.py -q
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n 4 -m "not slow and not integration" --maxfail=1
uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
git diff --check

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cbadb71d66

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Suppress human preview text for Hugging Face dry runs when JSON is written to stdout and add direct-file/model regressions that parse the emitted JSON.

mldangelo-oai · 2026-06-11T11:10:53Z

Pushed additive follow-up at ba3f314a.

What changed:

Suppressed Hugging Face dry-run preview text when JSON is emitted to stdout, preserving human previews for text/output-file modes.
Added direct-file and model dry-run JSON regressions that parse stdout with json.loads and assert no download/probe path is called.

Local verification after this follow-up:

PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py -k "huggingface or dry_run" -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_exit_codes.py -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_streaming_scan.py tests/test_scanner_selection.py -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_core.py tests/test_core_asset_extraction.py -q
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n 4 -m "not slow and not integration" --maxfail=1
uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
git diff --check

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ba3f314a9c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

mldangelo-oai · 2026-06-11T11:49:26Z

Current-head CI on ba3f314a fails consistently on Python 3.12, Python 3.13, and Windows in:

tests/test_cli.py::test_scan_huggingface_direct_file_dry_run_does_not_download
assert "Preview for" in result.output

The command now returns a valid JSON dry-run object (success: true, dry_run: true, zero files/checks) rather than text-mode Preview for ..., so the test expectation is stale for JSON output. Update the regression to parse/assert the JSON contract and retain a separate text-format assertion if both modes are intended. Do not reintroduce human prose into JSON output. Rerun the focused CLI dry-run tests on all supported platforms, push additively, and request fresh current-head Codex review.

Run: https://github.com/promptfoo/modelaudit/actions/runs/27342690555

Make the human-preview dry-run tests request text output explicitly so JSON defaults or config state cannot conflict with the JSON-stdout regressions.

mldangelo-oai · 2026-06-11T11:54:59Z

Follow-up for the exact-head CI failure on ba3f314a:

Fixed the stale dry-run preview tests by pinning the human-preview cases to --format text; the JSON stdout regressions remain JSON-only and parse the emitted payload.
This is test-only and preserves the production behavior from the prior commits.

Local verification on the new head before push:

PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py::test_scan_huggingface_direct_file_dry_run_does_not_download tests/test_cli.py::test_scan_huggingface_direct_file_dry_run_json_stdout_is_parseable tests/test_cli.py::test_scan_huggingface_streaming_dry_run_uses_metadata_preview_without_download_or_probe tests/test_cli.py::test_scan_huggingface_streaming_dry_run_json_stdout_is_parseable -q (4 passed)
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py -k "huggingface or dry_run" -q (37 passed)
uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n 4 -m "not slow and not integration" --maxfail=1 (17399 passed, 1292 skipped)

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 47fa65ece8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

mldangelo-oai · 2026-06-11T12:22:32Z

Independent review of exact head 47fa65ece833cb64b703ddbd17b4c2511c063477 found one P1 blocker:

Filename-complete declared SafeTensors shard families can hide renamed payloads selected by other scanners. The selector skips a complete .safetensors family before content-route detection runs. Completeness and overlap are inferred from filenames/scanner selection only; the bytes are never confirmed as SafeTensors.

Exact-head reproductions:

Complete model-00001-of-00002.safetensors / model-00002-of-00002.safetensors family whose mocked content detector returns XGBoost, with scanners=["xgboost"]: neither shard is probed or selected.
The same construction with ONNX content and ONNX selected also skips both shards.

Sniff content before applying the family skip unless the bytes are positively confirmed SafeTensors. Existing tests cover incomplete renamed families but not this complete-family case.

Validation: focused shard/XGBoost, index/budget, gated/overlap, dry-run, resource, and reconciliation slices all passed; pinned DeepSeek-R1 metadata/text QA selected only README with zero body GETs. Ruff/format/mypy passed. Four long CI lanes were still pending at closeout. The P1 bypass blocks merge.

mldangelo-oai · 2026-06-11T12:53:33Z

Pushed additive follow-up at fa86be75 for the current-head P1 shard-family finding.

What changed:

Removed the pre-probe complete SafeTensors shard-family skip, so content-routed selections no longer trust .safetensors shard filenames before bounded sniffing.
Proven SafeTensors shards still skip after content classification when selected routes cannot consume them, and they remain excluded from download/size inventory.
Added regressions for complete-looking two-shard XGBoost UBJSON payloads under both scanner-policy and extension-only selection, plus fail-closed cap behavior for high-count complete-looking shard families.

Local verification on this head:

PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py -q (294 passed, 1 skipped)
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py -k "huggingface or dry_run" -q (37 passed)
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_streaming_scan.py tests/test_scanner_selection.py tests/test_core.py tests/test_core_asset_extraction.py -q (754 passed, 21 skipped)
uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n 4 -m "not slow and not integration" --maxfail=1 (17401 passed, 1292 skipped)
git diff --check

@codex review

chatgpt-codex-connector · 2026-06-11T12:56:56Z

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…ds-20260610

…t13-hf-selective-skip-reads-20260610 # Conflicts: # modelaudit/utils/sources/huggingface.py

…t13-hf-selective-skip-reads-20260610

…t13-hf-selective-skip-reads-20260610 # Conflicts: # modelaudit/cli.py

…t13-hf-selective-skip-reads-20260610

…t13-hf-selective-skip-reads-20260610 # Conflicts: # modelaudit/cli.py # modelaudit/utils/sources/huggingface.py # tests/utils/sources/test_huggingface.py

fix(huggingface): avoid probes for selection-skipped shards

6f7fed2

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread modelaudit/utils/sources/huggingface.py Outdated

fix(huggingface): preserve selected pickle controls

478f239

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread modelaudit/utils/sources/huggingface.py Outdated

mldangelo-oai added 2 commits June 11, 2026 01:11

Merge remote-tracking branch 'origin/main' into mdangelo/codex/hf-fp-…

65180b2

…t13-hf-selective-skip-reads-20260610

fix(huggingface): probe shard-shaped pickle candidates

0dab63c

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread modelaudit/utils/sources/huggingface.py Outdated

fix(huggingface): preserve safetensors overlap routes

65d88c9

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread modelaudit/utils/sources/huggingface.py

fix(huggingface): skip irrelevant safetensors shards before probe

c062267

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread modelaudit/utils/sources/huggingface.py Outdated

test(huggingface): document safetensors pickle overlap

fa87d8f

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread modelaudit/utils/sources/huggingface.py Outdated

fix(huggingface): skip extension-only safetensors shards

c998168

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread modelaudit/utils/sources/huggingface.py

fix(huggingface): probe shard-shaped selected routes

4865428

fix(huggingface): scope safetensors shard families by path

189f5f3

fix(huggingface): harden selective streaming skips

cbadb71

Preserve selected noncanonical shard-shaped XGBoost routes, make Hugging Face dry runs avoid downloads/probes, and add regressions for probe budgets, direct URLs, dry-run exit codes, and sanitized failures.

mldangelo-oai requested a review from mldangelo June 11, 2026 10:12

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread modelaudit/cli.py Outdated

fix(cli): keep huggingface dry-run json clean

ba3f314

Suppress human preview text for Hugging Face dry runs when JSON is written to stdout and add direct-file/model regressions that parse the emitted JSON.

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread modelaudit/utils/sources/huggingface.py Outdated

test(cli): pin huggingface dry-run preview format

47fa65e

Make the human-preview dry-run tests request text output explicitly so JSON defaults or config state cannot conflict with the JSON-stdout regressions.

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread modelaudit/utils/sources/huggingface.py Outdated

fix(huggingface): sniff complete shard families before skipping

fa86be7

mldangelo-oai enabled auto-merge (squash) June 11, 2026 13:24

mldangelo-oai added 9 commits June 11, 2026 16:20

Merge origin/main into mdangelo/codex/hf-fp-t13-hf-selective-skip-rea…

16c7d07

…ds-20260610

Merge remote-tracking branch 'origin/main' into mdangelo/codex/hf-fp-…

0bcc1b5

…t13-hf-selective-skip-reads-20260610 # Conflicts: # modelaudit/utils/sources/huggingface.py

Merge remote-tracking branch 'origin/main' into mdangelo/codex/hf-fp-…

9c5b441

…t13-hf-selective-skip-reads-20260610

fix: fail closed on incomplete HF safetensors shards

46bd869

Merge remote-tracking branch 'origin/main' into mdangelo/codex/hf-fp-…

e244d2f

…t13-hf-selective-skip-reads-20260610 # Conflicts: # modelaudit/cli.py

Merge remote-tracking branch 'origin/main' into mdangelo/codex/hf-fp-…

3087595

…t13-hf-selective-skip-reads-20260610

Merge remote-tracking branch 'origin/main' into mdangelo/codex/hf-fp-…

f3dafba

…t13-hf-selective-skip-reads-20260610

Merge remote-tracking branch 'origin/main' into mdangelo/codex/hf-fp-…

0afb25e

…t13-hf-selective-skip-reads-20260610

Merge remote-tracking branch 'origin/main' into mdangelo/codex/hf-fp-…

38ecbfb

…t13-hf-selective-skip-reads-20260610 # Conflicts: # modelaudit/cli.py # modelaudit/utils/sources/huggingface.py # tests/utils/sources/test_huggingface.py

mldangelo-oai merged commit 94a89df into main Jun 11, 2026
29 checks passed

mldangelo-oai deleted the mdangelo/codex/hf-fp-t13-hf-selective-skip-reads-20260610 branch June 11, 2026 22:10

Conversation

mldangelo-oai commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Security Tradeoff

Real-Model QA

Validation

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance Benchmarks

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented Jun 11, 2026

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 11, 2026

Uh oh!

mldangelo-oai commented Jun 11, 2026

Uh oh!

mldangelo-oai commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading