Skip to content

fix(huggingface): avoid probes for selection-skipped shards#1633

Merged
mldangelo-oai merged 26 commits into
mainfrom
mdangelo/codex/hf-fp-t13-hf-selective-skip-reads-20260610
Jun 11, 2026
Merged

fix(huggingface): avoid probes for selection-skipped shards#1633
mldangelo-oai merged 26 commits into
mainfrom
mdangelo/codex/hf-fp-t13-hf-selective-skip-reads-20260610

Conversation

@mldangelo-oai

@mldangelo-oai mldangelo-oai commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes Hugging Face streaming selection so shard-shaped .safetensors artifacts are skipped before any remote content probe when the SafeTensors scanner is excluded by scanner selection. This prevents metadata/text or other scanner-selective runs from issuing range/body reads against large skipped SafeTensors shards.

Root Cause

_select_streamable_hf_files() correctly included files matching the selected scanner suffixes, but then the renamed-content-route loop could still call _detect_huggingface_content_route_format() on remaining files. For selected scanners such as pickle, scanner_ids_for_detected_format("safetensors") includes overlap scanners, so a large shard like model-00001-of-00002.safetensors excluded by suffix policy could still be range-read and selected after body sniffing.

Security Tradeoff

The fix is intentionally narrow: it skips only shard-shaped SafeTensors filenames when safetensors is not selected. Unknown suffixes and non-shard .safetensors files are still bounded-probed, so selected renamed malicious controls and payload.safetensors files whose bytes are pickle continue to route to the selected pickle scanner.

Real-Model QA

Pinned reproduction target:

# deepseek-ai/DeepSeek-R1 @ 56d4cbbb4d29f4355bab4b9a39ccb717a14ad5ad
PROMPTFOO_DISABLE_TELEMETRY=1 uv run python <guarded_hf_streaming_harness>

Harness behavior: forces the listed revision to 56d4cbbb4d29f4355bab4b9a39ccb717a14ad5ad, runs modelaudit scan --stream --scanners metadata,text --quiet --format json --max-size 100MB hf://deepseek-ai/DeepSeek-R1, patches requests.get to fail on any artifact body/range probe, and asserts no selected .safetensors filenames.

Observed outcome after the final fix:

PINNED_LIST repo=deepseek-ai/DeepSeek-R1 revision=56d4cbbb4d29f4355bab4b9a39ccb717a14ad5ad files=174
SELECTED README.md
EXIT 1
BODY_GET_COUNT 0
SUMMARY {'success': True, 'files_scanned': 1, 'scanner_names': ['text']}

EXIT 1 is expected because the selected README scan records informational findings; the scan itself completed successfully and no skipped shard body/range GETs occurred.

Validation

uv sync --extra all-ci
uv run ruff format modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check --fix modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_pickle_skips_safetensors_shard_without_probe tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_pickle_preserves_safetensors_pickle_control tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_pickle_preserves_renamed_malicious_control tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_extension_sniffs_renamed_files tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_extension_rejects_other_renamed_formats -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py::test_scan_huggingface_streaming_preserves_selected_extensionless_filenames tests/test_cli.py::test_scan_huggingface_streaming_passes_selected_scanner_extensions -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_safetensors_scanner.py::test_malformed_safetensors_custom_metadata_still_reports_security_flags tests/scanners/test_safetensors_scanner.py::test_oversized_header_returns_operational_exit2 tests/scanners/test_safetensors_scanner.py::test_oversized_header_triggers_limit_check -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py tests/test_scanner_selection.py -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py -q
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_streaming_scan.py -q
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1
# post-fetch origin/main == 2f782ba1f18ab5aac4716c8c7ac6f0a16f0c6f60
uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py tests/test_scanner_selection.py tests/test_cli.py tests/test_streaming_scan.py -q
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
git diff --check

@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Workflow run and artifacts

Performance Benchmarks

Compared 12 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 12 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 1.436s -> 1.442s (+0.4%).

Workload Benchmark Target Size Files Baseline Current Change Status
direct-malicious-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_direct_malicious_upload malicious_reduce 52 B 1 534.5us 514.9us -3.7% stable
single-checkpoint-preflight tests/benchmarks/test_scan_benchmarks.py::test_scan_single_checkpoint_before_load single_checkpoint.pkl 183.0 KiB 1 71.21ms 72.75ms +2.2% stable
warm-cache-rescan tests/benchmarks/test_scan_benchmarks.py::test_scan_warm_cached_repository_rescan release-candidate 547.3 KiB 32 112.78ms 110.33ms -2.2% stable
mixed-model-repository tests/benchmarks/test_scan_benchmarks.py::test_scan_release_candidate_repository release-candidate 547.3 KiB 32 482.76ms 487.90ms +1.1% stable
suspicious-pickle-intake tests/benchmarks/test_scan_benchmarks.py::test_scan_suspicious_pickle_intake suspicious-intake 183.8 KiB 4 146.24ms 144.69ms -1.1% stable
duplicate-heavy-registry tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_registry_snapshot registry-snapshot 915.2 KiB 13 394.78ms 398.22ms +0.9% stable
chunked-upload-stream tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_upload_stream chunked_stream 278.2 KiB 1 114.01ms 113.22ms -0.7% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_base64] nested_base64 98 B 1 593.7us 590.4us -0.6% stable
clean-training-checkpoint tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_clean_training_checkpoint safe_large 278.2 KiB 1 111.21ms 111.69ms +0.4% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_raw] nested_raw 78 B 1 584.7us 585.9us +0.2% stable
padded-multi-stream-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_padded_multi_stream_upload multi_stream_padded 4.1 KiB 1 654.5us 655.7us +0.2% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_hex] nested_hex 130 B 1 622.5us 622.3us -0.0% stable

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6f7fed2f61

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/utils/sources/huggingface.py Outdated
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 478f2395c2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/utils/sources/huggingface.py Outdated
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0dab63c606

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/utils/sources/huggingface.py Outdated
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex address that feedback. Fix the current P2 on exact head 0dab63c: preserve selected overlap routes such as --scanners compressed for shard-shaped SafeTensors with trailing gzip content. Add malicious and benign overlap/selection regressions, push a new head, and report tests.

@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 65d88c9d29

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/utils/sources/huggingface.py
@chatgpt-codex-connector

Copy link
Copy Markdown

Summary

Testing

  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_non_overlap_skips_detected_safetensors_shard_after_probe tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_compressed_preserves_safetensors_shard_overlap_route tests/utils/sources/test_huggingface.py::TestModelDownloadStreaming::test_download_model_streaming_selected_pickle_routes_shard_shaped_renamed_pickle tests/test_cli.py::test_scan_huggingface_streaming_selected_pickle_scans_shard_shaped_renamed_pickle -q
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/

View task →

@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review the exact current head after preserving selected overlap routes. Recheck selected compressed/SafeTensors behavior, skipped shard reads, and malicious overlap controls.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c06226792a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/utils/sources/huggingface.py Outdated
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review the exact current head fa87d8fe. Recheck skipped-shard selection, URL resolution/probe behavior, bounded acquisition, and malicious selected-shard controls.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fa87d8fe3a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/utils/sources/huggingface.py Outdated
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c998168cd2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/utils/sources/huggingface.py
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review the exact current head 4865428. Recheck shard-shaped SafeTensors selection, bounded content probing before exclusion, XGBoost/ONNX renamed route coverage, and genuine SafeTensors shard skip/cap behavior.

@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review the exact current head 189f5f357345ff572d19e00259935e9d3800455f. Recheck SafeTensors shard family scoping: parser full-matches the full repo path, complete declared families skip before probe only within the same path-scoped stem, incomplete cross-directory shard-shaped candidates still count against _HF_CONTENT_SNIFF_MAX_FILES, and selected-route malicious shard-shaped controls still route.

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Preserve selected noncanonical shard-shaped XGBoost routes, make Hugging Face dry runs avoid downloads/probes, and add regressions for probe budgets, direct URLs, dry-run exit codes, and sanitized failures.
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Pushed additive follow-up at cbadb71d.

What changed:

  • Preserved selected noncanonical/zero-based/one-of-one shard-shaped .safetensors XGBoost UBJSON routes under bounded range probes while complete canonical SafeTensors shard families still skip without HEAD/GET/probe.
  • Added Hugging Face direct/model dry-run previews that avoid downloads/probes and return success only when there is no operational error.
  • Added regressions for excluded shard no-probe behavior, companion index accounting, noncanonical probe caps, gated/signed URL redaction, direct URL streaming bypass, HF dry-run, and dry-run exit codes.

Local verification:

  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py -q
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py -k "huggingface or dry_run" -q
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_exit_codes.py -q
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_streaming_scan.py tests/test_scanner_selection.py -q
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_core.py tests/test_core_asset_extraction.py -q
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n 4 -m "not slow and not integration" --maxfail=1
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • git diff --check

@codex review

@mldangelo-oai mldangelo-oai requested a review from mldangelo June 11, 2026 10:12

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cbadb71d66

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/cli.py Outdated
Suppress human preview text for Hugging Face dry runs when JSON is written to stdout and add direct-file/model regressions that parse the emitted JSON.
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Pushed additive follow-up at ba3f314a.

What changed:

  • Suppressed Hugging Face dry-run preview text when JSON is emitted to stdout, preserving human previews for text/output-file modes.
  • Added direct-file and model dry-run JSON regressions that parse stdout with json.loads and assert no download/probe path is called.

Local verification after this follow-up:

  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py -q
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py -k "huggingface or dry_run" -q
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_exit_codes.py -q
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_streaming_scan.py tests/test_scanner_selection.py -q
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_core.py tests/test_core_asset_extraction.py -q
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n 4 -m "not slow and not integration" --maxfail=1
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • git diff --check

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ba3f314a9c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/utils/sources/huggingface.py Outdated
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Current-head CI on ba3f314a fails consistently on Python 3.12, Python 3.13, and Windows in:

tests/test_cli.py::test_scan_huggingface_direct_file_dry_run_does_not_download
assert "Preview for" in result.output

The command now returns a valid JSON dry-run object (success: true, dry_run: true, zero files/checks) rather than text-mode Preview for ..., so the test expectation is stale for JSON output. Update the regression to parse/assert the JSON contract and retain a separate text-format assertion if both modes are intended. Do not reintroduce human prose into JSON output. Rerun the focused CLI dry-run tests on all supported platforms, push additively, and request fresh current-head Codex review.

Run: https://github.com/promptfoo/modelaudit/actions/runs/27342690555

Make the human-preview dry-run tests request text output explicitly so JSON defaults or config state cannot conflict with the JSON-stdout regressions.
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Follow-up for the exact-head CI failure on ba3f314a:

  • Fixed the stale dry-run preview tests by pinning the human-preview cases to --format text; the JSON stdout regressions remain JSON-only and parse the emitted payload.
  • This is test-only and preserves the production behavior from the prior commits.

Local verification on the new head before push:

  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py::test_scan_huggingface_direct_file_dry_run_does_not_download tests/test_cli.py::test_scan_huggingface_direct_file_dry_run_json_stdout_is_parseable tests/test_cli.py::test_scan_huggingface_streaming_dry_run_uses_metadata_preview_without_download_or_probe tests/test_cli.py::test_scan_huggingface_streaming_dry_run_json_stdout_is_parseable -q (4 passed)
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py -k "huggingface or dry_run" -q (37 passed)
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n 4 -m "not slow and not integration" --maxfail=1 (17399 passed, 1292 skipped)

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 47fa65ece8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/utils/sources/huggingface.py Outdated
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Independent review of exact head 47fa65ece833cb64b703ddbd17b4c2511c063477 found one P1 blocker:

Filename-complete declared SafeTensors shard families can hide renamed payloads selected by other scanners. The selector skips a complete .safetensors family before content-route detection runs. Completeness and overlap are inferred from filenames/scanner selection only; the bytes are never confirmed as SafeTensors.

Exact-head reproductions:

  • Complete model-00001-of-00002.safetensors / model-00002-of-00002.safetensors family whose mocked content detector returns XGBoost, with scanners=["xgboost"]: neither shard is probed or selected.
  • The same construction with ONNX content and ONNX selected also skips both shards.

Sniff content before applying the family skip unless the bytes are positively confirmed SafeTensors. Existing tests cover incomplete renamed families but not this complete-family case.

Validation: focused shard/XGBoost, index/budget, gated/overlap, dry-run, resource, and reconciliation slices all passed; pinned DeepSeek-R1 metadata/text QA selected only README with zero body GETs. Ruff/format/mypy passed. Four long CI lanes were still pending at closeout. The P1 bypass blocks merge.

@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Pushed additive follow-up at fa86be75 for the current-head P1 shard-family finding.

What changed:

  • Removed the pre-probe complete SafeTensors shard-family skip, so content-routed selections no longer trust .safetensors shard filenames before bounded sniffing.
  • Proven SafeTensors shards still skip after content classification when selected routes cannot consume them, and they remain excluded from download/size inventory.
  • Added regressions for complete-looking two-shard XGBoost UBJSON payloads under both scanner-policy and extension-only selection, plus fail-closed cap behavior for high-count complete-looking shard families.

Local verification on this head:

  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/utils/sources/test_huggingface.py -q (294 passed, 1 skipped)
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_cli.py -k "huggingface or dry_run" -q (37 passed)
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/test_streaming_scan.py tests/test_scanner_selection.py tests/test_core.py tests/test_core_asset_extraction.py -q (754 passed, 21 skipped)
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n 4 -m "not slow and not integration" --maxfail=1 (17401 passed, 1292 skipped)
  • git diff --check

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@mldangelo-oai mldangelo-oai enabled auto-merge (squash) June 11, 2026 13:24
…t13-hf-selective-skip-reads-20260610

# Conflicts:
#	modelaudit/utils/sources/huggingface.py
…t13-hf-selective-skip-reads-20260610

# Conflicts:
#	modelaudit/cli.py
…t13-hf-selective-skip-reads-20260610

# Conflicts:
#	modelaudit/cli.py
#	modelaudit/utils/sources/huggingface.py
#	tests/utils/sources/test_huggingface.py
@mldangelo-oai mldangelo-oai merged commit 94a89df into main Jun 11, 2026
29 checks passed
@mldangelo-oai mldangelo-oai deleted the mdangelo/codex/hf-fp-t13-hf-selective-skip-reads-20260610 branch June 11, 2026 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant