fix: recognize GGUF BF16 tensor type#1632
Conversation
Add GGUF tensor type 30 to the known GGML type table and cover BF16, unsupported tensor type, and malformed tensor metadata behavior.
|
@codex review |
Performance BenchmarksCompared
|
|
Codex Review: Didn't find any major issues. You're on a roll. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
…t11-gguf-bf16-type30-20260610
|
@codex review |
|
Codex Review: Didn't find any major issues. Bravo. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
New real-model regression input from origin/main |
|
Codex Review: Didn't find any major issues. You're on a roll. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
Real-model QA for the new Gemma GGUF case completed on current PR head. PR head / fetch state:
Pinned model revision:
Exact full streaming command: PROMPTFOO_DISABLE_TELEMETRY=1 uv run modelaudit scan --stream --no-cache --format json --output /tmp/pr1632-gemma4-type30-stream.json --timeout 7200 --max-size 20GB hf://google/gemma-4-26B-A4B-it-qat-q4_0-ggufTerminal outcome:
Scanner IDs / assets:
Issue/check counts:
Fail-closed controls rechecked: PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_gguf_scanner.py -k "unknown_tensor_type or bf16 or truncated_tensor_dimensions or tensor_bounds_detects_uint64_wrap or tensor_information_byte_limit or default_tensor_count_limit_blocks_compact_resource_exhaustion or tensor_metadata_summary_is_capped_without_skipping_validation"Result: PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_gguf_scanner.pyResult: Conclusion: the origin/main type-30 S902 flood and operational failure are gone on the PR head for the exact Gemma revision. Unsupported tensor types, malformed tensor metadata, impossible offsets, truncation, and tensor/metadata resource controls remain bounded and fail closed in the focused and full GGUF regression suite. |
|
New high-volume exact-main reproduction: |
|
Exact-head no-change QA for the new Ideogram GGUF type-30 reproduction is complete. PR head:
Pinned reproduction scanned:
Pinned scan result:
Per-file tensor evidence:
Resource/timing evidence:
Regression/control validation: PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -q \
tests/scanners/test_gguf_scanner.py::test_gguf_bf16_tensor_type_30_is_supported \
tests/scanners/test_gguf_scanner.py::test_gguf_unknown_tensor_type_is_inconclusive \
tests/scanners/test_gguf_scanner.py::test_gguf_bf16_truncated_tensor_data_reports_bounds_check \
tests/scanners/test_gguf_scanner.py::test_gguf_truncated_tensor_dimensions_are_parse_inconclusive \
tests/scanners/test_gguf_scanner.py::test_gguf_scanner_tensor_bounds_detects_uint64_wrap \
tests/scanners/test_gguf_scanner.py::test_gguf_scanner_tensor_size_validation \
tests/scanners/test_gguf_scanner.py::test_gguf_scanner_invalid_tensor_dimensions \
tests/scanners/test_gguf_scanner.py::test_gguf_scanner_large_tensor_count \
tests/scanners/test_gguf_scanner.py::test_gguf_tensor_information_byte_limit_is_inconclusive \
tests/scanners/test_gguf_scanner.py::test_gguf_scanner_metadata_types \
tests/scanners/test_gguf_scanner.py::test_gguf_truncated_metadata_returns_exit2Result: PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -q tests/scanners/test_gguf_scanner.py tests/test_gguf_sbom_integration.pyResult: Additional checks:
Conclusion: the exact 508 type-30 tensors ( |
|
New exact-main GGUF QA inputs: |
|
Additional pinned GGUF QA: |
|
Codex Review: Didn't find any major issues. Chef's kiss. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
Exact-head QA follow-up for Pinned GGUF evidence was downloaded under
Main vs head scan evidence:
Interpretation: the main failures are the BF16 type-30 false-positive flood, not a real structural/layout defect. On the PR head, the same pinned files retain tensor coverage and exit cleanly; remaining S902 records are metadata value info checks, not structural inconclusive reasons. Local validation:
CI/review poll at exact head: all required checks are passing or intentionally skipped; |
|
@codex add this pinned GGUF BF16 case: |
|
Exact pinned GGUF BF16 QA completed on current PR head Pinned files downloaded under
Exact bounded scan command shape: PROMPTFOO_DISABLE_TELEMETRY=1 uv run modelaudit scan --no-cache --scanners gguf --format json --output <out.json> --timeout 7200 --max-size 5GB <two downloaded .gguf files>Before/after results:
Fail-closed controls rechecked on this head:
Conclusion: the exact pinned BF16/type-30 files no longer produce unknown-type floods or inconclusive GGUF structure outcomes on the PR head. Unknown/future tensor types, malformed/truncated metadata and tensor tables, tensor dimension parsing, tensor bounds/uint64 overflow, and tensor resource limits remain covered and fail closed. @codex review |
|
Codex Review: Didn't find any major issues. Keep it up! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
Testing
|
|
Fresh pinned QA on exact
Please add this pinned model as a second real-world QA case at the current PR head and prove that type 30 reaches normal tensor-size/bounds validation with zero unknown-type findings. Keep an unknown numeric type control fail-closed. |
|
Pinned real-model QA from Hugging Face rank 250 on exact main
This is 636 compatibility false positives across the pinned repository, not malicious evidence. Please include this exact model/revision and all three types in end-to-end QA before merge. Full audit: |
|
Additional exact-main GGUF QA from Hugging Face rank 253:
Please add this exact pinned model/file to the current-head QA alongside rank 250's type-21/23/30 matrix. Full audit: |
|
Pinned rank-250 GGUF follow-up completed on additive commit Upstream format check:
Pinned header QA:
Representative tensor evidence from the pinned tensor tables:
Exact pinned end-to-end scans:
Regression/validation:
Unknown future tensor types, malformed/truncated headers, impossible dimensions, overflowing tensor tables, unsupported encodings, and payload-offset violations remain covered by the GGUF fail-closed controls. @codex review |
|
Codex Review: Didn't find any major issues. Breezy! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
Pinned rank-262 GGUF compatibility QA for
Please run the exact PR head against this pinned revision and verify both the BF16 type covered by this PR and any still-current type 35 mapping. Unknown future types should remain explicit/inconclusive, but standardized current types should not generate hundreds of false findings. Full local evidence: |
Summary
Recognize GGUF/GGML tensor type
30as BF16 by adding it to the scanner's GGML tensor-size table as(block_size=1, type_size=2).Root cause
GgufScannerparses tensor metadata correctly, but_GGML_TYPE_INFOdid not include GGML type30. During the second tensor validation pass,_validate_tensor_info()treated every BF16 tensor as an unknown GGML type, marked the scan inconclusive withgguf_structure_validation_failed, and returned before normal tensor size/bounds validation could run.Security tradeoff
This only recognizes the known BF16 type. Unknown tensor types still fail closed as bounded inconclusive results, malformed tensor metadata still returns an explicit parse-incomplete inconclusive result, and malformed BF16 tensor data still reaches the existing tensor-data bounds checks.
Real-model QA
Pinned repo:
Outcome:
Bounded sparse prefix fetches, with range checks before body reads:
Outcome:
Pre-fix exact-main proof from detached worktree at
ab6e58ebe587197b2b46866c69f7d41d75e08a55:Outcome:
Post-fix pinned main BF16 sparse prefix:
Outcome:
Post-fix aggregate path on pinned
mmproj-BF16.ggufsparse prefix:Outcome:
Validation
Notes
Large model artifacts were not committed. The real-model files used for QA were sparse temporary files under
/tmp/modelaudit-gguf-bf16-type30, created from bounded HTTP range reads and logical truncation only.