Skip to content

Improve Error Handling with Serialized File Detection#58

Open
SkowronskiAndrew wants to merge 1 commit intoimprove-error-handlingfrom
serialized-file-detect
Open

Improve Error Handling with Serialized File Detection#58
SkowronskiAndrew wants to merge 1 commit intoimprove-error-handlingfrom
serialized-file-detect

Conversation

@SkowronskiAndrew
Copy link
Collaborator

@SkowronskiAndrew SkowronskiAndrew commented Feb 18, 2026

Summary

Enhances SerializedFile command with improved file type detection and better error reporting for analyze operations.

Key Changes

  • File Type Detection: Added robust detection for SerializedFiles, YAML formats, and archived assets with new SerializedFileDetector, ArchiveDetector and YamlSerializedFileDetector.cs‎ utilities
  • SerializedFile Command: Expanded command capabilities with better format handling and validation
  • Testing: Added comprehensive test coverage for file detection and command functionality

Testing

  • New FileDetectionTests and SerializedFileCommandTests with legacy and YAML format test data
  • Improved error reporting validation tests

Documentation

New "header" subcommand is documented, along with some high level explanation of the serialized file format.

Example:
$ UnityDataTool.exe sf header .\AssetBundle.buildreport
Version 22
Format Modern (64-bit)
File Size 12,280 bytes
Metadata Size 6,085 bytes
Data Offset 6,144
Endianness Little Endian

Example:

$ UnityDataTool.exe sf objectlist .\Scene1.unity
Error: The file is a YAML-format SerializedFile, which is not supported.
File: C:<fullpath>\Assets\Scenes\Scene1.unity

UnityDataTool only supports binary-format SerializedFiles.

Improve error reporting from serialized-file command
Use detection helpers to print more informative messages when command is run against the wrong file type.

Adding "header" subcommand for serialized file to print info from the serialized file header.

Detect and specific warning for YAML (text) format SerializedFiles

Example:
$ UnityDataTool.exe sf header .\AssetBundle.buildreport
Version              22
Format               Modern (64-bit)
File Size            12,280 bytes
Metadata Size        6,085 bytes
Data Offset          6,144
Endianness           Little Endian

Example:

$ UnityDataTool.exe sf objectlist .\Scene1.unity
Error: The file is a YAML-format SerializedFile, which is not supported.
File: C:\UnitySrc\unity\Modules\ContentBuild\Tests\ContentBuildTests\Assets\Scenes\Scene1.unity

UnityDataTool only supports binary-format SerializedFiles.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enhances the SerializedFile command with robust file type detection and improved error handling. It introduces three new detector utilities to differentiate between Unity Archives, YAML-format SerializedFiles, and binary SerializedFiles, replacing the previous archive detection code with a more maintainable, reusable implementation. The PR adds a new header subcommand that displays SerializedFile header information without requiring full file parsing.

Changes:

  • Added three new file format detector utilities (ArchiveDetector, YamlSerializedFileDetector, SerializedFileDetector) with comprehensive header parsing
  • Introduced a new header subcommand for quick SerializedFile inspection
  • Enhanced error messages to guide users when they attempt to analyze unsupported file formats (archives, YAML files)
  • Added comprehensive test coverage with 13 new unit tests and 7 integration tests

Reviewed changes

Copilot reviewed 10 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
Analyzer/Util/ArchiveDetector.cs New utility to detect Unity Archive files by signature (replaces inline code in SerializedFileParser)
Analyzer/Util/YamlSerializedFileDetector.cs New utility to detect YAML-format SerializedFiles with BOM handling
Analyzer/Util/SerializedFileDetector.cs New utility to detect and parse binary SerializedFile headers, supporting legacy and modern formats
UnityDataTool/SerializedFileCommands.cs Added ValidateSerializedFile method with improved error messages, new HandleHeader command, updated error handling in existing commands
UnityDataTool/Program.cs Registered new header subcommand with command-line parser
Analyzer/SQLite/Parsers/SerializedFileParser.cs Refactored to use new detector utilities instead of inline archive detection
Analyzer.Tests/FileDetectionTests.cs Added 13 comprehensive unit tests for all three detector utilities
UnityDataTool.Tests/SerializedFileCommandTests.cs Added 7 integration tests for header command and improved error handling validation
TestCommon/Data/YamlFormat.asset Test data file for YAML SerializedFile detection
Documentation/command-serialized-file.md Added documentation for header subcommand with examples and field descriptions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// STEP 2: Read endianness byte
// ============================================================
//
// The m_Endianess byte indicates the endianness of the DATA section
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo in the comment: "m_Endianess" should be "m_Endianness" (note the double 's'). This should match the standard spelling of "endianness" which is used elsewhere in the codebase, including in the SerializedFileInfo class property name.

Suggested change
// The m_Endianess byte indicates the endianness of the DATA section
// The m_Endianness byte indicates the endianness of the DATA section

Copilot uses AI. Check for mistakes.
// Offset 4-7: UInt32 m_FileSize
// Offset 8-11: UInt32 m_Version
// Offset 12-15: UInt32 m_DataOffset
// Offset 16: UInt8 m_Endianess (only present for version >= 9)
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo in the comment: "m_Endianess" should be "m_Endianness" (note the double 's'). This should match the standard spelling of "endianness" which is used elsewhere in the codebase.

Suggested change
// Offset 16: UInt8 m_Endianess (only present for version >= 9)
// Offset 16: UInt8 m_Endianness (only present for version >= 9)

Copilot uses AI. Check for mistakes.
// Offset 16-23: UInt64 m_MetadataSize
// Offset 24-31: UInt64 m_FileSize
// Offset 32-39: UInt64 m_DataOffset
// Offset 40: UInt8 m_Endianess
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo in the comment: "m_Endianess" should be "m_Endianness" (note the double 's'). This should match the standard spelling of "endianness" which is used elsewhere in the codebase.

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +35
/// Version &lt; 9:
/// - 20-byte header (SerializedFileHeader32) with 32-bit offsets/sizes
/// - Layout: [header][data][metadata]
/// - Endianness byte stored at END of file, just before metadata
///
/// Version 9-21:
/// - 20-byte header (SerializedFileHeader32) with 32-bit offsets/sizes
/// - Layout: [header][metadata][data]
/// - Endianness byte at offset 16 in header
/// - Limited to 4GB file sizes
///
/// Version &gt;= 22 (kLargeFilesSupport):
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTML entities in XML documentation comments should use the literal characters instead. The &lt; and &gt; entities are unnecessarily escaped here. In C# XML documentation comments, you can use literal < and > characters in text content without causing issues, as the XML parser understands the context. Using HTML entities makes the documentation harder to read in the source code.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments