Skip to content

Flag autogenerated files in build process#2078

Open
uttam282005 wants to merge 3 commits intoaboutcode-org:mainfrom
uttam282005:flag-autogen-files
Open

Flag autogenerated files in build process#2078
uttam282005 wants to merge 3 commits intoaboutcode-org:mainfrom
uttam282005:flag-autogen-files

Conversation

@uttam282005
Copy link
Contributor

Issues

Changes

This pull request introduces logic to automatically detect and flag autogenerated files in the codebase, improving the accuracy of resource classification. It adds a mechanism to scan file headers for common autogenerated markers, reclassifies such files from "requires review" to "ignored not interesting," and includes robust utilities for reading file headers safely. Comprehensive tests are added to ensure the new functionality works as intended.

Autogenerated file detection and classification:

  • Added is_probably_autogenerated_resource in scanpipe/pipes/d2d.py to identify files as autogenerated based on header markers, and updated match_unmapped_resources to reclassify such files from REQUIRES_REVIEW to IGNORED_NOT_INTERESTING.
  • Defined AUTOGENERATED_FILE_MARKERS and implemented read_file_head_text utility in scanpipe/pipes/flag.py to reliably read and normalize the file header for marker matching.

Testing:

  • Added unit tests for is_probably_autogenerated_resource covering positive and negative cases, and for the reclassification logic in match_unmapped_resources.
  • Added tests for read_file_head_text to ensure correct behavior with normal, non-UTF-8, missing, and null-byte-containing files.

Checklist

  • I have read the contributing guidelines
  • I have linked an existing issue above
  • I have added unit tests covering the new code
  • I have reviewed and understood every line of this PR

Signed-off-by: uttam282005 <[email protected]>
Signed-off-by: uttam282005 <[email protected]>
@uttam282005
Copy link
Contributor Author

Hi @chinyeungli, this PR is ready for review. I’d appreciate your feedback when you have time. Thanks!

@chinyeungli chinyeungli requested a review from tdruez March 5, 2026 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flagging Autogenerated Files in "map_deploy_to_develop" Pipeline - Pypi

1 participant