Fix variable shadowing crash in get_by_tags and code quality improvements#837
Fix variable shadowing crash in get_by_tags and code quality improvements#837BEAST04289 wants to merge 5 commits intoOWASP:mainfrom
Conversation
…rror The loop variable 'node' was overwritten by the return value of self.get_nodes(). When get_nodes() returned [] (empty list), the error handler tried to access node.name on the empty list, causing AttributeError. Renamed to 'db_node' and 'resolved' to preserve the original DB node reference for error logging. Fixes OWASP#836
There was a problem hiding this comment.
Pull request overview
Fixes a crash in Node_collection.get_by_tags() caused by variable shadowing, and applies a few small code-quality cleanups discovered during review (typos, dead code, redundant import).
Changes:
- Prevent variable shadowing in
get_by_tags()so error logging can’t crash whenget_nodes()returns an empty list. - Remove unreachable code in
add_node()and clean up redundant imports / misspelled local variable names. - Rename typos in variables used by gap analysis and standard registration flow.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| application/database/db.py | Fixes get_by_tags() variable shadowing crash; removes unreachable return after raise. |
| application/web/web_main.py | Removes redundant oscal_utils, redis import line (keeps the remaining import). |
| application/utils/gap_analysis.py | Fixes misspelled local variable pentalty → penalty. |
| application/cmd/cre_main.py | Fixes pending_stadards typo and updates references to the corrected variable name. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| logger.error( | ||
| f"Could not find gap analysis job for for {importing_name} and {standard_name} putting {standard_name} back in the queue" | ||
| ) |
There was a problem hiding this comment.
Typo in log message: duplicated word for ("job for for …").
| logger.error( | ||
| f"Could not find gap analysis job for for {importing_name} and {standard_name} putting {standard_name} back in the queue" | ||
| ) |
There was a problem hiding this comment.
Typo in log message: duplicated word for ("job for for …").
| @@ -295,7 +295,7 @@ def register_standard( | |||
| logger.error( | |||
| f"Could not find gap analysis job for for {importing_name} and {standard_name} putting {standard_name} back in the queue" | |||
| ) | |||
| pending_stadards.append(standard_name) | |||
| pending_standards.append(standard_name) | |||
|
|
|||
There was a problem hiding this comment.
This code appends back into pending_standards while iterating over it. Mutating the list during iteration can lead to unexpected behavior (including repeatedly re-processing the same standard if the exception keeps happening). Consider using a separate retry list/queue (e.g., to_retry.append(...) then extend after the loop), or iterating over a snapshot copy.
| @@ -310,7 +310,7 @@ def register_standard( | |||
| logger.error( | |||
| f"Could not find gap analysis job for for {importing_name} and {standard_name} putting {standard_name} back in the queue" | |||
| ) | |||
| pending_stadards.append(standard_name) | |||
| pending_standards.append(standard_name) | |||
| redis.wait_for_jobs(jobs) | |||
There was a problem hiding this comment.
This code appends back into pending_standards while iterating over it. Mutating the list during iteration can lead to unexpected behavior (including repeatedly re-processing the same standard if the exception keeps happening). Consider using a separate retry list/queue (e.g., to_retry.append(...) then extend after the loop), or iterating over a snapshot copy.
application/database/db.py
Outdated
| else: | ||
| logger.fatal( | ||
| "db.get_node returned None for" | ||
| "db.get_node returned None for " |
There was a problem hiding this comment.
The fatal log message in this branch is misleading: get_nodes() never returns None (it returns an empty list when nothing matches), and the message references db.get_node rather than get_nodes. This makes the log inaccurate and can confuse debugging; consider updating the wording to reflect an empty result from get_nodes() (and the correct function name).
| "db.get_node returned None for " | |
| "get_nodes() returned no documents for " |
| nodes = Node.query.filter(*nodes_where_clause).all() or [] | ||
| for node in nodes: | ||
| node = self.get_nodes( | ||
| name=node.name, | ||
| section=node.section, | ||
| subsection=node.subsection, | ||
| version=node.version, | ||
| link=node.link, | ||
| ntype=node.ntype, | ||
| sectionID=node.section_id, | ||
| for db_node in nodes: | ||
| resolved = self.get_nodes( | ||
| name=db_node.name, | ||
| section=db_node.section, | ||
| subsection=db_node.subsection, | ||
| version=db_node.version, | ||
| link=db_node.link, | ||
| ntype=db_node.ntype, | ||
| sectionID=db_node.section_id, | ||
| ) | ||
| if node: | ||
| documents.extend(node) | ||
| if resolved: | ||
| documents.extend(resolved) | ||
| else: | ||
| logger.fatal( | ||
| "db.get_node returned None for" | ||
| "db.get_node returned None for " | ||
| "Node %s:%s:%s that exists, BUG!" | ||
| % (node.name, node.section, node.section_id) | ||
| % (db_node.name, db_node.section, db_node.section_id) | ||
| ) |
There was a problem hiding this comment.
There is test coverage for get_by_tags() (see application/tests/db_test.py::test_get_by_tags), but it doesn’t exercise the failure mode fixed here (where get_nodes() returns an empty list for a DB row that matched the initial tag query). Please add a regression test that simulates get_nodes() returning [] for a matching Node row to ensure get_by_tags() doesn’t crash when logging the error path.
276bb39 to
7ae6362
Compare
Summary
Fixes a crash bug and several code quality issues found during codebase review for GSoC 2026 Module B proposal.
Closes #836
Changes
Bug Fix: Variable shadowing crash in get_by_tags() (application/database/db.py)
The loop variable node was overwritten by the return value of self.get_nodes(). When get_nodes() returned [] (empty list), the error handler tried to access node.name on the empty list, causing AttributeError. Renamed to db_node and resolved to preserve the original DB node reference for error logging.
Code Quality Fixes
Testing