fix(swebench-multimodal): create output.report.json for consistency#331
fix(swebench-multimodal): create output.report.json for consistency#331juanmichelini merged 4 commits intomainfrom
Conversation
…ith other benchmarks The swebench-multimodal evaluation was only creating report.json, while other benchmarks (swebench, commit0, etc.) create output.report.json. This caused the push-to-index workflow to fail when trying to find output.report.json. This change copies report.json to output.report.json after evaluation completes, matching the behavior of other benchmarks. Fixes the issue where GitHub Actions run 21077636459 failed to create a PR in openhands-index-results because the script couldn't find output.report.json.
|
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
|
@OpenHands please fix the failing actions on PR #331 at branch |
|
I'm on it! juanmichelini can track my progress at all-hands.dev |
The shutil module was used but not imported, causing pre-commit checks to fail. Co-authored-by: openhands <openhands@all-hands.dev>
SummaryI fixed the failing pre-commit check on PR #331 by adding the missing IssueThe pre-commit check was failing because the code used FixAdded Changes Made
Verification
The PR should now pass the pre-commit checks when CI runs again. |
| if report_path.exists(): | ||
| shutil.copy(str(report_path), str(dest_report_path)) | ||
| logger.info(f"Copied report file to: {dest_report_path}") | ||
| else: | ||
| logger.warning( | ||
| f"Report file not found at expected location: {report_path}" | ||
| ) |
There was a problem hiding this comment.
@OpenHands no defensive programming here: remove the if else
There was a problem hiding this comment.
I'm on it! simonrosenberg can track my progress at all-hands.dev
There was a problem hiding this comment.
Done! I've addressed the PR comment by removing the defensive if-else check for the report.json copy operation.
Changes made:
- Removed the
if report_path.exists():check and theelsebranch with the warning - The code now directly calls
shutil.copy()without checking if the file exists first
Final diff shows:
- Added
shutilimport - Added 6 lines that copy
report.jsontooutput.report.jsonwithout defensive checks
The changes have been committed and pushed to the fix-swebench-multimodal-report-filename branch, updating PR #331.
Co-authored-by: openhands <openhands@all-hands.dev>
|
@juanmichelini removed the defensive if:else so now the code fails if the report.json is not where it was supposed to be. I think that's OK? |
Problem
The swebench-multimodal evaluation was only creating
report.json, while other benchmarks (swebench, commit0, gaia, etc.) createoutput.report.json. This inconsistency caused the push-to-index workflow to fail when trying to findoutput.report.json.Evidence: GitHub Actions run 21077636459 failed with:
Solution
After the multimodal evaluation completes, copy
report.jsontooutput.report.jsonto match the behavior of other benchmarks.This follows the same pattern used in:
benchmarks/swebench/eval_infer.py(lines 258-267)Testing
The fix ensures that the archive structure matches what the push-to-index script expects:
output.report.jsonalongsideoutput.jsonlreport.jsonstill exists)