Feature/issue 1274 notebook format check#1299
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new script, check/nbformat, to validate and fix Jupyter notebook formatting using tensorflow_docs.tools.nbformat, and updates the project's dependency requirements and lockfiles. The review highlights two critical issues: the inefficiency of invoking the format checker in a loop rather than passing all files at once, and the breaking of environment compatibility for older Python versions caused by regenerating lockfiles with Python 3.13 instead of the project's minimum supported version.
|
Hey @mhucka any feedback on this also? No stress, just checking in. Thanks. |
|
@rosspeili Thank you for the contribution, and sorry for the long delay in getting back to this. The merge conflicts are because files in the repo changed in the meantime, which is my fault for not reviewing this sooner, so I will take the time to fix it in a moment and push an updated version to your branch. (I hope you don't mind – it will save everyone time.) A bigger problem is that this does not work. How did you test this? This is what happens when I run check/nbformat: |
|
Hey @mhucka no stress at all. Appreciate the review and comments as always. As disclosed also in previews PRs, the PR description indeed is formatted with AI, based on a dump, but will make it more concise or just drop the original blurp. Code is hand written and passed through AI for review before pr. I used existing tests with py 3.13, but will check again and come back to you asap. Please do feel free to push to my branch if available. Or suggest and I'll approve. 🙏 |
Add check/nbformat, a new bash script that checks the format of all tracked Jupyter notebook (.ipynb) files using the TensorFlow Docs notebook format checker (tensorflow_docs.tools.nbformat). The script: - Discovers all .ipynb files tracked by git via git ls-files - Runs python -m tensorflow_docs.tools.nbformat on each notebook - Defaults to check-only mode (non-destructive) - Accepts --fix to reformat notebooks in place - Accepts --help to print usage information - Exits non-zero if any notebook fails the format check Wire check/nbformat into: - check/all: called unconditionally alongside check/mypy and check/shellcheck, since notebook format is structural and not related to which Python files changed - check/shellcheck: added to the required_shell_scripts list (in sorted order) so shellcheck verifies the new script is present Add tensorflow-docs to dev_tools/requirements/deps/pytest.txt alongside the existing nbformat dep, since both are needed for notebook-related checks and belong in the same env grouping. Add a notebook-checks CI job to .github/workflows/ci.yaml that triggers when .ipynb files change and runs check/nbformat using the pytest.env.txt environment (which includes tensorflow-docs). Also add the new job to report-results so it is a required check.
Co-authored-by: Michael Hucka <mhucka@google.com>
…1274) The checker called a non-existent tensorflow_docs.tools.nbformat module. Align with Cirq by using nbfmt, regenerate lockfiles on current main, and format all tracked notebooks so CI can pass.
39ff7df to
b52511c
Compare
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
@mhucka thanks again for the patience and tips. Rebased onto current main and fixed check/nbformat, I had the wrong module ( Let me know if I am missing anything. <3 |
mhucka
left a comment
There was a problem hiding this comment.
Thanks for the quick work. This is better! There's just a small niggle in check/nbformat; please check the comment there.
I didn't anticipate that the notebooks would actually not conform to the format and would produce a lot of diffs. It is better to split out the formatting differences to a separate PR because then the PR commit can be adde to .git-blame-ignore-revs so that git blame output is less noisy. So, could we please proceed as follows?
|
One more minor thing. In the PR description, it's better to omit this statement:
When this PR is merged, the changes will be squashed in the git history and it will not be easy for readers to figure out what this comment refers to. The description only needs to refer to the PR as it stands and not the history of the PR. (I think you may have intended the comment to be helpful information for reviewers of the updated PR. That's kind, and I understand the impulse, but nevertheless, it turns out it's actually counter-productive in the long run. I used to do the same thing, and had to unlearn this particular habit …) |
|
Understood and respected. ✅️ |
Pass . to nbfmt instead of a hard-coded path list (nbfmt is recursive and only touches .ipynb). Add notebook_files output for consistency with other paths-filter variables.
Notebook nbfmt diffs are in a separate PR to merge first for git-blame-ignore-revs. This PR adds only the check script, CI, and deps.
Closes #1274
Adds
check/nbformatto validate tracked.ipynbfiles withtensorflow_docs.tools.nbfmt(repo root, Cirq-style), wired intocheck/all, shellcheck, and CI. Addstensorflow-docsto the pytest and dev lockfiles.Notebook formatting is in #1322 and should merge before this PR.