Skip to content

Fix deepcopy for And, Or, and Not expressions#79

Merged
robreeves merged 5 commits intolinkedin:li-0.11from
robreeves:fix-deepcopy
Apr 29, 2026
Merged

Fix deepcopy for And, Or, and Not expressions#79
robreeves merged 5 commits intolinkedin:li-0.11from
robreeves:fix-deepcopy

Conversation

@robreeves
Copy link
Copy Markdown
Collaborator

@robreeves robreeves commented Apr 28, 2026

Rationale for this change

copy.deepcopy() on And, Or, and Not expressions raises TypeError because Pydantic v2's default __deepcopy__ calls cls.__new__(cls) with no args, but these classes require positional arguments in __new__.

The fix adds __deepcopy__ to And, Or, and Not that deepcopy their fields and call the constructor directly.

Apache change to minimize the divergence here apache/iceberg-python#3295

Are these changes tested?

Yes. 11 new unit tests covering all expression types, balanced trees, nested expressions, and deepcopy followed by pickle.

Are there any user-facing changes?

Yes,copy.deepcopy() on expressions now works instead of raising.

Pydantic v2's BaseModel.__deepcopy__ calls cls.__new__(cls) with no
args, but And, Or, and Not define __new__ with required positional
arguments. These tests reproduce the TypeError on deepcopy.
Add __deepcopy__ to BooleanExpression that reconstructs the expression
via model_fields, and returns self for Singleton subclasses. Fixes
TypeError from Pydantic v2's BaseModel.__deepcopy__ calling
cls.__new__(cls) with no args on And, Or, and Not which require
positional arguments in __new__.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses failures when copy.deepcopy() is applied to PyIceberg boolean expression trees under Pydantic v2, particularly for compound expressions (And/Or/Not) with __new__ signatures requiring arguments, and for Singleton-based expressions that must not be reconstructed.

Changes:

  • Add a BooleanExpression.__deepcopy__ implementation that returns self for Singleton subclasses and reconstructs other expressions by deep-copying model fields.
  • Add unit tests covering deepcopy across several expression types, nested/balanced trees, and deepcopy→pickle roundtrips.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
pyiceberg/expressions/__init__.py Introduces custom __deepcopy__ for all BooleanExpression subclasses to avoid Pydantic v2 __new__-without-args failures and handle Singleton immutability.
tests/expressions/test_expressions.py Adds targeted tests validating deepcopy behavior for multiple expression shapes and pickle compatibility.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pyiceberg/expressions/__init__.py Outdated
Comment thread tests/expressions/test_expressions.py
Instead of a generic __deepcopy__ on BooleanExpression with Pydantic
internals, add simple __deepcopy__ methods to the three classes that
need them. Also add identity assertions for non-Singleton copies.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pyiceberg/expressions/__init__.py
Comment thread pyiceberg/expressions/__init__.py
Comment thread pyiceberg/expressions/__init__.py
Comment thread pyiceberg/expressions/__init__.py
@robreeves robreeves marked this pull request as ready for review April 28, 2026 21:45
@robreeves robreeves changed the title Fix deepcopy for BooleanExpression subclasses Fix deepcopy for And, Or, and Not expressions Apr 28, 2026
Copy link
Copy Markdown
Collaborator

@ShreyeshArangath ShreyeshArangath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but are we also planning to add support for deep copy in Bound* and In expressions later?

@robreeves
Copy link
Copy Markdown
Collaborator Author

LGTM, but are we also planning to add support for deep copy in Bound* and In expressions later?

Those already work because they don't have custom __new__ with required args so Pydantic's default __deepcopy__ is sufficient

@robreeves robreeves merged commit 33e0e11 into linkedin:li-0.11 Apr 29, 2026
15 checks passed
robreeves added a commit to linkedin/openhouse that referenced this pull request Apr 29, 2026
## Summary

Update li-pyiceberg dependency version to 0.11.5 for bug fix
linkedin/iceberg-python#79

## Changes

- [ ] Client-facing API Changes
- [ ] Internal API Changes
- [x] Bug Fixes
- [ ] New Features
- [ ] Performance Improvements
- [ ] Code Style
- [ ] Refactoring
- [ ] Documentation
- [ ] Tests

## Testing Done
<!--- Check any relevant boxes with "x" -->

- [ ] Manually Tested on local docker setup. Please include commands
ran, and their output.
- [ ] Added new tests for the changes made.
- [ ] Updated existing tests to reflect the changes made.
- [x] No tests added or updated. Please explain why. If unsure, please
feel free to ask for help.
- [ ] Some other form of testing like staging or soak time in
production. Please explain.

Dependency bump. Ran existing tests.

# Additional Information

- [ ] Breaking Changes
- [ ] Deprecations
- [ ] Large PR broken into smaller PRs, and PR plan linked in the
description.

For all the boxes checked, include additional details of the changes
made in this pull request.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants