-
-
Notifications
You must be signed in to change notification settings - Fork 211
[ENH] Refactor Extension
#1590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[ENH] Refactor Extension
#1590
Conversation
for more information, see https://pre-commit.ci
… into refactor-extension
fkiraly
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a clean refactor (although it is missing tests)
Mid-term, I think the executor should get removed, but that would be more than a simple refactor. Reason being since the current concept couples "executor" to "serializer" strongly. The same serialized object could be executed in multiple contextx.
Either way, I would only refactor and not redesign in the current PR.
We should also add tests for proper functioning.
I agree. I have added the tests for the base classes and registry. I have a question: This refactor means nothing and the public API change as mentioned in the PR description won't work unless we also refactor the openml extensions such as
Note: We are planning to move the extensions to |
I think you do not give yourself enough credit.
In a pure refactor, the behaviour of the class does not change. So you would replace the current extension class with instances of your new design, just to keep current behaviour exactly as it is. Mid-term, I agree that
Yes, I think that is a good next step (in a separate PR) - we ought to ensure that we strictly test:
As per the consensus from discussion on 2026-01-05, we can merge back but these are two things we must ensure.
How would that look like? I am not sure if I understand the pattern here. Refactoring across GitHub repositories seems like a lot more work to me. I can see how the first steps of a refactor would work purely within
Yes, although the end state would definitely not be in the current design (which I think has a lot of problems like internal coupling, lack of flexibility, etc). |
Summary
This PR refactors the existing
Extensionabstraction into three focused base classes:ModelSerializerModelExecutorOpenMLAPIConnectorIn addition, it introduces a registry-based API resolver that automatically determines the appropriate connector for a given estimator instance or OpenML flow. The resolver inspects the input at runtime and returns the matching connector, removing the need for users to know about or manually about estimator/flow-specific extension.
This significantly simplifies the user-facing API, users can work directly with estimators or flows without needing to know which extension is responsible for handling them.
And, the existing
Extensionclass is kept intact, ensuring that all current user-facing APIs remain fully backward compatible.As part of this change, existing OpenML extensions need to be refactored to adopt the new abstractions and take advantage of the simplified API.
User-facing APIs
Estimator Instance to Flow
Before
After (co-exists with the Before API for now)
Flow to Estimator Instance
Before
After (co-exists with the Before API for now)