TEZ-4688 TEZ-4648: Tez upgrade to Hadoop 3.5.0 and jersey 2.x#474
TEZ-4688 TEZ-4648: Tez upgrade to Hadoop 3.5.0 and jersey 2.x#474maheshrajus wants to merge 9 commits into
Conversation
|
FYI, |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
@abstractdog @ayushtkn Can you please review and provide comments if any. Thanks ! |
|
@maheshrajus , can you please rebase this PR and resolve merge conflicts? |
|
@Aggarwal-Raghav Hi, i rebased and pushed the commit just now. Could you please take a look at this? Thanks ! |
| return ReflectionUtils.createClazzInstance(authenticatorClazzName); | ||
| } | ||
|
|
||
| private static class TokenAuthenticatedURLConnectionFactory implements HttpURLConnectionFactory { |
There was a problem hiding this comment.
Why TokenAuthenticatedURLConnectionFactory was removed won't it break things on Hadoop delegation token auth?
There was a problem hiding this comment.
I checked and added Hadoop delegation token auth related checks in latest commit. Please check and let me know if any other scenarios we can cover. thanks !
There was a problem hiding this comment.
I haven't checked this one in detail yet, please let us know @Aggarwal-Raghav if you're fine with @maheshrajus's analysis, so we can maybe move forward
There was a problem hiding this comment.
@maheshrajus , I'm not in favour of removing TokenAuthenticatedURLConnectionFactory and PseudoAuthenticatedURLConnectionFactory class .
They were implementing HttpURLConnectionFactory which is jersey 1.x and going forward they should implement HttpUrlConnectorProvider.ConnectionFactory jersey 2.x
To summarize, in this PR you are creating a client without plugging ConnectionFactory
Can you check this patch on top your PR?
Timeline.patch
There was a problem hiding this comment.
Fixed. @Aggarwal-Raghav Can you please check and confirm. Thanks !
| private Client getHttpClient() { | ||
| if (httpClient == null) { | ||
| ClientConfig config = new DefaultClientConfig(JSONRootElementProvider.App.class); | ||
| HttpURLConnectionFactory urlFactory = new PseudoAuthenticatedURLConnectionFactory(); |
There was a problem hiding this comment.
If it was removed because HttpURLConnectionFactory was in jersey 1.x, Can we use HttpUrlConnectorProvider ?
There was a problem hiding this comment.
@Aggarwal-Raghav PseudoAuthenticatedURLConnectionFactory removed in Jersey 2.x and for Pseudo auth strategy for env where delegation token is not supported (hadoop 2.4). So creating client as per below code will be sufficient in my opinion. Please let me know if anything we can check add over here.
return ClientBuilder.newClient();
There was a problem hiding this comment.
I haven't checked this one in detail yet, please let us know @Aggarwal-Raghav if you're fine with @maheshrajus's analysis, so we can maybe move forward
There was a problem hiding this comment.
Same comment as in TimelineReaderFactory the client created is a vanila client. I disagee with the statement
PseudoAuthenticatedURLConnectionFactory removed in Jersey 2.x and for Pseudo auth strategy for env where delegation token is not supported (hadoop 2.4)
I have 2 questions on this statement:
- "PseudoAuthenticatedURLConnectionFactory removed in Jersey 2.x" :
PseudoAuthenticatedURLConnectionFactoryis TEZ internal class, the implementation should have been updated to the Jersey2.xConnectionFactoryinterface, not deleted. - "Pseudo auth strategy for env where delegation token is not supported (hadoop 2.4)" , then why in hadoop-3.4.1 it was still present in TEZ code?
There was a problem hiding this comment.
Fixed. @Aggarwal-Raghav Can you please check and confirm. Thanks !
|
The explicit dependency of |
This comment was marked as outdated.
This comment was marked as outdated.
@Aggarwal-Raghav Better this we will handle separately. thanks ! |
|
💔 -1 overall
This message was automatically generated. |
|
@abstractdog @ayushtkn @Aggarwal-Raghav |
abstractdog
left a comment
There was a problem hiding this comment.
left a few comments @maheshrajus
can you also edit the description to summarize what happened in the PR? I can see that there are a lot of changes in only 4-5 types, so some bullet points would help to get the big picture here
| private Client getHttpClient() { | ||
| if (httpClient == null) { | ||
| ClientConfig config = new DefaultClientConfig(JSONRootElementProvider.App.class); | ||
| HttpURLConnectionFactory urlFactory = new PseudoAuthenticatedURLConnectionFactory(); |
There was a problem hiding this comment.
I haven't checked this one in detail yet, please let us know @Aggarwal-Raghav if you're fine with @maheshrajus's analysis, so we can maybe move forward
| </dependency> | ||
| <dependency> | ||
| <groupId>org.junit.vintage</groupId> | ||
| <artifactId>junit-vintage-engine</artifactId> |
There was a problem hiding this comment.
some modules introduce junit-jupiter-api only, some modules introduce both junit-jupiter-api and junit-vintage-engine, can you explain the reason for this difference for clarity's sake? ideally, I would expect the same pattern amongst different modules of Tez project (given it's not that huge)
There was a problem hiding this comment.
After jersey 2.x upgrade some tests are failed and while fixing the tests we added these dependency.
This module still primarily uses JUnit 4 style tests, but after the upgrade the tests required JUnit 5 platform compatibility dependencies (junit-jupiter-api and junit-vintage-engine) to execute correctly.
junit-vintage-engine is needed to run the existing JUnit 4 tests on the JUnit 5 platform/runtime. Some other modules only required junit-jupiter-api, likely due to differences in their existing test dependency setup.
I limited the changes to the minimum required(junit-jupiter-api) for the affected modules instead of introducing both.
We can raise followup ticket for this to check and fix separately.
There was a problem hiding this comment.
the test dependencies are added into parent pom file.
| return ReflectionUtils.createClazzInstance(authenticatorClazzName); | ||
| } | ||
|
|
||
| private static class TokenAuthenticatedURLConnectionFactory implements HttpURLConnectionFactory { |
There was a problem hiding this comment.
I haven't checked this one in detail yet, please let us know @Aggarwal-Raghav if you're fine with @maheshrajus's analysis, so we can maybe move forward
|
@Aggarwal-Raghav Can you please check and confirm your opinion on above points which we discussed ? Thanks ! |
|
@maheshrajus , I'll check this by EOD or tomorrow. |
|
🎊 +1 overall
This message was automatically generated. |
| return ReflectionUtils.createClazzInstance(authenticatorClazzName); | ||
| } | ||
|
|
||
| private static class TokenAuthenticatedURLConnectionFactory implements HttpURLConnectionFactory { |
There was a problem hiding this comment.
@maheshrajus , I'm not in favour of removing TokenAuthenticatedURLConnectionFactory and PseudoAuthenticatedURLConnectionFactory class .
They were implementing HttpURLConnectionFactory which is jersey 1.x and going forward they should implement HttpUrlConnectorProvider.ConnectionFactory jersey 2.x
To summarize, in this PR you are creating a client without plugging ConnectionFactory
Can you check this patch on top your PR?
Timeline.patch
| private Client getHttpClient() { | ||
| if (httpClient == null) { | ||
| ClientConfig config = new DefaultClientConfig(JSONRootElementProvider.App.class); | ||
| HttpURLConnectionFactory urlFactory = new PseudoAuthenticatedURLConnectionFactory(); |
There was a problem hiding this comment.
Same comment as in TimelineReaderFactory the client created is a vanila client. I disagee with the statement
PseudoAuthenticatedURLConnectionFactory removed in Jersey 2.x and for Pseudo auth strategy for env where delegation token is not supported (hadoop 2.4)
I have 2 questions on this statement:
- "PseudoAuthenticatedURLConnectionFactory removed in Jersey 2.x" :
PseudoAuthenticatedURLConnectionFactoryis TEZ internal class, the implementation should have been updated to the Jersey2.xConnectionFactoryinterface, not deleted. - "Pseudo auth strategy for env where delegation token is not supported (hadoop 2.4)" , then why in hadoop-3.4.1 it was still present in TEZ code?
|
@Aggarwal-Raghav Thanks for your inputs on this PR. Let me check and confirm the above discussed points. |
|
💔 -1 overall
This message was automatically generated. |
|
@Aggarwal-Raghav Fixed your review comments. Can you please check and confirm at your free time. Thanks ! |
|
the latest job (https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-474/9/console) failed with: which is strange as my job is currently successfully passing this point (https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-504/6/) @Aggarwal-Raghav: you had similar failure recently, can you recall if it was a flaky thing or if it was resolved with some magic, rebase, etc.? |
my last run also passed (https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-507/1/console), @maheshrajus , please try to force push again. |
|
@abstractdog @Aggarwal-Raghav After force push the CI pipeline is running fine. Thanks ! |
|
💔 -1 overall
This message was automatically generated. |
What changes were proposed in this PR?
This PR upgrades Tez to support:
As part of the upgrade, several ATS Timeline API response format differences were observed between older Hadoop versions and Hadoop 3.5.0. This PR adds compatibility handling for those schema variations.
Main updates include:
Handled ATS compatibility differences including:
otherinfo↔otherInfoentity↔entityIdentitytype↔entityTypedomain/id↔domainIdAdditional fixes:
How was this patch tested?
jersey upgrade changes taken from PR : #429 by @abstractdog as they need for hadoop 3.5.0 upgrade.