Fix the pipeline failure about java - spring - ci#48488
Draft
Conversation
Member
Author
|
/azp run java - spring - ci |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a timeout to Maven Surefire forked JVMs in the Azure SDK Java client parent POM to prevent CI jobs from hanging indefinitely when a forked test process crashes (e.g., corrupted channel / dead fork).
Changes:
- Configure
maven-surefire-pluginwithforkedProcessTimeoutInSeconds=600inazure-client-sdk-parentto fail fast instead of waiting forever on a dead forked JVM.
java - spring - ci
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The Spring CI pipeline job "Test ubuntu2404_121_NotFromSource_TestsOnly" intermittently hangs until the 60-minute pipeline timeout kills it. The failure is not deterministic — retrying the pipeline typically resolves it.
Root Cause Analysis
Investigation traced the hang to a JVM crash (EXCEPTION_ACCESS_VIOLATION in Class.getDeclaredConstructors0) occurring inside a Surefire forked JVM during Spring Boot test context creation. The crash happens under high concurrent class-loading pressure when:
Maven runs multiple modules in parallel (-T 1C)
Each module forks a JVM for testing (forkCount=1)
JUnit 5 parallel test execution is enabled (parallelizeTests=concurrent)
Mockito/byte-buddy dynamically generates proxy bytecode alongside Spring's ApplicationContext proxy creation
When the forked JVM crashes, Surefire reports "Corrupted channel by directly writing to native stream in forked JVM" but has no timeout configured — so Maven waits indefinitely for the dead process, causing the pipeline to hang until the Azure DevOps job timeout (60 minutes) kills the entire build.
Fix
Added 600 to the maven-surefire-plugin configuration in azure-client-sdk-parent. This gives each forked test JVM a 10-minute timeout. If a fork crashes or hangs, Surefire will detect the timeout, kill the process, and report a clear error instead of waiting indefinitely.
Verification
The 600-second (10-minute) timeout is generous enough that no legitimate test run should hit it — the full Spring autoconfigure module (1073 tests) completes in ~32 seconds locally.
When a fork does crash/hang, the pipeline will now fail fast with a descriptive timeout error and free the CI agent, rather than consuming the full 60-minute job timeout.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines