Skip to content

[FLINK-39490][web] Improve feedback when cluster is unreachable#27971

Open
spuru9 wants to merge 4 commits intoapache:masterfrom
spuru9:fix-lint-and-interceptors
Open

[FLINK-39490][web] Improve feedback when cluster is unreachable#27971
spuru9 wants to merge 4 commits intoapache:masterfrom
spuru9:fix-lint-and-interceptors

Conversation

@spuru9
Copy link
Copy Markdown
Contributor

@spuru9 spuru9 commented Apr 19, 2026

What is the purpose of the change

When the Flink cluster becomes unreachable (JobManager down, network blip, proxy returning 5xx), the dashboard gives no clear signal — pages sit with stale data and scattered per-request errors. This change detects persistent unreachability and surfaces a single, clear notification.

Brief change log

  • Track consecutive network failures (status 0 / >=500) in StatusService and show a single persistent "Network Error" toast once 5 consecutive failures occur. The toast auto-dismisses on the next successful response and re-triggers after another streak if the user dismissed it during an ongoing outage.
  • Minor fixes in the same area: JobService.loadJob now returns EMPTY on error instead of breaking the Job Detail view, and the header badge refreshes immediately when a new server error arrives.

Verifying this change

Verified manually: kill the JobManager and confirm the toast appears once after ~5 failed polls; restart and confirm it auto-dismisses; dismiss manually during an outage and confirm it reappears after another 5 failures.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no

@flinkbot
Copy link
Copy Markdown
Collaborator

flinkbot commented Apr 19, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@spuru9 spuru9 marked this pull request as ready for review April 19, 2026 21:32
@spuru9
Copy link
Copy Markdown
Contributor Author

spuru9 commented Apr 21, 2026

@och5351 Can you check this too if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants