-
Notifications
You must be signed in to change notification settings - Fork 201
[BUG] Inflated ReadyReplicas During Rolling Updates in k8s #4498
Copy link
Copy link
Open
Labels
bugSomething isn't workingSomething isn't workingkubernetesItems related to KubernetesItems related to Kubernetesoperator
Description
Description
MCPServer status reports an inflated number of ReadyReplicas during rolling updates.
This occurs because the reconciliation logic counts all pods matching the toolhive-name label, including pods that are terminating but still marked as ready.
Environment
- Operator Version: v0.14.1
- CRD Version: v0.14.1
Current Behavior
During rollout:
- Old pods enter Terminating state but may still be counted as Ready
- New pods start concurrently
- ReadyReplicas temporarily exceeds desired replica count
Example:
- Desired replicas: 2
- Reported ReadyReplicas: 3 or 4 during rollout
Expected Behavior
- ReadyReplicas should reflect only active, non-terminating, ready pods
- Should not exceed desired replica count under normal conditions
Root Cause
- Status logic includes pods with:
- Matching label
- Non-nil DeletionTimestamp (terminating pods)
Proposed Solutions
1. Filter Terminating Pods
Update updateMCPServerStatus:
- Exclude pods where DeletionTimestamp is set
2. Improve Label Selectors
Introduce clearer labels:
- toolhive-tool-type: proxy
- toolhive-tool-type: mcp-server
This allows:
- More accurate filtering
- Potential separation of proxy vs backend status
3. Optional Enhancement
Consider reporting:
- Proxy replicas
- Backend replicas
separately for better observability
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingkubernetesItems related to KubernetesItems related to Kubernetesoperator