Skip to content

Bump cmf-sdk-go v0.0.5 to v0.0.6 and fix breaking changes#3320

Open
Paras Negi (paras-negi-flink) wants to merge 6 commits intomainfrom
fix/cmf-sdk-v0.0.6-compat
Open

Bump cmf-sdk-go v0.0.5 to v0.0.6 and fix breaking changes#3320
Paras Negi (paras-negi-flink) wants to merge 6 commits intomainfrom
fix/cmf-sdk-v0.0.6-compat

Conversation

@paras-negi-flink
Copy link
Copy Markdown

@paras-negi-flink Paras Negi (paras-negi-flink) commented Apr 14, 2026

What

Bumps cmf-sdk-go from v0.0.5 → v0.0.6 and handles the two breaking changes that ship with it. Applies to Confluent Platform on-prem only.

Breaking change 1: ComputePool.Status type

The generator now produces *map[string]map[string]interface{} for ComputePool.Status (caused by an inline OpenAPI schema with additionalProperties: true instead of a $ref to ComputePoolStatus). The real CMF API returns a flat status like {"phase":"RUNNING","message":null}, which json.Unmarshal cannot fit into the declared inner map[string]interface{}, so the SDK's Execute() fails on every valid 200 response — breaking compute-pool list/describe/create.

Fix: a fallback in pkg/flink/cmf_rest_client.go. When the SDK returns an error but the HTTP status is 2xx, we re-read the buffered response body and parse it ourselves:

  • unmarshalComputePool embeds cmfsdk.ComputePool and shadows Status with json.RawMessage. Everything except status unmarshals via the default path; status is captured as raw JSON, the phase is extracted, and re-encoded into the nested format ({"phase":{"value":"RUNNING"}}) that the existing extractComputePoolPhase helper expects.
  • unmarshalComputePoolsPage does the same for paginated list responses.
  • If the fallback parse also fails, the body is re-buffered so parseSdkError can still produce a useful error message.

Breaking change 2: PostEnvironment.Name type

Changed from string to *string with GetName()/SetName() accessors. Call sites in command_environment_create.go and command_environment_update.go migrated, with nil-name guards in CreateEnvironment/UpdateEnvironment.

Test handler rewrite (forced, not optional)

test/test-server/flink_onprem_handler.go had to be modified because two types it used no longer compile against v0.0.6:

  • cmfsdk.ComputePoolStatus{Phase: phase} — this struct type no longer exists in v0.0.6 (replaced by the generic map).
  • environment.Name reads — Name is now *string, so all reads were migrated to environment.GetName().

While rewriting createComputePool, output is changed to a raw map[string]interface{} with flat status ("phase": "RUNNING") so the mock matches the real CMF API response. This ensures the CLI's fallback path is actually exercised in CI rather than being dead code that only runs in production.

Scope

  • ~55 lines of workaround in pkg/flink/cmf_rest_client.go, isolated to the 3 compute-pool methods
  • No API surface change for consumers of CmfRestClient
  • Golden file changes are null{} only (cosmetic, from test mock defaulting)

Release Notes

Breaking Changes

  • None

New Features

  • None

Bug Fixes

  • Fixed on-prem Flink environment and compute pool commands to work with cmf-sdk-go v0.0.6 SDK type changes.

Checklist

  • I have successfully built and used a custom CLI binary, without linter issues from this PR.
  • I have clearly specified in the What section below whether this PR applies to Confluent Cloud, Confluent Platform, or both.
  • I have verified this PR in Confluent Cloud pre-prod or production environment, if applicable.
  • I have verified this PR in Confluent Platform on-premises environment, if applicable.
  • I have attached manual CLI verification results or screenshots in the Test & Review section below.
  • I have added appropriate CLI integration or unit tests for any new or updated commands and functionality.
  • I confirm that this PR introduces no breaking changes or backward compatibility issues.
  • I have indicated the potential customer impact if something goes wrong in the Blast Radius section below.
  • I have put checkmarks below confirming that the feature associated with this PR is enabled in:
    • Confluent Cloud prod
    • Confluent Cloud stag
    • Confluent Platform
    • Check this box if the feature is enabled for certain organizations only

Blast Radius

Scope is limited to on-prem Flink compute pool and environment commands; Confluent Cloud commands (which use ccloud-sdk-go-v2, not cmf-sdk-go) are completely unaffected.

Tests

 /Users/parasnegi/cli/dist/confluent_darwin_arm64_v8.0/confluent flink compute-pool
  describe pool --url http://localhost:8080 --environment test
  +---------------+--------------------------+
  | Creation Time | 2026-03-25T14:08:09.371Z |
  | Name          | pool                     |
  | Type          | DEDICATED                |
  | Phase         | DEDICATED                |
  +---------------+--------------------------+

  ---
  /Users/parasnegi/cli/dist/confluent_darwin_arm64_v8.0/confluent flink compute-pool
  describe pool --url http://localhost:8080 --environment test -o json

  {
    "apiVersion": "cmf.confluent.io/v1",
    "kind": "ComputePool",
    "metadata": {
      "creationTimestamp": "2026-03-25T14:08:09.371Z",
      "uid": "7cbd1475-53b4-4b71-ba8d-af3e4d7ee6d7",
      "labels": {},
      "annotations": {}
    },
    "spec": {
      "type": "DEDICATED",
      "clusterSpec": {
        "flinkConfiguration": {
          "execution.checkpointing.interval": "2000",
          "taskmanager.numberOfTaskSlots": "4"
        },
        "flinkVersion": "v1_19",
        "image": "519856050701.dkr.ecr.us-west-2.amazonaws.com/docker/dev/confluentinc/cp
  -flink-sql:0.0119-SNAPSHOT",
        "jobManager": {
          "resource": {
            "cpu": 0.5,
            "memory": "1024m"
          }
        },
        "podTemplate": {
          "spec": {
            "imagePullSecrets": [
              {
                "name": "confluent-registry"
              }
            ]
          }
        },
        "taskManager": {
          "resource": {
            "cpu": 0.5,
            "memory": "1024m"
          }
        }
      }
    },
    "status": {
      "phase": "DEDICATED"
    }
  }

ComputePool.Status changed from *ComputePoolStatus to
*map[string]map[string]interface{} and PostEnvironment.Name changed
from string to *string. This updates all call sites to use SDK
getters/setters, adds extractComputePoolPhase() helper for the
generic status map, adds empty-name guards to environment create/update,
and updates test handler and golden files accordingly.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copilot AI review requested due to automatic review settings April 14, 2026 15:51
@confluent-cla-assistant
Copy link
Copy Markdown

🎉 All Contributor License Agreements have been signed. Ready to merge.
Please push an empty commit if you would like to re-run the checks to verify CLA status for all contributors.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the CLI to accommodate breaking changes introduced in cmf-sdk-go v0.0.6, specifically around PostEnvironment.Name becoming a pointer and ComputePool.Status becoming a generic nested map. This keeps on-prem Flink environment and compute-pool commands working and aligns test fixtures with the updated API shapes.

Changes:

  • Bump github.com/confluentinc/cmf-sdk-go from v0.0.5 to v0.0.6 (go.mod/go.sum).
  • Update environment create/update to use SDK getters/setters and add empty-name validation in the CMF REST client.
  • Add extractComputePoolPhase() to read compute pool phase from the new generic status map, and update on-prem compute-pool commands + test server + golden fixtures accordingly.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pkg/flink/cmf_rest_client.go Uses PostEnvironment.GetName() and validates non-empty environment names for create/update.
internal/flink/command_environment_create.go Uses postEnvironment.SetName() to match pointer-based SDK field.
internal/flink/command_environment_update.go Uses postEnvironment.SetName() to match pointer-based SDK field.
internal/flink/command_compute_pool.go Adds extractComputePoolPhase() and updates compute-pool serialization to use it.
internal/flink/command_compute_pool_list_onprem.go Uses extractComputePoolPhase() for list output.
internal/flink/command_compute_pool_describe_onprem.go Uses extractComputePoolPhase() for describe output.
internal/flink/command_compute_pool_create_onprem.go Uses extractComputePoolPhase() for create output.
test/test-server/flink_onprem_handler.go Updates compute-pool and environment handlers to emit/consume new SDK shapes.
test/fixtures/output/flink/compute-pool/list-json.golden Updates expected JSON (clusterSpec now {} instead of null).
test/fixtures/output/flink/compute-pool/describe-success-json.golden Updates expected JSON (clusterSpec now {} instead of null).
go.mod Bumps cmf-sdk-go dependency version.
go.sum Updates checksums for cmf-sdk-go v0.0.6.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/flink/command_compute_pool.go Outdated
The JSON marshal/unmarshal fallback path can never succeed given the SDK
type map[string]map[string]interface{} — marshaling always produces
nested objects which cannot unmarshal into map[string]string. Remove the
unreachable code and stderr warnings to keep the function side-effect free.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
SDK v0.0.6 types ComputePool.Status as *map[string]map[string]interface{}
but the API returns a flat object like {"phase":"RUNNING"}, causing the
SDK's Execute() to fail on valid 200 responses. Add fallback helpers that
re-parse the buffered response body when this occurs.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The previous commit rewrote createComputePool to a raw map and explicitly
set clusterSpec to an empty map, which serialized as {} in golden JSON
outputs (vs the prior null). Dropping the clusterSpec key from the mock
leaves ComputePool.Spec.ClusterSpec at its zero value (nil map), which
serializes as null — matching the original golden files.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The fallback pattern (re-read buffered response body, attempt manual parse,
re-buffer on failure) was duplicated across CreateComputePool,
DescribeComputePool, and ListComputePools. Extract readFallbackBody and
a generic tryComputePoolFallback helper so each call site collapses to
a three-line check, and the failure/re-buffer discipline lives in one
place.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@sonarqube-confluent
Copy link
Copy Markdown

@rmetzger
Copy link
Copy Markdown

What caused breaking change 2?

Comment on lines +63 to +76
// extractComputePoolPhase extracts the phase string from the generic status map.
// ComputePool.Status is typed as *map[string]map[string]interface{} in SDK v0.0.6.
func extractComputePoolPhase(pool cmfsdk.ComputePool) string {
if pool.Status == nil {
return ""
}
status := pool.GetStatus()
if phaseMap, ok := status["phase"]; ok {
if value, ok := phaseMap["value"].(string); ok {
return value
}
}
return ""
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to move this utility into the SDK?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we don't do human code push in cmf-go-sdk, but if we want to : Adds a custom UnmarshalJSON method on the ComputePool type (in a new file v1/model_compute_pool_custom.go ) seems to be the better fix, which completely removes "custom unmarshaling with json.RawMessage embedding" from cli repo

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we don't do human code push in cmf-go-sdk, but if we want to

My only worry about human pushes in the sdk repo is that the code generator is overwriting human written code / utils.
But when we started the go sdk repo, I actually had additional, non generated code for shared utilities in mind.

}

if sdkComputePool.Status != nil {
if phase := extractComputePoolPhase(sdkComputePool); phase != "" {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we always call this method, also when the CLI runs against older CMF versions? Probably yes? because the response has not changed, only the SDK.

}

// readFallbackBody returns the HTTP response body when the SDK's Execute() returned
// an error despite a successful 2xx status — the ComputePool deserialization bug.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "the ComputePool deserialization bug"? I don't think this is a bug?
It's just that the SDK's interfaces have changed

@rmetzger
Copy link
Copy Markdown

err != nil && statusCode in [200,300) fires on any decode failure, not just the known Status-shape bug. Scenarios it could mask:

  1. SDK adds a new broken field unrelated to Status → fallback "succeeds" by ignoring that field, we silently return a partially-correct pool
  2. Response truncation on a 2xx → fallback parse fails, we fall through to parseSdkError which reports the wrong cause

narrow the error signature. The v0.0.6 bug produces a specific *json.UnmarshalTypeError on the Status field. Match on that:

var typeErr *json.UnmarshalTypeError
if !errors.As(err, &typeErr) || typeErr.Field != "status" {
    return nil   // not the bug we're working around
}

Brittle if the SDK changes error wrapping, but explicitly scoped to the known bug. Any other decode failure now surfaces correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants