fix: restore post-connect fallback for multi-network stacks on API < 1.44#13629
fix: restore post-connect fallback for multi-network stacks on API < 1.44#13629jotka wants to merge 4 commits intodocker:mainfrom
Conversation
…1.44 Docker Compose 5.1.0 broke multi-network stacks on Docker daemons older than API 1.44 (Docker < 23.0, e.g. Synology DSM 7.1/7.2) with error: "Container cannot be connected to network endpoints" Root cause: defaultNetworkSettings() was refactored to unconditionally include ALL networks in EndpointsConfig for ContainerCreate, without the API version guard that limited this to daemons >= 1.44. The post-connect NetworkConnect() fallback that handled old daemons was also removed. Fix: - In defaultNetworkSettings() (create.go): only add extra networks to EndpointsConfig when apiVersion >= 1.44 - In createMobyContainer() (convergence.go): restore the post-connect fallback that calls NetworkConnect() for each extra network when apiVersion < 1.44 Fixes docker#13628 Signed-off-by: jarek <[email protected]>
8f5763e to
af22a2b
Compare
- Add APIVersion field to createConfigs so createMobyContainer can use the version already fetched by getCreateConfigs instead of calling RuntimeVersion a second time. - Move ContainerInspect after the NetworkConnect loop so the returned container.Summary includes all attached networks. - Add unit tests for the < 1.44 code path in both defaultNetworkSettings and createMobyContainer. Signed-off-by: Jarek Krochmalski <[email protected]>
af22a2b to
7895a5a
Compare
|
Good catch to flag — I had the same instinct initially. But after scanning the codebase, this comparison is safe due to a structural guarantee in compose-go:
The only case where the loop runs is the standard That said, the codebase itself identifies primary by position ( for i, networkKey := range serviceNetworks {
if i == 0 {
continue // primary, already in ContainerCreate
}Functionally equivalent here, but arguably more explicit about intent. Let me know. |
There was a problem hiding this comment.
Pull request overview
Restores Docker Compose’s backward-compatible multi-network container creation behavior for Docker Engine API versions older than 1.44, preventing ContainerCreate failures on legacy daemons (e.g., Docker 20.10 / Synology DSM).
Changes:
- Reintroduced API-version gating so
ContainerCreateonly includes multipleEndpointsConfigentries whenapiVersion >= 1.44. - Restored the legacy post-create network attachment path (
NetworkConnect) forapiVersion < 1.44, and ensuredContainerInspecthappens after legacy network connects. - Added/updated unit tests covering the
< 1.44behavior in both network config generation and container creation.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/compose/create.go | Adds APIVersion to create configs and gates extra EndpointsConfig entries behind apiVersion >= 1.44. |
| pkg/compose/convergence.go | Restores legacy NetworkConnect loop for < 1.44 before inspecting the container. |
| pkg/compose/api_versions.go | Documents/defines the apiVersion144 threshold and its behavioral implications. |
| pkg/compose/create_test.go | Adds a subtest asserting only the primary network is included for API 1.43. |
| pkg/compose/convergence_test.go | Adds a legacy API test asserting NetworkConnect is used and ContainerInspect occurs after connects. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
glours
left a comment
There was a problem hiding this comment.
Hey @jotka
Thanks for this fix, unfortunately it could introduced creation of unwanted orphan containers when the process can't create endpoint or attach to networks.
There is also a separate issue where the fallback decision is based on the daemon’s advertised max API version instead of the negotiated client API version; that part has been handled separately in PR #13690.
| epSettings, err := createEndpointSettings(project, service, number, networkKey, cfgs.Links, opts.UseNetworkAliases) | ||
| if err != nil { | ||
| return created, err | ||
| } | ||
| if _, err := s.apiClient().NetworkConnect(ctx, mobyNetworkName, client.NetworkConnectOptions{ | ||
| Container: response.ID, | ||
| EndpointConfig: epSettings, | ||
| }); err != nil { | ||
| return created, err |
There was a problem hiding this comment.
For API < 1.44, this PR now creates the container first and then attaches secondary networks one by one here. If one of these NetworkConnect() calls fails, the function returns immediately and leaves the just-created container behind on the primary network. That means the legacy fallback is no longer atomic on failure, and a later up / recreate can hit a name conflict against the orphaned container.
I think we should clean up the created container on NetworkConnect failure before returning.
There was a problem hiding this comment.
Good catch, thanks! Added ContainerRemove with Force:true on both error paths (createEndpointSettings and NetworkConnect). Also added a test for the failure case. Re the API version negotiation — noted, will rebase on top of #13690 once it lands.
When NetworkConnect fails for a secondary network on API < 1.44, remove the just-created container before returning the error. This prevents orphan containers that cause name conflicts on retry.
Problem
Docker Compose 5.1.0 broke multi-network stacks on Docker daemons older than API 1.44 (Docker Engine < 23.0), including Synology DSM 7.1 (API 1.41) and DSM 7.2 (API 1.43). The error is:
This is a regression from 5.0.2 which worked correctly on these daemons.
Closes #13628
Context
I maintain Dockhand, a Docker management application. Many of our users run Dockhand on Synology NAS devices (DSM 7.1/7.2), which ship with Docker 20.10 (API 1.41). This regression broke multi-network stack deployments for all of them when we updated docker-compose to 5.1.0. We had to cap docker-compose at 5.0.2 as a workaround while investigating the root cause.
Root Cause
Both regressions were introduced in commit bfb5511 (
go.mod: bump github.com/moby/moby/api v1.53.0, moby/client v0.2.2):1.
defaultNetworkSettings()increate.goThe
versions.GreaterThanOrEqualTo(version, APIVersion144)guard was removed during the moby API bump, makingdefaultNetworkSettingsunconditionally include ALL networks inEndpointsConfigforContainerCreate. Old daemons (< API 1.44) only accept a single entry and reject the request with the error above.2. The post-connect fallback was removed from
convergence.goThe same commit removed the
NetworkConnectfallback loop fromcreateMobyContainer(). Previously (since aaa7ef6), afterContainerCreate, the code checkedversions.LessThan(apiVersion, "1.44")and connected extra networks one-by-one viaNetworkConnect(). This entire block was dropped during the API migration.Fix
First commit (
4f14ab8): Restore backward compatRestores the two behaviors from 5.0.2:
create.go—defaultNetworkSettings(): Only include extra networks inEndpointsConfigwhenapiVersion >= 1.44(guarded by!versions.LessThan(version, apiVersion144)).convergence.go—createMobyContainer(): Restore the post-connect fallback. After container creation, whenapiVersion < 1.44, iterate the service's extra networks and callNetworkConnect()for each, skipping the primary network (already configured viaNetworkModeinContainerCreate).Second commit (
7895a5a)Pass
APIVersionthroughcreateConfigs: AddedAPIVersionfield to thecreateConfigsstruct, populated from the existingapiVersionlocal ingetCreateConfigs(). This eliminates the redundants.RuntimeVersion(ctx)call increateMobyContainer.Reorder
ContainerInspectafterNetworkConnect: Moved the inspect call after theNetworkConnectloop so the returnedcontainer.Summaryincludes all attached networks.Unit tests for the < 1.44 code path:
TestDefaultNetworkSettingssub-test with version"1.43"— assertsEndpointsConfigcontains only the primary network.TestCreateMobyContainerLegacyAPI— assertsContainerCreategets only the primary network,NetworkConnectis called for the secondary, andContainerInspectruns afterNetworkConnect.Testing
TestRunHook_ConsoleSizefailure is a pre-existing PTY crash on macOS/arm64, unrelated to this change and present onmainbefore this fix)ContainerCreatepath