Skip to content

[shimV2] [VM controller] refactor + adds Update + Device controllers#2663

Merged
rawahars merged 5 commits intomicrosoft:mainfrom
rawahars:vm_controller_changes
Apr 16, 2026
Merged

[shimV2] [VM controller] refactor + adds Update + Device controllers#2663
rawahars merged 5 commits intomicrosoft:mainfrom
rawahars:vm_controller_changes

Conversation

@rawahars
Copy link
Copy Markdown
Contributor

@rawahars rawahars commented Apr 6, 2026

Summary

In this change, we are adding the following changes-

  • Following the shim V2 standard, we are declaring the interfaces at the place where the APIs are used. Therefore, we are deleting all the interfaces from the existing packages.
  • We are refactoring the vm controller to bring it to the same standard as rest of the controllers.

Additionally, we are adding-

  • Update functionality with LCOW and WCOW specific implementation
  • RuntimeID API to return the HCS compute ID
  • APIs to obtain device controllers
    • SCSI controller which is initialized eagerly on VM create
    • VPCI controller which is lazy initialized
    • Plan9 controller which is lazy initialized specifically for lcow builds

ID string

// HCSDocument specifies the HCS schema document used to create the VM.
HCSDocument *hcsschema.ComputeSystem
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self, I would have imagined the Create call is what makes this. How does the caller have it for create opts?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we build the document in the service layer and pass it along to the VM controller.
Ref-

hcsDocument, sandboxOptions, err := lcow.BuildSandboxConfig(ctx, ShimName, request.BundlePath, shimOpts, &sandboxSpec)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Lets chat on this one. I'm not sure

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline- Created an issue for refactor- #2687

Comment thread internal/controller/vm/vm.go Outdated
// NetworkController returns a new controller for managing network devices on the VM.
// Since we have a namespace per pod, we create a new controller per call.
func (c *Controller) NetworkController() *network.Controller {
return network.New(c.uvm, c.guest, c.guest)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New? Why not same one each time?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So networking is local to each pod i.e. each pod has exactly one network namespace and associated endpoints in them. Therefore, it seemed more natural to have a new controller for each pod which controlled per-pod settings.

Device controllers such as SCSI or Plan9 are different since they need holistic view of the VM for vm-host side allocation as well as guest side mount paths.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Lets chat on this one too

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline- Created an issue for refactor- #2688


// Iterate over the well-known controller GUIDs so the slice index gives us
// the correct controller number directly.
for ctrlIdx, guid := range guestrequest.ScsiControllerGuids {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont get this? Are you saying we force all controllers to have the same guid?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can have maximum 4 SCSI controllers and each has a specific GUID.
Reference-

// V5 GUIDs for SCSI controllers

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still dont get it. Why cant you just do doc.devices.scsi[guid]; !ok thats an empty guid you can allocate

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we iterated over the map directly, we'd have the GUID but not the controller number. We'd then need to reverse-lookup the index in ScsiControllerGuids for every GUID.

Honestly, this is because we use controller index which is an index into guestrequest.ScsiControllerGuids. We could have used controller GUIDs directly, but then slotting logic would have became complicated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline- Created an issue for refactor- #2689

Comment thread internal/controller/vm/vm_lcow.go Outdated
Comment thread internal/controller/vm/vm_unsupported.go Outdated
Comment thread internal/controller/vm/vm_wcow.go Outdated
@rawahars rawahars force-pushed the vm_controller_changes branch from 59b1602 to 50bcc09 Compare April 15, 2026 07:08
Following the shim V2 standard, we are declaring the interfaces at the place where the APIs are used. Therefore, we are deleting all the interfaces from the existing packages.

Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>
In this commit, we are refactoring the vm controller to bring it to the same standard as rest of the controllers.

Additionally, we are adding-
- Update functionality
- RuntimeID API
- APIs to obtain device controllers

Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>
Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>
Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>
@rawahars rawahars force-pushed the vm_controller_changes branch 5 times, most recently from c66dd9d to 16ce43e Compare April 16, 2026 04:59
Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>
@rawahars rawahars force-pushed the vm_controller_changes branch from 16ce43e to 00c98b1 Compare April 16, 2026 05:24
@rawahars
Copy link
Copy Markdown
Contributor Author

Rebased the PR to mainline which should eliminate PR checks.
No functional changes to the PR in the last push.

The added changes are for unblocking CI errors.

Copy link
Copy Markdown
Contributor

@jterry75 jterry75 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved remaining 3 will be follow ups.

@rawahars rawahars merged commit f8e5b68 into microsoft:main Apr 16, 2026
19 checks passed
@rawahars rawahars deleted the vm_controller_changes branch April 16, 2026 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants