Skip to content

M #-: Add Open vSwitch role#168

Open
sk4zuzu wants to merge 14 commits intomasterfrom
add-openvswitch-role
Open

M #-: Add Open vSwitch role#168
sk4zuzu wants to merge 14 commits intomasterfrom
add-openvswitch-role

Conversation

@sk4zuzu
Copy link
Copy Markdown
Collaborator

@sk4zuzu sk4zuzu commented Feb 12, 2026

No description provided.

@sk4zuzu sk4zuzu requested review from rsmontero and tinova February 12, 2026 15:33
@sk4zuzu sk4zuzu force-pushed the add-openvswitch-role branch from 80280d3 to 40bc416 Compare March 12, 2026 10:25
@sk4zuzu sk4zuzu changed the title M #-: Add Open vSwitch role (WIP) M #-: Add Open vSwitch role Mar 12, 2026
@sk4zuzu sk4zuzu marked this pull request as ready for review March 12, 2026 10:29
@sk4zuzu sk4zuzu force-pushed the add-openvswitch-role branch 2 times, most recently from c816666 to 6a91a50 Compare March 23, 2026 17:18
@dann1
Copy link
Copy Markdown
Collaborator

dann1 commented Apr 8, 2026

Hi @sk4zuzu. I've noticed some issues with the systemd unit execution when no ipv4 configuration is declared on the bridge.

If the ipv4 configuration is missing from the ovs.br section in the inventory

            addrs:
              - cidr: "{{ ansible_default_ipv4.address ~ '/' ~ ansible_default_ipv4.prefix }}"
                metric: 400
            gw: "{{ ansible_default_ipv4.gateway }}"
            dns: ["{{ ansible_default_ipv4.gateway }}"]

this leads to failing the systemd unit with a bash error

[root@sm15 ovs]# systemctl status opennebula-ovs.service
× opennebula-ovs.service - OVS Bridge Interface Network configuration
     Loaded: loaded (/etc/systemd/system/opennebula-ovs.service; disabled; preset: disabled)
     Active: failed (Result: exit-code) since Thu 2026-03-19 16:05:11 UTC; 20min ago
   Main PID: 14378 (code=exited, status=2)
        CPU: 648ms

Mar 19 16:05:11 sm15.local opennebula-ovs.sh[14378]: Stopping and disabling NetworkManager
Mar 19 16:05:11 sm15.local systemctl[14526]: Removed "/etc/systemd/system/multi-user.target.wants/NetworkManager.service".
Mar 19 16:05:11 sm15.local systemctl[14526]: Removed "/etc/systemd/system/dbus-org.freedesktop.nm-dispatcher.service".
Mar 19 16:05:11 sm15.local systemctl[14526]: Removed "/etc/systemd/system/network-online.target.wants/NetworkManager-wait-online.service".
Mar 19 16:05:11 sm15.local opennebula-ovs.sh[14378]: Flushing ovsbr0
Mar 19 16:05:11 sm15.local opennebula-ovs.sh[14378]: Bringing up ovsbr0
Mar 19 16:05:11 sm15.local opennebula-ovs.sh[14378]: /usr/local/sbin/opennebula-ovs.sh: line 97: syntax error near unexpected token `fi'
Mar 19 16:05:11 sm15.local systemd[1]: opennebula-ovs.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Mar 19 16:05:11 sm15.local systemd[1]: opennebula-ovs.service: Failed with result 'exit-code'.
Mar 19 16:05:11 sm15.local systemd[1]: Failed to start OVS Bridge Interface Network configuration.

The "compiled bash" has an empty if logic code in this case

[root@sm15 ovs]# tail -n +90 /usr/local/sbin/opennebula-ovs.sh

# --- Nameserver configuration

if type -p systemctl resolvectl &>/dev/null && systemctl is-active --quiet systemd-resolved; then
    # dpdk0
    # dpdk1
    # ovsbr0
fi

# --- Networking refresh

# ovsbr0

# ---

log 'OVS network configuration completed successfully'
exit 0
[root@sm15 ovs]# if type -p systemctl resolvectl &>/dev/null && systemctl is-active --quiet systemd-resolved; then
    # dpdk0
    # dpdk1
    # ovsbr0
fi
-bash: syntax error near unexpected token `fi'

This also leads to losing connectivity post reboot since NetworkManager is disabled.

In general, we shouldn't tinker with ipv4 related networking configuration if we don't define it on the bridge. We could also do this transparently possibly, by detecting if the interfaces used as bridge ports have ip configuration, but perhaps this is very complex.

@dann1
Copy link
Copy Markdown
Collaborator

dann1 commented Apr 8, 2026

We also need support for traffic mirroring. We validated the method to issue mirroring as follows

# ovs-vsctl add-port ovsbr0 gre0 -- set interface gre0 type=gre options:remote_ip=192.168.150.1 -- --id=@p get port gre0 -- --id=@m create mirror name=gre-mirror  select-all=true output-port=@p  -- set bridge ovsbr0 mirrors=@m


# ovs-vsctl list mirror
_uuid               : 6cb0a084-8446-4654-bab0-e1e16c503256
external_ids        : {}
name                : gre-mirror
output_port         : 540897bf-bbda-4fe7-9395-88b0fc97afb3
output_vlan         : []
select_all          : true
select_dst_port     : []
select_src_port     : []
select_vlan         : []
snaplen             : []
statistics          : {tx_bytes=1573748, tx_packets=2822}

The bash logic to create mirrors

if [[ -n $OVS_MIRROR ]]; then
    ovs-vsctl add-port $BRIDGE $GRE \
    -- set interface $GRE type=gre options:remote_ip=$MIRROR_HOST \
    -- --id=@p get port $GRE \
    -- --id=@m create mirror name=$OVS_MIRROR select-all=true output-port=@p \
    -- set bridge $BRIDGE mirrors=@m
fi

We can split that into port creation

ovs-vsctl add-port $BRIDGE $GRE -- set interface $GRE type=gre -- set interface $GRE options=remote_ip=$MIRROR_HOST

and mirror creation

ovs-vsctl --id=@p get port $GRE \
-- --id=@m create mirror name=$OVS_MIRROR select-all=true output-port=@p \
-- set bridge $BRIDGE mirrors=@m

Example output of bridge state

[root@sm15 ~]# ovs-vsctl show
f4eabec7-0b73-4a45-9024-a640953fde18
    Bridge ovsbr0
        datapath_type: netdev
        Port ovsbr0
            tag: 741
            Interface ovsbr0
                type: internal
        Port gre1
            Interface gre1
                type: gre
                options: {remote_ip="10.0.1.15"}
        Port bond0
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:01:00.0"}
            Interface dpdk1
                type: dpdk
                options: {dpdk-devargs="0000:01:00.1"}
    ovs_version: "3.6.0-12.el9fdp"

I tried setting that in the ovs.iface section, but it looks like this is for physical interfaces

        port:
          ovsbr0:
            set:
              - tag: 741
        iface:
          ovsbr0:
            set:
              - mtu_request: 1500
          gre1:
            set:
              - type: >-
                  gre options:remote_ip=sm14 -- --id@p get port gre1 -- --id=@m create mirror name=m0
                  select-all=true output-port=@p -- set bridge ovsbr0 mirrors=@m

which yield this error

TASK [opennebula.deploy.openvswitch : Query udev for PCI addresses] **************************************************************************************************************
Tuesday 07 April 2026  12:46:25 +0200 (0:00:00.355)       0:00:09.019 *********
fatal: [sm15]: FAILED! =>
    changed: false
    cmd:
    - udevadm
    - info
    - --query=property
    - --property=ID_PATH
    - --value
    - -p
    - /sys/class/net/gre1
    delta: '0:00:00.008522'
    end: '2026-04-07 10:46:25.909986'
    msg: non-zero return code
    rc: 1
    start: '2026-04-07 10:46:25.901464'
    stderr: 'Unknown device "/sys/class/net/gre1": No such device'
    stderr_lines: <omitted>
    stdout: ''
    stdout_lines: <omitted>

So, in general, we need the ability to add GRE mirrors and it looks like the jinja logic doesn't handle mirroring

I propose this interface to address this. Does it make sense ?

First we declare the gre port as an interface. This requires the current logic to skip udev query

      ovs:
        iface:
          gre1:
            set:
              - type: gre
              - options: remote_ip=10.0.1.15

then we declare the mirror

      ovs:
        mirror:
          m0:
            set:
              - output-port: gre1
              - select_all: true
            bridge: ovsbr0

The mirror cannot be created without adding it to a bridge in the same command, otherwise it could be possible to declare it on the bridge set

        br:
          ovsbr0:
            ports: [bond0, gre1]
            set:
              - datapath_type: netdev
              - mirrors: m0

@sk4zuzu
Copy link
Copy Markdown
Collaborator Author

sk4zuzu commented Apr 8, 2026

@dann1 the first problem is just a mistake, I believe this worked correctly at the beginning, but then I modified the code and probably missed it. Second problem with the udev error, the code simply needs a better whitelist of interface types that are not purely OVS resources to do checks only for them. Third problem "The mirror cannot be created without adding it to a bridge in the same command", then yes your suggestion makes sense to implement the ovs.mirror thing. 👍

@dann1
Copy link
Copy Markdown
Collaborator

dann1 commented Apr 8, 2026

There is also an edge case with DPDK. For mellanox cards, the mlx5_core driver can be used for DPDK interfaces. If binding to vfio-pci, it will output errors

[root@sm15 ~]# ovs-vsctl add-br ovsbr0 -- set bridge ovsbr0 datapath_type=netdev
[root@sm15 ~]# ovs-vsctl add-port ovsbr0 vmnic2 -- set Interface vmnic2 type=dpdk options:dpdk-devargs=0000:81:00.0
[root@sm15 ~]# ovs-vsctl show
f4eabec7-0b73-4a45-9024-a640953fde18
    Bridge ovsbr0
        datapath_type: netdev
        Port ovsbr0
            Interface ovsbr0
                type: internal
        Port vmnic2
            Interface vmnic2
                type: dpdk
                options: {dpdk-devargs="0000:81:00.0"}
    ovs_version: "3.6.0-12.el9fdp"
[root@sm15 ~]# ovs-vsctl add-port ovsbr0 paco111 -- set Interface paco111 type=dpdk options:dpdk-devargs=0000:81:00.1
[root@sm15 ~]# ovs-vsctl show
f4eabec7-0b73-4a45-9024-a640953fde18
    Bridge ovsbr0
        datapath_type: netdev
        Port ovsbr0
            Interface ovsbr0
                type: internal
        Port paco111
            Interface paco111
                type: dpdk
                options: {dpdk-devargs="0000:81:00.1"}
        Port vmnic2
            Interface vmnic2
                type: dpdk
                options: {dpdk-devargs="0000:81:00.0"}
    ovs_version: "3.6.0-12.el9fdp"
[root@sm15 ~]# dpdk-devbind.py --status-dev net

Network devices using kernel driver
===================================
0000:01:00.0 'BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller 16d8' numa_node=0 if=vmnic0 drv=bnxt_en unused=vfio-pci *Active*
0000:01:00.1 'BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller 16d8' numa_node=0 if=vmnic3 drv=bnxt_en unused=vfio-pci
0000:81:00.0 'MT27710 Family [ConnectX-4 Lx] 1015' numa_node=0 if=vmnic2 drv=mlx5_core unused=vfio-pci
0000:81:00.1 'MT27710 Family [ConnectX-4 Lx] 1015' numa_node=0 if=paco111 drv=mlx5_core unused=vfio-pci
[root@sm15 ~]# ovs-vsctl del-port ovsbr0 vmnic2
[root@sm15 ~]# dpdk-devbind.py -b vfio-pci 0000:81:00.0
[root@sm15 ~]# ovs-vsctl add-port ovsbr0 vmnic2 -- set Interface vmnic2 type=dpdk options:dpdk-devargs=0000:81:00.0
ovs-vsctl: Error detected while setting up 'vmnic2': vmnic2: Cannot get flow control parameters: Input/output error.  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch".
[root@sm15 ~]# ovs-vsctl show
f4eabec7-0b73-4a45-9024-a640953fde18
    Bridge ovsbr0
        datapath_type: netdev
        Port ovsbr0
            Interface ovsbr0
                type: internal
        Port vmnic2
            Interface vmnic2
                type: dpdk
                options: {dpdk-devargs="0000:81:00.0"}
                error: "vmnic2: Cannot get flow control parameters: Input/output error"
        Port paco111
            Interface paco111
                type: dpdk
                options: {dpdk-devargs="0000:81:00.1"}
    ovs_version: "3.6.0-12.el9fdp"
[root@sm15 ~]#

So maybe we can do some driver detection prior to the binding and skip if driver is DPDK capable. As of now we only know this is an issue with mlx5_core driver.
Or have a driver established in the interface options in order to control the binding, like

          vmnic0:
            set:
              - type: dpdk
              - options:dpdk-devargs: '0000:01:00.0'
              - mtu_request: 9126
            driver: vfio-pci
          paco111:
            set:
              - type: dpdk
              - options:dpdk-devargs: '0000:81:00.0'
              - mtu_request: 9126
            driver: mlx5_core

@sk4zuzu
Copy link
Copy Markdown
Collaborator Author

sk4zuzu commented Apr 8, 2026

@dann1 I see, so it's similar issue as with SR-IOV, so unhardcoding the driver is required I guess. 🤔
So yes ability to override the driver is a good idea imo. 👍

@sk4zuzu
Copy link
Copy Markdown
Collaborator Author

sk4zuzu commented Apr 9, 2026

@dann1 Please take a look at last 3 commits, they should be handling:

the first problem is just a mistake, I believe this worked correctly at the beginning, but then I modified the code and probably missed it.

Second problem with the udev error, the code simply needs a better whitelist of interface types that are not purely OVS resources to do checks only for them.

Or have a driver established in the interface options in order to control the binding

For the mellanox card you can use driver: omit I guess 🤔, this is modelled after #177.

I'll try to implement the ovs.mirror thing next. 👌 😇

@rsmontero
Copy link
Copy Markdown
Member

Mirroring port traffic in the OVS should be implemented at the API level / ovs driver and not in the configuration phase. I will skip this in one-deploy, at least for now.

@dann1
Copy link
Copy Markdown
Collaborator

dann1 commented Apr 14, 2026

Systemd unit doesn't fail anymore 👌 Some more feedback.

Checksum offload

I see the following in the jinja scripting logic

{% for br in _dpdk_br %}
log 'Disabling checksum offloading for {{ br }}'
ethtool --offload '{{ br }}' tx off rx off || log 'WARNING: Failed to disable checksum offloading for {{ br }}'
{% endfor %}

While doing testing before the ovs role automation existed we didn't need this change. Is there some unwanted behavior when not disabling the offloading in the DPDK bridge ? We did however, found a seemingly related problem.

When consuming DPDK bridges with regular OVS system interfaces (instead of vhostuserclient interfaces), that is, consuming the OVS virtual network backed by a DPDK bridge, but with BRIDGE_TYPE=openvswitch, we had to run the same command within the Guest OS of the VM so that TCP connections could be established to the Guest OS. However, this TCP issues only appeared, when having this sort of "mixed" datapath.

NetworkManager and Netplan

For the use case that warranted this OVS role, disabling the network configuration logic is perfect. In this case the management interfaces are the same ones becoming bonds.

However, if we don't declare ipv4 configuration over the bridge, then there is no need to disable the OS network management logic. If the server management interfaces are not used as OVS bridge interfaces, then the host becomes unreachable on the next power cycle.

[root@sm15 ~]# ovs-vsctl show
f4eabec7-0b73-4a45-9024-a640953fde18
    Bridge ovsbr0
        datapath_type: netdev
        Port bond0
            Interface dpdk1
                type: dpdk
                options: {dpdk-devargs="0000:01:00.1"}
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:01:00.0"}
        Port ovsbr0
            tag: 741
            Interface ovsbr0
                type: internal
    ovs_version: "3.6.0-12.el9fdp"
[root@sm15 ~]# systemctl status opennebula-ovs.service
● opennebula-ovs.service - OVS Bridge Interface Network configuration
     Loaded: loaded (/etc/systemd/system/opennebula-ovs.service; enabled; preset: disabled)
     Active: active (exited) since Tue 2026-04-14 10:56:39 UTC; 28min ago
    Process: 2299279 ExecStart=/usr/local/sbin/opennebula-ovs.sh (code=exited, status=0/SUCCESS)
   Main PID: 2299279 (code=exited, status=0/SUCCESS)
        CPU: 628ms

Apr 14 10:56:39 sm15.local ovs-vsctl[2299398]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -- --if-exists set Interface ovsbr0 mtu_request=1500
Apr 14 10:56:39 sm15.local opennebula-ovs.sh[2299279]: Reconfiguring interface dpdk0
Apr 14 10:56:39 sm15.local ovs-vsctl[2299399]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -- --if-exists set Interface dpdk0 type=dpdk -- --if-exists set Interface dpdk0 options:d>
Apr 14 10:56:39 sm15.local opennebula-ovs.sh[2299279]: Reconfiguring interface dpdk1
Apr 14 10:56:39 sm15.local ovs-vsctl[2299400]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -- --if-exists set Interface dpdk1 type=dpdk -- --if-exists set Interface dpdk1 options:d>
Apr 14 10:56:39 sm15.local opennebula-ovs.sh[2299279]: Disabling checksum offloading for ovsbr0
Apr 14 10:56:39 sm15.local opennebula-ovs.sh[2299279]: Flushing ovsbr0
Apr 14 10:56:39 sm15.local opennebula-ovs.sh[2299279]: Bringing up ovsbr0
Apr 14 10:56:39 sm15.local opennebula-ovs.sh[2299279]: OVS network configuration completed successfully
Apr 14 10:56:39 sm15.local systemd[1]: Finished OVS Bridge Interface Network configuration.
[root@sm15 ~]# ip addr show vmnic2
5: vmnic2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 7c:c2:55:88:48:fa brd ff:ff:ff:ff:ff:ff
    inet 10.0.1.16/24 brd 10.0.1.255 scope global noprefixroute vmnic2
       valid_lft forever preferred_lft forever
    inet6 fe80::7ec2:55ff:fe88:48fa/64 scope link
       valid_lft forever preferred_lft forever
[root@sm15 ~]# nmcli con show vmnic2
Error: NetworkManager is not running.

Do you think it is reasonable to avoid this deactivation if no addrs|gw|dns is declared in the inventory ?

Permissions

In rhel, the openvswitch daemon runs as a dedicated user. When creating VMs with NICs in a DPDK bridge, SELINUX denies libvirt the socket creation and openvwitch the socket connection.
To address this we had to label the filesystem where the socket resides, apply a custom selinux module and issue posix acls. We also made some changes in 7.2 regarding where these socekts are stored to avoid placing them on a system datastore mountpoint. In 7.2 packaging we are also creating this directory in install scripts

# Create DPDK vhost sockets directory
mkdir -p /var/lib/one/vhost-sockets
chown oneadmin:oneadmin /var/lib/one/vhost-sockets

In the project specific logic we made a compatible-ish ovs role that runs these tasks. Consuming as well a helper role that allows injecting selinux modules and labels. Since we need this to make VMs over DPDK work in rhel we need this logic in the role as well.

Would you like me to upload this ansible logic to a branch for a reference of the needed changes ?

@sk4zuzu
Copy link
Copy Markdown
Collaborator Author

sk4zuzu commented Apr 14, 2026

@dann1 Hi, thank you for feedback! 👍 😍

  1. Disabling checksum offloading is required for Debian/Ubuntu OVS/DPDK, for some reason OVS/DPDK comming from RHEL has this sorted out out-of-the-box. If this is not done in Debian/Ubuntu then only ICMP works and every other packet is rejected because of invalid checksum. I guess we can do this only where it's really required. If you have better suggestion on what can be done to make it work without ethtool, pls let me know 🙏 🥹 (that's the only I've found)
  2. NetworkManager/Neplan I think you're right, it could be suppressed, it would be like an additional mixed mode. 🤔
  3. Yes please.

We actually have more serious problem it seems, another needed change is integration with the infra playbook as there's a scenario where FE VMs are expected to use <interface type='vhostuser'>, but ONVS/DPDK config does not exist at this point. I think we need to split this role in the similar way as pci role is split, or just move it entirely to pre playbook. So despite it has been approved, the role needs more work. 👌 😇

@sk4zuzu
Copy link
Copy Markdown
Collaborator Author

sk4zuzu commented Apr 14, 2026

There is yet another problem of conflicting OVS/DPDK and SR-IOV roles when their configs point to overlapping PCI devices... 😞 This one is a bit awkward to solve.

sk4zuzu added 8 commits April 14, 2026 17:58
- Handle both kernel and DPDK networking in OVS
- Allow for creation of arbitrary number of bonds and bridges
- Assert that 'ovs' structure is valid before applying any changes
- Fix up checksum offloading for Debian-like distros
- Make OVS configuration persistent (across reboots)

Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
- Use external_ids to mark and manage only subset of OVS resources
- Auto-cleanup OVS ports and bridges managed by one-deploy
- Allow for interface and bridge re-configurations (limited)
- Re-assembly bond resources on interface changes (fix)
- Make it possible to move interfaces between different OVS bridges / bonds
- Improve readability of opennebula-ovs.sh.jinja template

Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
- Allow for setting options on "internal" ports
- Recognize "internal" ports automatically matching br names
- Update README.md to include "internal" port example
Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
- Add ovs.port dictionary
- Remove all/ALL prefixes from variable names and messages

Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
@dann1
Copy link
Copy Markdown
Collaborator

dann1 commented Apr 15, 2026

  1. For rhel this change didn't affect (at glance) the VMs connectivity. Since it looks like it is prone to distribution I think we can keep it. Maybe signal it with a variable. While diagnosing the TCP issues I found this TSO doc in OVS. It is prefaced as experimental, but it suggests that TSO can be delegated to OVS. I'll tinker a bit more with it on rhel ovs + mixed datapath. ethtool was the only way I found as well.
  2. Uploaded the changes to the selinux branch. It worked for us, but perhaps the implementation could be done better. It is not linked to the main playbook here, has some dedicated test playbooks.

We actually have more serious problem it seems, another needed change is integration with the infra playbook as there's a scenario where FE VMs are expected to use , but ONVS/DPDK config does not exist at this point. I think we need to split this role in the similar way as pci role is split, or just move it entirely to pre playbook. So despite it has been approved, the role needs more work. 👌 😇

In this case we can assume the bridge already exists (someone did manual configuration that one-deploy would do), and verify it with an assertion. But yes, this is more of an infra role and ideally we would have it at an early execution to avoid chicken and egg. Agree.

There is yet another problem of conflicting OVS/DPDK and SR-IOV roles when their configs point to overlapping PCI devices... 😞 This one is a bit awkward to solve.

What is the issue here ? AFAIK, the PCI device would either be used for extracting VFs, PT or DPDK

@sk4zuzu
Copy link
Copy Markdown
Collaborator Author

sk4zuzu commented Apr 15, 2026

The issuse is that pci passthrough both in one-deploy and OpenNebula itself uses wildcards, so you can effectively set driver for a range of devices, but then you can on top of that run OVS/DPDK and claim the same PCI devices twice (by mistake), then if a device is used already either driverctl or dpdk-devbind.py may hang forever, which is a horrible UX.

I think the fix here should be that PCI pasthrough part from helper/pci role should be executed first then we need to compare findings from the lspci_devices fact (produced by helper/pci role) with what user declared for dpdk and fail early without executing any operations on any devices. 🤔

Or we could remove driver management from ovs role and rely on pci role exclusively.

Or finally maybe removing the --force flag from invocation of dpdk-devbind.py would be the lesser evil.

WDYT? 🤔

sk4zuzu added 2 commits April 15, 2026 16:44
- Move 'global' repository management to pre.yml
- Rely on already existing per-role repository management
- Use helper/pci role to pre-configure PCI/SR-IOV early
- Move openvswitch role invocation to pre.yml

Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
@sk4zuzu sk4zuzu force-pushed the add-openvswitch-role branch from 9088758 to 4cb38e2 Compare April 16, 2026 09:44
sk4zuzu added 2 commits April 16, 2026 12:30
Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
@sk4zuzu
Copy link
Copy Markdown
Collaborator Author

sk4zuzu commented Apr 16, 2026

@dann1 Hi :)

NetworkManager/Neplan I think you're right, it could be suppressed, it would be like an additional mixed mode.

I handled this differently, after some testing I think it's better to exclusively move to OVS (I dont want to maintain double-standard kind of a solution as I would have to make it reversable both ways, which is a small nightmare 🤗), just request user to define some IP address -> 3421a66

Uploaded the changes to the selinux branch. It worked for us, but perhaps the implementation could be done better. It is not linked to the main playbook here, has some dedicated test playbooks.

I looked at this one, I think we'd do this in another PR. 🤔

There is yet another problem of conflicting OVS/DPDK and SR-IOV roles when their configs point to overlapping PCI devices... 😞 This one is a bit awkward to solve.

I think this commit 9acb255 should help 👍😇

We actually have more serious problem it seems, another needed change is integration with the infra playbook as there's a scenario where FE VMs are expected to use , but ONVS/DPDK config does not exist at this point.

I managed to make it so all non-nebula PCI/SR-IOV and OVS/DPDK code is run from the pre playbook, which helps with testing and sets the stage for infra role.

@dann1
Copy link
Copy Markdown
Collaborator

dann1 commented Apr 17, 2026

I handled this differently, after some testing I think it's better to exclusively move to OVS (I dont want to maintain double-standard kind of a solution as I would have to make it reversable both ways, which is a small nightmare 🤗), just request user to define some IP address -> 3421a66

Host connectivity can still be borked if you give it a placeholder IP but it seems super reasonable to avoid maintenance nightmares. In the role doc maybe we should state that using this role -> handing over the ip configuration to opennebula-ovs.

OK, since the selinux changes will come in another PR, I think we are good to merge. Do you want an issue open for selinux or will you make a PR directly? I just don't want it to be forgotten since VMs need this.

Thank you VERY much for your patience 🤗

Signed-off-by: Michal Opala <sk4zuzu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants