The openstack-operator automates the deployment of an OpenStack dataplane. A dataplane is a collection of nodes that will be used for hosting OpenStack workloads. The openstack-operator prepares the nodes with enough operating system configuration so that they are ready for hosting other required OpenStack services and workloads.

See contributing for notes for developers and contributors, running the operator, building the documentation, etc.

See design for details about the dataplane design.

Creating a DataPlane documents how to create a dataplane.

The documentation source is kept within the openstack-operator repo in the docs directory. The full generated documentation from that source is available at https://openstack-k8s-operators.github.io/openstack-operator/.

Data Plane Design

The openstack-operator provisions and configures nodes that make up the OpenStack data plane. The data plane consists of nodes that host end-user workloads and applications. Depending on the OpenStack deployment, these data plane nodes are often compute nodes, but may also be storage nodes, networker nodes, or other types of nodes.

The openstack-operator provides a Kubernetes like abstraction and API for deploying the data plane. It uses the openstack-baremetal-operator to optionally provision baremetal. It then creates Kubernetes jobs that execute Ansible to deploy, configure, and orchestrate software on the nodes. The software is typically RPM or container based using the podman container runtime.

External Data Plane Management (EDPM) is the concept of using Ansible in this manner to configure software on data plane nodes. Ansible is used instead of using native Kubernetes Workload API’s (Deployment, Job, Pod, etc) and kubelet. While the Ansible executions themselves run on the Kubernetes cluster as native Kubernetes workloads, they communicate using SSH with data plane nodes and use various Ansible modules to deploy software on data plane nodes.

CRD Design and Resources

The openstack-operator exposes the concepts of OpenStackDataPlaneNodeSets, OpenStackDataPlaneServices, and OpenStackDataPlaneDeployments as CRD’s:

The OpenStackDataPlaneNodeSet CRD is used to describe a logical grouping of nodes of a similar type. A node can only be defined in one NodeSet. This is analogous to the concept of "roles" in TripleO. An OpenStack data plane is likely to consist of multiple OpenStackDataPlaneNodeSet resources to describe groups of nodes that are configured differently.

Similarities within a OpenStackDataPlaneNodeSet are defined by the user, and could be of a small scope (ansible port), or a large scope (same network config, nova config, provisioning config, etc). The properties that all nodes in a OpenStackDataPlaneNodeSet share are set in the nodeTemplate field of the OpenStackDataPlaneNodeSet spec. Node specific parameters are then defined under the nodeTemplate.nodes section specific to that node. Options defined here will override the inherited values from the NodeSet.

Dividing and assigning nodes to different OpenStackDataPlaneNodeSets is a design decision by the user. Nodes that are configured mostly the same, are of the same hardware, and serving the same purpose are likely candidates for being in the same OpenStackDataPlaneNodeSet. While hardware differences or differences in purposes (compute vs. netwoker) would lend themselves to nodes being in different OpenStackDataPlaneNodeSets.

OpenStackDataPlaneNodeSet implements a baremetal provisioning interface to provision the nodes if requested. The baremetalSetTemplate field is used to describe the baremetal configuration of the nodes and is used to provision the initial OS on the set of nodes.

The OpenStackDataPlaneService CRD for is an abstraction which combines Ansible content and configuration from Kubernetes ConfigMaps and Secrets. The Ansible content is typically a playbook from edpm-ansible, but can be any Ansible play content. The ConfigMaps and Secrets are typically generated from OpenStack control plane operators, but could be any configuration data that needs to be consumed by the Ansible content.

An OpenStackDataPlaneNodeSet has a list of services that contain the OpenStackDataPlaneService resources for the nodes in that OpenStackDataPlaneNodeSet. Using the services list, users can customize the software that is deployed on the OpenStackDataPlaneNodeSet nodes.

The OpenStackDataPlaneDeployment CRD is used to start an Ansible execution for the list of OpenStackDataPlaneNodeSets on the OpenStackDataPlaneDeployment. Each OpenStackDataPlaneDeployment models a single Ansible execution, and once the execution is successful, the OpenStackDataPlaneDeployment does not automatically execute Ansible again, even if the OpenStackDataPlaneDeployment or related OpenStackDataPlaneNodeSet resources are changed. In order to start another Ansible execution, another OpenStackDataPlaneDeployment resource needs to be created. In this manner, the user maintains explicit control over when Ansible actually executes through the creation of OpenStackDataPlaneDeployment resources.

Creating the data plane

The OpenStack DataPlane consists of CentOS nodes. Use the OpenStackDataPlaneNodeSet custom resource definition (CRD) to create the custom resources (CRs) that define the nodes and the layout of the data plane. You can use pre-provisioned nodes, or provision bare metal nodes as part of the data plane creation and deployment process.

To create and deploy a data plane, you must perform the following tasks:

  1. Create a Secret CR for Ansible to use to execute commands on the data plane nodes.

  2. Create the OpenStackDataPlaneNodeSet CRs that define the nodes and layout of the data plane.

  3. Create the OpenStackDataPlaneDeployment CRs that trigger the Ansible execution to deploy and configure software.

Prerequisites

  • A functional control plane, created with the OpenStack Operator.

  • Pre-provisioned nodes must be configured with an SSH public key in the $HOME/.ssh/authorized_keys file for a user with passwordless sudo privileges.

  • For bare metal nodes that are not pre-provisioned and must be provisioned when creating the OpenStackDataPlaneNodeSet resource:

    • CBO is installed and configured for provisioning.

    • BareMetalHosts registered, inspected, and have the label app:openstack.

  • You are logged on to a workstation that has access to the RHOCP cluster as a user with cluster-admin privileges.

  • OpenShift CLI (oc) 4.14 or higher

Creating the SSH key secrets

You must generate SSH keys and create an SSH key Secret custom resource (CR) for each key to enable the following functionality:

  • You must generate an SSH key to enable Ansible to manage the CentOS nodes on the data plane. Ansible executes commands with this user and key.

  • You must generate an SSH key to enable migration of instances between Compute nodes.

The Secret CRs are used by the data plane nodes to enable secure access between nodes.

Procedure
  1. Create the SSH key pair for Ansible:

    $ KEY_FILE_NAME=<key_file_name>
    $ ssh-keygen -f $KEY_FILE_NAME -N "" -t rsa -b 4096
    • Replace <key_file_name> with the name to use for the key pair.

  2. Create the Secret CR for Ansible and apply it to the cluster:

    $ SECRET_NAME=<secret_name>
    $ oc create secret generic $SECRET_NAME \
    --save-config \
    --dry-run=client \
    [--from-file=authorized_keys=$KEY_FILE_NAME.pub \]
    --from-file=ssh-privatekey=$KEY_FILE_NAME \
    --from-file=ssh-publickey=$KEY_FILE_NAME.pub \
    -n openstack \
    -o yaml | oc apply -f-
    • Replace <secret_name> with the name you want to use for the Secret resource.

    • Include the --from-file=authorized_keys option for bare metal nodes that must be provisioned when creating the data plane.

  3. Create the SSH key pair for instance migration:

    $ ssh-keygen -f ./id -t ecdsa-sha2-nistp521 -N ''
  4. Create the Secret CR for migration and apply it to the cluster:

    $ oc create secret generic nova-migration-ssh-key \
    --from-file=ssh-privatekey=id \
    --from-file=ssh-publickey=id.pub \
    -n openstack \
    -o yaml | oc apply -f-
  5. Verify that the Secret CRs are created:

    $ oc describe secret $SECRET_NAME`

Creating a set of data plane nodes

You use the OpenStackDataPlaneNodeSet CRD to define the data plane and the data plane nodes. An OpenStackDataPlaneNodeSet custom resource (CR) represents a set of nodes of the same type that have similar configuration, comparable to the concept of a "role" in a director-deployed Red Hat OpenStack Platform (RHOSP) environment.

Create an OpenStackDataPlaneNodeSet CR for each logical grouping of nodes in your data plane, for example, nodes grouped by hardware, location, or networking. You can define as many node sets as necessary for your deployment. Each node can be included in only one OpenStackDataPlaneNodeSet CR. Each node set can be connected to only one Compute cell. By default, node sets are connected to cell1. If your control plane includes additional Compute cells, you must specify the cell to which the node set is connected.

Procedure
  1. Create an OpenStackDataPlaneNodeSet CR and save it to a file named openstack-edpm.yaml on your workstation:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    metadata:
      name: openstack-edpm-ipam
    spec:
      ...
  2. The sample OpenStackDataPlaneNodeSet CR is connected to cell1 by default. If you added additional Compute cells to the control plane and you want to connect the node set to one of the other cells, then you must create a custom service for the node set that includes the Secret CR for the cell:

    1. Create a custom nova service that includes the Secret CR for the cell to connect to:

      apiVersion: dataplane.openstack.org/v1beta1
      kind: OpenStackDataPlaneService
      metadata:
        name: nova-cell-custom
        spec:
          label: dataplane-deployment-custom-service
             playbook: osp.edpm.nova
          ...
          secrets:
            - nova-cell2-compute-config (1)
      1 The Secret CR generated by the control plane for the cell.

      For information about how to create a custom service, see Creating a custom service.

    2. Replace the nova service in your OpenStackDataPlaneNodeSet CR with your custom nova service:

      apiVersion: dataplane.openstack.org/v1beta1
      kind: OpenStackDataPlaneNodeSet
      metadata:
        name: openstack-edpm-ipam
      spec:
        services:
          - configure-network
          - validate-network
          - install-os
          - configure-os
          - run-os
          - ovn
          - libvirt
          - nova-cell-custom
          - telemetry
      Do not change the order of the default services.
  3. Update the Secret to the SSH key secret that you created to enable Ansible to connect to the data plane nodes:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    metadata:
      name: openstack-edpm-ipam
    spec:
      nodeTemplate:
        ansibleSSHPrivateKeySecret: <secret-key>
    • Replace <secret-key> with the name of the SSH key Secret CR you created in Creating the SSH key secrets, for example, dataplane-ansible-ssh-private-key-secret.

  4. Optional: Configure the node set for a Compute feature or workload. For more information, see Configuring a node set for a Compute feature or workload.

  5. Optional: The sample OpenStackDataPlaneNodeSet CR that you copied includes the minimum common configuration required for a set of nodes in this group under the nodeTemplate section. Each node in this OpenStackDataPlaneNodeSet inherits this configuration. You can edit the configured values as required, and you can add additional configuration.

    For information about the properties you can use to configure common node attributes, see OpenStackDataPlaneNodeSet CR properties.

  6. Optional: The sample OpenStackDataPlaneNodeSet CR you copied applies the single NIC VLANs network configuration by default to the data plane nodes. You can edit the template that is applied. For example, to configure the data plane for multiple NICS, copy the contents of the roles/edpm_network_config/templates/multiple_nics/multiple_nics.j2 file and add it to your openstack-edpm.yaml file:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    metadata:
      name: openstack-edpm-ipam
    spec:
      ...
      nodeTemplate:
        ...
        ansible:
          ansibleVars:
            edpm_network_config_template: |
                  ---
                  network_config:
                  - type: interface
                    name: nic1
                    mtu: {{ ctlplane_mtu }}
                    dns_servers: {{ ctlplane_dns_nameservers }}
                    domain: {{ dns_search_domains }}
                    routes: {{ ctlplane_host_routes }}
                    use_dhcp: false
                    addresses:
                    - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_subnet_cidr }}
                    {% for network in nodeset_networks %}
                    {% if network not in ["external", "tenant"] %}
                    - type: interface
                      name: nic{{ loop.index +1 }}
                      mtu: {{ lookup('vars', networks_lower[network] ~ '_mtu') }}
                      use_dhcp: false
                      addresses:
                      - ip_netmask:
                        {{ lookup('vars', networks_lower[network] ~ '_ip') }}/{{ lookup('vars', networks_lower[network] ~ '_cidr') }}
                      routes: {{ lookup('vars', networks_lower[network] ~ '_host_routes') }}
                    {% elif 'external_bridge' in nodeset_tags|default([]) %}
                    - type: ovs_bridge
                    {% if network == 'external' %}
                      name: {{ neutron_physical_bridge_name }}
                    {% else %}
                      name: {{ 'br-' ~ networks_lower[network] }}
                    {% endif %}
                      mtu: {{ lookup('vars', networks_lower[network] ~ '_mtu') }}
                      dns_servers: {{ ctlplane_dns_nameservers }}
                      use_dhcp: false
                      addresses:
                      - ip_netmask:
                        {{ lookup('vars', networks_lower[network] ~ '_ip') }}/{{ lookup('vars', networks_lower[network] ~ '_cidr') }}
                      routes: {{ lookup('vars', networks_lower[network] ~ '_host_routes') }}
                      members:
                      - type: interface
                        name: nic{{loop.index + 1}}
                        mtu: {{ lookup('vars', networks_lower[network] ~ '_mtu') }}
                        use_dhcp: false
                        primary: true
                    {% endif %}
                    {% endfor %}
  7. If your nodes are bare metal, you must configure the bare metal template, see Provisioning bare metal data plane nodes.

  8. Optional: The sample OpenStackDataPlaneNodeSet CR you copied includes default node configurations under the nodes section. You can add additional nodes, and edit the configured values as required. For example, to add node-specific Ansible variables that customize the node, add the following configuration to your openstack-edpm.yaml file:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneNodeSet
    metadata:
      name: openstack-edpm-ipam
    spec:
      ...
      nodeTemplate:
        ...
        ansible:
          ...
          ansibleVars:
            rhc_release: 9.2
            rhc_repositories:
                - {name: "*", state: disabled}
                - {name: "rhel-9-for-x86_64-baseos-eus-rpms", state: enabled}
                - {name: "rhel-9-for-x86_64-appstream-eus-rpms", state: enabled}
                - {name: "rhel-9-for-x86_64-highavailability-eus-rpms", state: enabled}
                - {name: "openstack-17.1-for-rhel-9-x86_64-rpms", state: enabled}
                - {name: "fast-datapath-for-rhel-9-x86_64-rpms", state: enabled}
                - {name: "openstack-dev-preview-for-rhel-9-x86_64-rpms", state: enabled}
      ...
      nodes:
        edpm-compute-0: (1)
          hostName: edpm-compute-0
          ansible:
            ansibleHost: 192.168.122.100
            ansibleVars: (2)
              ctlplane_ip: 192.168.122.100
              internalapi_ip: 172.17.0.100
              storage_ip: 172.18.0.100
              tenant_ip: 172.19.0.100
              fqdn_internalapi: edpm-compute-0.example.com
        edpm-compute-1:
          hostName: edpm-compute-1
          ansible:
            ansibleHost: 192.168.122.101
            ansibleVars:
              ctlplane_ip: 192.168.122.101
              internalapi_ip: 172.17.0.101
              storage_ip: 172.18.0.101
              tenant_ip: 172.19.0.101
              fqdn_internalapi: edpm-compute-1.example.com
    1 The node definition reference, for example, edpm-compute-0. Each node in the node set must have a node definition.
    2 Node-specific Ansible variables that customize the node.
    • Nodes defined within the nodes section can configure the same Ansible variables that are configured in the nodeTemplate section. Where an Ansible variable is configured for both a specific node and within the nodeTemplate section, the node-specific values override those from the nodeTemplate section.

    • You do not need to replicate all the nodeTemplate Ansible variables for a node to override the default and set some node-specific values. You only need to configure the Ansible variables you want to override for the node.

    For information about the properties you can use to configure node attributes, see OpenStackDataPlaneNodeSet CR properties. For example OpenStackDataPlaneNodeSet CR nodes definitions, see Example OpenStackDataPlaneNodeSet CR for pre-provisioned nodes or Example OpenStackDataPlaneNodeSet CR for bare metal nodes.

  9. Optional: Customize the container images used by the edpm-ansible roles. The following example shows the default images:

    spec:
      ...
      nodeTemplate:
        ...
        ansible:
          ...
          ansibleVars:
            edpm_iscsid_image: "quay.io/podified-antelope-centos9/openstack-iscsid:current-podified"
            edpm_logrotate_crond_image: "quay.io/podified-antelope-centos9/openstack-cron:current-podified"
            edpm_ovn_controller_agent_image: "quay.io/podified-antelope-centos9/openstack-frr:current-podified"
            edpm_ovn_metadata_agent_image: "quay.io/podified-antelope-centos9/openstack-neutron-metadata-agent-ovn:current-podified"
            edpm_frr_image: "quay.io/podified-antelope-centos9/openstack-frr:current-podified"
            edpm_ovn_bgp_agent_image: "quay.io/podified-antelope-centos9/openstack-ovn-bgp-agent:current-podified"
            telemetry_node_exporter_image: "quay.io/prometheus/node-exporter:v1.5.0"
            edpm_telemetry_kepler_image: "quay.io/sustainable_computing_io/kepler"
            edpm_libvirt_image: "quay.io/podified-antelope-centos9/openstack-nova-libvirt:current-podified"
            edpm_nova_compute_image: "quay.io/podified-antelope-centos9/openstack-nova-compute:current-podified"
            edpm_neutron_sriov_image: "quay.io/podified-antelope-centos9/openstack-neutron-sriov-agent:current-podified"
            edpm_multipathd_image: "quay.io/podified-antelope-centos9/openstack-multipathd:current-podified"
  10. Save the openstack-edpm.yaml definition file.

  11. Create the data plane resources:

    $ oc create -f openstack-edpm.yaml
  12. Verify that the data plane resources have been created:

    $ oc get openstackdataplanenodeset
    NAME           		STATUS MESSAGE
    openstack-edpm-ipam 	False  Deployment not started
  13. Verify that the Secret resource was created for the node set:

    $ oc get secret | grep openstack-edpm-ipam
    dataplanenodeset-openstack-edpm-ipam Opaque 1 3m50s
  14. Verify the services were created:

    $ oc get openstackdataplaneservice
    NAME                AGE
    configure-network   6d7h
    configure-os        6d6h
    install-os          6d6h
    run-os              6d6h
    validate-network    6d6h
    ovn                 6d6h
    libvirt             6d6h
    nova                6d6h
    telemetry           6d6h

Composable services

Composable services with openstack-operator provide a way for users to customize services that are deployed on dataplane nodes. It is possible to "compose" a set of services such that the dataplane deployment can be customized in whichever ways are needed.

Composing services can take different forms. The interfaces in openstack-operator allow for:

  • Enabling/disabling services

  • Ordering services

  • Developing custom services

For the purposes of the interfaces in openstack-operator, a service is an ansible execution that manages a software deployment (installation, configuration, execution, etc) on dataplane nodes. The ansible content that makes up each service is defined by the service itself. Each service is a resource instance of the OpenStackDataPlaneService CRD.

openstack-operator provided services

openstack-operator provides a default list of services that will be deployed on dataplane nodes. The services list is set on the OpenStackDataPlaneNodeSet CRD.

The default list of services as they will appear on the services field on an OpenStackDataPlaneNodeSet spec is:

services:
  - redhat
  - download-cache
  - bootstrap
  - configure-network
  - validate-network
  - install-os
  - configure-os
  - run-os
  - libvirt
  - nova
  - ovn
  - neutron-metadata
  - telemetry

If the services field is omitted from the OpenStackDataPlaneNodeSet spec, then the above list will be used.

The associated OpenStackDataPlaneService resources are reconciled during OpenStackDataPlaneNodeSet reconciliation if the service is in the NodeSets' service list.

The DataPlane Operator also includes the following services that are not enabled by default:

Service Description

ceph-client

Include this service to configure data plane nodes as clients of a Red Hat Ceph Storage server. Include between the install-os and configure-os services. The OpenStackDataPlaneNodeSet CR must include the following configuration to access the Red Hat Ceph Storage secrets:

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
spec:
  ...
  nodeTemplate:
    extraMounts:
    - extraVolType: Ceph
      volumes:
      - name: ceph
        secret:
          secretName: ceph-conf-files
      mounts:
      - name: ceph
        mountPath: "/etc/ceph"
        readOnly: true

ceph-hci-pre

Include this service to prepare data plane nodes to host Red Hat Ceph Storage in an HCI configuration. For more information, see Configuring a Hyperconverged Infrastructure environment.

configure-ovs-dpdk

Include this service to configure OvS DPDK configuration on EDPM nodes. This service is needed to enable OvS DPDK on the compute nodes.

neutron-sriov

Include this service to run a Neutron SR-IOV NIC agent on the data plane nodes.

neutron-metadata

Include this service to run the Neutron OVN Metadata agent on the data plane nodes. This agent is required to provide metadata services to the Compute nodes.

neutron-ovn

Include this service to run the Neutron OVN agent on the data plane nodes. This agent is required to provide QoS to hardware offloaded ports on the Compute nodes.

neutron-dhcp

Include this service to run a Neutron DHCP agent on the data plane nodes.

For more information about the available default services, see https://github.com/openstack-k8s-operators/openstack-operator/tree/main/config/services.

You can enable and disable services for an OpenStackDataPlaneNodeSet resource.

Do not change the order of the default service deployments.

You can use the OpenStackDataPlaneService CRD to create custom services that you can deploy on your data plane nodes. You add your custom services to the default list of services where the service must be executed. For more information, see Creating a custom service.

You can view the details of a service by viewing the YAML representation of the resource:

$ oc get openstackdataplaneservice configure-network -o yaml

Overriding services for the deployment

The list of services that will be deployed when an OpenStackDataPlaneDeployment is created is set on each OpenStackDataPlaneNodeSet that is included in the nodeSets list on the OpenStackDataPlaneDeployment.

This allows for deploying a different set of services on different OpenStackDataPlaneNodeSets using the same OpenStackDataPlaneDeployment resource simultaneously. It also maintains the association between services and nodeSets on the nodeSet itself. This association is important when nodeSets are used to group nodes with hardware and configuration differences that require the need for deploying different services on different nodeSets.

In some specific cases, it may be needed to override what services are deployed on all nodeSets included in an OpenStackDataPlaneDeployment. These cases can vary, but are often related to day 2 workflows such as update, upgrade, and scale out. In these cases, it may be needed to execute a smaller subset of services, or just a single service, across all nodeSets in the OpenStackDataPlaneDeployment.

The servicesOverride field on OpenStackDataPlaneDeployment allow for this behavior. Setting this field changes what services are deployed when the OpenStackDataPlaneDeployment is created. If the field is set, only the services listed in the field will be deployed on all nodeSets.

If deployment has been configured with tlsEnabled set to True in the original OpenStackDataPlaneNodeSet CR, it is recommended to add also install-certs to the list of services in servicesOverride list. That will install certificates potentially required by the other services in the destination nodes.

The following example OpenStackDataPlaneDeployment resource illustrates using servicesOverride to perform a pre-upgrade task of executing just the ovn service.

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
  name: openstack-edpm-pre-upgrade-ovn
spec:

  nodeSets:
    - openstack-edpm

  // Only the services here will be executed. Overriding any services value
  // on the openstack-edpm nodeSet.
  // Service install-certs is added here to install certificates
  // potentially required by the ovn service
  servicesOverride:
    - install-certs
    - ovn

Creating a custom service

You can use the OpenStackDataPlaneService CRD to create custom services to deploy on your data plane nodes.

Do not create a custom service with the same name as one of the default services. If a custom service name matches a default service name, the default service values overwrite the custom service values during OpenStackDataPlaneNodeSet reconciliation.

You specify the Ansible execution for your service with either an Ansible playbook or by including the free-form play contents directly in the spec section of the service.

You cannot use both an Ansible playbook and an Ansible play in the same service.
Procedure
  1. Create an OpenStackDataPlaneService CR and save it to a YAML file on your workstation, for example custom-service.yaml:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneService
    metadata:
      name: custom-service
    spec:
      label: dataplane-deployment-custom-service
  2. Specify the Ansible commands to create the custom service, by referencing an Ansible playbook or by including the Ansible play in the spec:

    • Specify the Ansible playbook to use:

      apiVersion: dataplane.openstack.org/v1beta1
      kind: OpenStackDataPlaneService
      metadata:
        name: custom-service
      spec:
        label: dataplane-deployment-custom-service
        playbook: osp.edpm.configure_os

    For information about how to create an Ansible playbook, see Creating a playbook.

    • Specify the Ansible play as a string that uses Ansible playbook syntax:

      apiVersion: dataplane.openstack.org/v1beta1
      kind: OpenStackDataPlaneService
      metadata:
        name: custom-service
      spec:
        label: dataplane-deployment-custom-service
        playbookContents: |
          hosts: all
          tasks:
            - name: Hello World!
              shell: "echo Hello World!"
              register: output
            - name: Show output
              debug:
                msg: "{{ output.stdout }}"
            - name: Hello World role
              import_role: hello_world
  3. Optional: To override the default container image used by the ansible-runner execution environment with a custom image that uses additional Ansible content for a custom service, build and include a custom ansible-runner image. For information, see Building a custom ansible-runner image.

  4. Optional: Designate and configure a node set for a Compute feature or workload. For more information, see Configuring a node set for a Compute feature or workload.

  5. Optional: Specify DataSource resources to use to pass ConfigMaps or Secrets into the OpenStackAnsibleEE job. When the optional field is true on a DataSource configMapRef or secretRef, the resource is optional, and an error won’t occur when it doesn’t exist.

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneService
    metadata:
      name: custom-service
    spec:
      ...
      playbookContents: |
        ...
      dataSources:
    	  - configMapRef:
    		    name: hello-world-cm-0
        - secretRef:
    	      name: hello-world-secret-0
        - secretRef:
            name: hello-world-secret-1
    		    # This secret is optional, it does not need to exist.
            optional: true

    A mount is created for each ConfigMap and Secret in the OpenStackAnsibleEE pod with a filename that matches the resource value. The mounts are created under /var/lib/openstack/configs/<service name>.

  6. Optional: It may be necessary to run some services on all nodesets at the same time. These services need to have their deployOnAllNodeSets field set to true. If these services are repated in multiple nodeset specs included in a deployment, they would be ignored from subsequent nodeset services and would be run only once.

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneService
    metadata:
      name: custom-global-service
    spec:
      label: custom-global-service
      playbookContents: |
        - hosts: localhost
          gather_facts: no
          name: global play
          tasks:
            - name: Sleep
              command: sleep 1
              delegate_to: localhost
      deployOnAllNodeSets: true
  7. Optional: Specify the edpmServiceType field for the service. Different custom services may use the same ansible content to manage the same EDPM service (such as ovn or nova). The DataSources, TLS certificates, and CA certificates need to be mounted at the same locations so they can be found by the ansible content even when using a custom service. edpmServiceType is used to create this association. The value is the name of the default service that uses the same ansible content as the custom service. If there are multiple services with the same edpmServiceType listed in a nodeset or deployment spec, latter ones would be ignored.

    For example, a custom service that uses the edpm_ovn ansible content from edpm-ansible would set edpmServiceType to ovn, which matches the default ovn service name provided by openstack-operator.

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneService
    metadata:
      name: custom-ovn-service
    spec:
      edpmServiceType: ovn
  8. Create the custom service:

    $ oc apply -f custom-service.yaml
  9. Verify that the custom service is created:

    $ oc get openstackdataplaneservice <custom_service_name> -o yaml
Enabling a custom service

To add a custom service to be executed as part of an OpenStackDataPlaneNodeSet deployment, add the service name to the services field list on the NodeSet. Add the service name in the order that it should be executed relative to the other services. This example shows adding the hello-world service as the first service to execute for the edpm-compute NodeSet.

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
  name: openstack-edpm
spec:
  services:
    - hello-world
    - redhat
    - download-cache
    - bootstrap
    - configure-network
    - validate-network
    - install-os
    - configure-os
    - run-os
    - ovn
    - neutron-metadata
    - libvirt
    - nova
  nodes:
    edpm-compute:
      ansible:
        ansibleHost: 172.20.12.67
        ansibleSSHPrivateKeySecret: dataplane-ansible-ssh-private-key-secret
        ansibleUser: cloud-admin
        ansibleVars:
          ansible_ssh_transfer_method: scp
          ctlplane_ip: 172.20.12.67
          external_ip: 172.20.12.76
          fqdn_internalapi: edpm-compute-1.example.com
          internalapi_ip: 172.17.0.101
          storage_ip: 172.18.0.101
          tenant_ip: 172.10.0.101
      hostName: edpm-compute-0
      networkConfig: {}
      nova:
        cellName: cell1
        deploy: true
        novaInstance: nova
  nodeTemplate: {}

When customizing the services list, the default list of services must be reproduced and then customized if the intent is to still deploy those services. If just the hello-world service was listed in the list, then that is the only service that would be deployed.

Exercise caution when including a service that is meant to be exectured on every NodeSet in the list. Some services may behave in unexpected ways when executed multiple times on the same node.

Configuring a node set for a Compute feature or workload

You can designate a node set for a particular Compute feature or workload. To designate and configure a node set for a feature, complete the following tasks:

  1. Create a ConfigMap CR to configure the Compute nodes.

  2. Create a custom nova service for the feature that runs the osp.edpm.nova playbook.

  3. Include the ConfigMap CR in the custom nova service.

Procedure
  1. Create ConfigMap CR to configure the Compute nodes. For example, to enable CPU pinning on the Compute nodes, create the following ConfigMap object:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: nova-cpu-pinning-configmap
      namespace: openstack
    data:
      25-nova-cpu-pinning.conf: |
        [compute]
        cpu_shared_set = 2,6
        cpu_dedicated_set = 1,3,5,7

    When the service is deployed it adds the configuration to etc/nova/nova.conf.d/ in the nova_compute container.

    For more information on creating ConfigMap objects, see Creating and using config maps.

    You can use a Secret to create the custom configuration instead if the configuration includes sensitive information, such as passwords or certificates that are required for certification.
  2. Create a custom nova service for the feature. For information about how to create a custom service, see Creating a custom service.

  3. Add the ConfigMap CR to the custom nova service:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneService
    metadata:
      name: nova-cpu-pinning-service
    spec:
      label: dataplane-deployment-custom-service
        playbook: osp.edpm.nova
      configMaps:
        - nova-cpu-pinning-configmap
  4. Specify the Secret CR for the cell that the node set that runs this service connects to:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneService
    metadata:
      name: nova-cpu-pinning-service
    spec:
      label: dataplane-deployment-custom-service
        playbook: osp.edpm.nova
      configMaps:
        - nova-cpu-pinning-configmap
      secrets:
        - nova-cell1-compute-config

Building a custom ansible-runner image

You can override the default container image used by the ansible-runner execution environment with your own custom image when you need additional Ansible content for a custom service.

Procedure
  1. Create a Containerfile that adds the custom content to the default image:

    FROM quay.io/openstack-k8s-operators/openstack-ansibleee-runner:latest
    COPY my_custom_role /usr/share/ansible/roles/my_custom_role
  2. Build and push the image to a container registry:

    $ podman build -t quay.io/example_user/my_custom_image:latest .
    $ podman push quay.io/example_user/my_custom_role:latest
  3. Specify your new container image as the image that the ansible-runner execution environment must use to add the additional Ansible content that your custom service requires, such as Ansible roles or modules:

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneService
    metadata:
      name: custom-service
    spec:
      label: dataplane-deployment-custom-service
      openStackAnsibleEERunnerImage: quay.io/openstack-k8s-operators/openstack-ansibleee-runner:latest (1)
      playbookContents: |
    1 Your container image that the ansible-runner execution environment uses to execute Ansible.

Example OpenStackDataPlaneNodeSet CR for pre-provisioned nodes

The following example OpenStackDataPlaneNodeSet CR creates a set of generic Compute nodes with some node-specific configuration.

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
  name: openstack-edpm-ipam
  namespace: openstack
spec:
  env: (1)
    - name: ANSIBLE_FORCE_COLOR
      value: "True"
  networkAttachments: (2)
    - ctlplane
  nodeTemplate: (3)
    ansible:
      ansibleUser: cloud-admin (4)
      ansibleVars: (5)
        edpm_network_config_template: | (6)
          ---
          {% set mtu_list = [ctlplane_mtu] %}
          {% for network in nodeset_networks %}
          {{ mtu_list.append(lookup('vars', networks_lower[network] ~ '_mtu')) }}
          {%- endfor %}
          {% set min_viable_mtu = mtu_list | max %}
          network_config:
          - type: ovs_bridge
            name: {{ neutron_physical_bridge_name }}
            mtu: {{ min_viable_mtu }}
            use_dhcp: false
            dns_servers: {{ ctlplane_dns_nameservers }}
            domain: {{ dns_search_domains }}
            addresses:
            - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
            routes: {{ ctlplane_host_routes }}
            members:
            - type: interface
              name: nic1
              mtu: {{ min_viable_mtu }}
              # force the MAC address of the bridge to this interface
              primary: true
          {% for network in nodeset_networks %}
            - type: vlan
              mtu: {{ lookup('vars', networks_lower[network] ~ '_mtu') }}
              vlan_id: {{ lookup('vars', networks_lower[network] ~ '_vlan_id') }}
              addresses:
              - ip_netmask:
                  {{ lookup('vars', networks_lower[network] ~ '_ip') }}/{{ lookup('vars', networks_lower[network] ~ '_cidr') }}
              routes: {{ lookup('vars', networks_lower[network] ~ '_host_routes') }}
          {% endfor %}
        edpm_nodes_validation_validate_controllers_icmp: false
        edpm_nodes_validation_validate_gateway_icmp: false
        edpm_sshd_allowed_ranges:
          - 192.168.122.0/24
        enable_debug: false
        gather_facts: false
        neutron_physical_bridge_name: br-ex
        neutron_public_interface_name: eth0
    ansibleSSHPrivateKeySecret: dataplane-ansible-ssh-private-key-secret (7)
  nodes:
    edpm-compute-0: (8)
      ansible:
        ansibleHost: 192.168.122.100
      hostName: edpm-compute-0
      networks:
        - defaultRoute: true
          fixedIP: 192.168.122.100
          name: ctlplane
          subnetName: subnet1
        - name: internalapi
          subnetName: subnet1
        - name: storage
          subnetName: subnet1
        - name: tenant
          subnetName: subnet1
  preProvisioned: true (9)
  services: (10)
    - bootstrap
    - download-cache
    - configure-network
    - validate-network
    - install-os
    - configure-os
    - ssh-known-hosts
    - run-os
    - reboot-os
    - install-certs
    - ovn
    - neutron-metadata
    - libvirt
    - nova
    - telemetry
  tlsEnabled: true
1 Optional: A list of environment variables to pass to the pod.
2 The networks the ansibleee-runner connects to, specified as a list of netattach resource names.
3 The common configuration to apply to all nodes in this set of nodes.
4 The user associated with the secret you created in Creating the SSH key secrets.
5 The Ansible variables that customize the set of nodes. For a complete list of Ansible variables, see https://openstack-k8s-operators.github.io/edpm-ansible/.
6 The network configuration template to apply to nodes in the set. For sample templates, see https://github.com/openstack-k8s-operators/edpm-ansible/tree/main/roles/edpm_network_config/templates.
7 The name of the secret that you created in Creating the SSH key secrets.
8 The node definition reference, for example, edpm-compute-0. Each node in the node set must have a node definition.
9 Specify if the nodes in this set are pre-provisioned, or if they must be provisioned when creating the resource.
10 The services that are deployed on the data plane nodes in this OpenStackDataPlaneNodeSet CR.

Example OpenStackDataPlaneNodeSet CR for bare metal nodes

The following example OpenStackDataPlaneNodeSet CR creates a set of generic Compute nodes with some node-specific configuration.

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
  name: openstack-edpm-ipam
  namespace: openstack
spec:
  baremetalSetTemplate: (1)
    bmhLabelSelector:
      app: openstack
    cloudUserName: cloud-admin
    ctlplaneInterface: enp1s0
  env: (2)
    - name: ANSIBLE_FORCE_COLOR
      value: "True"
  networkAttachments: (3)
    - ctlplane
  nodeTemplate: (4)
    ansible:
      ansibleUser: cloud-admin (5)
      ansibleVars: (6)
        edpm_network_config_template: | (7)
          ---
          {% set mtu_list = [ctlplane_mtu] %}
          {% for network in nodeset_networks %}
          {{ mtu_list.append(lookup('vars', networks_lower[network] ~ '_mtu')) }}
          {%- endfor %}
          {% set min_viable_mtu = mtu_list | max %}
          network_config:
          - type: ovs_bridge
            name: {{ neutron_physical_bridge_name }}
            mtu: {{ min_viable_mtu }}
            use_dhcp: false
            dns_servers: {{ ctlplane_dns_nameservers }}
            domain: {{ dns_search_domains }}
            addresses:
            - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
            routes: {{ ctlplane_host_routes }}
            members:
            - type: interface
              name: nic1
              mtu: {{ min_viable_mtu }}
              # force the MAC address of the bridge to this interface
              primary: true
          {% for network in nodeset_networks %}
            - type: vlan
              mtu: {{ lookup('vars', networks_lower[network] ~ '_mtu') }}
              vlan_id: {{ lookup('vars', networks_lower[network] ~ '_vlan_id') }}
              addresses:
              - ip_netmask:
                  {{ lookup('vars', networks_lower[network] ~ '_ip') }}/{{ lookup('vars', networks_lower[network] ~ '_cidr') }}
              routes: {{ lookup('vars', networks_lower[network] ~ '_host_routes') }}
          {% endfor %}
        edpm_nodes_validation_validate_controllers_icmp: false
        edpm_nodes_validation_validate_gateway_icmp: false
        edpm_sshd_allowed_ranges:
          - 192.168.111.0/24
        enable_debug: false
        gather_facts: false
        neutron_physical_bridge_name: br-ex
        neutron_public_interface_name: eth0
    ansibleSSHPrivateKeySecret: dataplane-ansible-ssh-private-key-secret (8)
    networks: (9)
      - defaultRoute: true
        name: ctlplane
        subnetName: subnet1
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
  nodes:
    edpm-compute-0: (10)
      hostName: edpm-compute-0
  preProvisioned: false
  services: (11)
    - bootstrap
    - download-cache
    - configure-network
    - validate-network
    - install-os
    - configure-os
    - ssh-known-hosts
    - run-os
    - reboot-os
    - install-certs
    - ovn
    - neutron-metadata
    - libvirt
    - nova
    - telemetry
  tlsEnabled: true
1 Configure the bare metal template for bare metal nodes that must be provisioned when creating the resource.
2 Optional: A list of environment variables to pass to the pod.
3 The networks the ansibleee-runner connects to, specified as a list of netattach resource names.
4 The common configuration to apply to all nodes in this set of nodes.
5 The user associated with the secret you created in Creating the SSH key secrets.
6 The Ansible variables that customize the set of nodes. For a complete list of Ansible variables, see https://openstack-k8s-operators.github.io/edpm-ansible/.
7 The network configuration template to apply to nodes in the set. For sample templates, see https://github.com/openstack-k8s-operators/edpm-ansible/tree/main/roles/edpm_network_config/templates.
8 The name of the secret that you created in Creating the SSH key secrets.
9 Networks for the bare metal nodes.
10 The node definition reference, for example, edpm-compute-0. Each node in the node set must have a node definition.
11 The services that are deployed on the data plane nodes in this OpenStackDataPlaneNodeSet CR.

Data plane conditions and states

Each data plane resource has a series of conditions within their status subresource that indicates the overall state of the resource, including its deployment progress.

For an OpenStackDataPlaneNodeSet, until an OpenStackDataPlaneDeployment has been started and finished successfully, the Ready condition is False. When the deployment succeeds, the Ready condition is set to True. A subsequent deployment sets the Ready condition to False until the deployment succeeds, when the Ready condition is set to True.

Table 1. OpenStackDataPlaneNodeSet CR conditions
Condition Description

Ready

  • "True": The OpenStackDataPlaneNodeSet CR is successfully deployed.

  • "False": The deployment is not yet requested or has failed, or there are other failed conditions.

SetupReady

"True": All setup tasks for a resource are complete. Setup tasks include verifying the SSH key secret, verifying other fields on the resource, and creating the Ansible inventory for each resource. Each service-specific condition is set to "True" when that service completes deployment. You can check the service conditions to see which services have completed their deployment, or which services failed.

DeploymentReady

"True": The NodeSet has been successfully deployed.

InputReady

"True": The required inputs are available and ready.

NodeSetDNSDataReady

"True": DNSData resources are ready.

NodeSetIPReservationReady

"True": The IPSet resources are ready.

NodeSetBaremetalProvisionReady

"True": Bare metal nodes are provisioned and ready.

Table 2. OpenStackDataPlaneNodeSet status fields
Status field Description

Deployed

  • "True": The OpenStackDataPlaneNodeSet CR is successfully deployed.

  • "False": The deployment is not yet requested or has failed, or there are other failed conditions.

DNSClusterAddresses

CtlplaneSearchDomain

Table 3. OpenStackDataPlaneDeployment CR conditions
Condition Description

Ready

  • "True": The data plane is successfully deployed.

  • "False": The data plane deployment failed, or there are other failed conditions.

DeploymentReady

"True": The data plane is successfully deployed.

InputReady

"True": The required inputs are available and ready.

<NodeSet> Deployment Ready

"True": The deployment has succeeded for the named NodeSet, indicating all services for the NodeSet have succeeded.

<NodeSet> <Service> Deployment Ready

"True": The deployment has succeeded for the named NodeSet and Service. Each <NodeSet> <Service> Deployment Ready specific condition is set to "True" as that service completes successfully for the named NodeSet. Once all services are complete for a NodeSet, the <NodeSet> Deployment Ready condition is set to "True". The service conditions indicate which services have completed their deployment, or which services failed and for which NodeSets.

Table 4. OpenStackDataPlaneDeployment status fields
Status field Description

Deployed

  • "True": The data plane is successfully deployed. All Services for all NodeSets have succeeded.

  • "False": The deployment is not yet requested or has failed, or there are other failed conditions.

Table 5. OpenStackDataPlaneService CR conditions
Condition Description

Ready

"True": The service has been created and is ready for use. "False": The service has failed to be created.

Provisioning bare metal data plane nodes

Provisioning bare metal nodes on the data plane is supported with the Red Hat OpenShift Container Platform (RHOCP) Cluster Baremetal Operator (CBO). The CBO is a RHOCP Operator responsible for deploying all the components that are required to provision bare metal nodes within the RHOCP cluster, including the Bare Metal Operator (BMO) and Ironic containers.

Installer-Provisioned Infrastructure

CBO is enabled by default on RHOCP clusters that are installed with the baremetal installer-provisioned infrastructure. You can configure installer-provisioned clusters with a provisioning network to enable both virtual media and network boot installations. You can alternatively configure an installer-provisioned cluster without a provisioning network so that only virtual media provisioning is possible.

Assisted Installer Provisioned Infrastructure

You can enable CBO on clusters installed with the Assisted Installer, and you can manually add the provisioning network to the Assisted Installer cluster after installation.

User Provisioned Infrastructure

You can activate CBO on RHOCP clusters installed with user-provisioned infrastructure by creating a Provisioning CR. You cannot add a provisioning network to a user-provisioned cluster.

For user-provisioned insfrastructure a provisioning CR has to be created manually as below:

apiVersion: metal3.io/v1alpha1
kind: Provisioning
metadata:
  name: provisioning-configuration
spec:
  provisioningNetwork: "Disabled"
  watchAllNamespaces: false

BMO manages the available hosts on clusters and performs the following operations:

  • Inspects node hardware details and reports them to the corresponding BareMetalHost CR. This includes information about CPUs, RAM, disks and NICs.

  • Provisions nodes with a specific image.

  • Cleans node disk contents before and after provisioning.

Provisioning Nodes with OpenStackDataPlaneNodeSet

Before deploying dataplane nodes on baremetal, ensure that CBO has been enabled/activated with clusters installed with the different installers mentioned above.

Sufficient number of edpm node BareMetalHost(BMH) CRs should be created and be in Available state (after inspection).By default baremetal-operator would be looking for BMHs in the openshift-machine-api namespace.

Provisioning resource should be patched to watch all namespaces with watchAllNamespaces: true as the secrets would be created in openstack namespace, in spite of BMHs in openshift-machine-api namespace.

$ oc patch provisioning provisioning-configuration --type merge -p '{"spec":{"watchAllNamespaces": true }}'

Sample BMH spec:

apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
  name: edpm-compute-01
  namespace: openstack
  labels:
    app: openstack
    workload: compute
spec:
  bmc:
    address: redfish+http://192.168.111.1:8000/redfish/v1/Systems/e8efd888-f844-4fe0-9e2e-498f4ab7806d
    credentialsName: node-bmc-secret
  bootMACAddress: 00:c7:e4:a7:e7:f3
  bootMode: UEFI
  online: false

BMH labels should be set appropriately for the desired nodes so that it can be used by the bmhLabelSelector in the OpenStackDataPlaneNodeSet spec.

For virtual-media provisioning BMC address should use virtual-media as below.

bmc:
  address: redfish-virtualmedia+http://192.168.111.1:8000/redfish/v1/Systems/e8efd888-f844-4fe0-9e2e-498f4ab7806d

To provision the baremetal nodes for edpm, OpenStackDataPlaneNodeSet spec should have the baremetalSetTemplate section as show below. Other than bmhLabelSelector, hardwareReqs field can also be provided for appropriate BMH selection. To select a particular BMH for a node, bmhLabelSelector can be provided in the node section of the OpenStackDataPlaneNodeSet spec. These labels would be used in addition to the labels set in baremetalSetTemplate to select BMHs for the node.

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
  name: openstack-edpm
spec:
  baremetalSetTemplate:
    bmhLabelSelector:
      app: openstack
      workload: compute
    ctlplaneInterface: enp1s0
    cloudUserName: cloud-admin
  nodes:
    edpm-compute-0
      hostName: edpm-compute-0
      ansible:
        ansibleHost: 192.168.122.100
      bmhLabelSelector:
        nodeName: edpm-compute-01
Relevant Status Condition

NodeSetBaremetalProvisionReady condition in status condtions reflects the status of baremetal provisioning as shown below.

$ oc get openstackdataplanenodeset openstack-edpm-ipam -o json | jq '.status.conditions[] | select(.type=="NodeSetBaremetalProvisionReady")'
{
  "lastTransitionTime": "2024-02-01T04:41:58Z",
  "message": "NodeSetBaremetalProvisionReady ready",
  "reason": "Ready",
  "status": "True",
  "type": "NodeSetBaremetalProvisionReady"
}

Deploying the data plane

You use the OpenStackDataPlaneDeployment CRD to configure the services on the data plane nodes and deploy the data plane. Create an OpenStackDataPlaneDeployment custom resource (CR) that deploys each of your OpenStackDataPlaneNodeSet CRs.

Procedure
  1. Create an OpenStackDataPlaneDeployment CR and save it to a file named openstack-edpm-deploy.yaml on your workstation.

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: edpm-deployment-ipam
    spec:
      nodeSets:
        - openstack-edpm-ipam
        - <nodeSet_name>
        - ...
        - <nodeSet_name>
    • Replace <nodeSet_name> with the names of the OpenStackDataPlaneNodeSet CRs that you want to include in your data plane deployment.

    • You can optionally provide ansibleJobNodeSelector to run the ansible jobs on specific set of OCP worker nodes. For example, worker nodes with ctlplane network, as ansible jobs require ctlplane NAD.

      apiVersion: dataplane.openstack.org/v1beta1
      kind: OpenStackDataPlaneDeployment
      metadata:
        name: edpm-deployment-ipam
      spec:
        nodeSets:
          - openstack-edpm-ipam
          - <nodeSet_name>
          - ...
          - <nodeSet_name>
        ansibleJobNodeSelector:
          nodeWith: ctlplane
  2. Save the openstack-edpm-deploy.yaml deployment file.

  3. Deploy the data plane:

    $ oc create -f openstack-edpm-deploy.yaml

    You can view the Ansible logs while the deployment executes:

    $ oc get pod -l app=openstackansibleee -w
    $ oc logs -l app=openstackansibleee -f --max-log-requests 20
  4. Verify that the data plane is deployed:

    $ oc get openstackdataplanedeployment
    NAME             	STATUS   MESSAGE
    edpm-deployment-ipam   True     Setup Complete
    
    
    $ oc get openstackdataplanenodeset
    NAME             	STATUS   MESSAGE
    openstack-edpm-ipam   True     NodeSet Ready

    For information on the meaning of the returned status, see Data plane conditions and states.

    If the status indicates that the data plane has not been deployed, then troubleshoot the deployment. For information, see Troubleshooting the data plane creation and deployment.

  5. Map the Compute nodes to the Compute cell that they are connected to:

    $ oc rsh nova-cell0-conductor-0 nova-manage cell_v2 discover_hosts --verbose

    If you did not create additional cells, this command maps the Compute nodes to cell1.

  6. Access the remote shell for the openstackclient pod and verify that the deployed Compute nodes are visible on the control plane:

    $ oc rsh -n openstack openstackclient
    $ openstack hypervisor list

Persistent logs

For enabling persistent logging:

  1. Create a PersistentVolume and a PersistentVolumeClaim

  2. Mount /runner/artifacts into the PersistentVolume through extraMounts field

Example:
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
spec:
  ...
  nodeTemplate:
    extraMounts:
      - extraVolType: Logs
        volumes:
        - name: ansible-logs
          persistentVolumeClaim:
            claimName: <PersistentVolumeClaim name>
        mounts:
        - name: ansible-logs
          mountPath: "/runner/artifacts"

Accessing the logs

  1. Query for pods with the OpenStackAnsibleEE label

    oc get pods -l app=openstackansibleee

    Sample output:

    configure-network-edpm-compute-j6r4l   0/1     Completed           0          3m36s
    validate-network-edpm-compute-6g7n9    0/1     Pending             0          0s
    validate-network-edpm-compute-6g7n9    0/1     ContainerCreating   0          11s
    validate-network-edpm-compute-6g7n9    1/1     Running             0          13s
  2. SSH into a pod

    When a pod is running:

    oc rsh validate-network-edpm-compute-6g7n9

    When a pod is not running:

    oc debug configure-network-edpm-compute-j6r4l
  3. List the directories under /runner/artifacts

    ls /runner/artifacts

    Sample output:

    configure-network-edpm-compute
    validate-network-edpm-compute
  4. Get the stdout of the desired artifact

    cat /runner/artifacts/configure-network-edpm-compute/stdout

Troubleshooting data plane creation and deployment

Each data plane deployment in the environment has associated services. Each of these services have a job condition message that matches to the current status of the AnsibleEE job executing for that service. This information can be used to troubleshoot deployments when services are not deploying or operating correctly.

Procedure
  1. Determine the name and status of all deployments:

    $ oc get openstackdataplanedeployment

    The following example output shows two deployments currently in progress:

    $ oc get openstackdataplanedeployment
    
    NAME                   NODESETS             STATUS   MESSAGE
    openstack-edpm-ipam1   ["openstack-edpm"]   False    Deployment in progress
    openstack-edpm-ipam2   ["openstack-edpm"]   False    Deployment in progress
  2. Determine the name and status of all services and their job condition:

    $ oc get openstackansibleee

    The following example output shows all services and their job condition for all current deployments:

    $ oc get openstackansibleee
    
    NAME                             NETWORKATTACHMENTS   STATUS   MESSAGE
    bootstrap-openstack-edpm         ["ctlplane"]         True     Job completed
    download-cache-openstack-edpm    ["ctlplane"]         False    Job in progress
    repo-setup-openstack-edpm        ["ctlplane"]         True     Job completed
    validate-network-another-osdpd   ["ctlplane"]         False    Job in progress
  3. Filter for the name and service for a specific deployment:

    $ oc get openstackansibleee -l openstackdataplanedeployment=<deployment_name>
    • Replace <deployment_name> with the name of the deployment to use to filter the services list.

      The following example filters the list to only show services and their job condition for the openstack-edpm-ipam1 deployment:

      $ oc get openstackansibleee -l openstackdataplanedeployment=openstack-edpm-ipam1
      
      NAME                            NETWORKATTACHMENTS   STATUS   MESSAGE
      bootstrap-openstack-edpm        ["ctlplane"]         True     Job completed
      download-cache-openstack-edpm   ["ctlplane"]         False    Job in progress
      repo-setup-openstack-edpm       ["ctlplane"]         True     Job completed
Job Condition Messages

AnsibleEE jobs have an associated condition message that indicates the current state of the service job. This condition message is displayed in the MESSAGE field of the oc get openstackansibleee command output. Jobs return one of the following conditions when queried:

  • Job not started: The job has not started.

  • Job in progress: The job is currently running.

  • Job completed: The job execution is complete.

  • Job error occured <error_message>: The job execution stopped unexpectedly. The <error_message> is replaced with a specific error message.

To further investigate a service displaying a particular job condition message, use the command oc logs job/<service> to display the logs associated with that service. For example, to display the logs for the repo-setup-openstack-edpm service, use the command oc logs job/repo-setup-openstack-edpm.

Check service pod status reports

During reconciliation of OpenStackDataPlaneDeployment resources, Kubernetes Pods associated with OpenStackAnsibleEE jobs are marked with label openstackdataplanedeployment=<OpenStackDataPlaneDeployment.Name>. This allows selection and monitoring of these pods using CLI commands.

When encountering failures within OpenStackAnsibleEE jobs, the resulting Kubernetes Pod reports will be formatted with an error message in the following manner: openstackansibleee job <POD_NAME> failed due to <ERROR> with message: <ERROR_MSG>.

These reports can provide valuable insights into the cause of the failure and aid in resolving related issues.

Deploying an OpenStackDataPlaneNodeSet with Internal TLS Enabled

When an OpenStackDataPlaneNodeSet is deployed with TLS Enabled, communications between dataplane services and with control plane services can be encrypted using TLS connections.

Functionality has been added to the openstack-operator to generate the needed certificates for all compute nodes in the nodeset. The details on how to enable this functionality and how dataplane services (including custom services) can take advantage of this functionality is provided here.

In addtion, an attribute has been added to the OpenStackDataplaneService spec to allow a CACert TLS bundle to be provided.

Prerequisites

Certificates for dataplane services are generated by certmanager issuers, which are referenced in the OpenStackDataplaneService attributes below. These issuers must be created beforehand.

In addition, OpenStackDataplaneService contains an attribute that allows the deployer to specify a secret containing a TLS CA bundle. This secret should also be created beforehand.

Basic deloyment steps

  1. Create the issuers and cacert bundle secrets as described in the pre-requisites above. These were likely created as part of the control plane deployment. (TODO - link to issuer/cacert docs when available)

  2. Enable TLS on the OpenStackDataPlaneNodeSet and create the nodeset. Ensure that the install-certs or similar service (described below) is in the list of services before any services that require certs.

  3. Deploy the OpenStackDataPlane. Certs should be created and copied to the compute nodes in the nodeset.

Enabling TLS on an OpenStackDataPlaneNodeSet

The OpenstackDataPlaneNodeSet has an attribute tlsEnabled, which defaults to false. The certficate generation code will be executed only if this attribute is set to true.

OpenStackDataplaneService attributes

Certificate generation is controlled by several attributes in the OpenstackDataplaneService specification. An example is provided below.

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneService
metadata:
  name: libvirt
spec:
  label: libvirt
  playbook: osp.edpm.libvirt
  tlsCerts:
    default:
      contents:
        - dnsnames
        - ips
      networks:
        - CtlPlane
      keyUsages:
        - digital signature
        - key encipherment
        - server auth
        - client auth
      issuer: osp-rootca-issuer-internal
  caCerts: combined-ca-bundle

A more minimal configuration is provided below:

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneService
metadata:
  name: service1
spec:
  label: service1
  playbook: osp.edpm.service1
  tlsCerts:
    default:
      contents:
        - dnsnames
  caCerts: combined-ca-bundle
caCerts

This optional attribute is a string pointing to the secret containing the TLS CA certificate bundle to be mounted for the dataplane service. This secret is expected to be created in the same namespace (default: openstack) beforehand.

tlsCerts

Not all dataplane services will require TLS certificates. For example, dataplane services that install the OS or download caches do not need TLS certificates.

tlsCerts is a map of certificates to be generated for the service. By convention, the default pre-defined services use "default" as the hash key if only one service is needed, but this is not required. Ultimately, the hash key will be part of the path where the cert is located.

Most services will only require one certificate. Some though, like libvirt, may require more than one certificate.

If tlsCerts is defined (and tlsEnabled is set on the nodeset), certs will be generated as prescribed by the following attributes:

contents

This attribute describes what information is included in the subject alternative names (SAN) in the certificate. At this time, only two values are possible ("dnsnames" and "ips"). In the libvirt example, both attributes are added. This attribute is required.

networks

This attribute describes which networks will be added to the SAN. For instance, in the libvirt example configuration, the DNSName and IPAdress for the node on the Ctlplane will be added to the SAN. If networks is not defined, the relevant contents for all networks will be added to the SAN. So, in the configuration for service1 above, dns names for all networks on the node are added to the SAN.

issuer

This attribute corresponds to the label for the certmanager issuer that is used to issue the certificate. The label can be different from the name of the issuer. There can be only one issuer with the specified label. If more than one issuer has the label, an error is generated. If the issuers attribute is not set, as in the configuration for service1, the certificates are issued with the default root CA for internal TLS as defined in lib-common, which is set to the label "osp-rootca-issuer-internal" for the rootca-internal issuer.

keyUsages

This attribute is a list of key uages to be included as key usage extensions in the certificate. The strings that correspond to valid usages are provided by the certmanage api. If this attribute is not provided, the default set of key usages as defined in lib-common. will be used. These are "key encipherment", "digital signature" and "server auth". In the above examples, we see that libvirt defines this attribute because the "client auth" key usage is also needed.

addCertMounts

This attribute specifies whether or not to mount the certificates and keys generated for all dataplane services (for all nodes in the nodeset) in the ansible EE pod for the service. This attribute defaults to false.

The current design has a special dataplane service "install-certs" that is expected to run before any services that need certificates, and which has this attribute set to true. The purpose of this dataplane service is to copy the certs to the correct place on the compute nodes. This dataplane service is described in more detail below.

The gritty details

How the certificates are generated

When tlsEnabled is set to True on the nodeset, and tlsCerts is defined for the dataplane service, certificates will be requested from the certmanager issuer designated in the issuer attribute (or a default) as described above.

The contents of the certificate (subject name, subject alternative names, etc.) are defined using the contents and issuer attributes as described above.

The certficates are generated when an OpenstackDataplaneDeployment is created, but before any ansible EE pods are created.

When the certificates are created, certmanager stores the certificates in secrets which are named "cert-<service_name>-<hash_key>-<node_name>-#". The # symbol represents the secret number, beginning with 0. Kubernetes distributions, such as Red Hat Openshift Platform, have a maximum secret size of 1 MiB. If the size of the created certificates and keys is larger than the maximum size of a single secret, then multiple secrets are created. Each secret receives its number and contains the certificate, key and cacert.

The certificates for all the nodes in the node set for a given service are collected in secrets named "<nodeset>-<service_name>-<hash_key>-certs-#", where the # symbol represents the generated secret number that starts at 0. These secrets are mounted in the ansibleEE when addCertMounts is enabled.

How the certificates are transferred to the compute nodes

A dataplane service ("install-certs") has been added to added to copy over the certificates to the compute nodes. As noted above, this service has the addCertMounts attribute set to True. It is expected that this service will be executed before any other services that require TLS certs.

The service:

  • Mounts the <nodeset>-<service_name>-<hash_key>-certs-# secrets for all services that have tlsCertsEnabled` set to "true".

  • For each node, calls the osp.edpm.install_certs role which copies all the certificates and keys for that node to /var/lib/openstack/certs/<service_name>/<hash_key>. The cacert bundles are copied to /var/lib/openstack/cacerts/<service_name>.

Code should then be added to each service’s ansible role to use the certs as needed. For example, in libvirt’s role, we move the certs and keys to standard locations on the compute host. Other roles may mount the certs and keys into their containers using kolla or otherwise. The certs and keys for all the services are available as needed for all services.

Whats happens when the certificates are renewed?

The secrets that store the certificates and keys that are generated by certmanager (which are named cert-<service_name>-<hash_key>_<node_name>) are owned by certmanager. When they are created, they are labelled using "osdp-service", "osdp-service-cert-key" and "osdpns" to indicate the dataplane service, hash key and nodeset accordingly.

At the end of the deployment, these secrets are hashed and the values are stored in the secretHashes status field of the nodeset and deployment. In this way, these cert secrets are treated in exactly the same way as any other dataplane service related secrets.

Certmanager will automatically renew certificates prior to their expiration, which will result in modifications to the secrets.

The deployer can periodically review the hashes for these secrets to determine if any of them have changed - this is currently expected to be a manual process - and then may choose to invoke a new deployment to update the certificates and keys.

How to enable cert generation for your dataplane service

Based on the above description, the steps are pretty straightforward.

  1. Add a tlsCerts attribute to your dataplane service. Set the contents, networks and issuer according to your needs. The service1 configuration is a minimal specification and will provide a cert with dnsNames for all the interfaces of the compute node in the SAN, issued by the internal TLS CA. This is probably sufficient for most use cases.

  2. Add a specification for a CACertBundle. This attribute can be added to mount a CACert bundle even if no cert generation is needed.

  3. The "install-certs: service should run before your service. It will copy the certs and cacerts to a standard location. See the section above.

  4. Modify your role to do something with the generated certs.

Scaling DataPlane

Scaling Out

Scale out of an existing dataplane with more edpm nodes can be achieved by adding new nodes to the nodes section of the OpenStackDataPlaneNodeSet spec. Ensure that there are enough BMHs in Available state in the required namespace with the desired labels for baremetal nodes.

Pre-Provisioned:
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
  name: openstack-edpm-ipam
spec:
  preProvisioned: True
  nodes:
  ...
    edpm-compute-2:
      hostName: edpm-compute-2
      ansible:
        ansibleHost: 192.168.122.102
      networks:
      - name: CtlPlane
        subnetName: subnet1
        defaultRoute: true
        fixedIP: 192.168.122.102
      - name: InternalApi
        subnetName: subnet1
      - name: Storage
        subnetName: subnet1
      - name: Tenant
        subnetName: subnet1
  ...
Baremetal:
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
  name: openstack-edpm-ipam
spec:
  nodes:
  ...
    edpm-compute-2:
      hostName: edpm-compute-2
  ...
To deploy on the additional nodes, a new OpenStackDataPlaneDeployment CR should be created with the OpenStackDataPlaneNodeSet in the nodeSets section.
New OpenStackDataPlaneDeployment:
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
  name: new-deployment # Do not re-use the name from previous OpenStackDataPlaneDeployment
spec:
 nodeSets:
   - openstack-edpm-ipam # scaled out nodeset name
Before applying the new OpenStackDataPlaneDeployment CR, verify if the OpenStackDataPlaneNodeSet in the nodeSets section has reached the SetupReady status.
$ oc wait openstackdataplanenodeset openstack-edpm-ipam --for condition=SetupReady --timeout=10m

Once deployment completes OpenStackDataPlaneNodeSet would be ready as shown below.

$ oc get openstackdataplanenodeset openstack-edpm-ipam
NAME                  STATUS   MESSAGE
openstack-edpm-ipam   True     NodeSet Ready

If the scaled out nodes are compute nodes, once the OpenStackDataPlaneNodeSet reaches NodeSet Ready, nova-manage cell_v2 discover_hosts should be run to make the new compute nodes show-up in hypervisor list and become usable.

$ oc rsh nova-cell0-conductor-0 nova-manage cell_v2 discover_hosts --verbose
Found 2 cell mappings.
Skipping cell0 since it does not contain hosts.
Getting computes from cell 'cell1': 75f666d2-922d-45af-8bdb-d897a1bc9b1c
Checking host mapping for compute host 'edpm-compute-2': 6afda7af-2953-4400-842c-a327a0e43a74
Creating host mapping for compute host 'edpm-compute-2': 6afda7af-2953-4400-842c-a327a0e43a74
Found 1 unmapped computes in cell: 75f666d2-922d-45af-8bdb-d897a1bc9b1c

$ oc rsh openstackclient openstack hypervisor list
+--------------------------------------+-------------------------------------+-----------------+-----------------+-------+
| ID                                   | Hypervisor Hostname                 | Hypervisor Type | Host IP         | State |
+--------------------------------------+-------------------------------------+-----------------+-----------------+-------+
| cc05372a-27bd-4b33-985e-b0009c9e515e | edpm-compute-1.ctlplane.example.com | QEMU            | 192.168.221.101 | up    |
| 5e3f7b5d-39fd-430c-80d1-084086bdccde | edpm-compute-0.ctlplane.example.com | QEMU            | 192.168.221.100 | up    |
| 6afda7af-2953-4400-842c-a327a0e43a74 | edpm-compute-2.ctlplane.example.com | QEMU            | 192.168.221.102 | up    |
+--------------------------------------+-------------------------------------+-----------------+-----------------+-------+

Scaling Out with different configuration

If the deployment needs to be scaled out to nodes that require different configuration (e.g. different kernel args, network config, or openstack config) compared to the configuration in the current OpenStackDataPlaneNodeSet then the new nodes cannot be added to the existing OpenStackDataPlaneNodeSet but a new OpenStackDataPlaneNodeSet needs to be created that contains the new nodes and the new configuration for those nodes. Then a new OpenStackDataPlaneDeployment needs to be created that points to both the existing and the new OpenStackDataPlaneNodeSets to trigger the scale out.

If only the new OpenStackDataPlaneNodeSet is included into the new OpenStackDataPlaneDeployment then the scale out seems to succeed but will be incomplete causing that VM move operations will fail between nodes in the different OpenStackDataPlaneNodeSets.

Scaling In

The procedure for removing edpm nodes from dataplane involves some manual cleanup steps after evacuation of workload.

For edpm compute nodes removal following steps should be performed.

Disable nova-compute service
$ oc rsh openstackclient openstack compute service list
+--------------------------------------+----------------+------------------------+----------+---------+-------+----------------------------+
| ID                                   | Binary         | Host                   | Zone     | Status  | State | Updated At                 |
+--------------------------------------+----------------+------------------------+----------+---------+-------+----------------------------+
| 11105d9b-9ef7-4d6f-8d17-6eb8db175d76 | nova-conductor | nova-cell1-conductor-0 | internal | enabled | up    | 2024-02-01T03:59:42.000000 |
| 31e2ee14-a124-4e02-b11d-87c2cdca3c56 | nova-compute   | edpm-compute-1         | nova     | enabled | up    | 2024-02-01T03:59:38.000000 |
| bd031e6e-89d8-4839-b345-5f124ec4c07e | nova-compute   | edpm-compute-0         | nova     | enabled | up    | 2024-02-01T03:59:37.000000 |
| f70912f9-eaaa-4caa-906f-a38e20667af4 | nova-compute   | edpm-compute-2         | nova     | enabled | up    | 2024-02-01T03:59:38.000000 |
| 8a4622c3-0fb8-498a-81d8-a9c23c0be5fc | nova-conductor | nova-cell0-conductor-0 | internal | enabled | up    | 2024-02-01T03:59:37.000000 |
| 5ad386ec-ac2d-4238-a671-d9402432d326 | nova-scheduler | nova-scheduler-0       | internal | enabled | up    | 2024-02-01T03:59:38.000000 |
+--------------------------------------+----------------+------------------------+----------+---------+-------+----------------------------+

$ oc rsh openstackclient openstack compute service set edpm-compute-2 nova-compute --disable
Stop ovn and nova-compute containers

ssh to the edpm node to be removed and stop the containers.

$ ssh -i out/edpm/ansibleee-ssh-key-id_rsa cloud-admin@192.168.221.102

[cloud-admin@edpm-compute-2 ~]$ sudo systemctl stop edpm_ovn_controller

[cloud-admin@edpm-compute-2 ~]$ sudo systemctl stop edpm_ovn_metadata_agent

[cloud-admin@edpm-compute-2 ~]$ sudo systemctl stop edpm_nova_compute
Delete network agents

Delete the agents for the compute nodes to be removed.

$ oc rsh openstackclient openstack network agent list

+--------------------------------------+------------------------------+----------------+-------------------+-------+-------+----------------+
| ID                                   | Agent Type                   | Host           | Availability Zone | Alive | State | Binary         |
+--------------------------------------+------------------------------+----------------+-------------------+-------+-------+----------------+
| d2b9e5d0-a406-41c2-9bc3-e74aaf113450 | OVN Controller Gateway agent | worker-0       |                   | :-)   | UP    | ovn-controller |
| 9529e28e-522e-48f6-82e2-c5caf1cf5a14 | OVN Controller Gateway agent | worker-1       |                   | :-)   | UP    | ovn-controller |
| 91bd4981-1e81-4fe8-b628-8581add36f13 | OVN Controller agent         | edpm-compute-1 |                   | :-)   | UP    | ovn-controller |
| bdc1dd13-586f-4553-90d6-14348f6be150 | OVN Controller agent         | edpm-compute-0 |                   | :-)   | UP    | ovn-controller |
| f7bb5520-27df-470b-9566-0aa7e5fef583 | OVN Controller agent         | edpm-compute-2 |                   | :-)   | UP    | ovn-controller |
+--------------------------------------+------------------------------+----------------+-------------------+-------+-------+----------------+

$ oc rsh openstackclient openstack network agent delete f7bb5520-27df-470b-9566-0aa7e5fef583
Delete nova-compute service

Delete nova-compute service for the removed node.

$ oc rsh openstackclient openstack compute service delete f70912f9-eaaa-4caa-906f-a38e20667af4

$ oc rsh openstackclient openstack hypervisor list
+--------------------------------------+-------------------------------------+-----------------+-----------------+-------+
| ID                                   | Hypervisor Hostname                 | Hypervisor Type | Host IP         | State |
+--------------------------------------+-------------------------------------+-----------------+-----------------+-------+
| cc05372a-27bd-4b33-985e-b0009c9e515e | edpm-compute-1.ctlplane.example.com | QEMU            | 192.168.221.101 | up    |
| 5e3f7b5d-39fd-430c-80d1-084086bdccde | edpm-compute-0.ctlplane.example.com | QEMU            | 192.168.221.100 | up    |
+--------------------------------------+-------------------------------------+-----------------+-----------------+-------+
Patch OpenStackDataPlaneNodeSet to remove node

Once the cleanup is complete, patch OpenStackDataPlaneNodeSet CR to remove the nodes from the nodes section.

$ oc patch openstackdataplanenodeset/openstack-edpm --type json --patch '[{ "op": "remove", "path": "/spec/nodes/edpm-compute-2" }]'
openstackdataplanenodeset.dataplane.openstack.org/openstack-edpm patched

For baremetal provisioned node this would start de-provisioning the removed node.

$ oc get bmh
NAME         STATE            CONSUMER              ONLINE   ERROR   AGE
compute-01   provisioned      openstack-edpm        true             2d21h
compute-02   provisioned      openstack-edpm        true             2d21h
compute-03   deprovisioning                         false            43h

Scaling In by removing a NodeSet

If a full OpenStackDataPlaneNodeSet has to be removed, steps mentioned above to disable nova-compute services, stop the ovn and nova-compute containers on nodes, delete network agents and delete nova-compute services should be done for each compute. Finally the OpenStackDataPlaneNodeSet CR can be deleted. If this OpenStackDataPlaneNodeSet is the only one listing the ssh-known-hosts service, then this service needs to be added to one or more of the remaining OpenStackDataPlaneNodeSets. To remove the ssh host keys of the removed nodes of this OpenStackDataPlaneNodeSet from other nodes a new OpenStackDataPlaneDeployment needs to be created that points to all the remaining OpenStackDataPlaneNodeSets.

AnsibleEE runner variables

Number of variables can be used to modify behavior of the AnsibleEE runner executing given job, such as timeouts and caching. These are expanded in the env/settings file. And further documented in Ansible Runner docs.

All of these variables are left set to sensible defaults, nevertheless, some changes may be necessary, depending on particulars of individual deployments.

Ansible variables

The list of ansible variables that can be set under ansibleVars is extensive. To understand what variables are available for each service, see the documentation in the Create OpenStackDataPlaneServices section.

Common configurations that can be enabled with ansibleVars are also documented at Common Configurations.

In the case of ansibleVars, the value is merged with that of the value from the nodeTemplate. This makes it so that the entire value of ansibleVars from the nodeTemplate does not need to be reproduced for each node just to set a few node specific values.

Importing ansible variables

ansibleVarsFrom allows you to set ansible variables for an OpenStackDataPlaneNodeSet by referencing either a ConfigMap or a Secret. When you use ansibleVarsFrom, all the key-value pairs in the referenced ConfigMap or Secret are set as environment variables for the OpenStackDataPlaneNodeSet. You can also specify a common prefix string.

Example:

Adding ansible variables from ConfigMap:

  1. Create a ConfigMap containing the ansible variables

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: common-edpm-vars
    data:
      edpm_config_var1: value1
      edpm_config_var2: value2
  2. Update the ansibleVarsFrom with the ConfigMap name

    ansibleVarsFrom:
      - configMapRef:
            name: common-edpm-vars
Example:

Execute subscription-manager register from corresponding Secret

  1. Create a Secret containing the credentials

    apiVersion: v1
    kind: Secret
    metadata:
      name: subscription-manager
    data:
      username: <base64 encoded username>
      password: <base64 encoded password>
  2. Update the ansibleVarsFrom with the Secret name, and ansibleVars with the variables generated from the Secret

    ansibleVarsFrom:
      - prefix: subscription_manager_
        secretRef:
          name: subscription-manager
    ansibleVars:
        edpm_bootstrap_command: |
          subscription-manager register --username {{ subscription_manager_username }} --password {{ subscription_manager_password }}

    Values defined by an ansibleVars with a duplicate key take precedence

Common Configurations

This page documents some of the common configurations that can be enabled through ansible variables. The ansible variables that affect the configuration of the ansible executions are set in the ansibleVars field on the dataplane resources.

The full set of ansible variables available for configuration are documented within each role in the edpm-ansible repository.

Initial bootstrap command

Variable: edpm_bootstrap_command Type: string Role: edpm_bootstrap

The edpm_bootstrap_command variable can be used to pass a shell command(s) that will be executed as early as possible in the deployment as part of the configure-network service. If the services list is customized with services that execute prior to configure-network then the command(s) specified by edpm_bootstrap_command would run after the custom services.

The string value for edpm_bootstrap_command is passed directly to the ansible shell. As such, when using multiple shell commands, the | character must be used to preserve new lines in the YAML value:

edpm_bootstrap_command: |
    command 1
    command 2
    etc.
Using edpm_bootstrap_command for system registration

edpm_bootstrap_command can be used to perform system registration in order to enable needed package repositories. Choose a registration method (either Portal or Satellite) and refer to the provided links below for instructions to create the registration commands.

Red Hat Customer Portal registration

The registration commands for the Red Hat Customer Portal are documented at https://access.redhat.com/solutions/253273.

Red Hat Satellite registration

If not using Satellite version 6.13, then refer to the specific version of the documentation for the version of Satellite that is in use.

Customizing container image locations

The container images used by the various roles from edpm-ansible can pull from customized locations. The ansible variables used to set the locations and their default values are:

edpm_iscsid_image: "quay.io/podified-antelope-centos9/openstack-iscsid
edpm_logrotate_crond_image: "quay.io/podified-antelope-centos9/openstack-cron
edpm_ovn_controller_agent_image: "quay.io/podified-antelope-centos9/openstack-ovn-controller
edpm_frr_image: "quay.io/podified-antelope-centos9/openstack-frr
edpm_ovn_bgp_agent_image: "quay.io/podified-antelope-centos9/openstack-ovn-bgp-agent
edpm_ovn_bgp_agent_local_ovn_nb_db_image: "quay.io/podified-antelope-centos9/openstack-ovn-nb-db-server
edpm_ovn_bgp_agent_local_ovn_sb_db_image: "quay.io/podified-antelope-centos9/openstack-ovn-sb-db-server
edpm_ovn_bgp_agent_local_ovn_northd_image: "quay.io/podified-antelope-centos9/openstack-ovn-northd
edpm_ovn_bgp_agent_local_ovn_controller_image: "quay.io/podified-antelope-centos9/openstack-ovn-controller
edpm_telemetry_node_exporter_image: quay.io/prometheus/node-exporter
edpm_telemetry_kepler_image: "quay.io/sustainable_computing_io/kepler"
edpm_telemetry_ceilometer_compute_image: quay.io/podified-antelope-centos9/openstack-ceilometer-compute
edpm_telemetry_ceilometer_ipmi_image: quay.io/podified-antelope-centos9/openstack-ceilometer-ipmi
edpm_nova_compute_image: "quay.io/podified-antelope-centos9/openstack-nova-compute
edpm_neutron_sriov_image: "quay.io/podified-antelope-centos9/openstack-neutron-sriov-agent
edpm_multipathd_image: "quay.io/podified-antelope-centos9/openstack-multipathd
edpm_neutron_dhcp_image: "quay.io/podified-antelope-centos9/openstack-neutron-dhcp-agent
edpm_neutron_metadata_agent_image: "quay.io/podified-antelope-centos9/openstack-neutron-metadata-agent-ovn
edpm_neutron_ovn_agent_image: "quay.io/podified-antelope-centos9/openstack-neutron-ovn-agent
edpm_swift_proxy_image: "quay.io/podified-antelope-centos9/openstack-swift-proxy-server
edpm_swift_account_image: "quay.io/podified-antelope-centos9/openstack-swift-account
edpm_swift_container_image: "quay.io/podified-antelope-centos9/openstack-swift-container
edpm_swift_object_image: "quay.io/podified-antelope-centos9/openstack-swift-object

Set any of the above ansible variables within the ansibleVars sections of OpenStackDataPlaneNodeSet to customize the container image locations.

Network Isolation

Network Isolation refers to the practice of separating network traffic by function, and configuring the networks on dataplane nodes. Nodes will need connectivity to various control plane services running on OCP. These services may be bound to different networks. Each of those networks needs to be configured as required on dataplane nodes.

For further details on the network architecture of the control plane, see https://github.com/openstack-k8s-operators/docs/blob/main/networking.md.

Configuring networking with edpm_network_config

The edpm_network_config ansible role is responsible for configuring networking on dataplane nodes.

The edpm_network_config_template variable specifies the contents of a jinja2 template that describes the networking configuration to be applied. The template itself also contains variables that can be used to customize the networking configuration for a specific node (IP addresses, interface names, routes, etc). See template examples provided in the nic-config-samples directory: https://github.com/openstack-k8s-operators/openstack-operator/tree/main/config/samples/nic-config-samples.

These samples can be used inline within the OpenStackDataPlaneNodeSet CR under then ansibleVars section (see our current sample files for examples of the inline implementation).

The following is an example ansibleVars field that shows defining the variables that configure the edpm_network_config role.

ansibleVars:
  ctlplane_ip: 192.168.122.100
  internalapi_ip: 172.17.0.100
  storage_ip: 172.18.0.100
  tenant_ip: 172.19.0.100
  fqdn_internalapi: edpm-compute-0.example.com
  edpm_network_config_template: |
	 ---
	 {% set mtu_list = [ctlplane_mtu] %}
	 {% for network in nodeset_networks %}
	 {{ mtu_list.append(lookup('vars', networks_lower[network] ~ '_mtu')) }}
	 {%- endfor %}
	 {% set min_viable_mtu = mtu_list | max %}
	 network_config:
	 - type: ovs_bridge
	 name: {{ neutron_physical_bridge_name }}
	 mtu: {{ min_viable_mtu }}
	 use_dhcp: false
	 dns_servers: {{ ctlplane_dns_nameservers }}
	 domain: {{ dns_search_domains }}
	 addresses:
	 - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
	 routes: {{ ctlplane_host_routes }}
	 members:
	 - type: interface
	 	name: nic1
	 	mtu: {{ min_viable_mtu }}
	 	# force the MAC address of the bridge to this interface
	 	primary: true
	 {% for network in nodeset_networks %}
	 - type: vlan
	 	mtu: {{ lookup('vars', networks_lower[network] ~ '_mtu') }}
	 	vlan_id: {{ lookup('vars', networks_lower[network] ~ '_vlan_id') }}
	 	addresses:
	 	- ip_netmask:
	 		{{ lookup('vars', networks_lower[network] ~ '_ip') }}/{{ lookup('vars', networks_lower[network] ~ '_cidr') }}
	 	routes: {{ lookup('vars', networks_lower[network] ~ '_host_routes') }}
	 {% endfor %}

This configuration would be applied by the configure-network service when it’s executed.

Network attachment definitions

The NetworkAttachmentDefinition resource is used to describe how pods can be attached to different networks. Network attachment definitions can be specified on the OpenStackDataPlaneNodeSet resource using the NetworkAttachments field.

The network attachments are used to describe which networks will be connected to the pod that is running ansible-runner. They do not enable networks on the dataplane nodes themselves. For example, adding the internalapi network attachment to NetworkAttachments means the ansible-runner pod will be connected to the internalapi network. This can enable scenarios where ansible needs to connect to different networks.

Interacting with Ansible

When a dataplane service is executed during a deployment, a corresponding Kubernetes Job is created. This Kubernetes Job is the associated ansible execution with the service.

During reconciliation a Job resource is created which in turn creates a Pod resource. The pod is started with an Ansible Execution Environment image, and runs ansible-runner.

Retrieving and inspecting Ansible Execution Jobs

The Kubernetes jobs are labelled with the name of the OpenStackDataPlaneDeployment. Jobs for each OpenStackDataPlaneDeployment can be seen by listing jobs by the label:

$ oc get job -l openstackdataplanedeployment=edpm-compute
NAME                                                 STATUS     COMPLETIONS   DURATION   AGE
bootstrap-edpm-compute-openstack-edpm-ipam           Complete   1/1           78s        25h
configure-network-edpm-compute-openstack-edpm-ipam   Complete   1/1           37s        25h
configure-os-edpm-compute-openstack-edpm-ipam        Complete   1/1           66s        25h
download-cache-edpm-compute-openstack-edpm-ipam      Complete   1/1           64s        25h
install-certs-edpm-compute-openstack-edpm-ipam       Complete   1/1           46s        25h
install-os-edpm-compute-openstack-edpm-ipam          Complete   1/1           57s        25h
libvirt-edpm-compute-openstack-edpm-ipam             Complete   1/1           2m37s      25h
neutron-metadata-edpm-compute-openstack-edpm-ipam    Complete   1/1           61s        25h
nova-edpm-compute-openstack-edpm-ipam                Complete   1/1           3m20s      25h
ovn-edpm-compute-openstack-edpm-ipam                 Complete   1/1           78s        25h
run-os-edpm-compute-openstack-edpm-ipam              Complete   1/1           33s        25h
ssh-known-hosts-edpm-compute                         Complete   1/1           19s        25h
telemetry-edpm-compute-openstack-edpm-ipam           Complete   1/1           2m5s       25h
validate-network-edpm-compute-openstack-edpm-ipam    Complete   1/1           16s        25h

Logs can be checked using oc logs -f job/<job-name>. For example, if we want to check the logs from the configure-network job:

 $ oc logs -f jobs/configure-network-edpm-compute-openstack-edpm-ipam | tail -n2
PLAY RECAP *********************************************************************
edpm-compute-0             : ok=22   changed=0    unreachable=0    failed=0    skipped=17   rescued=0    ignored=0

Controlling the Ansible execution

For specifying the ansible tags, skip-tags, and limit

The fields in OpenStackDataPlaneDeployment that correspond to these options are:

ansibleTags
ansibleSkipTags
ansibleLimit

The syntax for these fields match the syntax that ansible accepts on the command line for ansible-playbook and ansible-runner for each of these fields.

Example usage of these fields:

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
  name: openstack-edpm
spec:
  ansibleTags: containers
  ansibleSkipTags: packages
  ansibleLimit: compute1*,compute2*

The above example translates to an ansible command with the following arguments:

--tags containers --skip-tags packages --limit compute1*,compute2*

Hashes

NodeSet Config Changes

We create a Hash of the inputs located in the OpenStackDataPlaneNodeSet Spec Nodes and NodeTemplate sections. This hash is then stored in the status.configHash field. If the current value of the configHash is different to the deployedConfigHash, then it is necessary to recreate the OpenStackDataPlaneDeployment to roll out the new changes:

$ oc get osdpns -o yaml | yq '.items[0].status.configHash'
"n648hd6h88hc7h86hc7h568h585h79h5"

This field can be used to inform user decisions around when a new deploy is needed to reconclie the changes to the NodeSet.

OpenStackDataPlaneNodeSet deployment hashes

Each OpenStackDataPlaneService can optionally have an associated list of ConfigMaps and Secrets that are mounted as file data into the OpenStackAnsibleEE job started to deploy that service. The ansible content then is able to consume those files as necessary. See Configuring a custom service for more details.

When an OpenStackDataPlaneDeployment succeeds, the computed hash of each ConfigMap and Secret for each OpenStackDataPlaneService that was deployed is saved on the status of each OpenStackDataPlaneNodeSet referenced by the OpenStackDataPlaneDeployment.

These hashes can be compared against the current hash of the ConfigMap or Secret to see if there is newer input data that has not been deployed to the OpenStackDataPlaneNodeSet. For example if the hash of nova-cell1-compute-config Secret in the OpenStackDataPlaneNodeSet status is different from the hash of nova-cell1-compute-config in the novacell/nova-cell1 status, then there is nova-compute control plane configuration data the needs to be deployed to the EDPM compute nodes.

For example, the following hashes are saved on the OpenStackDataPlaneNodeSet status after a typical deployment:

$ oc get openstackdataplanenodeset openstack-edpm -o yaml

<snip>
status:
  conditions:
	<snip>
  configMapHashes:
    ovncontroller-config: n655h5...
  secretHashes:
    neutron-dhcp-agent-neutron-config: n9ch5...
    neutron-ovn-metadata-agent-neutron-config: n588h...
    neutron-sriov-agent-neutron-config: n648h...
    nova-cell1-compute-config: n576h...
    nova-metadata-neutron-config: n56fh...

Using IPAM and Internal DNS Service

To use IPAM and DNS Service with dataplane a NetConfig CR should exist with the required networks, subnets and their allocation pools and dns service should be enabled in OpenStackControlPlane CR.

When using IPAM, networks for the Node/NodeSet can be defined in the OpenStackDataPlaneNodeSet CR either in the nodes or nodeTemplate section.

For predictable IP, networks should be added in the nodes section with desired predictable IP as fixedIP.

<snip>
    nodes:
      edpm-compute-0:
        hostName: edpm-compute-0
        ansible:
          ansibleHost: 192.168.122.100
        networks:
        - name: ctlplane
          subnetName: subnet1
          defaultRoute: true
          fixedIP: 192.168.122.100
        - name: internalapi
          subnetName: subnet1
        - name: storage
          subnetName: subnet1
        - name: tenant
          subnetName: subnet1
<snip>
-------
<snip>
    nodeTemplate:
      networks:
      - name: ctlplane
        subnetName: subnet1
        defaultRoute: true
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
<snip>

Relevant Status Conditions

NodeSetIPReservationReady and NodeSetDNSDataReady conditions in status condtions reflects the status of IPReservation and DNSData as shown below.

$ oc get openstackdataplanenodeset openstack-edpm -o json | jq '.status.conditions[] | select(.type=="NodeSetIPReservationReady")'
{
  "lastTransitionTime": "2024-01-31T12:16:21Z",
  "message": "NodeSetIPReservationReady ready",
  "reason": "Ready",
  "status": "True",
  "type": "NodeSetIPReservationReady"
}

$ oc get openstackdataplanenodeset openstack-edpm-ipam -o json | jq '.status.conditions[] | select(.type=="NodeSetDNSDataReady")'
{
  "lastTransitionTime": "2024-01-31T12:16:21Z",
  "message": "NodeSetDNSDataReady ready",
  "reason": "Ready",
  "status": "True",
  "type": "NodeSetDNSDataReady"
}

Hotfixing the data plane

You can update the OpenStack data plane when hotfix content is available. Hotfix content can be delivered as RPM packages or container images.

You apply a container hotfix to the data plane nodes by updating any running containers to run from container images where the hotfix content has been applied. Container hotfix content can be delivered as either RPM’s or already updated container images.

How the software is installed on the data plane nodes determines which of the following methods you need to use to apply the hotfix content:

  • Node software was installed by using RPMs: Apply the hotfix to the RPM content.

  • Node software was installed by using container images: Apply the hotfix to the container content with either RPMs or container images.

Hotfixing the data plane RPM content

You install RPM hotfix content directly on to the data plane nodes.

Procedure
  1. Obtain the RPM hotfix content from the source and store it locally:

    $ mkdir -p <hotfix_id>/rpms
    $ cp /path/to/hotfix/*.rpm <hotfix_id>/rpms
    • Replace <hotfix_id> with a hotfix identifier such as a Jira issue, for example osprh-0000.

  2. Copy the RPM hotfix content to the affected data plane nodes:

    $ ssh <ssh_user>@<data_plane_node> mkdir -p /tmp/<hotfix_id>/rpms
    $ scp <hotfix_id>/rpms/*.rpm <ssh_user>@<data_plane_node>:/tmp/<hotfix_id>/rpms
    • Replace <ssh_user> with the SSH user name.

    • Replace <data_plane_node> with the hostname or IP for the data plane node.

    • Replace <hotfix_id> with a hotfix identifier such as a Jira issue, for example osprh-0000.

    Repeat this step for each data plane node that the hotfix must be applied to.

  3. Update the RPM hotfix content on the affected data plane nodes.

    $ ssh <ssh_user>@<data_plane_node>
    $ sudo dnf in -y /tmp/<hotfix_id>/rpms/*.rpm
    • Replace <ssh_user> with the SSH user name.

    • Replace <data_plane_node> with the hostname or IP for the data plane node.

    • Replace <hotfix_id> with a hotfix identifier such as a Jira issue, for example osprh-0000.

  4. Perform any additional custom steps that are detailed in the hotfix instructions to complete the application of the RPM hotfix content.

Hotfixing the data plane container content with RPM’s

When container hotfix content is delivered as RPM’s, you must update the container images manually.

Procedure
  1. From a RHEL workstation, server, or virtual machine, ensure the following packages are installed:

    • buildah

    • podman

  2. From a RHEL workstation, server, or virtual machine, collect the hotfix RPMs into a new directory:

    $ mkdir -p <hotfix_id>/rpms
    $ cp /path/to/hotfix/*.rpm <hotfix_id>/rpms
    • Replace <hotfix_id> with a hotfix identifier such as a Jira issue, for example osprh-0000.

  3. Create a container image tagged with your registry account details and a hotfix identifier:

    $ updated_container="<updated_container_registry>/<updated_container_project>/<container_image>:<hotfix_id>"
    $ container=$(buildah from <container_registry>/<container_project>/<container_image>:<container_tag>)
    $ buildah run --user root $container mkdir -p /<hotfix_id>/rpms
    $ buildah copy --user root $container <hotfix_id>/rpms/*.rpm /hotfix_id/rpms
    $ buildah run --user root rpm -F /<hotfix_id/rpms/*.rpm
    $ buildah commit $container $updated_container
    $ buildah push $updated_container
    • Replace <hotfix_id> with a hotfix identifier such as a Jira issue, for example osprh-0000.

    • Replace <updated_container_registry> with a container registry to serve the updated container image. The OCP internal container image registry can be used.

    • Replace <updated_container_project> with a container project to use for the updated container image.

    • Replace <container_project> with the container project for the container being updated.

    • Replace <container_registry> with the container registry for the container being updated.

    • Replace <container_image> with the container image being updated.

    • Replace <container_tag> with the container tag being updated.

      The values for <updated_container_registry> and <container_registry> can be the same. The values for <updated_container_project> and <container_project> can be the same. The container images will be differentiated based on the value of their tags.
  4. Hotfix the updated container image on the affected data plane nodes. Use the Hotfixing the data plane container content with images procedure to apply the hotfixed container image.

Hotfixing the data plane container content with images

When container hotfix content is delivered as images, the container processes need to be restarted to use the new images. This will be accomplished by creating a new OpenStackDataPlaneDeployment.

Procedure
  1. Optional: Prepare the container hotfix image in a container registry where the image can be pulled by affected data plane nodes:

    $ podman pull <container_registry>/<container_project>/<container_image>:<container_tag>
    $ podman tag <container_registry>/<container_project>/<container_image>:<container_tag> <updated_container_registry>/<updated_container_project>/<container_image>:<container_tag>
    $ podman push <updated_container_registry>/<updated_container_project>/<container_image>:<container_tag>
    • Replace <container_registry> with the source registry for the hotfixed container image.

    • Replace <container_project> with the source project for the hotfixed container image.

    • Replace <container_image> with the hotfixed container image.

    • Replace <container_tag> with the tag for the hotfixed container image.

    • Replace <updated_container_registry> with a container registry to serve the hotfixed container image. You can use the OpenShift internal container image registry.

    • Replace <updated_container_project> with a container project to use for the hotfixed container image.

  2. Update the affected OpenStackDataPlaneNodeSet resources by customizing the container locations to the hotfixed container locations. For more information about how to set the hotfixed container locations, see Customizing container image locations.

  3. Create a new OpenStackDataPlaneDeployment resource that deploys the affected OpenStackDataPlaneNodeSet resources. For more information about how to create OpenStackDataPlaneDeployment resources, see Deploying the data plane.

    You can restrict the list of services for the OpenStackDataPlaneDeployment to only those affected by the hotfix by using the servicesOverride field. For more information, see Overriding services for the deployment.

Updating the data plane

You can perform a minor update of your OpenStack data plane environment to keep it updated with the latest packages and containers.

You must coordinate the minor update of the OpenStack data plane environment with a minor update of the control plane. OVN containers on the data plane nodes should not be updated until OVN containers on the control plane have been updated.

See OpenStackVersion and Open vSwitch update for more information.

Updating OVN on the data plane

You update OVN content (containers) on the data plane once OVN on the control plane has been updated.

Procedure
  1. Validate that OVN has been updated on the control plane.

    $ oc wait openstackversion <openstack_ctlplane_name> --for=condition=MinorUpdateOVNControlplane
    • Replace <openstack_ctlplane_name> with the name of the OpenStack control plane resource.

      The following example output shows the condition has been met:

      openstackversion.core.openstack.org/openstack-galera-network-isolation condition met
  2. Create an OpenStackDataPlaneDeployment CR and save it to a file named openstack-edpm-update.yaml on your workstation.

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: edpm-deployment-ipam-update
    spec:
      nodeSets:
        - openstack-edpm-ipam
        - <nodeSet_name>
        - ...
        - <nodeSet_name>
      servicesOverride:
        - ovn
    • Replace <nodeSet_name> with the names of the OpenStackDataPlaneNodeSet CRs that you want to include in your data plane minor update.

      The servicesOverride field is set to include only ovn as the ovn service must be updated first in isolation. If using a custom service to manage ovn, then use that custom service name instead of ovn in servicesOverride. Additionally if other custom services must be updated at the same time as ovn, then they can be included in servicesOverride as well.
  3. Save the openstack-edpm-update.yaml deployment file.

  4. Update the data plane:

    $ oc create -f openstack-edpm-update.yaml
  5. Verify that the data plane update deployment succeeded:

    $ oc get openstackdataplanedeployment
    NAME             			STATUS   MESSAGE
    edpm-deployment-ipam   		True     Setup Complete
    edpm-deployment-ipam-update True     Setup Complete

Once OVN has been updated on the data plane, the rest of the control plane minor update will automatically proceed. Once the control plane minor update is finished, the rest of the data plane can be updated.

Troubleshooting

See Troubleshooting data plane creation and deployment for troubleshooting any deployment failures.

Updating other services on the data plane

Once OVN has been updated on the control plane and data plane, and the rest of the control plane has completed updating, you update the rest of the services on the data plane.

Procedure
  1. Validate that the rest of the minor update has completed on the control plane.

    $ oc wait openstackversion <openstack_ctlplane_name> --for=condition=MinorUpdateControlplane
    • Replace <openstack_ctlplane_name> with the name of the OpenStack control plane resource.

      The following example output shows the condition has been met:

      openstackversion.core.openstack.org/openstack-galera-network-isolation condition met
  2. Create an OpenStackDataPlaneDeployment CR and save it to a file named openstack-edpm-update-services.yaml on your workstation.

    apiVersion: dataplane.openstack.org/v1beta1
    kind: OpenStackDataPlaneDeployment
    metadata:
      name: edpm-deployment-ipam-update-services
    spec:
      nodeSets:
        - openstack-edpm-ipam
        - <nodeSet_name>
        - ...
        - <nodeSet_name>
      servicesOverride:
        - update
    • Replace <nodeSet_name> with the names of the OpenStackDataPlaneNodeSet CRs that you want to include in your data plane minor update.

      The servicesOverride field is set to include only update. The update service applies only the tasks needed to update the packages and containers on the EDPM nodes. When using custom services, include those here as well, or their equivalent custom services that apply the needed update tasks.
  3. Save the openstack-edpm-update-services.yaml deployment file.

  4. Update the data plane:

    $ oc create -f openstack-edpm-update-services.yaml
  5. Verify that the data plane update deployment succeeded:

    $ oc get openstackdataplanedeployment
    NAME             						STATUS   MESSAGE
    edpm-deployment-ipam   					True     Setup Complete
    edpm-deployment-ipam-update 			True     Setup Complete
    edpm-deployment-ipam-update-services 	True     Setup Complete
Troubleshooting

See Troubleshooting data plane creation and deployment for troubleshooting any deployment failures.

Deploying OpenStack in a disconnected environment

Process

Deploying in disconnected environments can be achieved largely by following the OpenShift documentation for mirroring OLM Operators: https://docs.openshift.com/container-platform/4.16/installing/disconnected_install/installing-mirroring-installation-images.html#olm-mirror-catalog_installing-mirroring-installation-images

Technical Implementation

The details provided in this section are for informational purposes only. Users should not need to interact with anything additional after completing the above mentioned OLM mirroring process.

The openstack-operator contains a list of related images that will ensure all required images for the deployment are mirrored following the above OpenShift process. Once images are mirrored, the ImageContentSourcePolicy custom resource (CR) is created. This process results in a MachineConfig called 99-master-genereted-registries being updated in the cluster. The 99-master-generated-registries MachineConfig contains a registries.conf file that is applied to all of the OpenShift nodes in the cluster.

In order for dataplane nodes to integrate cleanly with this process, openstack-operator checks for the existence of an ImageContentSourcePolicy. If one is found, it will read the registries.conf file from the 99-master-generated-registries MachineConfig. The openstack-operator will then set two variables in the Ansible inventory for the nodes.

edpm_podman_disconnected_ocp
edpm_podman_registries_conf

edpm_podman_disconnected_ocp is a boolean variable that is used to conditionally render registries.conf on the dataplane nodes during the deployment. While edpm_podman_registries_conf contains the contents of the registries.conf that were acquired from the MachineConfig in the cluster. The contents of this file will be written to /etc/containers/registries.conf on each of the dataplane nodes. This ensures that our dataplane nodes are configured in a consistent manner with the OpenShift nodes.

Since this configuration file is lifted directly from OpenShift, the dataplane nodes also have the same requirements as OpenShift for images - such as using image digests rather than image tags. This can be seen in the Ansible inventory secret for each of the OpenStackDataPlaneNodeSet objects in the cluster. Using multipathd as an example:

        edpm_podman_registries_conf: |
          [...]
            [[registry]]
              prefix = ""
              location = "registry.redhat.io/rhoso/openstack-multipathd-rhel9"

              [[registry.mirror]]
                location = "quay-mirror-registry.example.net:8443/olm/rhoso-openstack-multipathd-rhel9"
                pull-from-mirror = "digest-only"
          [...]

Note that the pull-from-mirror parameter is set to digest-only. This means that any attempt by podman to pull an image by a digest will result in the image being pulled from the specified mirror.

Accordingly, image references in the OpenStackVersion CR are provided in the digest format, for example the multipathd image:

$ oc get openstackversion -o jsonpath='{.items[].status.containerImages.edpmMultipathdImage}'
"registry.redhat.io/rhoso/openstack-multipathd-rhel9@sha256:7df2e1ebe4ec6815173e49157848a63d28a64ffb0db8de6562c4633c0fbcdf3f"

Since all images are in the digest format for the OpenStackVersion resource, there is no additional action required by users to work in a disconnected environment.

Custom Resources

Sub Resources

OpenStackDataPlaneDeployment

OpenStackDataPlaneDeployment is the Schema for the openstackdataplanedeployments API OpenStackDataPlaneDeployment name must be a valid RFC1123 as it is used in labels

Field Description Scheme Required

metadata

metav1.ObjectMeta

false

spec

OpenStackDataPlaneDeploymentSpec

false

status

OpenStackDataPlaneDeploymentStatus

false

OpenStackDataPlaneDeploymentList

OpenStackDataPlaneDeploymentList contains a list of OpenStackDataPlaneDeployment

Field Description Scheme Required

metadata

metav1.ListMeta

false

items

[]OpenStackDataPlaneDeployment

true

OpenStackDataPlaneDeploymentSpec

OpenStackDataPlaneDeploymentSpec defines the desired state of OpenStackDataPlaneDeployment

Field Description Scheme Required

nodeSets

NodeSets is the list of NodeSets deployed

[]string

true

backoffLimit

BackoffLimit allows to define the maximum number of retried executions (defaults to 6).

*int32

false

preserveJobs

PreserveJobs - do not delete jobs after they finished e.g. to check logs PreserveJobs default: true

bool

false

ansibleTags

AnsibleTags for ansible execution

string

false

ansibleLimit

AnsibleLimit for ansible execution

string

false

ansibleSkipTags

AnsibleSkipTags for ansible execution

string

false

ansibleExtraVars

AnsibleExtraVars for ansible execution

map[string]json.RawMessage

false

servicesOverride

ServicesOverride list

[]string

false

deploymentRequeueTime

Time before the deployment is requeued in seconds

int

true

ansibleJobNodeSelector

AnsibleJobNodeSelector to target subset of worker nodes running the ansible jobs

map[string]string

false

OpenStackDataPlaneDeploymentStatus

OpenStackDataPlaneDeploymentStatus defines the observed state of OpenStackDataPlaneDeployment

Field Description Scheme Required

nodeSetConditions

NodeSetConditions

map[string]condition.Conditions

false

ansibleEEHashes

AnsibleEEHashes

map[string]string

false

configMapHashes

ConfigMapHashes

map[string]string

false

secretHashes

SecretHashes

map[string]string

false

nodeSetHashes

NodeSetHashes

map[string]string

false

containerImages

ContainerImages

map[string]string

false

conditions

Conditions

condition.Conditions

false

observedGeneration

ObservedGeneration - the most recent generation observed for this Deployment. If the observed generation is less than the spec generation, then the controller has not processed the latest changes.

int64

false

deployedVersion

DeployedVersion

string

false

deployed

Deployed

bool

false

OpenStackDataPlaneNodeSet

OpenStackDataPlaneNodeSet is the Schema for the openstackdataplanenodesets API OpenStackDataPlaneNodeSet name must be a valid RFC1123 as it is used in labels

Field Description Scheme Required

metadata

metav1.ObjectMeta

false

spec

OpenStackDataPlaneNodeSetSpec

false

status

OpenStackDataPlaneNodeSetStatus

false

OpenStackDataPlaneNodeSetList

OpenStackDataPlaneNodeSetList contains a list of OpenStackDataPlaneNodeSets

Field Description Scheme Required

metadata

metav1.ListMeta

false

items

[]OpenStackDataPlaneNodeSet

true

OpenStackDataPlaneNodeSetSpec

OpenStackDataPlaneNodeSetSpec defines the desired state of OpenStackDataPlaneNodeSet

Field Description Scheme Required

baremetalSetTemplate

BaremetalSetTemplate Template for BaremetalSet for the NodeSet

baremetalv1.OpenStackBaremetalSetTemplateSpec

false

nodeTemplate

NodeTemplate - node attributes specific to nodes defined by this resource. These attributes can be overriden at the individual node level, else take their defaults from valus in this section.

NodeTemplate

true

nodes

Nodes - Map of Node Names and node specific data. Values here override defaults in the upper level section.

map[string]NodeSection

true

env

Env is a list containing the environment variables to pass to the pod Variables modifying behavior of AnsibleEE can be specified here.

[]corev1.EnvVar

false

networkAttachments

NetworkAttachments is a list of NetworkAttachment resource names to pass to the ansibleee resource which allows to connect the ansibleee runner to the given network

[]string

false

services

Services list

[]string

true

tags

Tags - Additional tags for NodeSet

[]string

false

secretMaxSize

SecretMaxSize - Maximum size in bytes of a Kubernetes secret. This size is currently situated around 1 MiB (nearly 1 MB).

int

true

preProvisioned

\n\nPreProvisioned - Set to true if the nodes have been Pre Provisioned.

bool

false

tlsEnabled

TLSEnabled - Whether the node set has TLS enabled.

bool

true

OpenStackDataPlaneNodeSetStatus

OpenStackDataPlaneNodeSetStatus defines the observed state of OpenStackDataPlaneNodeSet

Field Description Scheme Required

conditions

Conditions

condition.Conditions

false

deploymentStatuses

DeploymentStatuses

map[string]condition.Conditions

false

allHostnames

AllHostnames

map[string]map[infranetworkv1.NetNameStr]string

false

allIPs

AllIPs

map[string]map[infranetworkv1.NetNameStr]string

false

configMapHashes

ConfigMapHashes

map[string]string

false

secretHashes

SecretHashes

map[string]string

false

dnsClusterAddresses

DNSClusterAddresses

[]string

false

containerImages

ContainerImages

map[string]string

false

ctlplaneSearchDomain

CtlplaneSearchDomain

string

false

configHash

ConfigHash - holds the curret hash of the NodeTemplate and Node sections of the struct. This hash is used to determine when new Ansible executions are required to roll out config changes.

string

false

deployedConfigHash

DeployedConfigHash - holds the hash of the NodeTemplate and Node sections of the struct that was last deployed. This hash is used to determine when new Ansible executions are required to roll out config changes.

string

false

inventorySecretName

InventorySecretName Name of a secret containing the ansible inventory

string

false

observedGeneration

ObservedGeneration - the most recent generation observed for this NodeSet. If the observed generation is less than the spec generation, then the controller has not processed the latest changes.

int64

false

deployedVersion

DeployedVersion

string

false

OpenStackDataPlaneService

OpenStackDataPlaneService is the Schema for the openstackdataplaneservices API OpenStackDataPlaneService name must be a valid RFC1123 as it is used in labels

Field Description Scheme Required

metadata

metav1.ObjectMeta

false

spec

OpenStackDataPlaneServiceSpec

false

status

OpenStackDataPlaneServiceStatus

false

OpenStackDataPlaneServiceList

OpenStackDataPlaneServiceList contains a list of OpenStackDataPlaneService

Field Description Scheme Required

metadata

metav1.ListMeta

false

items

[]OpenStackDataPlaneService

true

OpenStackDataPlaneServiceSpec

OpenStackDataPlaneServiceSpec defines the desired state of OpenStackDataPlaneService

Field Description Scheme Required

dataSources

DataSources list of DataSource objects to mount as ExtraMounts for the OpenStackAnsibleEE

[]DataSource

false

tlsCerts

TLSCerts tls certs to be generated

map[string]OpenstackDataPlaneServiceCert

false

playbookContents

PlaybookContents is an inline playbook contents that ansible will run on execution.

string

false

playbook

Playbook is a path to the playbook that ansible will run on this execution

string

false

role

Role is a path to the role that ansible will run on this execution

string

false

caCerts

CACerts - Secret containing the CA certificate chain

string

true

openStackAnsibleEERunnerImage

OpenStackAnsibleEERunnerImage image to use as the ansibleEE runner image

string

false

certsFrom

CertsFrom - Service name used to obtain TLSCert and CACerts data. If both CertsFrom and either TLSCert or CACerts is set, then those fields take precedence.

string

false

addCertMounts

AddCertMounts - Whether to add cert mounts

bool

true

deployOnAllNodeSets

DeployOnAllNodeSets - should the service be deploy across all nodesets This will override default target of a service play, setting it to all.

bool

false

containerImageFields

ContainerImageFields - list of container image fields names that this service deploys. The field names should match the ContainerImages struct field names from github.com/openstack-k8s-operators/openstack-operator/apis/core/v1beta1

[]string

false

edpmServiceType

EDPMServiceType - service type, which typically corresponds to one of the default service names (such as nova, ovn, etc). Also typically corresponds to the ansible role name (without the "edpm_" prefix) used to manage the service. If not set, will default to the OpenStackDataPlaneService name.

string

false

OpenStackDataPlaneServiceStatus

OpenStackDataPlaneServiceStatus defines the observed state of OpenStackDataPlaneService

Field Description Scheme Required

conditions

Conditions

condition.Conditions

false

OpenstackDataPlaneServiceCert

OpenstackDataPlaneServiceCert defines the property of a TLS cert issued for a dataplane service

Field Description Scheme Required

contents

Contents of the certificate This is a list of strings for properties that are needed in the cert

[]string

true

networks

Networks to include in SNI for the cert

[]infranetworkv1.NetNameStr

false

issuer

Issuer is the label for the issuer to issue the cert Only one issuer should have this label

string

false

keyUsages

KeyUsages to be added to the issued cert

[]certmgrv1.KeyUsage

false

edpmRoleServiceName

EDPMRoleServiceName is the value of the _service_name variable from the edpm-ansible role where this certificate is used. For example if the certificate is for edpm_ovn from edpm-ansible, EDPMRoleServiceName must be ovn, which matches the edpm_ovn_service_name variable from the role. If not set, OpenStackDataPlaneService.Spec.EDPMServiceType is used. If OpenStackDataPlaneService.Spec.EDPMServiceType is not set, then OpenStackDataPlaneService.Name is used.

string

false

AnsibleEESpec

AnsibleEESpec is a specification of the ansible EE attributes

Field Description Scheme Required

extraMounts

ExtraMounts containing files which can be mounted into an Ansible Execution Pod

[]storage.VolMounts

false

env

Env is a list containing the environment variables to pass to the pod

[]corev1.EnvVar

false

extraVars

ExtraVars for ansible execution

map[string]json.RawMessage

false

dnsConfig

DNSConfig for setting dnsservers

*corev1.PodDNSConfig

false

networkAttachments

NetworkAttachments is a list of NetworkAttachment resource names to pass to the ansibleee resource which allows to connect the ansibleee runner to the given network

[]string

true

openStackAnsibleEERunnerImage

OpenStackAnsibleEERunnerImage image to use as the ansibleEE runner image

string

false

ansibleTags

AnsibleTags for ansible execution

string

false

ansibleLimit

AnsibleLimit for ansible execution

string

false

ansibleSkipTags

AnsibleSkipTags for ansible execution

string

false

ServiceAccountName

ServiceAccountName allows to specify what ServiceAccountName do we want the ansible execution run with. Without specifying, it will run with default serviceaccount

string

false

AnsibleOpts

AnsibleOpts defines a logical grouping of Ansible related configuration options.

Field Description Scheme Required

ansibleUser

AnsibleUser SSH user for Ansible connection

string

true

ansibleHost

AnsibleHost SSH host for Ansible connection

string

false

ansibleVars

AnsibleVars for configuring ansible

map[string]json.RawMessage

false

ansibleVarsFrom

AnsibleVarsFrom is a list of sources to populate ansible variables from. Values defined by an AnsibleVars with a duplicate key take precedence.

[]DataSource

false

ansiblePort

AnsiblePort SSH port for Ansible connection

int

false

ConfigMapEnvSource

ConfigMapEnvSource selects a ConfigMap to populate the environment variables with.\n\nThe contents of the target ConfigMap’s Data field will represent the key-value pairs as environment variables.

Field Description Scheme Required

optional

Specify whether the ConfigMap must be defined

*bool

false

DataSource

DataSource represents the source of a set of ConfigMaps/Secrets

Field Description Scheme Required

prefix

An optional identifier to prepend to each key in the ConfigMap. Must be a C_IDENTIFIER.

string

false

configMapRef

The ConfigMap to select from

*ConfigMapEnvSource

false

secretRef

The Secret to select from

*SecretEnvSource

false

LocalObjectReference

LocalObjectReference contains enough information to let you locate the referenced object inside the same namespace.

Field Description Scheme Required

name

Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

string

false

NodeSection

NodeSection defines the top level attributes inherited by nodes in the CR.

Field Description Scheme Required

networks

Networks - Instance networks

[]infranetworkv1.IPSetNetwork

false

bmhLabelSelector

BmhLabelSelector allows for a sub-selection of BaremetalHosts based on arbitrary labels for a node.

map[string]string

false

userData

UserData node specific user-data

*corev1.SecretReference

false

networkData

NetworkData node specific network-data

*corev1.SecretReference

false

ansible

Ansible is the group of Ansible related configuration options.

AnsibleOpts

false

hostName

HostName - node name

string

false

managementNetwork

ManagementNetwork - Name of network to use for management (SSH/Ansible)

string

false

NodeTemplate

NodeTemplate is a specification of the node attributes that override top level attributes.

Field Description Scheme Required

extraMounts

ExtraMounts containing files which can be mounted into an Ansible Execution Pod

[]storage.VolMounts

false

networks

Networks - Instance networks

[]infranetworkv1.IPSetNetwork

false

userData

UserData node specific user-data

*corev1.SecretReference

false

networkData

NetworkData node specific network-data

*corev1.SecretReference

false

ansibleSSHPrivateKeySecret

AnsibleSSHPrivateKeySecret Name of a private SSH key secret containing private SSH key for connecting to node. The named secret must be of the form: Secret.data.ssh-privatekey: https://kubernetes.io/docs/concepts/configuration/secret/#ssh-authentication-secrets

string

true

managementNetwork

ManagementNetwork - Name of network to use for management (SSH/Ansible)

string

true

ansible

Ansible is the group of Ansible related configuration options.

AnsibleOpts

false

SecretEnvSource

SecretEnvSource selects a Secret to populate the environment variables with.\n\nThe contents of the target Secret’s Data field will represent the key-value pairs as environment variables.

Field Description Scheme Required

optional

Specify whether the Secret must be defined

*bool

false