Adopting a Red Hat OpenStack Platform 17.1 deployment

1. Red Hat OpenStack Services on OpenShift Antelope adoption overview

Adoption is the process of migrating a OpenStack (OSP) control plane to Red Hat OpenStack Services on OpenShift Antelope, and then completing an in-place upgrade of the data plane. You can retain existing infrastructure investments and modernize your OSP deployment on a containerized OpenShift foundation. To ensure that you understand the entire adoption process and how to sufficiently prepare your OSP environment, review the prerequisites, adoption process, and post-adoption tasks.

Read the whole adoption guide before you start the adoption to ensure that you understand the procedure. Prepare the necessary configuration snippets for each OSP service in advance, and test the migration in a representative test environment before you apply it to production.

1.1. Adoption limitations

Before you proceed with the adoption, check which features are Technology Previews or unsupported.

Technology Preview

The following features are Technology Previews and have not been tested within the context of the Red Hat OpenStack Services on OpenShift (RHOSO) adoption:

FC-based drivers for Block Storage service (cinder)

The following Compute service (nova) features are Technology Previews:
NUMA-aware vswitches
PCI passthrough by flavor
SR-IOV trusted virtual functions
vGPU
Emulated virtual Trusted Platform Module (vTPM)
UEFI
AMD SEV
Direct download from Rados Block Device (RBD)
File-backed memory
Defining a custom inventory of resources in a YAML file, provider.yaml

Unsupported features

The adoption process does not support the following features:

Distributed Compute Node (DCN) architecture with storage services at remote or edge sites
DNS-as-a-service (designate)
Load-balancing service (octavia)
Adopting Border Gateway Protocol (BGP) environments to the RHOSO data plane
Adopting a Federal Information Processing Standards (FIPS) environment

1.2. Adoption prerequisites

Before you begin the adoption procedure, complete the following prerequisites:

Planning information

Review the Adoption limitations.
Review the OpenShift requirements, data plane node requirements, Compute node requirements, and so on. For more information, see Planning your deployment.
Review the adoption-specific networking requirements. For more information, see Configuring the network for the RHOSO deployment.
Review the adoption-specific storage requirements. For more information, see Storage requirements.
Review how to customize your deployed control plane with the services that are required for your environment. For more information, see Customizing the Red Hat OpenStack Services on OpenShift deployment.
Familiarize yourself with the following OCP concepts that are used during adoption:
Familiarize yourself with mapping RHOSO versions to OpenStack Operators and OpenStackVersion custom resources (CRs). For more information, see the Red Hat Knowledgebase article How RHOSO versions map to OpenStack Operators and OpenStackVersion CRs.

Back-up information

Back up your OpenStack (OSP) environment by using one of the following options:
- The Relax-and-Recover tool. For more information, see Backing up the undercloud and the control plane nodes by using the Relax-and-Recover tool in Backing up and restoring the undercloud and control plane nodes.
- The Snapshot and Revert tool. For more information, see Backing up your Red Hat OpenStack Platform cluster by using the Snapshot and Revert tool in Backing up and restoring the undercloud and control plane nodes.
- A third-party backup and recovery tool. For more information about certified backup and recovery tools, see the Red Hat Ecosystem Catalog.
Back up the configuration files from the OSP services and TripleO on your file system. For more information, see Pulling the configuration from a TripleO deployment.

Compute

Upgrade your Compute nodes to Red Hat Enterprise Linux 9.2. For more information, see Upgrading all Compute nodes to RHEL 9.2 in Framework for upgrades (16.2 to 17.1).
On your Compute hosts, the systemd-container package must be installed and the systemd-machined service must be running. For more information about how to verify that the package is installed and that the service is running, see Installing the systemd-container package on Compute hosts.

ML2/OVS

If you use the Modular Layer 2 plug-in with Open vSwitch mechanism driver (ML2/OVS), migrate it to the Modular Layer 2 plug-in with Open Virtual Networking (ML2/OVN) mechanism driver. For more information, see Migrating to the OVN mechanism driver.

Tools

The oc and podman command line tools are installed on your workstation.
Make sure to set the correct RHOSO project namespace in which to run commands.
```
$ oc project openstack
```

OSP release

The OSP cloud is updated to the 17.1.4 release or later. For more information, see Performing a minor update of Red Hat OpenStack Platform.

OSP hosts

All control plane and data plane hosts of the OSP cloud are up and running, and continue to run throughout the adoption procedure.

1.3. Guidelines for planning the adoption

When planning to adopt a Red Hat OpenStack Services on OpenShift (RHOSO) Antelope environment, consider the scope of the change. An adoption is similar in scope to a data center upgrade. Different firmware levels, hardware vendors, hardware profiles, networking interfaces, storage interfaces, and so on affect the adoption process and can cause changes in behavior during the adoption.

Review the following guidelines to adequately plan for the adoption and increase the chance that you complete the adoption successfully:

All commands in the adoption documentation are examples. Do not copy and paste the commands without understanding what the commands do.

To minimize the risk of an adoption failure, reduce the number of environmental differences between the staging environment and the production sites.
If the staging environment is not representative of the production sites or if a staging environment is not available, you must plan to include contingency time in case the adoption fails.
Review your custom OpenStack (OSP) service configuration at every major release.
- Every major release upgrades through multiple OpenStack releases.
- Each major release might deprecate configuration options or change the format of the configuration.
Prepare a Method of Procedure (MOP) that is specific to your environment to reduce the risk of variance or omitted steps when running the adoption process.
You can use representative hardware in a staging environment to prepare a MOP and validate any content changes.
- Include a cross-section of firmware versions, additional interface or device hardware, and any additional software in the representative staging environment to ensure that it is broadly representative of the variety that is present in the production environments.
- Ensure that you validate any Red Hat Enterprise Linux update or upgrade in the representative staging environment.
Use Satellite for localized and version-pinned RPM content where your data plane nodes are located.
In the production environment, use the content that you tested in the staging environment.

1.4. Adoption process overview

Familiarize yourself with the steps of the adoption process and the optional post-adoption tasks.

Main adoption process

Migrate TLS everywhere (TLS-e) to the Red Hat OpenStack Services on OpenShift (RHOSO) deployment.
Migrate your existing databases to the new control plane.
Adopt your Red Hat OpenStack Platform 17.1 control plane services to the new RHOSO 18.0 deployment.
Adopt the RHOSO 18.0 data plane.
Migrate the Object Storage service (swift) to the RHOSO nodes.
Migrate the Ceph cluster.

Post-adoption tasks

Optional: Run tempest to verify that the entire adoption process is working correctly. For more information, see Validating and troubleshooting the deployed cloud.
After adoption, Red Hat OpenStack Services on OpenShift (RHOSO) data plane nodes run Red Hat Enterprise Linux (RHEL) 9.2. The data plane nodes can remain on RHEL 9.2; however, you must perform a system update to use the full feature set from the release, and to align your environment with the maximum support lifecycle of RHOSO.
- You can perform a system update any time after you complete the adoption procedure.
- You can defer the system update to a separate maintenance window.
- You can perform the system update on one nodeset at a time. For example, you can update one nodeset from RHEL 9.2 to RHEL 9.4 in one maintenance window, and then update a different nodeset in another maintenance window later.
  
  For more information about updating your environment, see Updating your environment to the latest maintenance release.
Optional: Verify that you migrated all services from the Controller nodes, and then power off the nodes. If any services are still running in the Controller nodes, such as Open Virtual Networking (ML2/OVN), Object Storage service (swift), or Ceph, do not power off the nodes.
If you enabled the high availability for Compute instances (Instance HA) service, remove the Pacemaker components from your Compute nodes. For more information, see Enabling the high availability for Compute instances service.

1.5. Overview of Distributed Compute Node adoption

The process to adopt Distributed Compute Node (DCN) deployment from OpenStack (OSP) to Red Hat OpenStack Services on OpenShift (RHOSO) requires additional adoption tasks:

You must map a multi-stack deployment to multiple node sets.

You must map additional networking configurations.

Multi-stack to multi-node set mapping

In TripleO deployments, DCN environments use multiple Heat stacks:

The Central stack is templating for Controllers and central Compute nodes.

An edge stack is templating for Edge Compute nodes in a stack. There is one stack per DCN site.

When you perform an adoption, map TripleO stacks to OpenStackDataPlaneNodeSet custom resources (CRs):

Table 1. Mapping TripleO stacks to RHOSO nodesets
TripleO stack	RHOSO nodeset	Availability zone
Central stack (Compute role)	`openstack-edpm` or `openstack-cell1`	az-central
DCN1 stack (ComputeDcn1 role)	`openstack-edpm-dcn1` or `openstack-cell1-dcn1`	az-dcn1
DCN2 stack (ComputeDcn2 role)	`openstack-edpm-dcn2` or `openstack-cell1-dcn2`	az-dcn2

Keep all node sets in the same Nova cell to maintain unified scheduling through a shared cell. The default cell is cell1.

Key differences from standard adoption

The following table summarizes the differences between standard adoption and DCN adoption:

Table 2. Comparison of standard and DCN adoption
Aspect	Standard adoption	DCN adoption
Director stacks	Single stack	Multiple stacks (central + edge sites)
Network topology	Flat L2 networks	Routed L3 networks with multiple subnets
Data plane node sets	Single node set	Multiple node sets (one per site minimum)
Network routes	Usually not required	Required for inter-site connectivity
Physnets	Single physnet (e.g., `datacentre`)	Multiple physnets (e.g., `leaf0`, `leaf1`, `leaf2`)
Availability zones	Often single AZ	Multiple AZs (one per site)
OVN bridge mappings	Single mapping	Site-specific mappings
Provider networks	Single segment	Multi-segment routed provider networks

Requirements for DCN adoption

Before adopting a DCN deployment, ensure you have:

Network topology information for all sites (IP ranges, VLANs, gateways)
Inter-site routing configuration (routes between site subnets)
Mapping of TripleO roles to availability zones
OVN bridge mapping configuration for each site

The adoption of the control plane must complete before adopting any data plane nodes. However, once the control plane is adopted, the edge site data plane adoptions can proceed in parallel with the central site data plane adoption.

DCN Adoption workflow overview

The adoption of a Distributed Compute Node (DCN) deployment from OpenStack (OSP) to Red Hat OpenStack Services on OpenShift (RHOSO)

Control plane adoption: Adopt all control plane services from the central TripleO stack to the RHOSO control plane. This is identical to standard adoption.
Network configuration: Configure multi-subnet NetConfig and NetworkAttachmentDefinition CRs to support all site networks.
Data plane node set creation: Create separate OpenStackDataPlaneNodeSet CRs for each site, each with site-specific network configurations:
- Network subnet references
- OVN bridge mappings (physnets)
- Inter-site routing configuration
Data plane deployment: Deploy all node sets. The edge site node sets can be deployed in parallel after the central site control plane is adopted.

Additional resources

Configuring spine-leaf networks for the Red Hat OpenStack Services on OpenShift deployment

1.6. Installing the `systemd-container` package on Compute hosts

Before you adopt the Red Hat OpenStack Services on OpenShift (RHOSO) data plane, you must verify that the systemd-container package is installed and that systemd-machined is running on all the Compute hosts. You must install the systemd-container package on each Compute host that does not have this package.

Procedure

List the instances that are running on the host:

$ sudo machinectl list

Sample output

MACHINE                  CLASS SERVICE      OS VERSION ADDRESSES
qemu-1-instance-000000b9 vm    libvirt-qemu -  -       -
qemu-2-instance-000000c2 vm    libvirt-qemu -  -       -

2 machines listed.

Verify that the systemd-machined service is running:

$ sudo systemctl status systemd-machined.service

Sample output

systemd-machined.service - Virtual Machine and Container Registration Service
     Loaded: loaded (/usr/lib/systemd/system/systemd-machined.service; static)
     Active: active (running) since Mon 2025-06-16 11:42:07 EDT; 2min 48s ago
       Docs: man:systemd-machined.service(8)
             man:org.freedesktop.machine1(5)
   Main PID: 136614 (systemd-machine)
     Status: "Processing requests..."
      Tasks: 1 (limit: 838860)
     Memory: 1.4M
        CPU: 33ms
     CGroup: /system.slice/systemd-machined.service
             └─136614 /usr/lib/systemd/systemd-machined

Jun 16 11:42:07 computehost001 systemd[1]: Starting Virtual Machine and Container Registration Service...
Jun 16 11:42:07 computehost001 systemd[1]: Started Virtual Machine and Container Registration Service.
Jun 16 11:43:44 computehost001 systemd-machined[136614]: New machine qemu-1-instance-000000b9.
Jun 16 11:43:51 computehost001 systemd-machined[136614]: New machine qemu-2-instance-000000c2.

If the systemd-machined service is running, skip the rest of this procedure. Ensure that you verify that the systemd-machined service is running each Compute node host in the cluster.

If the systemd-machined service is not running, before you can install the systemd-container package, live migrate all virtual machines from the host. For more information about live migration, see Rebooting Compute nodes in Performing a minor update of Red Hat OpenStack Platform.
Install the systemd-container on the host:
- If you upgraded your environment from an earlier version of OpenStack, reboot the Compute host to automatically install the systemd-container.
- If you deployed a new RHOSO environment, install the systemd-container manually by using the following command. Rebooting the Compute host is not required:
  $ sudo dnf -y install systemd-container
  If your Compute host is not running a virtual machine, you can install the systemd-container automatically or manually.
Repeat this procedure on each Compute host in the cluster where the systemd-machined service is not running.

1.7. Identity service authentication

If you have custom policies enabled, complete the following steps for adoption:

Remove custom policies.
Run the adoption.
Re-add custom policies by using the new SRBAC syntax.

Red Hat does not support customized roles or policies. Syntax errors or misapplied authorization can negatively impact security or usability. If you need customized roles or policies in your production environment, contact Red Hat support for a support exception before you begin the adoption.

After you adopt a TripleO-based OpenStack deployment to a Red Hat OpenStack Services on OpenShift deployment, the Identity service performs user authentication and authorization by using Secure RBAC (SRBAC). If SRBAC is already enabled, then there is no change to how you perform operations. If SRBAC is disabled, then adopting a TripleO-based OpenStack deployment might change how you perform operations due to changes in API access policies.

For more information on SRBAC, see Secure role based access control in Red Hat OpenStack Services on OpenShift in Performing security operations.

1.8. Configuring the network for the Red Hat OpenStack Services on OpenShift deployment

When you adopt a new Red Hat OpenStack Services on OpenShift (RHOSO) deployment, you must align the network configuration with the adopted cluster to maintain connectivity for existing workloads.

Perform the following tasks to incorporate the existing network configuration:

Configure OpenShift worker nodes to align VLAN tags and IP Address Management (IPAM) configuration with the existing deployment.
Configure control plane services to use compatible IP ranges for service and load-balancing IP addresses.
Configure data plane nodes to use corresponding compatible configuration for VLAN tags and IPAM.

When configuring nodes and services, the general approach is as follows:

For IPAM, you can either reuse subnet ranges from the existing deployment or, if there is a shortage of free IP addresses in existing subnets, define new ranges for the new control plane services. If you define new ranges, you configure IP routing between the old and new ranges. For more information, see Planning your IPAM configuration.
For VLAN tags, always reuse the configuration from the existing deployment.

For more information about the network architecture and configuration, see Preparing networks for Red Hat OpenStack Services on OpenShift in Deploying Red Hat OpenStack Services on OpenShift and About networking in Networking.

1.8.1. Retrieving the network configuration from your existing deployment

You must determine which isolated networks are defined in your existing deployment. After you retrieve your network configuration, you have the following information:

A list of isolated networks that are used in the existing deployment.
For each of the isolated networks, the VLAN tag and IP ranges used for dynamic address allocation.
A list of existing IP address allocations that are used in the environment. When reusing the existing subnet ranges to host the new control plane services, these addresses are excluded from the corresponding allocation pools.

Procedure

Find the network configuration in the network_data.yaml file. For example:

- name: InternalApi
  mtu: 1500
  vip: true
  vlan: 20
  name_lower: internal_api
  dns_domain: internal.mydomain.tld.
  service_net_map_replace: internal
  subnets:
    internal_api_subnet:
      ip_subnet: '172.17.0.0/24'
      allocation_pools: [{'start': '172.17.0.4', 'end': '172.17.0.250'}]

Retrieve the VLAN tag that is used in the vlan key and the IP range in the ip_subnet key for each isolated network from the network_data.yaml file. When reusing subnet ranges from the existing deployment for the new control plane services, the ranges are split into separate pools for control plane services and load-balancer IP addresses.
Use the tripleo-ansible-inventory.yaml file to determine the list of IP addresses that are already consumed in the adopted environment. For each listed host in the file, make a note of the IP and VIP addresses that are consumed by the node. For example:
```
Standalone:
  hosts:
    standalone:
      ...
      internal_api_ip: 172.17.0.100
    ...
  ...
standalone:
  children:
    Standalone: {}
  vars:
    ...
    internal_api_vip: 172.17.0.2
    ...
```
In this example, the 172.17.0.2 and 172.17.0.100 values are consumed and are not available for the new control plane services until the adoption is complete.
Repeat this procedure for each isolated network and each host in the configuration.

1.8.2. Planning your IPAM configuration

In a Red Hat OpenStack Services on OpenShift (RHOSO) deployment, each service that is deployed on the OpenShift worker nodes requires an IP address from the IP Address Management (IPAM) pool. In a OpenStack (OSP) deployment, all services that are hosted on a Controller node share the same IP address.

The RHOSO control plane has different requirements for the number of IP addresses that are made available for services. Depending on the size of the IP ranges that are used in the existing RHOSO deployment, you might reuse these ranges for the RHOSO control plane.

The total number of IP addresses that are required for the new control plane services in each isolated network is calculated as the sum of the following:

The number of OCP worker nodes. Each worker node requires 1 IP address in the NodeNetworkConfigurationPolicy custom resource (CR).
The number of IP addresses required for the data plane nodes. Each node requires an IP address from the NetConfig CRs.
The number of IP addresses required for control plane services. Each service requires an IP address from the NetworkAttachmentDefinition CRs. This number depends on the number of replicas for each service.
The number of IP addresses required for load balancer IP addresses. Each service requires a Virtual IP address from the IPAddressPool CRs.

For example, a simple single worker node OCP deployment with Red Hat OpenShift Local has the following IP ranges defined for the internalapi network:

1 IP address for the single worker node
1 IP address for the data plane node
NetworkAttachmentDefinition CRs for control plane services: X.X.X.30-X.X.X.70 (41 addresses)
IPAllocationPool CRs for load balancer IPs: X.X.X.80-X.X.X.90 (11 addresses)

This example shows a total of 54 IP addresses allocated to the internalapi allocation pools.

The requirements might differ depending on the list of OSP services to be deployed, their replica numbers, and the number of OCP worker nodes and data plane nodes.

Additional IP addresses might be required in future OSP releases, so you must plan for some extra capacity for each of the allocation pools that are used in the new environment.

After you determine the required IP pool size for the new deployment, you can choose to define new IP address ranges or reuse your existing IP address ranges. Regardless of the scenario, the VLAN tags in the existing deployment are reused in the new deployment. Ensure that the VLAN tags are properly retained in the new configuration. For more information, see Configuring isolated networks.

Configuring new subnet ranges

If you are using IPv6, you can reuse existing subnet ranges in most cases. For more information about existing subnet ranges, see Reusing existing subnet ranges.

You can define new IP ranges for control plane services that belong to a different subnet that is not used in the existing cluster. Then you configure link local IP routing between the existing and new subnets to enable existing and new service deployments to communicate. This involves using the TripleO mechanism on a pre-adopted cluster to configure additional link local routes. This enables the data plane deployment to reach out to OpenStack (OSP) nodes by using the existing subnet addresses. You can use new subnet ranges with any existing subnet configuration, and when the existing cluster subnet ranges do not have enough free IP addresses for the new control plane services.

You must size the new subnet appropriately to accommodate the new control plane services. There are no specific requirements for the existing deployment allocation pools that are already consumed by the OSP environment.

Defining a new subnet for Storage and Storage management is not supported because Compute service (nova) and Ceph do not allow modifying those networks during adoption.

In the following procedure, you configure NetworkAttachmentDefinition custom resources (CRs) to use a different subnet from what is configured in the network_config section of the OpenStackDataPlaneNodeSet CR for the same networks. The new range in the NetworkAttachmentDefinition CR is used for control plane services, while the existing range in the OpenStackDataPlaneNodeSet CR is used to manage IP Address Management (IPAM) for data plane nodes.

The values that are used in the following procedure are examples. Use values that are specific to your configuration.

Procedure

Configure link local routes on the existing deployment nodes for the control plane subnets. This is done through TripleO configuration:
```
network_config:
  - type: ovs_bridge
    name: br-ctlplane
    routes:
    - ip_netmask: 0.0.0.0/0
      next_hop: 192.168.1.1
    - ip_netmask: 172.31.0.0/24
      next_hop: 192.168.1.100
```
- ip_netmask defines the new control plane subnet.
- next_hop defines the control plane IP address of the existing data plane node.
  
  Repeat this configuration for other networks that need to use different subnets for the new and existing parts of the deployment.

Apply the new configuration to every OSP node:

(undercloud)$ openstack overcloud network provision \
 --output  <deployment_file> \
[--templates <templates_directory>]/home/stack/templates/<networks_definition_file>

(undercloud)$ openstack overcloud node provision \
 --stack <stack> \
 --network-config \
 --output <deployment_file> \
[--templates <templates_directory>]/home/stack/templates/<node_definition_file>

Optional: Include the --templates option to use your own templates instead of the default templates located in /usr/share/openstack-tripleo-heat-templates. Replace <templates_directory> with the path to the directory that contains your templates.
Replace <stack> with the name of the stack for which the bare-metal nodes are provisioned. If not specified, the default is overcloud.
Include the --network-config optional argument to provide the network definitions to the cli-overcloud-node-network-config.yaml Ansible playbook. The cli-overcloud-node-network-config.yaml playbook uses the os-net-config tool to apply the network configuration on the deployed nodes. If you do not use --network-config to provide the network definitions, then you must configure the {{role.name}}NetworkConfigTemplate parameters in your network-environment.yaml file, otherwise the default network definitions are used.
Replace <deployment_file> with the name of the heat environment file to generate for inclusion in the deployment command, for example /home/stack/templates/overcloud-baremetal-deployed.yaml.

Replace <node_definition_file> with the name of your node definition file, for example, overcloud-baremetal-deploy.yaml. Ensure that the network_config_update variable is set to true in the node definition file.

Network configuration changes are not applied by default to avoid the risk of network disruption. You must enforce the changes by setting the StandaloneNetworkConfigUpdate: true in the TripleO configuration files.

Confirm that there are new link local routes to the new subnet on each node. For example:
```
# ip route | grep 172
172.31.0.0/24 via 192.168.122.100 dev br-ctlplane
```
You also must configure link local routes to existing deployment on Red Hat OpenStack Services on OpenShift (RHOSO) worker nodes. This is achieved by adding routes entries to the NodeNetworkConfigurationPolicy CRs for each network. For example:
```
  - destination: 192.168.122.0/24
    next-hop-interface: ospbr
```
- destination defines the original subnet of the isolated network on the data plane.
- next-hop-interface defines the OpenShift worker network interface that corresponds to the isolated network on the data plane.
  
  As a result, the following route is added to your OCP nodes:
  # ip route | grep 192 192.168.122.0/24 dev ospbr proto static scope link

Later, during the data plane adoption, in the network_config section of the OpenStackDataPlaneNodeSet CR, add the same link local routes for the new control plane subnet ranges. For example:

  nodeTemplate:
    ansible:
      ansibleUser: root
      ansibleVars:
        additional_ctlplane_host_routes:
        - ip_netmask: 172.31.0.0/24
          next_hop: '{{ ctlplane_ip }}'
        edpm_network_config_template: |
          network_config:
          - type: ovs_bridge
            routes: {{ ctlplane_host_routes + additional_ctlplane_host_routes }}
            ...

List the IP addresses that are used for the data plane nodes in the existing deployment as ansibleHost and fixedIP. For example:

  nodes:
    standalone:
      ansible:
        ansibleHost: 192.168.122.100
        ansibleUser: ""
      hostName: standalone
      networks:
      - defaultRoute: true
        fixedIP: 192.168.122.100
        name: ctlplane
        subnetName: subnet1

Do not change OSP node IP addresses during the adoption process. List previously used IP addresses in the fixedIP fields for each node entry in the nodes section of the OpenStackDataPlaneNodeSet CR.

Expand the SSH range for the firewall configuration to include both subnets to allow SSH access to data plane nodes from both subnets:
```
  edpm_sshd_allowed_ranges:
  - 192.168.122.0/24
  - 172.31.0.0/24
```
This provides SSH access from the new subnet to the OSP nodes as well as the OSP subnets.

Reusing existing subnet ranges

You can reuse existing subnet ranges if they have enough IP addresses to allocate to the new control plane services. You configure the new control plane services to use the same subnet as you used in the OpenStack (OSP) environment, and configure the allocation pools that are used by the new services to exclude IP addresses that are already allocated to existing cluster nodes. By reusing existing subnets, you avoid additional link local route configuration between the existing and new subnets.

If your existing subnets do not have enough IP addresses in the existing subnet ranges for the new control plane services, you must create new subnet ranges. For more information, see Using new subnet ranges.

No special routing configuration is required to reuse subnet ranges. However, you must ensure that the IP addresses that are consumed by OSP services do not overlap with the new allocation pools configured for Red Hat OpenStack Services on OpenShift control plane services.

If you are especially constrained by the size of the existing subnet, you may have to apply elaborate exclusion rules when defining allocation pools for the new control plane services. For more information, see Configuring isolated networks.

1.8.3. Configuring isolated networks

Before you begin replicating your existing VLAN and IPAM configuration in the Red Hat OpenStack Services on OpenShift (RHOSO) environment, you must have the following IP address allocations for the new control plane services:

1 IP address for each isolated network on each OpenShift worker node. You configure these IP addresses in the NodeNetworkConfigurationPolicy custom resources (CRs) for the OCP worker nodes. For more information, see Configuring OCP worker nodes.
1 IP range for each isolated network for the data plane nodes. You configure these ranges in the NetConfig CRs for the data plane nodes. For more information, see Configuring data plane nodes.
1 IP range for each isolated network for control plane services. These ranges enable pod connectivity for isolated networks in the NetworkAttachmentDefinition CRs. For more information, see Configuring the networking for control plane services.
1 IP range for each isolated network for load balancer IP addresses. These IP ranges define load balancer IP addresses for MetalLB in the IPAddressPool CRs. For more information, see Configuring the networking for control plane services.

The exact list and configuration of isolated networks in the following procedures should reflect the actual OpenStack environment. The number of isolated networks might differ from the examples used in the procedures. The IPAM scheme might also differ. Only the parts of the configuration that are relevant to configuring networks are shown. The values that are used in the following procedures are examples. Use values that are specific to your configuration.

Configuring isolated networks on OCP worker nodes

To connect service pods to isolated networks on OpenShift worker nodes that run OpenStack services, physical network configuration on the hypervisor is required.

This configuration is managed by the NMState operator, which uses NodeNetworkConfigurationPolicy custom resources (CRs) to define the desired network configuration for the nodes.

In IPv6, OpenShift worker nodes need a /64 prefix allocation due to OVN limitations (RFC 4291). For dynamic IPv6 configuration, you need to change the prefix allocation on the Router Advertisement settings. If you want to use manual configuration for IPv6, define a similar CR to the NodeNetworkConfigurationPolicy CR example in this procedure, and define an IPv6 address and disable IPv4. Because the constraint for the /64 prefix did not exist in TripleO, your OSP control plane network might not have enough capacity to allocate these networks. If that is the case, allocate a prefix that fits a large enough number of addresses, for example, /60. The prefix depends on the number of worker nodes you have.

Procedure

For each OCP worker node, define a NodeNetworkConfigurationPolicy CR that describes the desired network configuration. For example:

apiVersion: v1
items:
- apiVersion: nmstate.io/v1
  kind: NodeNetworkConfigurationPolicy
  spec:
    desiredState:
      interfaces:
      - description: internalapi vlan interface
        ipv4:
          address:
          - ip: 172.17.0.10
            prefix-length: 24
          dhcp: false
          enabled: true
        ipv6:
          enabled: false
        name: enp6s0.20
        state: up
        type: vlan
        vlan:
          base-iface: enp6s0
          id: 20
          reorder-headers: true
      - description: storage vlan interface
        ipv4:
          address:
          - ip: 172.18.0.10
            prefix-length: 24
          dhcp: false
          enabled: true
        ipv6:
          enabled: false
        name: enp6s0.21
        state: up
        type: vlan
        vlan:
          base-iface: enp6s0
          id: 21
          reorder-headers: true
      - description: tenant vlan interface
        ipv4:
          address:
          - ip: 172.19.0.10
            prefix-length: 24
          dhcp: false
          enabled: true
        ipv6:
          enabled: false
        name: enp6s0.22
        state: up
        type: vlan
        vlan:
          base-iface: enp6s0
          id: 22
          reorder-headers: true
    nodeSelector:
      kubernetes.io/hostname: ocp-worker-0
      node-role.kubernetes.io/worker: ""

For environments that are enabled with border gateway protocol (BGP), you might need to add additional routes in the NodeNetworkConfigurationPolicy CR so that OCP worker nodes can reach the OpenStack Controller nodes and Compute nodes over the control plane and internal API networks.

When you configure the OCP worker nodes network in the NodeNetworkConfigurationPolicy CR, add routes for each of the following networks:

External network (for example, 172.31.0.0/24)
Control plane network (for example, 192.168.188.0/24)
BGP main network (for example, 99.99.0.0/16)

The following example shows the routes.config section from a NodeNetworkConfigurationPolicy CR for a worker node with BGP configured. In this example, 100.64.0.17 and 100.65.0.17 are the IP addresses of the leaf switches that are connected to the specific OCP node:

    routes:
      config:
      - destination: 99.99.0.0/16
        next-hop-address: 100.64.0.17
        next-hop-interface: enp7s0
        weight: 200
      - destination: 99.99.0.0/16
        next-hop-address: 100.65.0.17
        next-hop-interface: enp8s0
        weight: 200
      - destination: 172.31.0.0/24
        next-hop-address: 100.64.0.17
        next-hop-interface: enp7s0
        weight: 200
      - destination: 172.31.0.0/24
        next-hop-address: 100.65.0.17
        next-hop-interface: enp8s0
        weight: 200
      - destination: 192.168.188.0/24
        next-hop-address: 100.64.0.17
        next-hop-interface: enp7s0
        weight: 200
      - destination: 192.168.188.0/24
        next-hop-address: 100.65.0.17
        next-hop-interface: enp8s0
        weight: 200

Configuring isolated networks on control plane services

After the NMState operator creates the desired hypervisor network configuration for isolated networks, you must configure the OpenStack (OSP) services to use the configured interfaces. You define a NetworkAttachmentDefinition custom resource (CR) for each isolated network. In some clusters, these CRs are managed by the Cluster Network Operator, in which case you use Network CRs instead. For more information, see Cluster Network Operator in Networking.

Procedure

Define a NetworkAttachmentDefinition CR for each isolated network. For example:

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: internalapi
spec:
  config: |
    {
      "cniVersion": "0.3.1",
      "name": "internalapi",
      "type": "macvlan",
      "master": "enp6s0.20",
      "ipam": {
        "type": "whereabouts",
        "range": "172.17.0.0/24",
        "range_start": "172.17.0.20",
        "range_end": "172.17.0.50"
      }
    }

Ensure that the interface name and IPAM range match the configuration that you used in the NodeNetworkConfigurationPolicy CRs.

Optional: When reusing existing IP ranges, you can exclude part of the range that is used in the existing deployment by using the exclude parameter in the NetworkAttachmentDefinition pool. For example:

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: internalapi
spec:
  config: |
    {
      "cniVersion": "0.3.1",
      "name": "internalapi",
      "type": "macvlan",
      "master": "enp6s0.20",
      "ipam": {
        "type": "whereabouts",
        "range": "172.17.0.0/24",
        "range_start": "172.17.0.20",
        "range_end": "172.17.0.50",
        "exclude": [
          "172.17.0.24/32",
          "172.17.0.44/31"
        ]
      }
    }

spec.config.ipam.range_start defines the start of the IP range.
spec.config.ipam.range_end defines the end of the IP range.
spec.config.ipam.exclude excludes part of the IP range. This example excludes IP addresses 172.17.0.24/32 and 172.17.0.44/31 from the allocation pool.

If your OSP services require load balancer IP addresses, define the pools for these services in an IPAddressPool CR. For example:

The load balancer IP addresses belong to the same IP range as the control plane services, and are managed by MetalLB. This pool should also be aligned with the OSP configuration.
```
- apiVersion: metallb.io/v1beta1
  kind: IPAddressPool
  spec:
    addresses:
    - 172.17.0.60-172.17.0.70
```
Define IPAddressPool CRs for each isolated network that requires load balancer IP addresses.
Optional: When reusing existing IP ranges, you can exclude part of the range by listing multiple entries in the addresses section of the IPAddressPool. For example:
```
- apiVersion: metallb.io/v1beta1
  kind: IPAddressPool
  spec:
    addresses:
    - 172.17.0.60-172.17.0.64
    - 172.17.0.66-172.17.0.70
```
The example above would exclude the 172.17.0.65 address from the allocation pool.

For environments that are enabled with border gateway protocol (BGP), add routes to the NetworkAttachmentDefinition CRs so that the pods can communicate with the OpenStack Controller nodes and Compute nodes over the isolated networks. This is similar to the routes that should be added to the NodeNetworkConfigurationPolicy CRs in BGP environments. For more information about isolated networks, see Configuring isolated networks on RHOCP worker nodes. The following example shows a NetworkAttachmentDefinition CR for the storage network with routes:

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: storage
  namespace: openstack
spec:
  config: |
    {
      "cniVersion": "0.3.1",
      "name": "storage",
      "type": "bridge",
      "isDefaultGateway": false,
      "isGateway": true,
      "forceAddress": false,
      "hairpinMode": true,
      "ipMasq": false,
      "bridge": "storage",
      "ipam": {
        "type": "whereabouts",
        "range": "172.18.0.0/24",
        "range_start": "172.18.0.30",
        "range_end": "172.18.0.70",
        "routes": [
           {"dst": "172.31.0.0/24", "gw": "172.18.0.1"},
           {"dst": "192.168.188.0/24", "gw": "172.18.0.1"},
           {"dst": "99.99.0.0/16", "gw": "172.18.0.1"}
        ]
      }
    }

Configuring isolated networks on data plane nodes

Data plane nodes are configured by the OpenStack Operator and your OpenStackDataPlaneNodeSet custom resources (CRs). The OpenStackDataPlaneNodeSet CRs define your desired network configuration for the nodes.

Your Red Hat OpenStack Services on OpenShift (RHOSO) network configuration should reflect the existing OpenStack (OSP) network setup. You must pull the network_data.yaml files from each OSP node and reuse them when you define the OpenStackDataPlaneNodeSet CRs. The format of the configuration does not change, so you can put network templates under edpm_network_config_template variables, either for all nodes or for each node.

Procedure

Configure a NetConfig CR with your desired VLAN tags and IPAM configuration. For example:

apiVersion: network.openstack.org/v1beta1
kind: NetConfig
metadata:
  name: netconfig
spec:
  networks:
  - name: internalapi
    dnsDomain: internalapi.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 172.17.0.250
        start: 172.17.0.100
      cidr: 172.17.0.0/24
      vlan: 20
  - name: storage
    dnsDomain: storage.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 172.18.0.250
        start: 172.18.0.100
      cidr: 172.18.0.0/24
      vlan: 21
  - name: tenant
    dnsDomain: tenant.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 172.19.0.250
        start: 172.19.0.100
      cidr: 172.19.0.0/24
      vlan: 22

where:

spec.networks: Specifies the networks composition. The networks composition must match the source cloud configuration to avoid data plane connectivity downtime.

Optional: In the NetConfig CR, list multiple ranges for the allocationRanges field to exclude some of the IP addresses, for example, to accommodate IP addresses that are already consumed by the adopted environment:

apiVersion: network.openstack.org/v1beta1
kind: NetConfig
metadata:
  name: netconfig
spec:
  networks:
  - name: internalapi
    dnsDomain: internalapi.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 172.17.0.199
        start: 172.17.0.100
      - end: 172.17.0.250
        start: 172.17.0.201
      cidr: 172.17.0.0/24
      vlan: 20

This example excludes the 172.17.0.200 address from the pool.

1.9. Configuring spine-leaf networks for the Red Hat OpenStack Services on OpenShift deployment

When you adopt a OpenStack (OSP) deployment with spine-leaf networking, like a Distributed Compute Node (DCN) architecture, you must each L2 network segment with a separate IP subnet and create create routed provider networks. Traffic between sites is routed at L3 through spine routers or similar network infrastructure.

You must configure routing for Compute nodes at edge sites to connect with control plane services, such as RabbitMQ or the database at the central site. The cloud will not function correctly without routes configured.

DHCP relay is not supported in adopted Red Hat OpenStack Services on OpenShift (RHOSO) environments with spine-leaf topologies. This affects bare-metal provisioning scenarios that use PXE boot.

If you need to provision bare-metal nodes at edge sites, use Redfish virtual media or similar BMC virtual media features instead of PXE boot.

Table 3. Example routes required on DCN1 Compute nodes
Destination network	Next hop	Purpose
172.17.0.0/24	172.17.10.1	Route to central internalapi
172.17.20.0/24	172.17.10.1	Route to DCN2 internalapi
172.18.0.0/24	172.18.10.1	Route to central storage
172.18.20.0/24	172.18.10.1	Route to DCN2 storage

You configure these routes in the edpm_network_config_template within the OpenStackDataPlaneNodeSet custom resource (CR) for each site.

Table 4. Example network topology for a three-site DCN deployment
Network	Central site	DCN1 site	DCN2 site
Control plane	192.168.122.0/24	192.168.133.0/24	192.168.144.0/24
Internal API	172.17.0.0/24	172.17.10.0/24	172.17.20.0/24
Storage	172.18.0.0/24	172.18.10.0/24	172.18.20.0/24
Tenant	172.19.0.0/24	172.19.10.0/24	172.19.20.0/24

When you adopt a spine-leaf deployment, you configure the NetConfig CR with multiple subnets for each service network. Each subnet represents a different site.

Example NetConfig with multiple subnets per network

apiVersion: network.openstack.org/v1beta1
kind: NetConfig
metadata:
  name: netconfig
spec:
  networks:
  - name: ctlplane
    dnsDomain: ctlplane.example.com
    subnets:
    - name: subnet1              # Central site
      allocationRanges:
      - end: 192.168.122.120
        start: 192.168.122.100
      cidr: 192.168.122.0/24
      gateway: 192.168.122.1
    - name: ctlplanedcn1         # DCN1 site
      allocationRanges:
      - end: 192.168.133.120
        start: 192.168.133.100
      cidr: 192.168.133.0/24
      gateway: 192.168.133.1
    - name: ctlplanedcn2         # DCN2 site
      allocationRanges:
      - end: 192.168.144.120
        start: 192.168.144.100
      cidr: 192.168.144.0/24
      gateway: 192.168.144.1
  - name: internalapi
    dnsDomain: internalapi.example.com
    subnets:
    - name: subnet1              # Central site
      allocationRanges:
      - end: 172.17.0.250
        start: 172.17.0.100
      cidr: 172.17.0.0/24
      vlan: 20
    - name: internalapidcn1      # DCN1 site
      allocationRanges:
      - end: 172.17.10.250
        start: 172.17.10.100
      cidr: 172.17.10.0/24
      vlan: 30
    - name: internalapidcn2      # DCN2 site
      allocationRanges:
      - end: 172.17.20.250
        start: 172.17.20.100
      cidr: 172.17.20.0/24
      vlan: 40

Each network defines multiple subnets, one for each site.
Each site uses unique VLAN IDs. In this example, central uses VLANs 20-23, DCN1 uses VLANs 30-33, and DCN2 uses VLANs 40-43.
The subnet naming convention typically uses subnet1 for the central site and site-specific names like internalapidcn1 for edge sites.

Because the sites are geopgraphically distributed, each site requires its own provider network (physnet). The Networking service (neutron) must be configured to recognize all physnets.

Example Neutron ML2 configuration for multiple physnets

[ml2_type_vlan]
network_vlan_ranges = leaf0:1:1000,leaf1:1:1000,leaf2:1:1000

[neutron]
physnets = leaf0,leaf1,leaf2

leaf0 corresponds to the central site.
leaf1 corresponds to the DCN1 site.
leaf2 corresponds to the DCN2 site.

When you create routed provider networks in RHOSO, you create network segments that map to these physnets:

Segment for central: physnet=leaf0, subnet=192.168.122.0/24
Segment for DCN1: physnet=leaf1, subnet=192.168.133.0/24
Segment for DCN2: physnet=leaf2, subnet=192.168.144.0/24

Additional resources

Configuring control plane networking for spine-leaf topologies

1.10. Storage requirements

Storage in a OpenStack (OSP) deployment refers to the following types:

The storage that is needed for the service to run
The storage that the service manages

Before you can deploy the services in Red Hat OpenStack Services on OpenShift (RHOSO), you must review the storage requirements, plan your OpenShift node selection, prepare your OCP nodes, and so on.

1.10.1. Storage driver certification

Before you adopt your OpenStack deployment to a Red Hat OpenStack Services on OpenShift (RHOSO) Antelope deployment, confirm that your deployed storage drivers are certified for use with RHOSO Antelope.

For information on software certified for use with RHOSO Antelope, see the Red Hat Ecosystem Catalog.

1.10.2. Block Storage service guidelines

Prepare to adopt your Block Storage service (cinder):

Take note of the Block Storage service back ends that you use.
Determine all the transport protocols that the Block Storage service back ends use, such as RBD, iSCSI, FC, NFS, NVMe-TCP, and so on. You must consider them when you place the Block Storage services and when the right storage transport-related binaries are running on the OpenShift nodes. For more information about each storage transport protocol, see OCP preparation for Block Storage service adoption.

Use a Block Storage service volume service to deploy each Block Storage service volume back end.

For example, you have an LVM back end, a Ceph back end, and two entries in cinderVolumes, and you cannot set global defaults for all volume services. You must define a service for each of them:

apiVersion: core.openstack.org/v1beta1
kind: OpenStackControlPlane
metadata:
  name: openstack
spec:
  cinder:
    enabled: true
    template:
      cinderVolumes:
        lvm:
          customServiceConfig: |
            [DEFAULT]
            debug = True
            [lvm]
< . . . >
        ceph:
          customServiceConfig: |
            [DEFAULT]
            debug = True
            [ceph]
< . . . >

Check that all configuration options are still valid for RHOSO Antelope version. Configuration options might be deprecated, removed, or added. This applies to both back-end driver-specific configuration options and other generic options.

There are two ways to prepare a Block Storage service configuration for adoption. You can customize the configuration or prepare a quick configuration. There is no difference in how Block Storage service operates with both methods, but you should customize the configuration whenever possible.

Preparing the Block Storage service by using an agnostic configuration file

The following process is a more straightforward approach:

Procedure

Create an agnostic configuration file removing any specifics from the old deployment’s cinder.conf file, such as the connection in the [database] section, the transport_url and log_dir in [DEFAULT], the whole [coordination] and [barbican] sections, and so on.
Assuming the configuration has sensitive information, drop the modified contents of the whole file into a Secret.
Reference this secret in all the services, creating a Block Storage service (cinder) volumes section for each backend and just adding the respective enabled_backends option.

Add external files as mentioned in the last bullet of the tailor-made configuration explanation.

Example of what the quick and dirty configuration patch would look like:

   spec:
     cinder:
       enabled: true
       template:
         cinderAPI:
           customServiceConfigSecrets:
             - cinder-conf
         cinderScheduler:
           customServiceConfigSecrets:
             - cinder-conf
         cinderBackup:
           customServiceConfigSecrets:
             - cinder-conf
         cinderVolume:
           lvm1:
             customServiceConfig: |
               [DEFAULT]
               enabled_backends = lvm1
             customServiceConfigSecrets:
               - cinder-conf
           lvm2:
             customServiceConfig: |
               [DEFAULT]
               enabled_backends = lvm2
             customServiceConfigSecrets:
               - cinder-conf

About the Block Storage service configuration generation helper tool

Creating the right Block Storage service (cinder) configuration files to deploy by using Operators can sometimes be complex. There is a helper tool that can create a draft of the files from a cinder.conf file.

This tool is not meant to be an automation tool. You can use the tool to get a general understanding of how to create the configuration files, or to point out some potential pitfalls and reminders.

The tool requires the PyYAML Python package to be installed (pip install PyYAML).

This cinder-cfg.py script defaults to reading the cinder.conf file from the current directory (unless --config option is used) and outputs files to the current directory (unless --out-dir option is used).

In the output directory, you always get a cinder.patch file with the Cinder- specific configuration patch to apply to the OpenStackControlPlane custom resource, but you might also get an additional file called cinder-prereq.yaml with some Secrets and MachineConfigs, and an openstackversion.yaml file with the OpenStackVersion sample.

Example of an invocation setting input and output explicitly to the defaults for a Ceph backend:

$ python cinder-cfg.py --config cinder.conf --out-dir ./
WARNING:root:The {block_storage} is configured to use ['/etc/cinder/policy.yaml'] as policy file, please ensure this file is available for the control plane {block_storage} services using "extraMounts" or remove the option.

WARNING:root:Deployment uses Ceph, so make sure the Ceph credentials and configuration are present in OpenShift as a secret and then use the extra volumes to make them available in all the services that would need them.

WARNING:root:You were using user ['nova'] to talk to Nova, but in podified using the service keystone username is preferred in this case ['cinder']. Dropping that configuration.

WARNING:root:ALWAYS REVIEW RESULTS, OUTPUT IS JUST A ROUGH DRAFT!!

Output written at ./: cinder.patch

The script outputs some warnings to let you know that you might need to do some things manually, such as add the custom policy or provide the Ceph configuration files, and also let you know that the service_user was removed.

A different example when using multiple backends, one of them being a 3PAR FC could be:

$ python cinder-cfg.py --config cinder.conf --out-dir ./
WARNING:root:The {block_storage} is configured to use ['/etc/cinder/policy.yaml'] as policy file, please ensure this file is available for the control plane Block Storage services using "extraMounts" or remove the option.

ERROR:root:Backend hpe_fc requires a vendor container image, but there is no certified image available yet. Patch will use the last known image for reference, but IT WILL NOT WORK

WARNING:root:Deployment uses Ceph, so make sure the Ceph credentials and configuration are present in OpenShift as a secret and then use the extra volumes to make them available in all the services that would need them.

WARNING:root:You were using user ['nova'] to talk to Nova, but in podified using the service keystone username is preferred, in this case ['cinder']. Dropping that configuration.

WARNING:root:Configuration is using FC, please ensure all your OpenShift nodes have HBAs or use labels to ensure that Volume and Backup services are scheduled on nodes with HBAs.

WARNING:root:ALWAYS REVIEW RESULTS, OUTPUT IS JUST A ROUGH DRAFT!!

Output written at ./: cinder.patch, cinder-prereq.yaml

In this case there are additional messages. The following list provides an explanation of each one:

There is one message mentioning how this backend driver needs external vendor dependencies so the standard container image will not work. Unfortunately this image is still not available, so an older image is used in the output patch file for reference. You can then replace this image with one that you build or with a Red Hat official image once the image is available. In this case you can see in your cinder.patch file that has an OpenStackVersion object:
```
apiVersion: core.openstack.org/v1beta1
kind: OpenStackVersion
metadata:
  name: openstack
spec:
  customContainerImages:
    cinderVolumeImages:
      hpe-fc:
        containerImage: registry.connect.redhat.com/hpe3parcinder/openstack-cinder-volume-hpe3parcinder17-0
```
The name of the OpenStackVersion must match the name of your OpenStackControlPlane, so in your case it may be other than openstack.
The FC message reminds you that this transport protocol requires specific HBA cards to be present on the nodes where Block Storage services are running.
In this case it has created the cinder-prereq.yaml file and within the file there is one MachineConfig and one Secret. The MachineConfig is called 99-master-cinder-enable-multipathd and like the name suggests enables multipathing on all the OCP worker nodes. The Secret is called openstackcinder-volumes-hpe_fc and contains the 3PAR backend configuration because it has sensitive information (credentials). The cinder.patch file uses the following configuration:
```
   cinderVolumes:
      hpe-fc:
        customServiceConfigSecrets:
        - openstackcinder-volumes-hpe_fc
```

1.10.3. Limitations for adopting the Block Storage service

Before you begin the Block Storage service (cinder) adoption, review the following limitations:

There is no global nodeSelector option for all Block Storage service volumes. You must specify the nodeSelector for each back end.
There are no global customServiceConfig or customServiceConfigSecrets options for all Block Storage service volumes. You must specify these options for each back end.
Support for Block Storage service back ends that require kernel modules that are not included in Red Hat Enterprise Linux is not tested in Red Hat OpenStack Services on OpenShift (RHOSO).

1.10.4. OCP preparation for Block Storage service adoption

Before you deploy OpenStack (OSP) in OpenShift nodes, ensure that the networks are ready, that you decide which OCP nodes to restrict, and that you make any necessary changes to the OCP nodes.

Node selection

You might need to restrict the OCP nodes where the Block Storage service volume and backup services run.

An example of when you need to restrict nodes for a specific Block Storage service is when you deploy the Block Storage service with the LVM driver. In that scenario, the LVM data where the volumes are stored only exists in a specific host, so you need to pin the Block Storage-volume service to that specific OCP node. Running the service on any other OCP node does not work. You cannot use the OCP host node name to restrict the LVM back end. You need to identify the LVM back end by using a unique label, an existing label, or a new label:

$ oc label nodes worker0 lvm=cinder-volumes

apiVersion: core.openstack.org/v1beta1
kind: OpenStackControlPlane
metadata:
  name: openstack
spec:
  secret: osp-secret
  storageClass: local-storage
  cinder:
    enabled: true
    template:
      cinderVolumes:
        lvm-iscsi:
          nodeSelector:
            lvm: cinder-volumes
< . . . >

For more information about node selection, see About node selectors.

If your nodes do not have enough local disk space for temporary images, you can use a remote NFS location by setting the extra volumes feature, extraMounts.

Transport protocols

Some changes to the storage transport protocols might be required for OCP:

If you use a MachineConfig to make changes to OCP nodes, the nodes reboot.
Check the back-end sections that are listed in the enabled_backends configuration option in your cinder.conf file to determine the enabled storage back-end sections.
Depending on the back end, you can find the transport protocol by viewing the volume_driver or target_protocol configuration options.

The iscsid service, multipathd service, and NVMe-TCP kernel modules start automatically on data plane nodes.

NFS

OCP connects to NFS back ends without additional changes.

Rados Block Device and Ceph

OCP connects to Ceph back ends without additional changes. You must provide credentials and configuration files to the services.

iSCSI

To connect to iSCSI volumes, the iSCSI initiator must run on the OCP hosts where the volume and backup services run. The Linux Open iSCSI initiator does not support network namespaces, so you must only run one instance of the service for the normal OCP usage, as well as the OCP CSI plugins and the OSP services.

If you are not already running iscsid on the OCP nodes, then you must apply a MachineConfig. For example:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
    service: cinder
  name: 99-master-cinder-enable-iscsid
spec:
  config:
    ignition:
      version: 3.2.0
    systemd:
      units:
      - enabled: true
        name: iscsid.service

If you use labels to restrict the nodes where the Block Storage services run, you must use a MachineConfigPool to limit the effects of the MachineConfig to the nodes where your services might run. For more information, see About node selectors.
If you are using a single node deployment to test the process, replace worker with master in the MachineConfig.
For production deployments that use iSCSI volumes, configure multipathing for better I/O.

FC

The Block Storage service volume and Block Storage service backup services must run in an OCP host that has host bus adapters (HBAs). If some nodes do not have HBAs, then use labels to restrict where these services run. For more information, see About node selectors.
If you have virtualized OCP clusters that use FC you need to expose the host HBAs inside the virtual machine.
For production deployments that use FC volumes, configure multipathing for better I/O.

NVMe-TCP

To connect to NVMe-TCP volumes, load NVMe-TCP kernel modules on the OCP hosts.

If you do not already load the nvme-fabrics module on the OCP nodes where the volume and backup services are going to run, then you must apply a MachineConfig. For example:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
    service: cinder
  name: 99-master-cinder-load-nvme-fabrics
spec:
  config:
    ignition:
      version: 3.2.0
    storage:
      files:
        - path: /etc/modules-load.d/nvme_fabrics.conf
          overwrite: false
          # Mode must be decimal, this is 0644
          mode: 420
          user:
            name: root
          group:
            name: root
          contents:
            # Source can be a http, https, tftp, s3, gs, or data as defined in rfc2397.
            # This is the rfc2397 text/plain string format
            source: data:,nvme-fabrics

If you use labels to restrict the nodes where Block Storage services run, use a MachineConfigPool to limit the effects of the MachineConfig to the nodes where your services run. For more information, see About node selectors.
If you use a single node deployment to test the process, replace worker with master in the MachineConfig.
Only load the nvme-fabrics module because it loads the transport-specific modules, such as TCP, RDMA, or FC, as needed.

For production deployments that use NVMe-TCP volumes, use multipathing. For NVMe-TCP volumes OCP uses native multipathing, called ANA.

After the OCP nodes reboot and load the nvme-fabrics module, you can confirm that the operating system is configured and that it supports ANA by checking the host:

$ cat /sys/module/nvme_core/parameters/multipath

ANA does not use the Linux Multipathing Device Mapper, but OCP requires multipathd to run on Compute nodes for the Compute service (nova) to be able to use multipathing. Multipathing is automatically configured on data plane nodes when they are provisioned.

Multipathing

Use multipathing for iSCSI and FC protocols. To configure multipathing on these protocols, you perform the following tasks:
- Prepare the OCP hosts
- Configure the Block Storage services
- Prepare the Compute service nodes
- Configure the Compute service

To prepare the OCP hosts, ensure that the Linux Multipath Device Mapper is configured and running on the OCP hosts by using MachineConfig. For example:

# Includes the /etc/multipathd.conf contents and the systemd unit changes
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
    service: cinder
  name: 99-master-cinder-enable-multipathd
spec:
  config:
    ignition:
      version: 3.2.0
    storage:
      files:
        - path: /etc/multipath.conf
          overwrite: false
          # Mode must be decimal, this is 0600
          mode: 384
          user:
            name: root
          group:
            name: root
          contents:
            # Source can be a http, https, tftp, s3, gs, or data as defined in rfc2397.
            # This is the rfc2397 text/plain string format
            source: data:,defaults%20%7B%0A%20%20user_friendly_names%20no%0A%20%20recheck_wwid%20yes%0A%20%20skip_kpartx%20yes%0A%20%20find_multipaths%20yes%0A%7D%0A%0Ablacklist%20%7B%0A%7D
    systemd:
      units:
      - enabled: true
        name: multipathd.service

If you use labels to restrict the nodes where Block Storage services run, you need to use a MachineConfigPool to limit the effects of the MachineConfig to only the nodes where your services run. For more information, see About node selectors.
If you are using a single node deployment to test the process, replace worker with master in the MachineConfig.
Cinder volume and backup are configured by default to use multipathing.

1.10.5. Preparing the Block Storage service by customizing the configuration

The high level explanation of the tailor-made approach is:

Determine what part of the configuration is generic for all the Block Storage services and remove anything that would change when deployed in OpenShift, such as the connection in the [database] section, the transport_url and log_dir in the [DEFAULT] sections, the whole [coordination] and [barbican] sections. The remaining generic configuration goes into the customServiceConfig option, or a Secret custom resource (CR) and is then used in the customServiceConfigSecrets section, at the cinder: template: level.
Determine if there is a scheduler-specific configuration and add it to the customServiceConfig option in cinder: template: cinderScheduler.
Determine if there is an API-specific configuration and add it to the customServiceConfig option in cinder: template: cinderAPI.
If the Block Storage service backup is deployed, add the Block Storage service backup configuration options to customServiceConfig option, or to a Secret CR that you can add to customServiceConfigSecrets section at the cinder: template: cinderBackup: level. Remove the host configuration in the [DEFAULT] section to support multiple replicas later.
Determine the individual volume back-end configuration for each of the drivers. The configuration is in the specific driver section, and it includes the [backend_defaults] section and FC zoning sections if you use them. The Block Storage service operator does not support a global customServiceConfig option for all volume services. Each back end has its own section under cinder: template: cinderVolumes, and the configuration goes in the customServiceConfig option or in a Secret CR and is then used in the customServiceConfigSecrets section.
If any of the Block Storage service volume drivers require a custom vendor image, find the location of the image in the Red Hat Ecosystem Catalog, and create or modify an OpenStackVersion CR to specify the custom image by using the key from the cinderVolumes section.

For example, if you have the following configuration:
```
spec:
  cinder:
    enabled: true
    template:
      cinderVolume:
        pure:
          customServiceConfigSecrets:
            - openstack-cinder-pure-cfg
< . . . >
```
Then the OpenStackVersion CR that describes the container image for that back end looks like the following example:
```
apiVersion: core.openstack.org/v1beta1
kind: OpenStackVersion
metadata:
  name: openstack
spec:
  customContainerImages:
    cinderVolumeImages:
      pure: registry.connect.redhat.com/purestorage/openstack-cinder-volume-pure-rhosp-18-0'
```
The name of the OpenStackVersion must match the name of your OpenStackControlPlane CR.
If your Block Storage services use external files, for example, for a custom policy, or to store credentials or SSL certificate authority bundles to connect to a storage array, make those files available to the right containers. Use Secrets or ConfigMap to store the information in OCP and then in the extraMounts key. For example, for Ceph credentials that are stored in a Secret called ceph-conf-files, you patch the top-level extraMounts key in the OpenstackControlPlane CR:
```
spec:
  extraMounts:
  - extraVol:
    - extraVolType: Ceph
      mounts:
      - mountPath: /etc/ceph
        name: ceph
        readOnly: true
      propagation:
      - CinderVolume
      - CinderBackup
      - Glance
      volumes:
      - name: ceph
        projected:
          sources:
          - secret:
              name: ceph-conf-files
```

For a service-specific file, such as the API policy, you add the configuration on the service itself. In the following example, you include the CinderAPI configuration that references the policy you are adding from a ConfigMap called my-cinder-conf that has a policy key with the contents of the policy:

spec:
  cinder:
    enabled: true
    template:
      cinderAPI:
        customServiceConfig: |
           [oslo_policy]
           policy_file=/etc/cinder/api/policy.yaml
      extraMounts:
      - extraVol:
        - extraVolType: Ceph
          mounts:
          - mountPath: /etc/cinder/api
            name: policy
            readOnly: true
          propagation:
          - CinderAPI
          volumes:
          - name: policy
            projected:
              sources:
              - configMap:
                  name: my-cinder-conf
                  items:
                    - key: policy
                      path: policy.yaml

1.10.6. Changes to CephFS through NFS

Before you begin the adoption, review the following information to understand the changes to CephFS through NFS between OpenStack (OSP) and Red Hat OpenStack Services on OpenShift (RHOSO) Antelope:

If the OSP deployment uses CephFS through NFS as a back end for Shared File Systems service (manila), you cannot directly import the ceph-nfs service on the OSP Controller nodes into RHOSO Antelope. In RHOSO Antelope, the Shared File Systems service only supports using a clustered NFS service that is directly managed on the Ceph cluster. Adoption with the ceph-nfs service involves a data path disruption to existing NFS clients.
On OSP , Pacemaker manages the high availability of the ceph-nfs service. This service is assigned a Virtual IP (VIP) address that is also managed by Pacemaker. The VIP is typically created on an isolated StorageNFS network. The Controller nodes have ordering and collocation constraints established between this VIP, ceph-nfs, and the Shared File Systems service (manila) share manager service. Prior to adopting Shared File Systems service, you must adjust the Pacemaker ordering and collocation constraints to separate the share manager service. This establishes ceph-nfs with its VIP as an isolated, standalone NFS service that you can decommission after completing the RHOSO adoption.
In Ceph Reef, a native clustered Ceph NFS service has to be deployed on the Ceph cluster by using the Ceph Orchestrator prior to adopting the Shared File Systems service. This NFS service eventually replaces the standalone NFS service from OSP in your deployment. When the Shared File Systems service is adopted into the RHOSO Antelope environment, it establishes all the existing exports and client restrictions on the new clustered Ceph NFS service. Clients can continue to read and write data on existing NFS shares, and are not affected until the old standalone NFS service is decommissioned. After the service is decommissioned, you can re-mount the same share from the new clustered Ceph NFS service during a scheduled downtime.
To ensure that NFS users are not required to make any networking changes to their existing workloads, assign an IP address from the same isolated StorageNFS network to the clustered Ceph NFS service. NFS users only need to discover and re-mount their shares by using new export paths. When the adoption is complete, RHOSO users can query the Shared File Systems service API to list the export locations on existing shares to identify the preferred paths to mount these shares. These preferred paths correspond to the new clustered Ceph NFS service in contrast to other non-preferred export paths that continue to be displayed until the old isolated, standalone NFS service is decommissioned.
When you migrate your workloads from the old NFS service, you must ensure that exports are not consumed from both the old NFS service and the new clustered Ceph NFS service at the same time. This simultaneous access to both services is considered dangerous and bypasses the protections for concurrent access that is ensured by the NFS protocol. When you migrate the workloads to use exports from the new NFS service, you must ensure that you migrate the use of each export entirely so that no part of the workload stays connected to the old NFS service.
You can no longer control the old Pacemaker-managed ceph-nfs service through the OpenStack TripleO after the control plane adoption is complete. This means that there is no support for updating the NFS Ganesha software, or changing any configuration. While data is protected from server crashes or restarts, high availability and data recovery is still limited, and these maintenance issues are no longer visible to Shared File Systems service.
Cloud administrators must ensure a reasonably short window to switch over all end-user workloads to the new NFS service.
While the old ceph-nfs service only supported NFS version 4.1 and later, the new clustered NFS service supports NFS protocols 3 and 4.1 and later. Mixing protocol versions with an export results in unintended consequences. You should mount a given share across all clients by using a consistent NFS protocol version.

For more information on setting up a clustered NFS service, see Creating an NFS Ganesha cluster.

1.11. Ceph prerequisites

Before you migrate your Ceph cluster daemons from your Controller nodes, complete the following tasks in your OpenStack environment:

Upgrade your Ceph cluster to release Reef. For more information, see Upgrading Red Hat Ceph Storage 6 to 7 in Framework for upgrades (16.2 to 17.1).
Your Ceph Reef deployment is managed by cephadm.
The undercloud is still available, and the nodes and networks are managed by TripleO.
If you use an externally deployed Ceph cluster, you must recreate a ceph-nfs cluster in the target nodes as well as propogate the StorageNFS network.

Complete the prerequisites for your specific Ceph environment:

Ceph with monitoring stack components
Ceph RGW
Ceph RBD
NFS Ganesha

1.11.1. Completing prerequisites for a Ceph cluster with monitoring stack components

Before you migrate a Ceph cluster with monitoring stack components, you must gather monitoring stack information, review and update the container image registry, and remove the undercloud container images.

In addition to updating the container images related to the monitoring stack, you must update the configuration entry related to the container_image_base. This has an impact on all the Ceph daemons that rely on the undercloud images. New daemons are deployed by using the new image registry location that is configured in the Ceph cluster.

Procedure

Gather the current status of the monitoring stack. Verify that the hosts have no monitoring label, or grafana, prometheus, or alertmanager, in cases of a per daemons placement evaluation:

The entire relocation process is driven by cephadm and relies on labels to be assigned to the target nodes, where the daemons are scheduled.

[tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls

HOST                    	ADDR       	LABELS                 	STATUS
cephstorage-0.redhat.local  192.168.24.11  osd mds
cephstorage-1.redhat.local  192.168.24.12  osd mds
cephstorage-2.redhat.local  192.168.24.47  osd mds
controller-0.redhat.local   192.168.24.35  _admin mon mgr
controller-1.redhat.local   192.168.24.53  mon _admin mgr
controller-2.redhat.local   192.168.24.10  mon _admin mgr
6 hosts in cluster

Confirm that the cluster is healthy and that both ceph orch ls and ceph orch ps return the expected number of deployed daemons.

Review and update the container image registry:

If you run the Ceph externalization procedure after you migrate the OpenStack control plane, update the container images in the Ceph Storage cluster configuration. The current container images point to the undercloud registry, which might not be available anymore. Because the undercloud is not available after adoption is complete, replace the undercloud-provided images with an alternative registry. In case the desired option is to rely on the default images shipped by cephadm, remove the following config options from the Ceph Storage cluster.

$ ceph config dump
...
...
mgr   advanced  mgr/cephadm/container_image_alertmanager    undercloud-0.ctlplane.redhat.local:8787/ceph/alertmanager:v0.25.0
mgr   advanced  mgr/cephadm/container_image_base            undercloud-0.ctlplane.redhat.local:8787/ceph/ceph:v18
mgr   advanced  mgr/cephadm/container_image_grafana         undercloud-0.ctlplane.redhat.local:8787/ceph/ceph-grafana:9.4.7
mgr   advanced  mgr/cephadm/container_image_node_exporter   undercloud-0.ctlplane.redhat.local:8787/ceph/node-exporter:v1.5.0
mgr   advanced  mgr/cephadm/container_image_prometheus      undercloud-0.ctlplane.redhat.local:8787/ceph/prometheus:v2.43.0

Remove the undercloud container images:

$ cephadm shell -- ceph config rm mgr mgr/cephadm/container_image_base \
for i in prometheus grafana alertmanager node_exporter; do \
    cephadm shell -- ceph config rm mgr mgr/cephadm/container_image_$i \
done

1.11.2. Completing prerequisites for Ceph RGW migration

Complete the following prerequisites before you begin the Ceph Object Gateway (RGW) migration.

Procedure

Check the current status of the Ceph nodes:

(undercloud) [stack@undercloud-0 ~]$ metalsmith list


    +------------------------+    +----------------+
    | IP Addresses           |    |  Hostname      |
    +------------------------+    +----------------+
    | ctlplane=192.168.24.25 |    | cephstorage-0  |
    | ctlplane=192.168.24.10 |    | cephstorage-1  |
    | ctlplane=192.168.24.32 |    | cephstorage-2  |
    | ctlplane=192.168.24.28 |    | compute-0      |
    | ctlplane=192.168.24.26 |    | compute-1      |
    | ctlplane=192.168.24.43 |    | controller-0   |
    | ctlplane=192.168.24.7  |    | controller-1   |
    | ctlplane=192.168.24.41 |    | controller-2   |
    +------------------------+    +----------------+

Full List of Resources:
  * ip-192.168.24.46	(ocf:heartbeat:IPaddr2):     	Started controller-0
  * ip-10.0.0.103   	(ocf:heartbeat:IPaddr2):     	Started controller-1
  * ip-172.17.1.129 	(ocf:heartbeat:IPaddr2):     	Started controller-2
  * ip-172.17.3.68  	(ocf:heartbeat:IPaddr2):     	Started controller-0
  * ip-172.17.4.37  	(ocf:heartbeat:IPaddr2):     	Started controller-1
  * Container bundle set: haproxy-bundle

[undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-haproxy:pcmklatest]:
    * haproxy-bundle-podman-0   (ocf:heartbeat:podman):  Started controller-2
    * haproxy-bundle-podman-1   (ocf:heartbeat:podman):  Started controller-0
    * haproxy-bundle-podman-2   (ocf:heartbeat:podman):  Started controller-1

Identify the ranges of the storage networks. The following is an example and the values might differ in your environment:

[heat-admin@controller-0 ~]$ ip -o -4 a

1: lo	inet 127.0.0.1/8 scope host lo\   	valid_lft forever preferred_lft forever
2: enp1s0	inet 192.168.24.45/24 brd 192.168.24.255 scope global enp1s0\   	valid_lft forever preferred_lft forever
2: enp1s0	inet 192.168.24.46/32 brd 192.168.24.255 scope global enp1s0\   	valid_lft forever preferred_lft forever
7: br-ex	inet 10.0.0.122/24 brd 10.0.0.255 scope global br-ex\   	valid_lft forever preferred_lft forever
8: vlan70	inet 172.17.5.22/24 brd 172.17.5.255 scope global vlan70\   	valid_lft forever preferred_lft forever
8: vlan70	inet 172.17.5.94/32 brd 172.17.5.255 scope global vlan70\   	valid_lft forever preferred_lft forever
9: vlan50	inet 172.17.2.140/24 brd 172.17.2.255 scope global vlan50\   	valid_lft forever preferred_lft forever
10: vlan30	inet 172.17.3.73/24 brd 172.17.3.255 scope global vlan30\   	valid_lft forever preferred_lft forever
10: vlan30	inet 172.17.3.68/32 brd 172.17.3.255 scope global vlan30\   	valid_lft forever preferred_lft forever
11: vlan20	inet 172.17.1.88/24 brd 172.17.1.255 scope global vlan20\   	valid_lft forever preferred_lft forever
12: vlan40	inet 172.17.4.24/24 brd 172.17.4.255 scope global vlan40\   	valid_lft forever preferred_lft forever

br-ex represents the External Network, where in the current environment, HAProxy has the front-end Virtual IP (VIP) assigned.
vlan30 represents the Storage Network, where the new RGW instances should be started on the Ceph Storage nodes.

Identify the network that you previously had in HAProxy and propagate it through TripleO to the Ceph Storage nodes. Use this network to reserve a new VIP that is owned by Ceph as the entry point for the RGW service.

$ less /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg
...
...
listen ceph_rgw
  bind 10.0.0.103:8080 transparent
  bind 172.17.3.68:8080 transparent
  mode http
  balance leastconn
  http-request set-header X-Forwarded-Proto https if { ssl_fc }
  http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
  http-request set-header X-Forwarded-Port %[dst_port]
  option httpchk GET /swift/healthcheck
  option httplog
  option forwardfor
  server controller-0.storage.redhat.local 172.17.3.73:8080 check fall 5 inter 2000 rise 2
  server controller-1.storage.redhat.local 172.17.3.146:8080 check fall 5 inter 2000 rise 2
  server controller-2.storage.redhat.local 172.17.3.156:8080 check fall 5 inter 2000 rise 2

Confirm that the network is used as an HAProxy front end. The following example shows that controller-0 exposes the services by using the external network, which is absent from the Ceph nodes. You must propagate the external network through TripleO:
```
[controller-0]$ ip -o -4 a

...
7: br-ex	inet 10.0.0.106/24 brd 10.0.0.255 scope global br-ex\   	valid_lft forever preferred_lft forever
...
```
If the target nodes are not managed by director, you cannot use this procedure to configure the network. An administrator must manually configure all the required networks.

Propagate the HAProxy front-end network to Ceph Storage nodes.

In the NIC template that you use to define the ceph-storage network interfaces, add the new config section in the Ceph network configuration template file, for example, /home/stack/composable_roles/network/nic-configs/ceph-storage.j2:

---
network_config:
- type: interface
  name: nic1
  use_dhcp: false
  dns_servers: {{ ctlplane_dns_nameservers }}
  addresses:
  - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
  routes: {{ ctlplane_host_routes }}
- type: vlan
  vlan_id: {{ storage_mgmt_vlan_id }}
  device: nic1
  addresses:
  - ip_netmask: {{ storage_mgmt_ip }}/{{ storage_mgmt_cidr }}
  routes: {{ storage_mgmt_host_routes }}
- type: interface
  name: nic2
  use_dhcp: false
  defroute: false
- type: vlan
  vlan_id: {{ storage_vlan_id }}
  device: nic2
  addresses:
  - ip_netmask: {{ storage_ip }}/{{ storage_cidr }}
  routes: {{ storage_host_routes }}
- type: ovs_bridge
  name: {{ neutron_physical_bridge_name }}
  dns_servers: {{ ctlplane_dns_nameservers }}
  domain: {{ dns_search_domains }}
  use_dhcp: false
  addresses:
  - ip_netmask: {{ external_ip }}/{{ external_cidr }}
  routes: {{ external_host_routes }}
  members: []
  - type: interface
    name: nic3
    primary: true

Add the External Network to the bare metal file, for example, /home/stack/composable_roles/network/baremetal_deployment.yaml that is used by metalsmith:

Ensure that network_config_update is enabled for network propagation to the target nodes when os-net-config is triggered.

- name: CephStorage
  count: 3
  hostname_format: cephstorage-%index%
  instances:
  - hostname: cephstorage-0
  name: ceph-0
  - hostname: cephstorage-1
  name: ceph-1
  - hostname: cephstorage-2
  name: ceph-2
  defaults:
  profile: ceph-storage
  network_config:
      template: /home/stack/composable_roles/network/nic-configs/ceph-storage.j2
      network_config_update: true
  networks:
  - network: ctlplane
      vif: true
  - network: storage
  - network: storage_mgmt
  - network: external

Configure the new network on the bare metal nodes:

(undercloud) [stack@undercloud-0]$ openstack overcloud node provision \
   -o overcloud-baremetal-deployed-0.yaml \
   --stack overcloud \
   --network-config -y \
  $PWD/composable_roles/network/baremetal_deployment.yaml

Verify that the new network is configured on the Ceph Storage nodes:

[root@cephstorage-0 ~]# ip -o -4 a

1: lo	inet 127.0.0.1/8 scope host lo\   	valid_lft forever preferred_lft forever
2: enp1s0	inet 192.168.24.54/24 brd 192.168.24.255 scope global enp1s0\   	valid_lft forever preferred_lft forever
11: vlan40	inet 172.17.4.43/24 brd 172.17.4.255 scope global vlan40\   	valid_lft forever preferred_lft forever
12: vlan30	inet 172.17.3.23/24 brd 172.17.3.255 scope global vlan30\   	valid_lft forever preferred_lft forever
14: br-ex	inet 10.0.0.133/24 brd 10.0.0.255 scope global br-ex\   	valid_lft forever preferred_lft forever

1.11.3. Completing prerequisites for a Ceph RBD migration

Complete the following prerequisites before you begin the Ceph Rados Block Device (RBD) migration.

The target CephStorage or ComputeHCI nodes are configured to have both storage and storage_mgmt networks. This ensures that you can use both Ceph public and cluster networks from the same node. From OpenStack and later you do not have to run a stack update.
NFS Ganesha is migrated from a TripleO deployment to cephadm. For more information, see Creating an NFS Ganesha cluster.
Ceph Metadata Server, monitoring stack, Ceph Object Gateway, and any other daemon that is deployed on Controller nodes.
The Ceph cluster is healthy, and the ceph -s command returns HEALTH_OK.

Run os-net-config on the bare metal node and configure additional networks:

If target nodes are CephStorage, ensure that the network is defined in the bare metal file for the CephStorage nodes, for example, /home/stack/composable_roles/network/baremetal_deployment.yaml:

- name: CephStorage
count: 2
instances:
- hostname: oc0-ceph-0
name: oc0-ceph-0
- hostname: oc0-ceph-1
name: oc0-ceph-1
defaults:
networks:
- network: ctlplane
vif: true
- network: storage_cloud_0
subnet: storage_cloud_0_subnet
- network: storage_mgmt_cloud_0
subnet: storage_mgmt_cloud_0_subnet
network_config:
template: templates/single_nic_vlans/single_nic_vlans_storage.j2

Add the missing network:

$ openstack overcloud node provision \
-o overcloud-baremetal-deployed-0.yaml --stack overcloud-0 \
/--network-config -y --concurrency 2 /home/stack/metalsmith-0.yaml

Verify that the storage network is configured on the target nodes:

(undercloud) [stack@undercloud ~]$ ssh heat-admin@192.168.24.14 ip -o -4 a
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
5: br-storage    inet 192.168.24.14/24 brd 192.168.24.255 scope global br-storage\       valid_lft forever preferred_lft forever
6: vlan1    inet 192.168.24.14/24 brd 192.168.24.255 scope global vlan1\       valid_lft forever preferred_lft forever
7: vlan11    inet 172.16.11.172/24 brd 172.16.11.255 scope global vlan11\       valid_lft forever preferred_lft forever
8: vlan12    inet 172.16.12.46/24 brd 172.16.12.255 scope global vlan12\       valid_lft forever preferred_lft forever

1.11.4. Creating an NFS Ganesha cluster

If you use CephFS through NFS with the Shared File Systems service (manila), you must create a new clustered NFS service on the Ceph cluster. This service replaces the standalone, Pacemaker-controlled ceph-nfs service that you use in OpenStack (OSP) .

Procedure

Identify the Ceph nodes to deploy the new clustered NFS service, for example, cephstorage-0, cephstorage-1, cephstorage-2.

You must deploy this service on the StorageNFS isolated network so that you can mount your existing shares through the new NFS export locations. You can deploy the new clustered NFS service on your existing CephStorage nodes or HCI nodes, or on new hardware that you enrolled in the Ceph cluster.

If you deployed your Ceph nodes with TripleO, propagate the StorageNFS network to the target nodes where the ceph-nfs service is deployed.

If the target nodes are not managed by director, you cannot use this procedure to configure the network. An administrator must manually configure all the required networks.
1. Identify the node definition file, overcloud-baremetal-deploy.yaml, that is used in the OSP environment. See Deploying an Overcloud with Network Isolation with TripleO and Applying network configuration changes after deployment for the background to these tasks.
2. Edit the networks that are associated with the Ceph Storage nodes to include the StorageNFS network:
  - name: CephStorage count: 3 hostname_format: cephstorage-%index% instances: - hostname: cephstorage-0 name: ceph-0 - hostname: cephstorage-1 name: ceph-1 - hostname: cephstorage-2 name: ceph-2 defaults: profile: ceph-storage network_config: template: /home/stack/network/nic-configs/ceph-storage.j2 network_config_update: true networks: - network: ctlplane vif: true - network: storage - network: storage_mgmt - network: storage_nfs
3. Edit the network configuration template file, for example, /home/stack/network/nic-configs/ceph-storage.j2, for the Ceph Storage nodes to include an interface that connects to the StorageNFS network:
  - type: vlan device: nic2 vlan_id: {{ storage_nfs_vlan_id }} addresses: - ip_netmask: {{ storage_nfs_ip }}/{{ storage_nfs_cidr }} routes: {{ storage_nfs_host_routes }}
4. Update the Ceph Storage nodes:
  $ openstack overcloud node provision \ --stack overcloud \ --network-config -y \ -o overcloud-baremetal-deployed-storage_nfs.yaml \ --concurrency 2 \ /home/stack/network/baremetal_deployment.yaml
  When the update is complete, ensure that a new interface is created in theCeph Storage nodes and that they are tagged with the VLAN that is associated with StorageNFS.
Identify the IP address from the StorageNFS network to use as the Virtual IP address (VIP) for the Ceph NFS service:
```
$ openstack port list -c "Fixed IP Addresses" --network storage_nfs
```
In a running cephadm shell, identify the hosts for the NFS service:
```
$ ceph orch host ls
```
Label each host that you identified. Repeat this command for each host that you want to label:
```
$ ceph orch host label add <hostname> nfs
```
- Replace <hostname> with the name of the host that you identified.

Create the NFS cluster:

$ ceph nfs cluster create cephfs \
    "label:nfs" \
    --ingress \
    --virtual-ip=<VIP> \
    --ingress-mode=haproxy-protocol

Replace <VIP> with the VIP for the Ceph NFS service.

You must set the ingress-mode argument to haproxy-protocol. No other ingress-mode is supported. This ingress mode allows you to enforce client restrictions through the Shared File Systems service.

For more information on deploying the clustered Ceph NFS service, see the ceph orchestrator documentation.

Check the status of the NFS cluster:

$ ceph nfs cluster ls
$ ceph nfs cluster info cephfs

1.12. Preparing an Instance HA deployment for adoption

To enable the high availability for Compute instances (Instance HA) service after you adopt the Red Hat OpenStack Services on OpenShift (RHOSO) Antelope data plane, perform the following preparation tasks:

Create a fencing configuration file to use after you adopt the RHOSO data plane.
Prevent Pacemaker from monitoring or recovering the Compute nodes.

1.12.1. Maintaining the Instance HA functionality after adoption

To maintain the high availability for Compute instances (Instance HA) functionality after you adopt Red Hat OpenStack Services on OpenShift Antelope, create a fencing configuration file to use in your adopted environment.

Procedure

Gather the fencing information from the fencing.yaml file in your OpenStack (OSP) cluster.

Retrieve the OSP stonith configuration from any of your overcloud Controller nodes:

$ sudo pcs config

Stonith Devices:
...
  Resource: stonith-fence_ipmilan-525400dde4f7 (class=stonith
      type=fence_ipmilan)
    Attributes: stonith-fence_ipmilan-525400dde4f7-instance_attributes
      delay=20
      ipaddr=172.16.0.1
      ipport=6231
      lanplus=true
      login=admin
      passwd=password
      pcmk_host_list=compute-1
    Operations:
      monitor: stonith-fence_ipmilan-525400dde4f7-monitor-interval-60s
        interval=60s
  Resource: stonith-fence_ipmilan-525400819ad3 (class=stonith
      type=fence_ipmilan)
    Attributes: stonith-fence_ipmilan-525400819ad3-instance_attributes
      delay=20
      ipaddr=172.16.0.1
      ipport=6230
      lanplus=true
      login=admin
      passwd=password
      pcmk_host_list=compute-0
    Operations:
      monitor: stonith-fence_ipmilan-525400819ad3-monitor-interval-60s
        interval=60s
...

Generate the fencing configuration file:
- To install the script that automatically generates this file, see How do I automatically generate fencing secret for RHOSO18 instanceha from a osp17.1 cluster that I want to adopt?.
- To create the fencing configuration file manually, see Configuring the fencing of Compute nodes in Configuring high availability for instances.

1.12.2. Preventing Pacemaker from monitoring Compute nodes

You must disable Pacemaker so that it does not monitor your Compute nodes during the adoption. For example, if a network issue occurs during the adoption, Pacemaker attempts to reboot the Compute nodes to recover them, which breaks the adoption.

Procedure

Retrieve the names of the Compute remote resources:

$ sudo pcs stonith |grep -B1 stonith-fence_compute-fence-nova |grep Target |awk -F ': ' '{print $2}'

Disable the stonith and pacemaker_remote resources on each Compute remote resource:
```
$ sudo pcs property set stonith-enabled=false
$ sudo pcs resource disable <compute_remote_resource>
```
where:

<compute_remote_resource>

Specifies the name of the Compute remote resource in your environment.

Retrieve the name of the Compute stonith resources:

$ sudo pcs stonith |grep Level |grep fence_compute |awk '{print $4}' |awk -F ',' '{print $1}' |sort |uniq

Remove the Compute node pacemaker_remote and fencing resources:

$ sudo pcs stonith disable stonith-fence_compute-fence-nova
$ sudo pcs stonith disable <compute_stonith_resource>
$ sudo pcs stonith delete <compute_stonith_resource>
$ sudo pcs resource delete <compute_remote_resource>
$ sudo pcs resource disable compute-unfence-trigger-clone
$ sudo pcs resource delete compute-unfence-trigger-clone
$ sudo pcs resource disable nova-evacuate
$ sudo pcs resource delete nova-evacuate

where:

<compute_stonith_resource>: Specifies the name of the Compute stonith resource in your environment.

1.13. Comparing configuration files between deployments

To help you manage the configuration for your TripleO and OpenStack (OSP) services, you can compare the configuration files between your TripleO deployment and the Red Hat OpenStack Services on OpenShift (RHOSO) cloud by using the os-diff tool.

Prerequisites

Golang is installed and configured on your environment:

dnf install -y golang-github-openstack-k8s-operators-os-diff

Procedure

Configure the /etc/os-diff/os-diff.cfg file and the /etc/os-diff/ssh.config file according to your environment. To allow os-diff to connect to your clouds and pull files from the services that you describe in the config.yaml file, you must set the following options in the os-diff.cfg file:
```
[Default]

local_config_dir=/tmp/
service_config_file=config.yaml

[Tripleo]

ssh_cmd=ssh -F ssh.config
director_host=standalone
container_engine=podman
connection=ssh
remote_config_path=/tmp/tripleo
local_config_path=/tmp/

[Openshift]

ocp_local_config_path=/tmp/ocp
connection=local
ssh_cmd=""
```
- ssh_cmd=ssh -F ssh.config instructs os-diff to access your TripleO host through SSH. The default value is ssh -F ssh.config. However, you can set the value without an ssh.config file, for example, ssh -i /home/user/.ssh/id_rsa stack@my.undercloud.local.
- director_host=standalone specifies the host to use to access your cloud, and the podman/docker binary is installed and allowed to interact with the running containers. You can leave this key blank.

If you use a host file to connect to your cloud, configure the ssh.config file to allow os-diff to access your OSP environment, for example:

Host *
    IdentitiesOnly yes

Host virthost
    Hostname virthost
    IdentityFile ~/.ssh/id_rsa
    User root
    StrictHostKeyChecking no
    UserKnownHostsFile=/dev/null


Host standalone
    Hostname standalone
    IdentityFile ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa
    User root
    StrictHostKeyChecking no
    UserKnownHostsFile=/dev/null

Host crc
    Hostname crc
    IdentityFile ~/.ssh/id_rsa
    User stack
    StrictHostKeyChecking no
    UserKnownHostsFile=/dev/null

Replace <path to SSH key> with the path to your SSH key. You must provide a value for IdentityFile to get full working access to your OSP environment.

If you use an inventory file to connect to your cloud, generate the ssh.config file from your Ansible inventory, for example, tripleo-ansible-inventory.yaml file:
```
$ os-diff configure -i tripleo-ansible-inventory.yaml -o ssh.config --yaml
```

Verification

Test your connection:
```
$ ssh -F ssh.config standalone
```

2. Migrating TLS-e to the RHOSO deployment

If you enabled TLS everywhere (TLS-e) in your OpenStack (OSP) deployment, you must migrate TLS-e to the Red Hat OpenStack Services on OpenShift (RHOSO) deployment.

The RHOSO deployment uses the cert-manager operator to issue, track, and renew the certificates. In the following procedure, you extract the CA signing certificate from the FreeIPA instance that you use to provide the certificates in the OSP environment, and then import them into cert-manager in the RHOSO environment. As a result, you minimize the disruption on the Compute nodes because you do not need to install a new chain of trust.

You then decommission the previous FreeIPA node and no longer use it to issue certificates. This might not be possible if you use the IPA server to issue certificates for non-OSP systems.

The following procedure was reproduced on a FreeIPA 4.10.1 server. The location of the files and directories might change depending on the version.
If the signing keys are stored in an hardware security module (HSM) instead of an NSS shared database (NSSDB), and the keys are retrievable, special HSM utilities might be required.

Prerequisites

Your OSP deployment is using TLS-e.
Ensure that the back-end services on the new deployment are not started yet.
Define the following shell variables. The values are examples and refer to a single-node standalone TripleO deployment. Replace these example values with values that are correct for your environment:
```
IPA_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100 podman exec -ti freeipa-server"
```
In this example the FreeIPA instance is running on a separate host, in a container.

Procedure

To locate the CA certificate and key, list all the certificates inside your NSSDB:

$IPA_SSH certutil -L -d /etc/pki/pki-tomcat/alias

The -L option lists all certificates.

The -d option specifies where the certificates are stored.

The command produces an output similar to the following example:

Certificate Nickname                                         Trust Attributes
                                                             SSL,S/MIME,JAR/XPI

caSigningCert cert-pki-ca                                    CTu,Cu,Cu
ocspSigningCert cert-pki-ca                                  u,u,u
Server-Cert cert-pki-ca                                      u,u,u
subsystemCert cert-pki-ca                                    u,u,u
auditSigningCert cert-pki-ca                                 u,u,Pu

Export the certificate and key from the /etc/pki/pki-tomcat/alias directory. The following example uses the caSigningCert cert-pki-ca certificate:

$IPA_SSH pk12util -o /tmp/freeipa.p12 -n 'caSigningCert\ cert-pki-ca' -d /etc/pki/pki-tomcat/alias -k /etc/pki/pki-tomcat/alias/pwdfile.txt -w /etc/pki/pki-tomcat/alias/pwdfile.txt

The command generates a P12 file with both the certificate and the key. The /etc/pki/pki-tomcat/alias/pwdfile.txt file contains the password that protects the key. You can use the password to both extract the key and generate the new file, /tmp/freeipa.p12. You can also choose another password. If you choose a different password for the new file, replace the parameter of the -w option, or use the -W option followed by the password, in clear text.

With that file, you can also get the certificate and the key by using the openssl pkcs12 command.

Create the secret that contains the root CA:
```
$ oc create secret generic rootca-internal
```

Import the certificate and the key from FreeIPA:

$ oc patch secret rootca-internal -p="{\"data\":{\"ca.crt\": \"`$IPA_SSH openssl pkcs12 -in /tmp/freeipa.p12 -passin file:/etc/pki/pki-tomcat/alias/pwdfile.txt -nokeys | openssl x509 | base64 -w 0`\"}}"

$ oc patch secret rootca-internal -p="{\"data\":{\"tls.crt\": \"`$IPA_SSH openssl pkcs12 -in /tmp/freeipa.p12 -passin file:/etc/pki/pki-tomcat/alias/pwdfile.txt -nokeys | openssl x509 | base64 -w 0`\"}}"

$ oc patch secret rootca-internal -p="{\"data\":{\"tls.key\": \"`$IPA_SSH openssl pkcs12 -in /tmp/freeipa.p12 -passin file:/etc/pki/pki-tomcat/alias/pwdfile.txt -nocerts -noenc | openssl rsa | base64 -w 0`\"}}"

Create the cert-manager issuer and reference the secret:

$ oc apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: rootca-internal
  labels:
    osp-rootca-issuer-public: ""
    osp-rootca-issuer-internal: ""
    osp-rootca-issuer-libvirt: ""
    osp-rootca-issuer-ovn: ""
spec:
  ca:
    secretName: rootca-internal
EOF

Delete the previously created p12 files:
```
$IPA_SSH rm /tmp/freeipa.p12
```

Verification

Verify that the necessary resources are created:

$ oc get issuers

$ oc get secret rootca-internal -o yaml

After the adoption is complete, the cert-manager operator issues new certificates and updates the secrets with the new certificates. As a result, the pods on the control plane automatically restart in order to obtain the new certificates. On the data plane, you must manually initiate a new deployment and restart certain processes to use the new certificates. The old certificates remain active until both the control plane and data plane obtain the new certificates.

3. Migrating databases to the control plane

To begin creating the control plane, enable back-end services and import the databases from your original OpenStack deployment.

3.1. Retrieving topology-specific service configuration

Before you migrate your databases to the Red Hat OpenStack Services on OpenShift (RHOSO) control plane, retrieve the topology-specific service configuration from your OpenStack (OSP) environment. You need this configuration for the following reasons:

To check your current database for inaccuracies
To ensure that you have the data you need before the migration
To compare your OSP database with the adopted RHOSO database

Prerequisites

Define the following shell variables. Replace the example values with values that are correct for your environment:

If you use IPv6, define the SOURCE_MARIADB_IP value without brackets. For example, SOURCE_MARIADB_IP=fd00:bbbb::2.

$ PASSWORD_FILE="$HOME/overcloud-passwords.yaml"
$ MARIADB_IMAGE=quay.io/podified-antelope-centos9/openstack-mariadb:current-podified
$ declare -A TRIPLEO_PASSWORDS
$ CELLS="default cell1 cell2"
$ for CELL in $(echo $CELLS); do
>    TRIPLEO_PASSWORDS[$CELL]="$PASSWORD_FILE"
> done
$ declare -A SOURCE_DB_ROOT_PASSWORD
$ for CELL in $(echo $CELLS); do
>     SOURCE_DB_ROOT_PASSWORD[$CELL]=$(cat ${TRIPLEO_PASSWORDS[$CELL]} | grep ' MysqlRootPassword:' | awk -F ': ' '{ print $2; }')
> done

Define the following shell variables. Replace the example values with values that are correct for your environment:

$ MARIADB_CLIENT_ANNOTATIONS='--annotations=k8s.v1.cni.cncf.io/networks=internalapi'
$ MARIADB_RUN_OVERRIDES="$MARIADB_CLIENT_ANNOTATIONS"

For environments that are enabled with border gateway protocol (BGP), the network annotation must include a default route to enable proper routing. Use the following instead:

$ MARIADB_CLIENT_ANNOTATIONS='--annotations=k8s.v1.cni.cncf.io/networks=[{"name":"internalapi","namespace":"openstack","default-route":["<172.17.0.1>"]}]'
$ MARIADB_RUN_OVERRIDES="$MARIADB_CLIENT_ANNOTATIONS"

where:

<172.17.0.1>: Replace with the gateway IP address of your internalapi network.

$ CONTROLLER1_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100"

$ declare -A SOURCE_MARIADB_IP
$ SOURCE_MARIADB_IP[default]=*<galera cluster VIP>*
$ SOURCE_MARIADB_IP[cell1]=*<galera cell1 cluster VIP>*
$ SOURCE_MARIADB_IP[cell2]=*<galera cell2 cluster VIP>*
# ...

Provide CONTROLLER1_SSH settings with SSH connection details for any non-cell Controller of the source TripleO cloud.
For each cell that is defined in CELLS, replace SOURCE_MARIADB_IP[*]= ..., with the records lists for the cell names and VIP addresses of MariaDB Galera clusters, including the cells, of the source TripleO cloud.

To get the values for SOURCE_MARIADB_IP, query the puppet-generated configurations in a Controller and CellController node:

$ sudo grep -rI 'listen mysql' -A10 /var/lib/config-data/puppet-generated/ | grep bind

The source cloud always uses the same password for cells databases. For that reason, the same passwords file is used for all cells stacks. However, split-stack topology allows using different passwords files for each stack.

Procedure

If your source OSP environment uses border gateway protocol (BGP) for Layer 3 networking, create a BGPConfiguration custom resource to enable BGP routing:
```
$ cat << EOF > bgp.yaml
apiVersion: network.openstack.org/v1beta1
kind: BGPConfiguration
metadata:
  name: bgpconfiguration
  namespace: openstack
spec: {}
EOF

$ oc apply -f bgp.yaml
```
The BGPConfiguration resource enables BGP route advertisement between the OpenShift cluster and the source cloud, which is necessary for the mariadb-client pod to reach the source MariaDB cluster.
Create a persistent mariadb-client pod for database operations:
```
$ oc delete pod mariadb-client || true
$ oc run mariadb-client ${MARIADB_RUN_OVERRIDES} -q --image ${MARIADB_IMAGE} --restart=Never -- /usr/bin/sleep infinity
```
This creates a long-running pod that is used for all subsequent database operations, avoiding the need to create temporary pods for each command.

Wait for the mariadb-client pod to be able to reach the source MariaDB:

$ oc rsh mariadb-client mysql -rsh "${SOURCE_MARIADB_IP[default]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[default]}" -e 'select 1;'

For BGP-enabled environments, this command might take a few moments to succeed while BGP routes are advertised and propagated through the network. The mariadb-client pod needs to receive the route to the source MariaDB IP address through BGP before it can establish a connection. If the command fails, wait a few seconds and retry. The connection should succeed once the BGP route advertisement is complete.

For standard (non-BGP) deployments, this command should succeed immediately.

Export the shell variables for the following outputs and test the connection to the OSP database:

$ unset PULL_OPENSTACK_CONFIGURATION_DATABASES
$ declare -xA PULL_OPENSTACK_CONFIGURATION_DATABASES
$ for CELL in $(echo $CELLS); do
>     PULL_OPENSTACK_CONFIGURATION_DATABASES[$CELL]=$(oc rsh mariadb-client \
>         mysql -rsh "${SOURCE_MARIADB_IP[$CELL]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" -e 'SHOW databases;')
> done

If the connection is successful, the expected output is nothing.

The nova, nova_api, and nova_cell0 databases are included in the same database host for the main overcloud Orchestration service (heat) stack. Additional cells use the nova database of their local Galera clusters.

Run mysqlcheck on the OSP database to check for inaccuracies:

$ unset PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK
$ declare -xA PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK
$ run_mysqlcheck() {
>     PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK=$(oc rsh mariadb-client \
>         mysqlcheck --all-databases -h ${SOURCE_MARIADB_IP[$CELL]} -u root -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" | grep -v OK)
> }
$ for CELL in $(echo $CELLS); do
>     run_mysqlcheck $CELL
> done
$ if [ "$PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK" != "" ]; then
>     for CELL in $(echo $CELLS); do
>         MYSQL_UPGRADE=$(oc rsh mariadb-client \
>             mysql_upgrade --skip-version-check -v -h ${SOURCE_MARIADB_IP[$CELL]} -u root -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}")
>         # rerun mysqlcheck to check if problem is resolved
>         run_mysqlcheck
>     done
> fi

Get the Compute service (nova) cell mappings:

export PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS=$(oc rsh mariadb-client \
    mysql -rsh "${SOURCE_MARIADB_IP[default]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[default]}" nova_api -e \
    'select uuid,name,transport_url,database_connection,disabled from cell_mappings;')

Get the hostnames of the registered Compute services:

$ unset PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES
$ declare -xA PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES
$ for CELL in $(echo $CELLS); do
>     PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES[$CELL]=$(oc rsh mariadb-client \
>         mysql -rsh "${SOURCE_MARIADB_IP[$CELL]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" -e \
>             "select host from nova.services where services.binary='nova-compute' and deleted=0;")
> done

Get the list of the mapped Compute service cells:

export PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS=$($CONTROLLER1_SSH sudo podman exec -it nova_conductor nova-manage cell_v2 list_cells)

Store the exported variables for future use:

$ unset SRIOV_AGENTS
$ declare -xA SRIOV_AGENTS
$ for CELL in $(echo $CELLS); do
>     RCELL=$CELL
>     [ "$CELL" = "$DEFAULT_CELL_NAME" ] && RCELL=default
>     cat > ~/.source_cloud_exported_variables_$CELL << EOF
> unset PULL_OPENSTACK_CONFIGURATION_DATABASES
> unset PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK
> unset PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES
> declare -xA PULL_OPENSTACK_CONFIGURATION_DATABASES
> declare -xA PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK
> declare -xA PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES
> PULL_OPENSTACK_CONFIGURATION_DATABASES[$CELL]="$(oc rsh mariadb-client \
>     mysql -rsh ${SOURCE_MARIADB_IP[$RCELL]} -uroot -p${SOURCE_DB_ROOT_PASSWORD[$RCELL]} -e 'SHOW databases;')"
> PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK[$CELL]="$(oc rsh mariadb-client \
>     mysqlcheck --all-databases -h ${SOURCE_MARIADB_IP[$RCELL]} -u root -p${SOURCE_DB_ROOT_PASSWORD[$RCELL]} | grep -v OK)"
> PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES[$CELL]="$(oc rsh mariadb-client \
>     mysql -rsh ${SOURCE_MARIADB_IP[$RCELL]} -uroot -p${SOURCE_DB_ROOT_PASSWORD[$RCELL]} -e \
>     "select host from nova.services where services.binary='nova-compute' and deleted=0;")"
> if [ "$RCELL" = "default" ]; then
>     PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS="$(oc rsh mariadb-client \
>         mysql -rsh ${SOURCE_MARIADB_IP[$RCELL]} -uroot -p${SOURCE_DB_ROOT_PASSWORD[$RCELL]} nova_api -e \
>             'select uuid,name,transport_url,database_connection,disabled from cell_mappings;')"
>     PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS="$($CONTROLLER1_SSH sudo podman exec -it nova_conductor nova-manage cell_v2 list_cells)"
> fi
> EOF
> done
$ chmod 0600 ~/.source_cloud_exported_variables*

declare -xA SRIOV_AGENTS gets the neutron-sriov-nic-agent configuration to use for the data plane adoption if neutron-sriov-nic-agent agents are running in your OSP deployment.

Clean up the mariadb-client pod:
```
$ oc delete pod mariadb-client
```
The mariadb-client pod is no longer needed after all the data is exported and stored.

Next steps

This configuration and the exported values are required later, during the data plane adoption post-checks. After the OSP control plane services are shut down, if any of the exported values are lost, re-running the export command fails because the control plane services are no longer running on the source cloud, and the data cannot be retrieved. To avoid data loss, preserve the exported values in an environment file before shutting down the control plane services.

3.2. Deploying back-end services

Create the OpenStackControlPlane custom resource (CR) with the basic back-end services deployed, and disable all the OpenStack (OSP) services. This CR is the foundation of the control plane.

Prerequisites

The cloud that you want to adopt is running, and it is on OSP 17.1.4 or later.
All control plane and data plane hosts of the source cloud are running, and continue to run throughout the adoption procedure.
The openstack-operator is deployed, but OpenStackControlPlane is not deployed.

For developer/CI environments, the OSP operator can be deployed by running make openstack inside install_yamls repo.

For production environments, the deployment method will likely be different.
If you enabled TLS everywhere (TLS-e) on the OSP environment, you must copy the tls root CA from the OSP environment to the rootca-internal issuer.
There are free PVs available for Galera and RabbitMQ.

For developer/CI environments driven by install_yamls, make sure you’ve run make crc_storage.

Set the desired admin password for the control plane deployment. This can be the admin password from your original deployment or a different password:

ADMIN_PASSWORD=SomePassword

To use the existing OSP deployment password:

declare -A TRIPLEO_PASSWORDS
TRIPLEO_PASSWORDS[default]="$HOME/overcloud-passwords.yaml"
ADMIN_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' AdminPassword:' | awk -F ': ' '{ print $2; }')

Set the service password variables to match the original deployment. Database passwords can differ in the control plane environment, but you must synchronize the service account passwords.

For example, in developer environments with TripleO Standalone, the passwords can be extracted:

AODH_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' AodhPassword:' | awk -F ': ' '{ print $2; }')
BARBICAN_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' BarbicanPassword:' | awk -F ': ' '{ print $2; }')
CEILOMETER_METERING_SECRET=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' CeilometerMeteringSecret:' | awk -F ': ' '{ print $2; }')
CEILOMETER_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' CeilometerPassword:' | awk -F ': ' '{ print $2; }')
CINDER_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' CinderPassword:' | awk -F ': ' '{ print $2; }')
DESIGNATE_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' DesignatePassword:' | awk -F ': ' '{ print $2; }')
GLANCE_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' GlancePassword:' | awk -F ': ' '{ print $2; }')
HEAT_AUTH_ENCRYPTION_KEY=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' HeatAuthEncryptionKey:' | awk -F ': ' '{ print $2; }')
HEAT_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' HeatPassword:' | awk -F ': ' '{ print $2; }')
HEAT_STACK_DOMAIN_ADMIN_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' HeatStackDomainAdminPassword:' | awk -F ': ' '{ print $2; }')
IRONIC_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' IronicPassword:' | awk -F ': ' '{ print $2; }')
MANILA_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' ManilaPassword:' | awk -F ': ' '{ print $2; }')
NEUTRON_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' NeutronPassword:' | awk -F ': ' '{ print $2; }')
NOVA_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' NovaPassword:' | awk -F ': ' '{ print $2; }')
OCTAVIA_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' OctaviaPassword:' | awk -F ': ' '{ print $2; }')
PLACEMENT_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' PlacementPassword:' | awk -F ': ' '{ print $2; }')
SWIFT_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' SwiftPassword:' | awk -F ': ' '{ print $2; }')

Procedure

Ensure that you are using the OpenShift namespace where you want the control plane to be deployed:
```
$ oc project openstack
```
Create the OSP secret.

The procedure for this will vary, but in developer/CI environments you use install_yamls:
```
# in install_yamls
make input
```
If the $ADMIN_PASSWORD is different than the password you set in osp-secret, amend the AdminPassword key in the osp-secret:
```
$ oc set data secret/osp-secret "AdminPassword=$ADMIN_PASSWORD"
```

Set service account passwords in osp-secret to match the service account passwords from the original deployment:

$ oc set data secret/osp-secret "AodhPassword=$AODH_PASSWORD"
$ oc set data secret/osp-secret "BarbicanPassword=$BARBICAN_PASSWORD"
$ oc set data secret/osp-secret "CeilometerPassword=$CEILOMETER_PASSWORD"
$ oc set data secret/osp-secret "CinderPassword=$CINDER_PASSWORD"
$ oc set data secret/osp-secret "DesignatePassword=$DESIGNATE_PASSWORD"
$ oc set data secret/osp-secret "GlancePassword=$GLANCE_PASSWORD"
$ oc set data secret/osp-secret "HeatAuthEncryptionKey=$HEAT_AUTH_ENCRYPTION_KEY"
$ oc set data secret/osp-secret "HeatPassword=$HEAT_PASSWORD"
$ oc set data secret/osp-secret "HeatStackDomainAdminPassword=$HEAT_STACK_DOMAIN_ADMIN_PASSWORD"
$ oc set data secret/osp-secret "IronicPassword=$IRONIC_PASSWORD"
$ oc set data secret/osp-secret "IronicInspectorPassword=$IRONIC_PASSWORD"
$ oc set data secret/osp-secret "ManilaPassword=$MANILA_PASSWORD"
$ oc set data secret/osp-secret "MetadataSecret=$METADATA_SECRET"
$ oc set data secret/osp-secret "NeutronPassword=$NEUTRON_PASSWORD"
$ oc set data secret/osp-secret "NovaPassword=$NOVA_PASSWORD"
$ oc set data secret/osp-secret "OctaviaPassword=$OCTAVIA_PASSWORD"
$ oc set data secret/osp-secret "PlacementPassword=$PLACEMENT_PASSWORD"
$ oc set data secret/osp-secret "SwiftPassword=$SWIFT_PASSWORD"

Deploy the OpenStackControlPlane CR. Ensure that you only enable the DNS, Galera, Memcached, and RabbitMQ services. All other services must be disabled:

$ oc apply -f - <<EOF
apiVersion: core.openstack.org/v1beta1
kind: OpenStackControlPlane
metadata:
  name: openstack
spec:
  secret: osp-secret
  storageClass: <storage_class>

  barbican:
    enabled: false
    template:
      barbicanAPI: {}
      barbicanWorker: {}
      barbicanKeystoneListener: {}

  cinder:
    enabled: false
    template:
      cinderAPI: {}
      cinderScheduler: {}
      cinderBackup: {}
      cinderVolumes: {}

  dns:
    template:
      override:
        service:
          metadata:
            annotations:
              metallb.universe.tf/address-pool: ctlplane
              metallb.universe.tf/allow-shared-ip: ctlplane
              metallb.universe.tf/loadBalancerIPs: <loadBalancer_IP>

          spec:
            type: LoadBalancer
      options:
      - key: server
        values:
        - 192.168.122.1
      replicas: 1

  glance:
    enabled: false
    template:
      glanceAPIs: {}

  heat:
    enabled: false
    template: {}

  horizon:
    enabled: false
    template: {}

  ironic:
    enabled: false
    template:
      ironicConductors: []

  keystone:
    enabled: false
    template: {}

  manila:
    enabled: false
    template:
      manilaAPI: {}
      manilaScheduler: {}
      manilaShares: {}

  galera:
    enabled: true
    templates:
      openstack:
        secret: osp-secret
        replicas: 3
        storageRequest: 5G
      openstack-cell1:
        secret: osp-secret
        replicas: 3
        storageRequest: 5G
      openstack-cell2:
        secret: osp-secret
        replicas: 1
        storageRequest: 5G
      openstack-cell3:
        secret: osp-secret
        replicas: 1
        storageRequest: 5G
  memcached:
    enabled: true
    templates:
      memcached:
        replicas: 3

  neutron:
    enabled: false
    template: {}

  nova:
    enabled: false
    template: {}

  ovn:
    enabled: false
    template:
      ovnController:
        networkAttachment: tenant
        nodeSelector:
          node: non-existing-node-name
      ovnNorthd:
        replicas: 0
      ovnDBCluster:
        ovndbcluster-nb:
          replicas: 3
          dbType: NB
          networkAttachment: internalapi
        ovndbcluster-sb:
          replicas: 3
          dbType: SB
          networkAttachment: internalapi

  placement:
    enabled: false
    template: {}

  rabbitmq:
    templates:
      rabbitmq:
        override:
          service:
            metadata:
              annotations:
                metallb.universe.tf/address-pool: internalapi
                metallb.universe.tf/loadBalancerIPs: <loadBalancer_IP>
            spec:
              type: LoadBalancer
      rabbitmq-cell1:
        persistence:
          storage: 1G
        override:
          service:
            metadata:
              annotations:
                metallb.universe.tf/address-pool: internalapi
                metallb.universe.tf/loadBalancerIPs: <loadBalancer_IP>

            spec:
              type: LoadBalancer
      rabbitmq-cell2:
        persistence:
          storage: 1G
        override:
          service:
            metadata:
              annotations:
                metallb.universe.tf/address-pool: internalapi
                metallb.universe.tf/loadBalancerIPs: <loadBalancer_IP>
            spec:
              type: LoadBalancer
      rabbitmq-cell3:
        persistence:
          storage: 1G
        override:
          service:
            metadata:
              annotations:
                metallb.universe.tf/address-pool: internalapi
                metallb.universe.tf/loadBalancerIPs: <loadBalancer_IP>
            spec:
              type: LoadBalancer
  telemetry:
    enabled: false
  tls:
    podLevel:
      enabled: false
    ingress:
      enabled: false
  swift:
    enabled: false
    template:
      swiftRing:
        ringReplicas: 3
      swiftStorage:
        replicas: 0
      swiftProxy:
        replicas: 2
EOF

where:

<storage_class>

Specifies an existing storage class in your OCP cluster.

<loadBalancer_IP>

Specifies the LoadBalancer IP address. If you use IPv6, change the load balancer IPs to the IPs in your environment, for example:

...
metallb.universe.tf/allow-shared-ip: ctlplane
metallb.universe.tf/loadBalancerIPs: fd00:aaaa::80
...
metallb.universe.tf/address-pool: internalapi
metallb.universe.tf/loadBalancerIPs: fd00:bbbb::85
...
metallb.universe.tf/address-pool: internalapi
metallb.universe.tf/loadBalancerIPs: fd00:bbbb::86

galera.openstack-cell1 provides the required infrastructure database and messaging services for the Compute cells, for example, cell1, cell2, and cell3. Adjust the values for fields such as replicas, storage, or storageRequest, for each Compute cell as needed.

spec.tls specifies whether TLS-e is enabled. If you enabled TLS-e in your OSP environment, set tls to the following:

spec:
  ...
  tls:
    podLevel:
      enabled: true
      internal:
        ca:
          customIssuer: rootca-internal
      libvirt:
        ca:
          customIssuer: rootca-internal
      ovn:
        ca:
          customIssuer: rootca-internal
    ingress:
      ca:
        customIssuer: rootca-internal
      enabled: true

Verification

Verify that the Galera and RabbitMQ status is Running for all defined cells:

$ RENAMED_CELLS="cell1 cell2 cell3"
$ oc get pod openstack-galera-0 -o jsonpath='{.status.phase}{"\n"}'
$ oc get pod rabbitmq-server-0 -o jsonpath='{.status.phase}{"\n"}'
$ for CELL in $(echo $RENAMED_CELLS); do
>     oc get pod openstack-$CELL-galera-0 -o jsonpath='{.status.phase}{"\n"}'
>     oc get pod rabbitmq-$CELL-server-0 -o jsonpath='{.status.phase}{"\n"}'
> done

The given cells names are later referred to by using the environment variable RENAMED_CELLS. During the database migration procedure, the Nova cells are renamed. RENAMED_CELLS variable represents the new cell names used in the RHOSO deployment.

Ensure that the statuses of all the Rabbitmq and Galera CRs are Setup complete:

$ oc get Rabbitmqs,Galera
NAME                                                                  STATUS   MESSAGE
rabbitmq.rabbitmq.openstack.org/rabbitmq                              True     Setup complete
rabbitmq.rabbitmq.openstack.org/rabbitmq-cell1                        True     Setup complete

NAME                                                               READY   MESSAGE
galera.mariadb.openstack.org/openstack                             True     Setup complete
galera.mariadb.openstack.org/openstack-cell1                       True     Setup complete

Verify that the OpenStackControlPlane CR is waiting for deployment of the openstackclient pod:

$ oc get OpenStackControlPlane openstack
NAME        STATUS    MESSAGE
openstack   Unknown   OpenStackControlPlane Client not started

3.3. Configuring a Ceph back end

If your OpenStack (OSP) deployment uses a Ceph back end for any service, such as Image Service (glance), Block Storage service (cinder), Compute service (nova), or Shared File Systems service (manila), you must configure the custom resources (CRs) to use the same back end in the Red Hat OpenStack Services on OpenShift (RHOSO) Antelope deployment.

To run ceph commands, you must use SSH to connect to a Ceph node and run sudo cephadm shell. This generates a Ceph orchestrator container that enables you to run administrative commands against the Ceph Storage cluster. If you deployed the Ceph Storage cluster by using TripleO, you can launch the cephadm shell from an OSP Controller node.

Prerequisites

The OpenStackControlPlane CR is created.
If your OSP deployment uses the Shared File Systems service, the openstack keyring is updated. Modify the openstack user so that you can use it across all OSP services:
```
ceph auth caps client.openstack \
  mgr 'allow *' \
  mon 'allow r, profile rbd' \
  osd 'profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=images, allow rw pool manila_data'
```
Using the same user across all services makes it simpler to create a common Ceph secret that includes the keyring and ceph.conf file and propagate the secret to all the services that need it.

The following shell variables are defined. Replace the following example values with values that are correct for your environment:

CEPH_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100"
CEPH_KEY=$($CEPH_SSH "cat /etc/ceph/ceph.client.openstack.keyring | base64 -w 0")
CEPH_CONF=$($CEPH_SSH "cat /etc/ceph/ceph.conf | base64 -w 0")

Procedure

Create the ceph-conf-files secret that includes the Ceph configuration:

$ oc apply -f - <<EOF
apiVersion: v1
data:
  ceph.client.openstack.keyring: $CEPH_KEY
  ceph.conf: $CEPH_CONF
kind: Secret
metadata:
  name: ceph-conf-files
type: Opaque
EOF

The content of the file should be similar to the following example:

apiVersion: v1
kind: Secret
metadata:
  name: ceph-conf-files
stringData:
  ceph.client.openstack.keyring: |
    [client.openstack]
        key = <secret key>
        caps mgr = "allow *"
        caps mon = "allow r, profile rbd"
        caps osd = "pool=vms, profile rbd pool=volumes, profile rbd pool=images, allow rw pool manila_data'
  ceph.conf: |
    [global]
    fsid = 7a1719e8-9c59-49e2-ae2b-d7eb08c695d4
    mon_host = 10.1.1.2,10.1.1.3,10.1.1.4

mon_host specifies the addresses of the cluster’s monitors. If you use IPv6, use brackets for the mon_host. For example: mon_host = [v2:[fd00:cccc::100]:3300/0,v1:[fd00:cccc::100]:6789/0]

In your OpenStackControlPlane CR, inject ceph.conf and ceph.client.openstack.keyring to the OSP services that are defined in the propagation list. For example:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  extraMounts:
    - name: v1
      region: r1
      extraVol:
        - propagation:
          - CinderVolume
          - CinderBackup
          - GlanceAPI
          - ManilaShare
          extraVolType: Ceph
          volumes:
          - name: ceph
            projected:
              sources:
              - secret:
                  name: ceph-conf-files
          mounts:
          - name: ceph
            mountPath: "/etc/ceph"
            readOnly: true
'

3.4. Stopping OpenStack services

Before you start the Red Hat OpenStack Services on OpenShift (RHOSO) adoption, you must stop the OpenStack (OSP) services to avoid inconsistencies in the data that you migrate for the data plane adoption. Inconsistencies are caused by resource changes after the database is copied to the new deployment.

You should not stop the infrastructure management services yet, such as:

Database
RabbitMQ
HAProxy Load Balancer
Ceph-nfs
Compute service
Containerized modular libvirt daemons
Object Storage service (swift) back-end services

Prerequisites

Ensure that there no long-running tasks that require the services that you plan to stop, such as instance live migrations, volume migrations, volume creation, backup and restore, attaching, detaching, and other similar operations:

$ openstack server list --all-projects -c ID -c Status |grep -E '\| .+ing \|'
$ openstack volume list --all-projects -c ID -c Status |grep -E '\| .+ing \|'| grep -vi error
$ openstack volume backup list --all-projects -c ID -c Status |grep -E '\| .+ing \|' | grep -vi error
$ openstack share list --all-projects -c ID -c Status |grep -E '\| .+ing \|'| grep -vi error
$ openstack image list -c ID -c Status |grep -E '\| .+ing \|'

Collect the services topology-specific configuration. For more information, see Retrieving topology-specific service configuration.
Define the following shell variables. The values are examples and refer to a single node standalone TripleO deployment. Replace these example values with values that are correct for your environment:
```
CONTROLLER1_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.100"
```

Specify the IP addresses of all Controller nodes, for example:

CONTROLLER1_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.103"
CONTROLLER2_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.106"
CONTROLLER3_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@192.168.122.109"

Procedure

If your deployment enables CephFS through NFS as a back end for Shared File Systems service (manila), remove the following Pacemaker ordering and co-location constraints that govern the Virtual IP address of the ceph-nfs service and the manila-share service:

# check the co-location and ordering constraints concerning "manila-share"
sudo pcs constraint list --full

# remove these constraints
sudo pcs constraint remove colocation-openstack-manila-share-ceph-nfs-INFINITY
sudo pcs constraint remove order-ceph-nfs-openstack-manila-share-Optional

Disable OSP control plane services:

# Update the services list to be stopped
ServicesToStop=("tripleo_aodh_api.service"
                "tripleo_aodh_api_cron.service"
                "tripleo_aodh_evaluator.service"
                "tripleo_aodh_listener.service"
                "tripleo_aodh_notifier.service"
                "tripleo_ceilometer_agent_central.service"
                "tripleo_ceilometer_agent_notification.service"
                "tripleo_designate_api.service"
                "tripleo_designate_backend_bind9.service"
                "tripleo_designate_central.service"
                "tripleo_designate_mdns.service"
                "tripleo_designate_producer.service"
                "tripleo_designate_worker.service"
                "tripleo_octavia_api.service"
                "tripleo_octavia_health_manager.service"
                "tripleo_octavia_rsyslog.service"
                "tripleo_octavia_driver_agent.service"
                "tripleo_octavia_housekeeping.service"
                "tripleo_octavia_worker.service"
                "tripleo_horizon.service"
                "tripleo_keystone.service"
                "tripleo_barbican_api.service"
                "tripleo_barbican_worker.service"
                "tripleo_barbican_keystone_listener.service"
                "tripleo_cinder_api.service"
                "tripleo_cinder_api_cron.service"
                "tripleo_cinder_scheduler.service"
                "tripleo_cinder_volume.service"
                "tripleo_cinder_backup.service"
                "tripleo_collectd.service"
                "tripleo_glance_api.service"
                "tripleo_gnocchi_api.service"
                "tripleo_gnocchi_metricd.service"
                "tripleo_gnocchi_statsd.service"
                "tripleo_manila_api.service"
                "tripleo_manila_api_cron.service"
                "tripleo_manila_scheduler.service"
                "tripleo_neutron_api.service"
                "tripleo_placement_api.service"
                "tripleo_nova_api_cron.service"
                "tripleo_nova_api.service"
                "tripleo_nova_conductor.service"
                "tripleo_nova_metadata.service"
                "tripleo_nova_scheduler.service"
                "tripleo_nova_vnc_proxy.service"
                "tripleo_aodh_api.service"
                "tripleo_aodh_api_cron.service"
                "tripleo_aodh_evaluator.service"
                "tripleo_aodh_listener.service"
                "tripleo_aodh_notifier.service"
                "tripleo_ceilometer_agent_central.service"
                "tripleo_ceilometer_agent_compute.service"
                "tripleo_ceilometer_agent_ipmi.service"
                "tripleo_ceilometer_agent_notification.service"
                "tripleo_ovn_cluster_northd.service"
                "tripleo_ironic_neutron_agent.service"
                "tripleo_ironic_api.service"
                "tripleo_ironic_inspector.service"
                "tripleo_ironic_conductor.service"
                "tripleo_ironic_inspector_dnsmasq.service"
                "tripleo_ironic_pxe_http.service"
                "tripleo_ironic_pxe_tftp.service"
                "tripleo_unbound.service")

PacemakerResourcesToStop=("openstack-cinder-volume"
                          "openstack-cinder-backup"
                          "openstack-manila-share")

echo "Stopping systemd OpenStack services"
for service in ${ServicesToStop[*]}; do
    for i in {1..3}; do
        SSH_CMD=CONTROLLER${i}_SSH
        if [ ! -z "${!SSH_CMD}" ]; then
            echo "Stopping the $service in controller $i"
            if ${!SSH_CMD} sudo systemctl is-active $service; then
                ${!SSH_CMD} sudo systemctl stop $service
            fi
        fi
    done
done

echo "Checking systemd OpenStack services"
for service in ${ServicesToStop[*]}; do
    for i in {1..3}; do
        SSH_CMD=CONTROLLER${i}_SSH
        if [ ! -z "${!SSH_CMD}" ]; then
            if ! ${!SSH_CMD} systemctl show $service | grep ActiveState=inactive >/dev/null; then
                echo "ERROR: Service $service still running on controller $i"
            else
                echo "OK: Service $service is not running on controller $i"
            fi
        fi
    done
done

echo "Stopping pacemaker OpenStack services"
for i in {1..3}; do
    SSH_CMD=CONTROLLER${i}_SSH
    if [ ! -z "${!SSH_CMD}" ]; then
        echo "Using controller $i to run pacemaker commands"
        for resource in ${PacemakerResourcesToStop[*]}; do
            if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then
                echo "Stopping $resource"
                ${!SSH_CMD} sudo pcs resource disable $resource
            else
                echo "Service $resource not present"
            fi
    done
        break
    fi
done

echo "Checking pacemaker OpenStack services"
for i in {1..3}; do
    SSH_CMD=CONTROLLER${i}_SSH
    if [ ! -z "${!SSH_CMD}" ]; then
        echo "Using controller $i to run pacemaker commands"
        for resource in ${PacemakerResourcesToStop[*]}; do
            if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then
                if ! ${!SSH_CMD} sudo pcs resource status $resource | grep Started; then
                    echo "OK: Service $resource is stopped"
                else
                    echo "ERROR: Service $resource is started"
                fi
            fi
        done
        break
    fi
    done

If the status of each service is OK, then the services stopped successfully.

3.5. Migrating databases to MariaDB instances

Migrate your databases from the original OpenStack (OSP) deployment to the MariaDB instances in the OpenShift cluster.

Prerequisites

Ensure that the control plane MariaDB and RabbitMQ are running, and that no other control plane services are running.
Retrieve the topology-specific service configuration. For more information, see Retrieving topology-specific service configuration.
Stop the OSP services. For more information, see Stopping OpenStack services.
Ensure that there is network routability between the original MariaDB and the MariaDB for the control plane.

Define the following shell variables. Replace the following example values with values that are correct for your environment:

$ STORAGE_CLASS=local-storage
$ MARIADB_IMAGE=quay.io/podified-antelope-centos9/openstack-mariadb:current-podified


$ CELLS="default cell1 cell2"
$ DEFAULT_CELL_NAME="cell3"
$ RENAMED_CELLS="cell1 cell2 $DEFAULT_CELL_NAME"

$ CHARACTER_SET=utf8 #
$ COLLATION=utf8_general_ci

$ declare -A PODIFIED_DB_ROOT_PASSWORD
$ for CELL in $(echo "super $RENAMED_CELLS"); do
>   PODIFIED_DB_ROOT_PASSWORD[$CELL]=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d)
> done

$ declare -A PODIFIED_MARIADB_IP
$ for CELL in $(echo "super $RENAMED_CELLS"); do
>   if [ "$CELL" = "super" ]; then
>     PODIFIED_MARIADB_IP[$CELL]=$(oc get svc --selector "mariadb/name=openstack" -ojsonpath='{.items[0].spec.clusterIP}')
>   else
>     PODIFIED_MARIADB_IP[$CELL]=$(oc get svc --selector "mariadb/name=openstack-$CELL" -ojsonpath='{.items[0].spec.clusterIP}')
>   fi
> done

$ declare -A TRIPLEO_PASSWORDS
$ for CELL in $(echo $CELLS); do
>   if [ "$CELL" = "default" ]; then
>     TRIPLEO_PASSWORDS[default]="$HOME/overcloud-passwords.yaml"
>   else
>     # in a split-stack source cloud, it should take a stack-specific passwords file instead
>     TRIPLEO_PASSWORDS[$CELL]="$HOME/overcloud-passwords.yaml"
>   fi
> done

$ declare -A SOURCE_DB_ROOT_PASSWORD
$ for CELL in $(echo $CELLS); do
>   SOURCE_DB_ROOT_PASSWORD[$CELL]=$(cat ${TRIPLEO_PASSWORDS[$CELL]} | grep ' MysqlRootPassword:' | awk -F ': ' '{ print $2; }')
> done

$ declare -A SOURCE_MARIADB_IP
$ SOURCE_MARIADB_IP[default]=*<galera cluster VIP>*
$ SOURCE_MARIADB_IP[cell1]=*<galera cell1 cluster VIP>*
$ SOURCE_MARIADB_IP[cell2]=*<galera cell2 cluster VIP>*
# ...

$ declare -A SOURCE_GALERA_MEMBERS_DEFAULT
$ SOURCE_GALERA_MEMBERS_DEFAULT=(
>   ["standalone.localdomain"]=172.17.0.100
>   # [...]=...
> )
$ declare -A SOURCE_GALERA_MEMBERS_CELL1
$ SOURCE_GALERA_MEMBERS_CELL1=(
>   # ...
> )
$ declare -A SOURCE_GALERA_MEMBERS_CELL2
$ SOURCE_GALERA_MEMBERS_CELL2=(
>   # ...
> )

CELLS and RENAMED_CELLS represent changes that are going to be made after you import the databases. The default cell takes a new name from DEFAULT_CELL_NAME. In a multi-cell adoption scenario, default cell might retain its original default name as well.
CHARACTER_SET and COLLATION should match the source database. If they do not match, then foreign key relationships break for any tables that are created in the future as part of the database sync.
SOURCE_MARIADB_IP[X]= ... includes the data for each cell that is defined in CELLS. Provide records for the cell names and VIP addresses of MariaDB Galera clusters.
<galera_cell1_cluster_VIP> defines the VIP of your galera cell1 cluster.
<galera_cell2_cluster_VIP> defines the VIP of your galera cell2 cluster, and so on.
SOURCE_GALERA_MEMBERS_CELL<X>, defines the names of the MariaDB Galera cluster members and their IP address for each cell defined in CELLS. Replace ["standalone.localdomain"]="172.17.0.100" with the real hosts data.

A standalone TripleO environment only creates a default cell, which should be the only CELLS value in this case. The DEFAULT_CELL_NAME value should be cell1.

The super is the top-scope Nova API upcall database instance. A super conductor connects to that database. In subsequent examples, the upcall and cells databases use the same password that is defined in osp-secret. Old passwords are only needed to prepare the data exports.

To get the values for SOURCE_MARIADB_IP, query the puppet-generated configurations in the Controller and CellController nodes:
```
$ sudo grep -rI 'listen mysql' -A10 /var/lib/config-data/puppet-generated/ | grep bind
```
To get the values for SOURCE_GALERA_MEMBERS_*, query the puppet-generated configurations in the Controller and CellController nodes:
```
$ sudo grep -rI 'listen mysql' -A10 /var/lib/config-data/puppet-generated/ | grep server
```
The source cloud always uses the same password for cells databases. For that reason, the same passwords file is used for all cells stacks. However, split-stack topology allows using different passwords files for each stack.

Prepare the MariaDB adoption helper pod:

Create a temporary volume claim and a pod for the database data copy. Edit the volume claim storage request if necessary, to give it enough space for the overcloud databases:

$ oc apply -f - <<EOF
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mariadb-data
spec:
  storageClassName: $STORAGE_CLASS
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: mariadb-copy-data
  annotations:
    openshift.io/scc: anyuid
    k8s.v1.cni.cncf.io/networks: internalapi
  labels:
    app: adoption
spec:
  containers:
  - image: $MARIADB_IMAGE
    command: [ "sh", "-c", "sleep infinity"]
    name: adoption
    volumeMounts:
    - mountPath: /backup
      name: mariadb-data
  securityContext:
    allowPrivilegeEscalation: false
    capabilities:
      drop: ALL
    runAsNonRoot: true
    seccompProfile:
      type: RuntimeDefault
  volumes:
  - name: mariadb-data
    persistentVolumeClaim:
      claimName: mariadb-data
EOF

Wait for the pod to be ready:

$ oc wait --for condition=Ready pod/mariadb-copy-data --timeout=30s

Procedure

Check that the source Galera database clusters in each cell have its members online and synced:

for CELL in $(echo $CELLS); do
  MEMBERS=SOURCE_GALERA_MEMBERS_$(echo ${CELL}|tr '[:lower:]' '[:upper:]')[@]
  for i in "${!MEMBERS}"; do
    echo "Checking for the database node $i WSREP status Synced"
    oc rsh mariadb-copy-data mysql \
      -h "$i" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" \
      -e "show global status like 'wsrep_local_state_comment'" | \
      grep -qE "\bSynced\b"
  done
done

Each additional Compute service (nova) v2 cell runs a dedicated Galera database cluster, so the command checks each cell.

Get the count of source databases with the NOK (not-OK) status:

for CELL in $(echo $CELLS); do
  oc rsh mariadb-copy-data mysql -h "${SOURCE_MARIADB_IP[$CELL]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" -e "SHOW databases;"
done

Check that mysqlcheck had no errors:

for CELL in $(echo $CELLS); do
  set +u
  . ~/.source_cloud_exported_variables_$CELL
  set -u
  test -z "${PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK[$CELL]}" || [ "${PULL_OPENSTACK_CONFIGURATION_MYSQLCHECK_NOK[$CELL]}" = " " ] && echo "OK" || echo "CHECK FAILED"
done

Test the connection to the control plane upcall and cells databases:

for CELL in $(echo "super $RENAMED_CELLS"); do
  oc rsh mariadb-copy-data mysql -rsh "${PODIFIED_MARIADB_IP[$CELL]}" -uroot -p"${PODIFIED_DB_ROOT_PASSWORD[$CELL]}" -e 'SHOW databases;'
done

You must transition Compute services that you import later into a superconductor architecture by deleting the old service records in the cell databases, starting with cell1. New records are registered with different hostnames that are provided by the Compute service operator. All Compute services, except the Compute agent, have no internal state, and you can safely delete their service records. You also need to rename the former default cell to DEFAULT_CELL_NAME.

Create a dump of the original databases:

for CELL in $(echo $CELLS); do
  oc rsh mariadb-copy-data << EOF
    mysql -h"${SOURCE_MARIADB_IP[$CELL]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" \
    -N -e "show databases" | grep -E -v "schema|mysql|gnocchi|aodh" | \
    while read dbname; do
      echo "Dumping $CELL cell \${dbname}";
      mysqldump -h"${SOURCE_MARIADB_IP[$CELL]}" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" \
        --single-transaction --complete-insert --skip-lock-tables --lock-tables=0 \
        "\${dbname}" > /backup/"${CELL}.\${dbname}".sql;
    done
EOF
done

Restore the databases from .sql files into the control plane MariaDB:

for CELL in $(echo $CELLS); do
  RCELL=$CELL
  [ "$CELL" = "default" ] && RCELL=$DEFAULT_CELL_NAME
  oc rsh mariadb-copy-data << EOF
    declare -A db_name_map
    db_name_map['nova']="nova_$RCELL"
    db_name_map['ovs_neutron']='neutron'
    db_name_map['ironic-inspector']='ironic_inspector'
    declare -A db_cell_map
    db_cell_map['nova']="nova_$DEFAULT_CELL_NAME"
    db_cell_map["nova_$RCELL"]="nova_$RCELL"
    declare -A db_server_map
    db_server_map['default']=${PODIFIED_MARIADB_IP['super']}
    db_server_map["nova"]=${PODIFIED_MARIADB_IP[$DEFAULT_CELL_NAME]}
    db_server_map["nova_$RCELL"]=${PODIFIED_MARIADB_IP[$RCELL]}
    declare -A db_server_password_map
    db_server_password_map['default']=${PODIFIED_DB_ROOT_PASSWORD['super']}
    db_server_password_map["nova"]=${PODIFIED_DB_ROOT_PASSWORD[$DEFAULT_CELL_NAME]}
    db_server_password_map["nova_$RCELL"]=${PODIFIED_DB_ROOT_PASSWORD[$RCELL]}
    cd /backup
    for db_file in \$(ls ${CELL}.*.sql); do
      db_name=\$(echo \${db_file} | awk -F'.' '{ print \$2; }')
      [[ "$CELL" != "default" && ! -v "db_cell_map[\${db_name}]" ]] && continue
      if [[ "$CELL" == "default" && -v "db_cell_map[\${db_name}]" ]] ; then
        target=$DEFAULT_CELL_NAME
      elif [[ "$CELL" == "default" && ! -v "db_cell_map[\${db_name}]" ]] ; then
        target=super
      else
        target=$RCELL
      fi
      renamed_db_file="\${target}_new.\${db_name}.sql"
      mv -f \${db_file} \${renamed_db_file}
      if [[ -v "db_name_map[\${db_name}]" ]]; then
        echo "renaming $CELL cell \${db_name} to \$target \${db_name_map[\${db_name}]}"
        db_name=\${db_name_map[\${db_name}]}
      fi
      db_server=\${db_server_map["default"]}
      if [[ -v "db_server_map[\${db_name}]" ]]; then
        db_server=\${db_server_map[\${db_name}]}
      fi
      db_password=\${db_server_password_map['default']}
      if [[ -v "db_server_password_map[\${db_name}]" ]]; then
        db_password=\${db_server_password_map[\${db_name}]}
      fi
      echo "creating $CELL cell \${db_name} in \$target \${db_server}"
      mysql -h"\${db_server}" -uroot "-p\${db_password}" -e \
        "CREATE DATABASE IF NOT EXISTS \${db_name} DEFAULT \
        CHARACTER SET ${CHARACTER_SET} DEFAULT COLLATE ${COLLATION};"
      echo "importing $CELL cell \${db_name} into \$target \${db_server} from \${renamed_db_file}"
      mysql -h "\${db_server}" -uroot "-p\${db_password}" "\${db_name}" < "\${renamed_db_file}"
    done
    if [ "$CELL" = "default" ] ; then
      mysql -h "\${db_server_map['default']}" -uroot -p"\${db_server_password_map['default']}" -e \
        "update nova_api.cell_mappings set name='$DEFAULT_CELL_NAME' where name='default';"
    fi
    mysql -h "\${db_server_map["nova_$RCELL"]}" -uroot -p"\${db_server_password_map["nova_$RCELL"]}" -e \
      "delete from nova_${RCELL}.services where host not like '%nova_${RCELL}-%' and services.binary != 'nova-compute';"
EOF
done

db_name_map defines which common databases to rename when importing them.
db_cell_map defines which cells databases to import, and how to rename them, if needed.
db_cell_map["nova_$RCELL"]="nova_$RCELL" omits importing special cell0 databases of the cells, as its contents cannot be consolidated during adoption.
db_server_map defines which databases to import into which servers, usually dedicated for cells.
db_server_password_map defines the root passwords map for database servers. You can only use the same password for now.
renamed_db_file="\${target}_new.\${db_name}.sql" assigns which databases to import into which hosts when extracting databases from the default cell.

Verification

Compare the following outputs with the topology-specific service configuration. For more information, see Retrieving topology-specific service configuration.

Check that the databases are imported correctly:

$ set +u
$ . ~/.source_cloud_exported_variables_default
$ set -u
$ dbs=$(oc exec openstack-galera-0 -c galera -- mysql -rs -uroot -p"${PODIFIED_DB_ROOT_PASSWORD['super']}" -e 'SHOW databases;')
$ echo $dbs | grep -Eq '\bkeystone\b' && echo "OK" || echo "CHECK FAILED"
$ echo $dbs | grep -Eq '\bneutron\b' && echo "OK" || echo "CHECK FAILED"
$ echo "${PULL_OPENSTACK_CONFIGURATION_DATABASES[@]}" | grep -Eq '\bovs_neutron\b' && echo "OK" || echo "CHECK FAILED"
$ novadb_mapped_cells=$(oc exec openstack-galera-0 -c galera -- mysql -rs -uroot -p"${PODIFIED_DB_ROOT_PASSWORD['super']}" \
>   nova_api -e 'select uuid,name,transport_url,database_connection,disabled from cell_mappings;')
> uuidf='\S{8,}-\S{4,}-\S{4,}-\S{4,}-\S{12,}'
> default=$(printf "%s\n" "$PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS" | sed -rn "s/^($uuidf)\s+default\b.*$/\1/p")
> difference=$(diff -ZNua \
>   <(printf "%s\n" "$PULL_OPENSTACK_CONFIGURATION_NOVADB_MAPPED_CELLS") \
>   <(printf "%s\n" "$novadb_mapped_cells")) || true
> if [ "$DEFAULT_CELL_NAME" != "default" ]; then
>   printf "%s\n" "$difference" | grep -qE "^\-$default\s+default\b" && echo "OK" || echo "CHECK FAILED"
>   printf "%s\n" "$difference" | grep -qE "^\+$default\s+$DEFAULT_CELL_NAME\b" && echo "OK" || echo "CHECK FAILED"
>   [ $(grep -E "^[-\+]$uuidf" <<<"$difference" | wc -l) -eq 2 ] && echo "OK" || echo "CHECK FAILED"
> else
>   [ "x$difference" = "x" ] && echo "OK" || echo "CHECK FAILED"
> fi
> for CELL in $(echo $RENAMED_CELLS); do
>   RCELL=$CELL
>   [ "$CELL" = "$DEFAULT_CELL_NAME" ] && RCELL=default
>   set +u
>   . ~/.source_cloud_exported_variables_$RCELL
>   set -u
>   c1dbs=$(oc exec openstack-$CELL-galera-0 -c galera -- mysql -rs -uroot -p${PODIFIED_DB_ROOT_PASSWORD[$CELL]} -e 'SHOW databases;')
>   echo $c1dbs | grep -Eq "\bnova_${CELL}\b" && echo "OK" || echo "CHECK FAILED"
>   novadb_svc_records=$(oc exec openstack-$CELL-galera-0 -c galera -- mysql -rs -uroot -p${PODIFIED_DB_ROOT_PASSWORD[$CELL]} \
>     nova_$CELL -e "select host from services where services.binary='nova-compute' and deleted=0 order by host asc;")
>   diff -Z <(echo "x$novadb_svc_records") <(echo "x${PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES[@]}") && echo "OK" || echo "CHECK FAILED"
> done

echo "${PULL_OPENSTACK_CONFIGURATION_DATABASES[@]}" | grep -Eq '\bovs_neutron\b' && echo "OK" || echo "CHECK FAILED" ensures that the Networking service (neutron) database is renamed from ovs_neutron.
nova_api -e 'select uuid,name,transport_url,database_connection,disabled from cell_mappings;') ensures that the default cell is renamed to $DEFAULT_CELL_NAME, and the cell UUIDs are retained.
for CELL in $(echo $RENAMED_CELLS); do ensures that the registered Compute services names have not changed.
c1dbs=$(oc exec openstack-$CELL-galera-0 -c galera -- mysql -rs -uroot -p${PODIFIED_DB_ROOT_PASSWORD[$CELL]} -e 'SHOW databases;') ensures Compute service cells databases are extracted to separate database servers, and renamed from nova to nova_cell<X>.
diff -Z <(echo "x$novadb_svc_records") <(echo "x${PULL_OPENSTACK_CONFIGURATION_NOVA_COMPUTE_HOSTNAMES[@]}") && echo "OK" || echo "CHECK FAILED" ensures that the registered Compute service name has not changed.

Delete the mariadb-data pod and the mariadb-copy-data persistent volume claim that contains the database backup:

Consider taking a snapshot of them before deleting.
```
$ oc delete pod mariadb-copy-data
$ oc delete pvc mariadb-data
```

During the pre-checks and post-checks, the mariadb-client pod might return a pod security warning related to the restricted:latest security context constraint. This warning is due to default security context constraints and does not prevent the admission controller from creating a pod. You see a warning for the short-lived pod, but it does not interfere with functionality. For more information, see About pod security standards and warnings.

3.6. Migrating OVN data

Migrate the data in the OVN databases from the original OpenStack deployment to ovsdb-server instances that are running in the OpenShift cluster.

Prerequisites

The OpenStackControlPlane resource is created.
NetworkAttachmentDefinition custom resources (CRs) for the original cluster are defined. Specifically, the internalapi network is defined.
The original Networking service (neutron) and OVN northd are not running.
There is network routability between the control plane services and the adopted cluster.
The cloud is migrated to the Modular Layer 2 plug-in with Open Virtual Networking (ML2/OVN) mechanism driver.

Define the following shell variables. Replace the example values with values that are correct for your environment:

STORAGE_CLASS_NAME=local-storage
OVSDB_IMAGE=quay.io/podified-antelope-centos9/openstack-ovn-base:current-podified
SOURCE_OVSDB_IP=172.17.0.100 # For IPv4
SOURCE_OVSDB_IP=[fd00:bbbb::100] # For IPv6

To get the value to set SOURCE_OVSDB_IP, query the puppet-generated configurations in a Controller node:

$ grep -rI 'ovn_[ns]b_conn' /var/lib/config-data/puppet-generated/

Procedure

Prepare a temporary PersistentVolume claim and the helper pod for the OVN backup. Adjust the storage requests for a large database, if needed:

$ oc apply -f - <<EOF
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: ovn-data-cert
spec:
  commonName: ovn-data-cert
  secretName: ovn-data-cert
  issuerRef:
    name: rootca-internal
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ovn-data
spec:
  storageClassName: $STORAGE_CLASS
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: ovn-copy-data
  annotations:
    openshift.io/scc: anyuid
    k8s.v1.cni.cncf.io/networks: internalapi
  labels:
    app: adoption
spec:
  containers:
  - image: $OVSDB_IMAGE
    command: [ "sh", "-c", "sleep infinity"]
    name: adoption
    volumeMounts:
    - mountPath: /backup
      name: ovn-data
    - mountPath: /etc/pki/tls/misc
      name: ovn-data-cert
      readOnly: true
  securityContext:
    allowPrivilegeEscalation: false
    capabilities:
      drop: ALL
    runAsNonRoot: true
    seccompProfile:
      type: RuntimeDefault
  volumes:
  - name: ovn-data
    persistentVolumeClaim:
      claimName: ovn-data
  - name: ovn-data-cert
    secret:
      secretName: ovn-data-cert
EOF

Wait for the pod to be ready:

$ oc wait --for=condition=Ready pod/ovn-copy-data --timeout=30s

If the podified internalapi cidr is different than the source internalapi cidr, add an iptables accept rule on the Controller nodes:

$ $CONTROLLER1_SSH sudo iptables -I INPUT -s {PODIFIED_INTERNALAPI_NETWORK} -p tcp -m tcp --dport 6641 -m conntrack --ctstate NEW -j ACCEPT
$ $CONTROLLER1_SSH sudo iptables -I INPUT -s {PODIFIED_INTERNALAPI_NETWORK} -p tcp -m tcp --dport 6642 -m conntrack --ctstate NEW -j ACCEPT

Back up your OVN databases:

If you did not enable TLS everywhere, run the following command:

$ oc exec ovn-copy-data -- bash -c "ovsdb-client backup tcp:$SOURCE_OVSDB_IP:6641 > /backup/ovs-nb.db"
$ oc exec ovn-copy-data -- bash -c "ovsdb-client backup tcp:$SOURCE_OVSDB_IP:6642 > /backup/ovs-sb.db"

If you enabled TLS everywhere, run the following command:

$ oc exec ovn-copy-data -- bash -c "ovsdb-client backup --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$SOURCE_OVSDB_IP:6641 > /backup/ovs-nb.db"
$ oc exec ovn-copy-data -- bash -c "ovsdb-client backup --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$SOURCE_OVSDB_IP:6642 > /backup/ovs-sb.db"

Start the control plane OVN database services prior to import, with northd disabled:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  ovn:
    enabled: true
    template:
      ovnDBCluster:
        ovndbcluster-nb:
          replicas: 3
          dbType: NB
          storageRequest: 10G
          networkAttachment: internalapi
        ovndbcluster-sb:
          replicas: 3
          dbType: SB
          storageRequest: 10G
          networkAttachment: internalapi
      ovnNorthd:
        replicas: 0
'

Wait for the OVN database services to reach the Running phase:

$ oc wait --for=jsonpath='{.status.phase}'=Running pod --selector=service=ovsdbserver-nb
$ oc wait --for=jsonpath='{.status.phase}'=Running pod --selector=service=ovsdbserver-sb

Fetch the OVN database IP addresses on the clusterIP service network:

PODIFIED_OVSDB_NB_IP=$(oc get svc --selector "statefulset.kubernetes.io/pod-name=ovsdbserver-nb-0" -ojsonpath='{.items[0].spec.clusterIP}')
PODIFIED_OVSDB_SB_IP=$(oc get svc --selector "statefulset.kubernetes.io/pod-name=ovsdbserver-sb-0" -ojsonpath='{.items[0].spec.clusterIP}')

If you are using IPv6, adjust the address to the format expected by ovsdb-* tools:

PODIFIED_OVSDB_NB_IP=[$PODIFIED_OVSDB_NB_IP]
PODIFIED_OVSDB_SB_IP=[$PODIFIED_OVSDB_SB_IP]

Upgrade the database schema for the backup files:

If you did not enable TLS everywhere, use the following command:

$ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema tcp:$PODIFIED_OVSDB_NB_IP:6641 > /backup/ovs-nb.ovsschema && ovsdb-tool convert /backup/ovs-nb.db /backup/ovs-nb.ovsschema"
$ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema tcp:$PODIFIED_OVSDB_SB_IP:6642 > /backup/ovs-sb.ovsschema && ovsdb-tool convert /backup/ovs-sb.db /backup/ovs-sb.ovsschema"

If you enabled TLS everywhere, use the following command:

$ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_NB_IP:6641 > /backup/ovs-nb.ovsschema && ovsdb-tool convert /backup/ovs-nb.db /backup/ovs-nb.ovsschema"
$ oc exec ovn-copy-data -- bash -c "ovsdb-client get-schema --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_SB_IP:6642 > /backup/ovs-sb.ovsschema && ovsdb-tool convert /backup/ovs-sb.db /backup/ovs-sb.ovsschema"

Restore the database backup to the new OVN database servers:

If you did not enable TLS everywhere, use the following command:

$ oc exec ovn-copy-data -- bash -c "ovsdb-client restore tcp:$PODIFIED_OVSDB_NB_IP:6641 < /backup/ovs-nb.db"
$ oc exec ovn-copy-data -- bash -c "ovsdb-client restore tcp:$PODIFIED_OVSDB_SB_IP:6642 < /backup/ovs-sb.db"

If you enabled TLS everywhere, use the following command:

$ oc exec ovn-copy-data -- bash -c "ovsdb-client restore --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_NB_IP:6641 < /backup/ovs-nb.db"
$ oc exec ovn-copy-data -- bash -c "ovsdb-client restore --ca-cert=/etc/pki/tls/misc/ca.crt --private-key=/etc/pki/tls/misc/tls.key --certificate=/etc/pki/tls/misc/tls.crt ssl:$PODIFIED_OVSDB_SB_IP:6642 < /backup/ovs-sb.db"

Check that the data was successfully migrated by running the following commands against the new database servers, for example:
```
$ oc exec -it ovsdbserver-nb-0 -- ovn-nbctl show
$ oc exec -it ovsdbserver-sb-0 -- ovn-sbctl list Chassis
```

Start the control plane ovn-northd service to keep both OVN databases in sync:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  ovn:
    enabled: true
    template:
      ovnNorthd:
        replicas: 1
'

If you are running OVN gateway services on OCP nodes, enable the control plane ovn-controller service:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  ovn:
    enabled: true
    template:
      ovnController:
        nicMappings:
          physNet: NIC

physNet defines the name of your physical network. NIC is the name of the physical interface that is connected to your physical network.

Running OVN gateways on OCP nodes might be prone to data plane downtime during Open vSwitch upgrades. Consider running OVN gateways on dedicated Networker data plane nodes for production deployments instead.

Delete the ovn-data helper pod and the temporary PersistentVolumeClaim that is used to store OVN database backup files:

$ oc delete --ignore-not-found=true pod ovn-copy-data
$ oc delete --ignore-not-found=true pvc ovn-data

Consider taking a snapshot of the ovn-data helper pod and the temporary PersistentVolumeClaim before deleting them. For more information, see About volume snapshots in OpenShift Container Platform storage overview.

Stop the adopted OVN database servers:

ServicesToStop=("tripleo_ovn_cluster_north_db_server.service"
                "tripleo_ovn_cluster_south_db_server.service")

echo "Stopping systemd OpenStack services"
for service in ${ServicesToStop[*]}; do
    for i in {1..3}; do
        SSH_CMD=CONTROLLER${i}_SSH
        if [ ! -z "${!SSH_CMD}" ]; then
            echo "Stopping the $service in controller $i"
            if ${!SSH_CMD} sudo systemctl is-active $service; then
                ${!SSH_CMD} sudo systemctl stop $service
            fi
        fi
    done
done

echo "Checking systemd OpenStack services"
for service in ${ServicesToStop[*]}; do
    for i in {1..3}; do
        SSH_CMD=CONTROLLER${i}_SSH
        if [ ! -z "${!SSH_CMD}" ]; then
            if ! ${!SSH_CMD} systemctl show $service | grep ActiveState=inactive >/dev/null; then
                echo "ERROR: Service $service still running on controller $i"
            else
                echo "OK: Service $service is not running on controller $i"
            fi
        fi
    done
done

4. Adopting OpenStack control plane services

Adopt your OpenStack control plane services to deploy them in the Red Hat OpenStack Services on OpenShift (RHOSO) Antelope control plane.

4.1. Adopting the Identity service

To adopt the Identity service (keystone), you patch an existing OpenStackControlPlane custom resource (CR) where the Identity service is disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment.

Prerequisites

Create the keystone secret that includes the Fernet keys that were copied from the OSP environment:

$ oc apply -f - <<EOF
apiVersion: v1
data:
  CredentialKeys0: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/credential-keys/0 | base64 -w 0)
  CredentialKeys1: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/credential-keys/1 | base64 -w 0)
  FernetKeys0: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/fernet-keys/0 | base64 -w 0)
  FernetKeys1: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/keystone/etc/keystone/fernet-keys/1 | base64 -w 0)
kind: Secret
metadata:
  name: keystone
type: Opaque
EOF

Procedure

Patch the OpenStackControlPlane CR to deploy the Identity service:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  keystone:
    enabled: true
    apiOverride:
      route: {}
    template:
      override:
        service:
          internal:
            metadata:
              annotations:
                metallb.universe.tf/address-pool: internalapi
                metallb.universe.tf/allow-shared-ip: internalapi
                metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
            spec:
              type: LoadBalancer
      databaseInstance: openstack
      secret: osp-secret
'

where:

<172.17.0.80>: Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example, metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.

Create an alias to use the openstack command in the Red Hat OpenStack Services on OpenShift (RHOSO) deployment:
```
$ alias openstack="oc exec -t openstackclient -- openstack"
```

Remove services and endpoints that still point to the OSP control plane, excluding the Identity service and its endpoints:

$ openstack endpoint list | grep keystone | awk '/admin/{ print $2; }' | xargs ${BASH_ALIASES[openstack]} endpoint delete || true
> for service in aodh heat heat-cfn barbican cinderv3 glance designate gnocchi manila manilav2 neutron nova placement swift ironic-inspector ironic octavia; do
>  openstack service list | awk "/ $service /{ print \$2; }" | xargs -r ${BASH_ALIASES[openstack]} service delete || true
> done

Verification

Verify that you can access the OpenStackClient pod. For more information, see Accessing the OpenStackClient pod in Maintaining the Red Hat OpenStack Services on OpenShift deployment.
Confirm that the Identity service endpoints are defined and are pointing to the control plane FQDNs:
```
$ openstack endpoint list | grep keystone
```

Wait for the OpenStackControlPlane resource to become Ready:

$ oc wait --for=condition=Ready --timeout=1m OpenStackControlPlane openstack

4.2. Configuring LDAP with domain-specific drivers

If you need to integrate the Identity service (keystone) with one or more LDAP servers using domain-specific configurations, you can enable domain-specific drivers and provide the necessary LDAP settings.

This involves two main steps:

Create the secret that holds the domain-specific LDAP configuration files that the Identity service uses. Each file within the secret corresponds to an LDAP domain.
Patch the OpenStackControlPlane custom resource (CR) to enable domain-specific drivers for the Identity service and mount a secret that contains the LDAP configurations.

Procedure

To create the keystone-domains secret that stores the actual LDAP configuration files that Identity service uses, create a local file that includes your LDAP configuration, for example, keystone.myldapdomain.conf:

The following example file includes the configuration for a single LDAP domain. If you have multiple LDAP domains, create a configuration file for each, for example, keystone.DOMAIN_ONE.conf, keystone.DOMAIN_TWO.conf.

[identity]
driver = ldap
[ldap]
url = ldap://<ldap_server_host>:<ldap_server_port>
user = <bind_dn_user>
password = <bind_dn_password>
suffix = <user_tree_dn>
query_scope = sub
# User configuration
user_tree_dn = <user_tree_dn>
user_objectclass = <user_object_class>
user_id_attribute = <user_id_attribute>
user_name_attribute = <user_name_attribute>
user_mail_attribute = <user_mail_attribute>
user_enabled_attribute = <user_enabled_attribute>
user_enabled_default = true
# Group configuration
group_tree_dn = <group_tree_dn>
group_objectclass = <group_object_class>
group_id_attribute = <group_id_attribute>
group_name_attribute = <group_name_attribute>
group_member_attribute = <group_member_attribute>
group_members_are_ids = true

Replace the values, such as <ldap_server_host>, <bind_dn_user>, <user_tree_dn>, and so on, with your LDAP server details.

Create the secret from this file:

$ oc create secret generic keystone-domains --from-file=<keystone.DOMAIN_NAME.conf>

Replace <keystone.DOMAIN_NAME.conf> with the name of your local configuration file. If applicable, include additional configuration files by using the --from-file option. After creating the secret, you can remove the local configuration file if it is no longer needed, or store it securely.

The name of the file that you provide to --from-file, for example keystone.DOMAIN_NAME.conf, is critical. The Identity service uses this filename to map incoming authentication requests for a domain to the correct LDAP configuration. Ensure that DOMAIN_NAME matches the name of the domain you are configuring in the Identity service.

Patch the OpenStackControlPlane CR:

$ oc patch openstackcontrolplane <cr_name> --type=merge -p '
spec:
  keystone:
    template:
      customServiceConfig: |
          [identity]
          domain_specific_drivers_enabled = true
      extraMounts:
        - name: v1
          region: r1
          extraVol:
            - propagation:
              - Keystone
              extraVolType: Conf
              volumes:
              - name: keystone-domains
                secret:
                  secretName: keystone-domains
              mounts:
              - name: keystone-domains
                mountPath: "/etc/keystone/domains"
                readOnly: true

Replace <cr_name> with the name of your OpenStackControlPlane CR (for example, openstack).
This patch does the following:
- Sets spec.keystone.template.customServiceConfig. Ensure that you do not overwrite any previously defined value.
- Defines spec.keystone.template.extraMounts to mount a secret named keystone-domains into the Identity service pods at /etc/keystone/domains. This secret contains your LDAP configuration files.
  
  You might need to wait a few minutes for the changes to propagate and for the Identity service pods to be updated.

Verification

Verify that users from the LDAP domain are accessible:
```
$ oc exec -t openstackclient -- openstack user list --domain <domain_name>
```
- Replace <domain_name> with your LDAP domain name.
  
  This command returns a list of users from your LDAP server.
Verify that groups from the LDAP domain are accessible:
```
$ oc exec -t openstackclient -- openstack group list --domain <domain_name>
```
This command returns a list of groups from your LDAP server.
Test authentication with an LDAP user:
```
$ oc exec -t openstackclient -- openstack --os-auth-url <keystone_auth_url> --os-identity-api-version 3 --os-user-domain-name <domain_name> --os-username <ldap_username> --os-password <ldap_password> token issue
```
- Replace <keystone_auth_url> with the Identity service authentication URL.
- Replace <ldap_username> and <ldap_password> with valid LDAP user credentials.
  
  If successful, this command returns a token, confirming that LDAP authentication is working correctly.
Verify group membership for an LDAP user:
```
$ oc exec -t openstackclient -- openstack group contains user --group-domain <domain_name> --user-domain <domain_name> <group_name> <username>
```
- Replace <domain_name>, <group_name>, and <username> with the appropriate values from your LDAP server.
  
  This command verifies that the user is properly associated with the group through LDAP.

4.3. Adopting the Key Manager service

To adopt the Key Manager service (barbican), you patch an existing OpenStackControlPlane custom resource (CR) where Key Manager service is disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment. You configure the Key Manager service to use the simple_crypto back end.

The Key Manager service adoption is complete if you see the following results:

The BarbicanAPI, BarbicanWorker, and BarbicanKeystoneListener services are up and running.
Keystone endpoints are updated, and the same crypto plugin of the source cloud is available.

To configure hardware security module (HSM) integration with Proteccio HSM, see Adopting the Key Manager service with Proteccio HSM integration.

Procedure

Add the kek secret:

$ oc set data secret/osp-secret "BarbicanSimpleCryptoKEK=$($CONTROLLER1_SSH "python3 -c \"import configparser; c = configparser.ConfigParser(); c.read('/var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf'); print(c['simple_crypto_plugin']['kek'])\"")"

Patch the OpenStackControlPlane CR to deploy the Key Manager service:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  barbican:
    enabled: true
    apiOverride:
      route: {}
    template:
      databaseInstance: openstack
      databaseAccount: barbican
      messagingBus:
        cluster: rabbitmq
      secret: osp-secret
      simpleCryptoBackendSecret: osp-secret
      serviceAccount: barbican
      serviceUser: barbican
      passwordSelectors:
        service: BarbicanPassword
        simplecryptokek: BarbicanSimpleCryptoKEK
      barbicanAPI:
        replicas: 1
        override:
          service:
            internal:
              metadata:
                annotations:
                  metallb.universe.tf/address-pool: internalapi
                  metallb.universe.tf/allow-shared-ip: internalapi
                  metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
              spec:
                type: LoadBalancer
      barbicanWorker:
        replicas: 1
      barbicanKeystoneListener:
        replicas: 1
'

where:

<172.17.0.80>: Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example, metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.

Verification

Ensure that the Identity service (keystone) endpoints are defined and are pointing to the control plane FQDNs:
```
$ openstack endpoint list | grep key-manager
```

Ensure that Barbican API service is registered in the Identity service:

$ openstack service list | grep key-manager

$ openstack endpoint list | grep key-manager

List the secrets:
```
$ openstack secret list
```

4.4. Adopting the Key Manager service with HSM integration

Adopt the Key Manager service (barbican) from TripleO to Red Hat OpenStack Services on OpenShift (RHOSO) when your source environment includes hardware security module (HSM) integration to preserve HSM functionality and maintain access to HSM-backed secrets. HSM provides enhanced security for cryptographic operations by storing encryption keys in dedicated hardware devices.

For additional information about the Key Manager service before you start the adoption, see the following resources:

Key Manager service service configuration documentation
Hardware security module vendor-specific documentation
OpenStack Barbican PKCS#11 plugin documentation

4.4.1. Key Manager service HSM adoption approaches

The Key Manager service (barbican) adoption approach depends on your source TripleO environment configuration.

Use the standard adoption approach if your environment includes only the simple_crypto plugin for secret storage and has no HSM integration.
Use the HSM-enabled adoption approach if your source environment has HSM integration that uses Public Key Cryptography Standard (PKCS) #11, Key Management Interoperability Protocol (KMIP), or other HSM back ends alongside simple_crypto.

Standard adoption approach
Uses the existing Key Manager service adoption procedure
Migrates a simple crypto back-end configuration
Provides a single-step adoption process
Is suitable for development, testing, and standard production environments

HSM-enabled adoption approach
Uses the enhanced barbican_adoption role with HSM awareness
Configures HSM integration through a simple boolean flag (barbican_hsm_enabled: true)
Automatically creates required Kubernetes secrets (hsm-login and proteccio-data)
Preserves HSM metadata during database migration
Supports both simple crypto and HSM back ends in the target environment
Requires HSM-specific configuration variables and custom container images with HSM client libraries (built using the rhoso_proteccio_hsm Ansible role)
Uses HSM client certificates and configuration files accessible via URLs
Requires proper HSM partition and key configuration that matches your source environment
The HSM-enabled adoption approach currently supports:
- Proteccio (Eviden Trustway): Fully supported with PKCS#11 integration
- Luna (Thales): PKCS#11 support available
- nCipher (Entrust): PKCS#11 support available

HSM adoption requires additional configuration steps, including:

Custom Barbican container images with HSM client libraries that are built using the rhoso_proteccio_hsm Ansible role
HSM client certificates and configuration files that are accessible by using URLs
Proper HSM partition and key configuration that matches your source environment

These approaches are mutually exclusive. Choose an approach based on your source environment configuration.

Source environment characteristic Approach Rationale

Only simple_crypto back-end configured

Standard adoption

No HSM complexity needed

HSM integration present (PKCS#11, KMIP, and so on)

HSM-enabled adoption

Preserves HSM functionality and secrets

Development or testing environment

Standard adoption

Simpler setup and maintenance

Production with compliance requirements

HSM-enabled adoption

Maintains security compliance

Unknown back-end configuration

Check source environment first

Determine appropriate approach

4.4.2. Adopting the Key Manager service with Proteccio HSM integration

To adopt the Key Manager service (barbican) with Proteccio hardware security module (HSM) integration, you use the enhanced Barbican adoption role with HSM support enabled through a configuration flag. This approach preserves HSM integration while adopting all existing secrets from your source OpenStack (OSP) environment. When you run the data plane adoption tests with HSM support enabled, the adoption process performs the following actions:

Extracts the simple crypto KEK from the source configuration.
Creates the required HSM secrets (hsm-login and proteccio-data) in the target namespace.
Deploys Barbican with HSM-enabled configuration by using the PKCS#11 plugin.
Verifies the HSM functionality and secret migration. When you run the data plane adoption tests with HSM support enabled, the adoption process performs the following actions:
Extracts the simple crypto KEK from the source configuration.
Creates the required HSM secrets (hsm-login and proteccio-data) in the target namespace.
Deploys Barbican with HSM-enabled configuration by using the PKCS#11 plugin.
Verifies the HSM functionality and secret migration.

The Key Manager service Proteccio HSM adoption is complete if you see the following results:

The BarbicanAPI and BarbicanWorker services are up and running with HSM-enabled configuration.
All secrets from the source OSP environment are available in Red Hat OpenStack Services on OpenShift (RHOSO) Antelope.
The PKCS11 crypto plugin is available alongside simple_crypto for new secret storage.
HSM functionality is verified and operational.

If your environment does not include Proteccio HSM, to adopt the Key Manager service by using simple_crypto, see Adopting the Key Manager service.

The enhanced Key Manager service adoption role supports HSM configuration through a simple boolean flag. This approach integrates seamlessly with the standard data plane adoption framework while providing HSM support.

Prerequisites

You have a running TripleO environment with Proteccio HSM integration (the source cloud).
You have a Single Node OpenShift or OpenShift Local running in the OpenShift cluster.
You have SSH access to the source TripleO undercloud and Controller nodes.
You have configured HSM variables in your adoption configuration files.
Custom Key Manager service container images with the Proteccio client libraries are available in your registry.

The HSM adoption process requires proper configuration of HSM-related variables. The adoption role automatically creates the required Kubernetes secrets (hsm-login and proteccio-data) when barbican_hsm_enabled is set to true. Ensure that your environment includes the following:

All HSM-related variables are properly set in your configuration files
The Proteccio client ISO, certificates, and configuration files are accessible from the configured URLs
Custom Key Manager service images with Proteccio client are built and available in your container registry

Without proper HSM configuration, your HSM-protected secrets become inaccessible after adoption.

Procedure

Configure HSM integration variables in your adoption configuration (Zuul job vars or CI framework configuration):

# Enable HSM integration for the Barbican adoption role
barbican_hsm_enabled: true

# HSM login credentials
proteccio_login_password: "your_hsm_password"

# Kubernetes secret names (defaults shown)
proteccio_login_secret_name: "hsm-login"
proteccio_client_data_secret_name: "proteccio-data"

# HSM partition and key configuration
cifmw_hsm_proteccio_partition: "VHSM1"
cifmw_hsm_mkek_label: "adoption_mkek_1"
cifmw_hsm_hmac_label: "adoption_hmac_1"
cifmw_hsm_proteccio_library_path: "/usr/lib64/libnethsm.so"
cifmw_hsm_key_wrap_mechanism: "CKM_AES_CBC_PAD"

# HSM client sources (URLs to download Proteccio client files)
cifmw_hsm_proteccio_client_src: "<URL_of_Proteccio_ISO_file>"
cifmw_hsm_proteccio_conf_src: "<URL_of_proteccio.rc_config_file>"
cifmw_hsm_proteccio_client_crt_src: "<URL_of_client_certificate_file>"
cifmw_hsm_proteccio_client_key_src: "<URL_of_client_certificate_key>"
cifmw_hsm_proteccio_server_crt_src:
  - "<URL_of_HSM_certificate_file>"

where:

<URL_of_Proteccio_ISO_file>: Specifies the full URL (including "http://" or "https://") of the Proteccio client ISO image file.
<URL_of_proteccio.rc_config_file>: Specifies the full URL (including "http://" or "https://") of the proteccio.rc configuration in your RHOSO environment.
<URL_of_client_certificate_file>: Specifies the full URL (including "http://" or "https://") of the HSM client certificate file.
<URL_of_client_certificate_key>: Specifies the full URL (including "http://" or "https://") of the client key file.
<URL_of_HSM_certificate_file>: Specifies the full URL (including "http://" or "https://") of the HSM certificate file.

Run the data plane adoption tests with HSM support enabled:

Verification

Ensure that the Identity service (keystone) endpoints are defined and are pointing to the control plane FQDNs:
```
$ openstack endpoint list | grep key-manager
```
Ensure that the Barbican API service is registered in the Identity service:
```
$ openstack service list | grep key-manager
```
Verify that all secrets from the source environment are available:
```
$ openstack secret list
```

Confirm that Barbican services are running:

$ oc get pods -n openstack -l service=barbican -o wide

Test secret creation to verify HSM functionality:

$ openstack secret store --name adoption-verification --payload 'HSM adoption successful'

Verify that the HSM back end is operational:
```
$ openstack secret get <secret_id> --payload
```
where:

<secret_id>

Specifies the ID of the HSM secret.

4.4.3. Adopting the Key Manager service with HSM integration

When your source TripleO environment includes hardware security module (HSM) integration, you must use the HSM-enabled adoption approach to preserve HSM functionality and maintain access to HSM-backed secrets.

Prerequisites

The source TripleO environment with HSM integration is configured.
HSM client software and certificates are available from accessible URLs.
The target Red Hat OpenStack Services on OpenShift (RHOSO) environment with HSM infrastructure is accessible.
HSM-enabled Key Manager service (barbican) container images are built and available in your registry.

If you use the automated adoption process by setting barbican_hsm_enabled: true, the required HSM secrets (hsm-login and proteccio-data) are created automatically. You only need to manually create the secret when you perform the manual adoption steps.

Procedure

Confirm that your source environment configuration includes HSM integration:
```
$ ssh tripleo-admin@controller-0.ctlplane \
  "sudo cat /var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf | grep -A5 '\[.*plugin\]'"
```
If you see [p11_crypto_plugin] or other HSM-specific sections, continue with the HSM adoption.

Extract the simple crypto key encryption keys (KEK) from your source environment:

$ SIMPLE_CRYPTO_KEK=$(ssh tripleo-admin@controller-0.ctlplane \
  "sudo python3 -c \"import configparser; c = configparser.ConfigParser(); c.read('/var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf'); print(c['simple_crypto_plugin']['kek'])\"")

Add the KEK to the target environment:

$ oc set data secret/osp-secret "BarbicanSimpleCryptoKEK=${SIMPLE_CRYPTO_KEK}"

If you are not using the automated adoption, create HSM-specific secrets in the target environment:

# Create HSM login credentials secret
$ oc create secret generic hsm-login \
  --from-literal=PKCS11Pin=<your_hsm_password> \
  -n openstack

# Create HSM client configuration and certificates secret
$ oc create secret generic proteccio-data \
  --from-file=client.crt=<path_to_client_cert> \
  --from-file=client.key=<path_to_client_key> \
  --from-file=10_8_60_93.CRT=<path_to_server_cert> \
  --from-file=proteccio.rc=<path_to_hsm_config> \
  -n openstack

where:

<your_hsm_password>

Specifies the HSM password for your RHOSO environment.

<path_to_client_cert>

Specifies the path to the HSM client certificate.

<path_to_client_key>

Specifies the path to the client key.

<path_to_server_cert>

Specifies the path to the server certificate.

<path_to_hsm_config>

Specifies the path to your HSM configuration in your RHOSO environment.

When you use the automated adoption by setting barbican_hsm_enabled: true, the barbican_adoption role creates these secrets automatically. The secret names default to hsm-login and proteccio-data.

Patch the OpenStackControlPlane custom resource to deploy Key Manager service with HSM support:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  barbican:
    enabled: true
    apiOverride:
      route: {}
    template:
      databaseInstance: openstack
      databaseAccount: barbican
      rabbitMqClusterName: rabbitmq
      secret: osp-secret
      simpleCryptoBackendSecret: osp-secret
      serviceAccount: barbican
      serviceUser: barbican
      passwordSelectors:
        database: BarbicanDatabasePassword
        service: BarbicanPassword
        simplecryptokek: BarbicanSimpleCryptoKEK
      customServiceConfig: |
        [p11_crypto_plugin]
        plugin_name = PKCS11
        library_path = /usr/lib64/libnethsm.so
        token_labels = VHSM1
        mkek_label = adoption_mkek_1
        hmac_label = adoption_hmac_1
        encryption_mechanism = CKM_AES_CBC
        hmac_key_type = CKK_GENERIC_SECRET
        hmac_keygen_mechanism = CKM_GENERIC_SECRET_KEY_GEN
        hmac_mechanism = CKM_SHA256_HMAC
        key_wrap_mechanism = CKM_AES_CBC_PAD
        key_wrap_generate_iv = true
        always_set_cka_sensitive = true
        os_locking_ok = false
      globalDefaultSecretStore: pkcs11
      enabledSecretStores: ["simple_crypto", "pkcs11"]
      pkcs11:
        loginSecret: hsm-login
        clientDataSecret: proteccio-data
        clientDataPath: /etc/proteccio
      barbicanAPI:
        replicas: 1
        override:
          service:
            internal:
              metadata:
                annotations:
                  metallb.universe.tf/address-pool: internalapi
                  metallb.universe.tf/allow-shared-ip: internalapi
                  metallb.universe.tf/loadBalancerIPs: 172.17.0.80
              spec:
                type: LoadBalancer
      barbicanWorker:
        replicas: 1
      barbicanKeystoneListener:
        replicas: 1
'

library_path specifies the path to the PKCS#11 library, for example, /usr/lib64/libnethsm.so for Proteccio).
token_labels specifies the HSM partition name, for example, VHSM1.
mkek_label and hmac_label specify key labels that are configured in the HSM.
loginSecret specifies the name of the Kubernetes secret that contains the HSM PIN.
clientDataSecret specifies the name of the Kubernetes secret that contains the HSM certificates and configuration.

Verification

Verify that both secret stores are available:
```
$ openstack secret store list
```

Test the HSM back-end functionality:

$ openstack secret store --name "hsm-test-$(date +%s)" \
  --payload "test-payload" \
  --algorithm aes --mode cbc --bit-length 256

Verify that the migrated secrets are accessible:
```
$ openstack secret list
```

Check that the Key Manager service services are operational:

$ oc get pods -l service=barbican
NAME                                          READY   STATUS    RESTARTS      AGE
barbican-api-5d65949b4-xhkd7                  2/2     Running   7 (10m ago)   29d
barbican-keystone-listener-687cbdc77d-4kjnk   2/2     Running   3 (11m ago)   29d
barbican-worker-5c4b947d5c-l9jdh              2/2     Running   3 (11m ago)   29d

HSM adoption preserves both simple crypto and HSM-backed secrets. The migration process maintains HSM metadata and secret references, ensuring continued access to existing secrets while enabling new secrets to use either back-end.

4.4.4. Troubleshooting Key Manager HSM adoption

Review troubleshooting guidance for common issues that you might encounter while you perform the HSM-enabled Key Manager (Barbican) service adoption.

If issues persist after following the troubleshooting guide:

Collect adoption logs and configuration for analysis.
Check the HSM vendor documentation for vendor-specific troubleshooting.
Verify HSM server status and connectivity independently.
Review the adoption summary report for additional diagnostic information.

Resolving configuration validation failures

If the adoption fails with validation errors about placeholder values, replace the placeholder values with your environment’s configuration values.

Example error:

TASK [Validate all required variables are set] ****
fatal: [localhost]: FAILED! => {
    "msg": "Required variable proteccio_certs_path contains placeholder value."
}

Procedure

Edit your hardware security module configuration in the Zuul job vars or CI framework configuration file.

Check the following key variables and replace all placeholder values with actual configuration values for your environment:

cifmw_hsm_password: <your_actual_hsm_password>
cifmw_barbican_proteccio_partition: <VHSM1>
cifmw_barbican_proteccio_mkek_label: <your_mkek_label>
cifmw_barbican_proteccio_hmac_label: <your_hmac_label>
cifmw_hsm_proteccio_client_src: <https://your-server/path/to/Proteccio.iso>
cifmw_hsm_proteccio_conf_src: <https://your-server/path/to/proteccio.rc>

Verify that no placeholder values remain in your configuration.

Resolving missing HSM file prerequisites

If the adoption fails because hardware security module (HSM) certificates or client software cannot be found, update your configuration to point to the files in their specific locations.

Example error:

TASK [Validate Proteccio prerequisites exist] ****
fatal: [localhost]: FAILED! => {
    "msg": "Proteccio client ISO not found: /opt/proteccio/Proteccio3.06.05.iso"
}

Procedure

Verify that all required HSM files are accessible from the configured URLs. For example:

$ curl -I https://your-server/path/to/Proteccio3.06.05.iso
$ curl -I https://your-server/path/to/proteccio.rc
$ curl -I https://your-server/path/to/client.crt
$ curl -I https://your-server/path/to/client.key

If the files are in different locations, update the URL variables in your configuration. For example:

cifmw_hsm_proteccio_client_src: "https://correct-server/path/to/Proteccio3.06.05.iso"
cifmw_hsm_proteccio_conf_src: "https://correct-server/path/to/proteccio.rc"
cifmw_hsm_proteccio_client_crt_src: "https://correct-server/path/to/client.crt"
cifmw_hsm_proteccio_client_key_src: "https://correct-server/path/to/client.key"

Check the network connectivity and authentication to ensure that the URLs are accessible from the CI environment.

Resolving source environment connectivity issues

If the adoption cannot connect to the source OpenStack environment to extract the configuration, check your SSH connectivity to the source Controller node and update the configuration if needed.

Example error:

TASK [detect source environment HSM configuration] ****
fatal: [localhost]: FAILED! => {
    "msg": "SSH connection to source environment failed"
}

Procedure

Verify SSH connectivity to the source Controller node:

$ ssh -o StrictHostKeyChecking=no tripleo-admin@controller-0.ctlplane

Update the controller1_ssh variable if needed:
```
$ controller1_ssh: "ssh -o StrictHostKeyChecking=no tripleo-admin@<controller_ip>"
```
where:

<controller_ip>

Specifies the IP address of your Controller node.
Ensure that the SSH keys are properly configured for passwordless access.

Resolving HSM secret creation failures

If hardware security module (HSM) secrets cannot be created in the target environment, check whether you need to update the names of your secrets in your source configuration file.

Example error:

TASK [Create HSM secrets in target environment] ****
fatal: [localhost]: FAILED! => {
    "msg": "Failed to create secret proteccio-data"
}

Procedure

Verify target environment access:

$ export KUBECONFIG=/path/to/.kube/config
$ oc get secrets -n openstack

Check if secrets already exist:

$ oc get secret proteccio-data hsm-login -n openstack

If secrets exist with different names, update the configuration variables:

proteccio_login_secret_name: "your-hsm-login-secret"
proteccio_client_data_secret_name: "your-proteccio-data-secret"

Resolving custom image registry issues

If custom Barbican images cannot be pushed to or pulled from the configured registry, you can verify the authentication, test image push permissions, and then update the configuration as needed.

Example error:

TASK [Create Proteccio-enabled Barbican images] ****
fatal: [localhost]: FAILED! => {
    "msg": "Failed to push image to registry"
}

Procedure

Verify registry authentication:
```
$ podman login <registry_url>
```
where:

<registry_url>

Specifies the URL of your configured registry.
Test image push permissions:
```
$ podman tag hello-world <registry>/<namespace>/test:latest
$ podman push <registry>/<namespace>/test:latest
```
where:

<registry>

Specifies the name of your registry server.

<namespace>

Specifies the namespace of your container image.

Update registry configuration variables if needed:

cifmw_update_containers_registry: "your-registry:5001"
cifmw_update_containers_org: "your-namespace"
cifmw_image_registry_verify_tls: false

Resolving HSM back-end detection failures

If the adoption role cannot detect hardware security module (HSM) configuration in the source environment, you must force the HSM adoption.

Example error:

TASK [detect source environment HSM configuration] ****
ok: [localhost] => {
    "msg": "No HSM configuration found - using standard adoption"
}

Procedure

Manually verify that the HSM configuration exists in the source environment:

$ ssh tripleo-admin@controller-0.ctlplane \
  "sudo grep -A 10 '\[p11_crypto_plugin\]' \
  /var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf"

If HSM is configured but not detected, force HSM adoption by setting the barbican_hsm_enabled variable:
```
# In your Zuul job vars or CI framework configuration
barbican_hsm_enabled: true
```
This configuration ensures that the barbican_adoption role uses the HSM-enabled patch for Key Manager service (barbican) deployment.

Resolving database migration issues

If hardware security module (HSM) metadata is not preserved during database migration, check the database logs for any errors and verify that the source database includes the HSM secrets.

Example error:

TASK [Verify database migration preserves HSM references] ****
ok: [localhost] => {
    "msg": "HSM secrets found in migrated database: 0"
}

Procedure

Verify that the source database contains the HSM secrets:

$ ssh tripleo-admin@controller-0.ctlplane \
  "sudo mysql barbican -e 'SELECT COUNT(*) FROM secret_store_metadata WHERE key=\"plugin_name\" AND value=\"PKCS11\";'"

Check the database migration logs for errors:

$ oc logs deployment/barbican-api | grep -i migration

If the migration failed, restore the database from backup and retry.

Resolving service startup failures

If the Key Manager service (barbican) services fail to start after the hardware security module (HSM) configuration is applied, check the configuration in the pod.

Example error:

$ oc get pods -l service=barbican
NAME                           READY   STATUS    RESTARTS   AGE
barbican-api-xyz               0/1     Error     0          2m

Procedure

Check pod logs for HSM connectivity issues:
```
$ oc logs barbican-api-xyz
```

Verify HSM library is accessible:

$ oc exec barbican-api-xyz -- ls -la /usr/lib64/libnethsm.so

Check HSM configuration in the pod:

$ oc exec barbican-api-xyz -- cat /etc/proteccio/proteccio.rc

Resolving performance and connectivity issues

If the hardware security module (HSM) operations are slow or fail intermittently, check the HSM connectivity and monitor the HSM server logs.

Procedure

Test HSM connectivity from Key Manager service (barbican) pods:

$ oc exec barbican-api-xyz -- pkcs11-tool --module /usr/lib64/libnethsm.so --list-slots

Check HSM server connectivity:
```
$ oc exec barbican-api-xyz -- nc -zv <hsm_server_ip> <hsm_port>
```
where:

<hsm_server_ip>

Specifies the IP address of the HSM server.

<hsm_port>

Specifies the port of your HSM server.
Monitor HSM server logs for authentication or capacity issues.

4.4.5. Troubleshooting Key Manager service Proteccio HSM adoption

Use this reference to troubleshoot common issues that might occur during Key Manager service (barbican) adoption with Proteccio HSM integration. If Proteccio HSM issues persist, consult the Eviden Trustway documentation and ensure that HSM server configuration matches the client settings.

Resolving prerequisite validation failures

If the adoption script fails during the prerequisites check, verify that your configuration includes all the required Proteccio files and that the HSM Ansible role is available.

Example error:

ERROR: Required file proteccio_files/YOUR_CERT_FILE not found
ERROR: Cannot connect to OpenShift cluster
ERROR: Proteccio HSM Ansible role not found

Procedure

Verify that all required Proteccio files are present:
```
$ ls -la /path/to/your/proteccio_files/
```
Ensure that your configured certificate files, private key, HSM certificate file, and configuration file exist as specified in your proteccio_required_files configuration.

Test OpenShift cluster connectivity:

$ oc cluster-info
$ oc get pods -n openstack

Verify that the HSM Ansible role is available:

$ ls -la /path/to/your/roles/ansible-role-rhoso-proteccio-hsm/

Resolving SSH connection failures to the source environment

If you cannot connect to the source TripleO environment, verify your SSH key access and test the SSH commands that the adoption uses.

Example error:

Warning: Permanently added 'YOUR_UNDERCLOUD_HOST' (ED25519) to the list of known hosts.
Permission denied (publickey).

Procedure

Verify SSH key access to the undercloud:

$ ssh YOUR_UNDERCLOUD_HOST echo "Connection test"

Test the specific SSH commands used by the adoption:

$ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack bash -lc "echo test"'
$ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack ssh -t tripleo-admin@YOUR_CONTROLLER_HOST.ctlplane "echo test"'

If the connection fails, verify the SSH configuration and ensure that the undercloud hostname resolves correctly.

Resolving database import failures

If the source database export or import fails, check the the source Galera container, database connectivity, and the source Key Manager service (barbican) configuration.

Example error:

Error: no container with name or ID "galera-bundle-podman-0" found
mysqldump: Got error: 1045: "Access denied for user 'barbican'@'localhost'"

Procedure

Verify that the source Galera container is running:

$ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack ssh -t tripleo-admin@YOUR_CONTROLLER_HOST.ctlplane "sudo podman ps | grep galera"'

Test database connectivity with the extracted credentials:

$ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack ssh -t tripleo-admin@YOUR_CONTROLLER_HOST.ctlplane "sudo podman exec galera-bundle-podman-0 mysql -u barbican -p<password> -e \"SELECT 1;\""'

where:

<password>: Specifies your database password.

Check the source Key Manager service configuration for the correct database password:

$ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack ssh -t tripleo-admin@YOUR_CONTROLLER_HOST.ctlplane "sudo grep connection /var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf"'

Resolving custom image pull failures

If Proteccio custom images fail to pull or start, verify image registry access, image pull secrets, and registry authentication.

Example error:

Failed to pull image "<Your Custom Pod Image and Tag>": rpc error
Pod has unbound immediate PersistentVolumeClaims

Procedure

Verify image registry access:
```
$ podman pull <custom_pod_image_and_tag>
```
where:

<custom_pod_image_and_tag>

Specifies your custom pod image and the image tag.
Check image pull secrets and registry authentication:
```
$ oc get secrets -n openstack | grep pull
$ oc describe pod <barbican_pod_name> -n openstack
```
where:

<barbican_pod_name>

Specifies your Barbican pod name.
Verify that the OpenStackVersion resource was applied correctly:
```
$ oc get openstackversion openstack -n openstack -o yaml
```

Resolving HSM certificate mounting issues

If Proteccio client certificates are not properly mounted in pods, check the secret creation and ensure that the Key Manager service (barbican) configuration includes the correct volume mounts.

Example error:

$ oc exec <barbican-pod> -c barbican-api -- ls -la /etc/proteccio/
ls: cannot access '/etc/proteccio/': No such file or directory

Procedure

Verify that the proteccio-data secret was created correctly:
```
$ oc describe secret proteccio-data -n openstack
```

Check that the secret contains the expected files:

$ oc get secret proteccio-data -n openstack -o yaml

Verify that the Key Manager service configuration includes the correct volume mounts:
```
$ oc get barbican barbican -n openstack -o yaml | grep -A10 pkcs11
```

Resolving service startup failures

If the Key Manager service (barbican) services fail to start after the hardware security module (HSM) configuration is applied, check the configuration in the pod.

Example error:

$ oc get pods -l service=barbican
NAME                           READY   STATUS    RESTARTS   AGE
barbican-api-xyz               0/1     Error     0          2m

Procedure

Check pod logs for HSM connectivity issues:
```
$ oc logs barbican-api-xyz
```

Verify HSM library is accessible:

$ oc exec barbican-api-xyz -- ls -la /usr/lib64/libnethsm.so

Check HSM configuration in the pod:

$ oc exec barbican-api-xyz -- cat /etc/proteccio/proteccio.rc

Resolving adoption verification failures

If the secrets from the source environment are not accessible after adoption, verify that the database import completed successfully, test API connectivity, and check for schema adoption issues.

Example error:

$ openstack secret list
# Returns empty list or HTTP 500 errors

Procedure

Verify that the database import completed successfully:

$ oc exec openstack-galera-0 -n openstack -- mysql -u root -p<password> barbican -e "SELECT COUNT(*) FROM secrets;"

where:

<password>: Specifies your database password.

Check for schema adoption issues:

$ oc logs job.batch/barbican-db-sync -n openstack

Test API connectivity:

$ oc exec openstackclient -n openstack -- curl -s -k -H "X-Auth-Token: $(openstack token issue -f value -c id)" https://barbican-internal.openstack.svc:9311/v1/secrets

Verify that projects and users were adopted correctly, as secrets are project-scoped.

4.4.6. Rolling back the HSM adoption

If the hardware security module (HSM) adoption fails, you can restore your environment to its original state and attempt the adoption again.

Procedure

Restore the Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 database backup:

$ oc exec -i openstack-galera-0 -n openstack -- mysql -u root -p<password> barbican < /path/to/your/backups/rhoso18_barbican_backup.sql

where:

<password>: Specifies your database password.

Reset to standard images:

$ oc delete openstackversion openstack -n openstack

Restore the base control plane configuration:

$ oc apply -f /path/to/your/base_controlplane.yaml

Next steps

To avoid additional issues when attempting your adoption again, consider the following suggestions:

Check the adoption logs that are stored in your configured working directory with timestamped summary reports.
For HSM-specific issues, consult the Proteccio documentation and verify HSM connectivity from the target environment.
Run the adoption in dry-run mode (./run_proteccio_adoption.sh option 3) to validate the environment before making changes.

4.5. Adopting the Networking service

To adopt the Networking service (neutron), you patch an existing OpenStackControlPlane custom resource (CR) that has the Networking service disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment.

The Networking service adoption is complete if you see the following results:

The NeutronAPI service is running.
The Identity service (keystone) endpoints are updated, and the same back end of the source cloud is available.

The Networking service adoption follows a similar pattern to Keystone.

Prerequisites

Ensure that Single Node OpenShift or OpenShift Local is running in the OpenShift cluster.
Adopt the Identity service. For more information, see Adopting the Identity service.
Migrate your OVN databases to ovsdb-server instances that run in the OpenShift cluster. For more information, see Migrating OVN data.

Procedure

Patch the OpenStackControlPlane CR to deploy the Networking service:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  neutron:
    enabled: true
    apiOverride:
      route: {}
    template:
      override:
        service:
          internal:
            metadata:
              annotations:
                metallb.universe.tf/address-pool: internalapi
                metallb.universe.tf/allow-shared-ip: internalapi
                metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
            spec:
              type: LoadBalancer
      databaseInstance: openstack
      databaseAccount: neutron
      secret: osp-secret
      networkAttachments:
      - internalapi
'

where:

<172.17.0.80>

Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example, metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.

If you used the neutron-dhcp-agent in your OSP deployment and you still need to use it after adoption, you must enable the dhcp_agent_notification for the neutron-api service:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
 spec:
  neutron:
    template:
      customServiceConfig: |
        [DEFAULT]
        dhcp_agent_notification = True
'

Verification

Inspect the resulting Networking service pods:
```
$ oc get pods -l service=neutron
```

Ensure that the Neutron API service is registered in the Identity service:

$ openstack service list | grep network

$ openstack endpoint list | grep network

| 6a805bd6c9f54658ad2f24e5a0ae0ab6 | regionOne | neutron      | network      | True    | public    | http://neutron-public-openstack.apps-crc.testing  |
| b943243e596847a9a317c8ce1800fa98 | regionOne | neutron      | network      | True    | internal  | http://neutron-internal.openstack.svc:9696        |

Create sample resources so that you can test whether the user can create networks, subnets, ports, or routers:

$ openstack network create net
$ openstack subnet create --network net --subnet-range 10.0.0.0/24 subnet
$ openstack router create router

4.6. Configuring control plane networking for spine-leaf topologies

If you are adopting a spine-leaf or Distributed Compute Node (DCN) deployment, update the control plane networking for communication across sites. Add subnets for remote sites to your existing NetConfig custom resource (CR) and update NetworkAttachmentDefinition CRs with routes to enable connectivity between the central control plane and remote sites.

Prerequisites

You have deployed the Red Hat OpenStack Services on OpenShift (RHOSO) control plane.
You have configured a NetConfig CR for the central site. For more information, see Configuring isolated networks.
You have the network topology information for all remote sites, including:
- IP address ranges for each service network at each site
- VLAN IDs for each service network at each site
- Gateway addresses for inter-site routing

Procedure

Update your existing NetConfig CR to add subnets for each remote site. Each service network must include a subnet for the central site and each remote site. Use unique VLAN IDs for each site. For example:

Central site: VLANs 20-23
Edge site 1: VLANs 30-33

Edge site 2: VLANs 40-43

apiVersion: network.openstack.org/v1beta1
kind: NetConfig
metadata:
  name: netconfig
spec:
  networks:
  - name: ctlplane
    dnsDomain: ctlplane.example.com
    subnets:
    - name: <subnet1>
      allocationRanges:
      - end: 192.168.122.120
        start: 192.168.122.100
      cidr: 192.168.122.0/24
      gateway: 192.168.122.1
    - name: <ctlplanesite1>
      allocationRanges:
      - end: 192.168.133.120
        start: 192.168.133.100
      cidr: 192.168.133.0/24
      gateway: 192.168.133.1
    - name: <ctlplanesite2>
      allocationRanges:
      - end: 192.168.144.120
        start: 192.168.144.100
      cidr: 192.168.144.0/24
      gateway: 192.168.144.1
  - name: internalapi
    dnsDomain: internalapi.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 172.17.0.250
        start: 172.17.0.100
      cidr: 172.17.0.0/24
      vlan: 20
    - name: internalapisite1
      allocationRanges:
      - end: 172.17.10.250
        start: 172.17.10.100
      cidr: 172.17.10.0/24
      vlan: 30
    - name: internalapisite2
      allocationRanges:
      - end: 172.17.20.250
        start: 172.17.20.100
      cidr: 172.17.20.0/24
      vlan: 40
  - name: storage
    dnsDomain: storage.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 172.18.0.250
        start: 172.18.0.100
      cidr: 172.18.0.0/24
      vlan: 21
    - name: storagesite1
      allocationRanges:
      - end: 172.18.10.250
        start: 172.18.10.100
      cidr: 172.18.10.0/24
      vlan: 31
    - name: storagesite2
      allocationRanges:
      - end: 172.18.20.250
        start: 172.18.20.100
      cidr: 172.18.20.0/24
      vlan: 41
  - name: tenant
    dnsDomain: tenant.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 172.19.0.250
        start: 172.19.0.100
      cidr: 172.19.0.0/24
      vlan: 22
    - name: tenantsite1
      allocationRanges:
      - end: 172.19.10.250
        start: 172.19.10.100
      cidr: 172.19.10.0/24
      vlan: 32
    - name: tenantsite2
      allocationRanges:
      - end: 172.19.20.250
        start: 172.19.20.100
      cidr: 172.19.20.0/24
      vlan: 42

where:

<subnet1>

Specifies a user-defined subnet name for the central site subnet.

<ctlplanesite1>

Specifies a user-defined subnet for the first DCN edge site.

<ctlplanesite2>

Specifies a user-defined subnet for the second DCN edge site.

You must have the storagemgmt network on OpenShift nodes when using DCN with Swift storage. It is not necessary when using Red Hat Ceph Storage.

Update the NetworkAttachmentDefinition CR for the internalapi network to include routes to remote site subnets. These routes fields enable control plane pods attached to the internalapi network, such as OVN Southbound database, to communicate with Compute nodes at remote sites through the central site gateway, and are required for DCN:

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: internalapi
spec:
  config: |
    {
      "cniVersion": "0.3.1",
      "name": "internalapi",
      "type": "macvlan",
      "master": "internalapi",
      "ipam": {
        "type": "whereabouts",
        "range": "172.17.0.0/24",
        "range_start": "172.17.0.30",
        "range_end": "172.17.0.70",
        "routes": [
          { "dst": "172.17.10.0/24", "gw": "172.17.0.1" },
          { "dst": "172.17.20.0/24", "gw": "172.17.0.1" }
        ]
      }
    }

Update the NetworkAttachmentDefinition CR for the ctlplane network to include routes to remote site subnets:

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: ctlplane
spec:
  config: |
    {
      "cniVersion": "0.3.1",
      "name": "ctlplane",
      "type": "macvlan",
      "master": "ospbr",
      "ipam": {
        "type": "whereabouts",
        "range": "192.168.122.0/24",
        "range_start": "192.168.122.30",
        "range_end": "192.168.122.70",
        "routes": [
          { "dst": "192.168.133.0/24", "gw": "192.168.122.1" },
          { "dst": "192.168.144.0/24", "gw": "192.168.122.1" }
        ]
      }
    }

Update the NetworkAttachmentDefinition CR for the storage network to include routes to remote site subnets:

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: storage
spec:
  config: |
    {
      "cniVersion": "0.3.1",
      "name": "storage",
      "type": "macvlan",
      "master": "storage",
      "ipam": {
        "type": "whereabouts",
        "range": "172.18.0.0/24",
        "range_start": "172.18.0.30",
        "range_end": "172.18.0.70",
        "routes": [
          { "dst": "172.18.10.0/24", "gw": "172.18.0.1" },
          { "dst": "172.18.20.0/24", "gw": "172.18.0.1" }
        ]
      }
    }

Update the NetworkAttachmentDefinition CR for the tenant network to include routes to remote site subnets:

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: tenant
spec:
  config: |
    {
      "cniVersion": "0.3.1",
      "name": "tenant",
      "type": "macvlan",
      "master": "tenant",
      "ipam": {
        "type": "whereabouts",
        "range": "172.19.0.0/24",
        "range_start": "172.19.0.30",
        "range_end": "172.19.0.70",
        "routes": [
          { "dst": "172.19.10.0/24", "gw": "172.19.0.1" },
          { "dst": "172.19.20.0/24", "gw": "172.19.0.1" }
        ]
      }
    }

Adjust the IP ranges, subnets, and gateway addresses in all NAD configurations to match your network topology. The master interface name must match the interface on the OpenShift nodes where the VLAN is configured.

If you have already deployed OVN services, restart the OVN Southbound database pods to pick up the new routes:
```
$ oc delete pod -l service=ovsdbserver-sb
```
The pods are automatically recreated with the updated network configuration.
Configure the Networking service (neutron) to recognize all site physnets. In the OpenStackControlPlane CR, ensure the Networking service configuration includes all physnets:
```
apiVersion: core.openstack.org/v1beta1
kind: OpenStackControlPlane
metadata:
  name: openstack
spec:
  neutron:
    template:
      customServiceConfig: |
        [ml2_type_vlan]
        network_vlan_ranges = leaf0:1:1000,leaf1:1:1000,leaf2:1:1000
        [ovn]
        ovn_emit_need_to_frag = false
```
where:

leaf0

Represents the physnet for the central site.

leaf1

Represents the physnet for the first remote site.

leaf2

Represents the physnet for the second remote site.

Adjust the physnet names to match your OpenStack deployment. Common conventions include leaf0/leaf1/leaf2 or datacentre/dcn1/dcn2.

Verification

Verify that the NetConfig CR is created with all subnets:

$ oc get netconfig netconfig -o yaml | grep -A2 "name: subnet1\|name: .*site"

Verify that each NetworkAttachmentDefinition includes routes to remote site subnets:

for nad in ctlplane internalapi storage tenant; do
  echo "=== $nad ==="
  oc get net-attach-def $nad -o jsonpath='{.spec.config}' | jq '.ipam.routes
done

After restarting OVN SB pods, verify they have routes to remote site subnets:

$ oc exec $(oc get pod -l service=ovsdbserver-sb -o name | head -1) -- ip route show | grep 172.17

Sample output:

172.17.10.0/24 via 172.17.0.1 dev internalapi
172.17.20.0/24 via 172.17.0.1 dev internalapi

4.7. Adopting the Object Storage service

If you are using Object Storage as a service, adopt the Object Storage service (swift) to the Red Hat OpenStack Services on OpenShift (RHOSO) environment. If you are using the Object Storage API of the Ceph Object Gateway (RGW), skip the following procedure.

Prerequisites

The Object Storage service storage back-end services are running in the OpenStack (OSP) deployment.
The storage network is properly configured on the OpenShift cluster. For more information, see Preparing Red Hat OpenShift Container Platform for Red Hat OpenStack Services on OpenShift in Deploying Red Hat OpenStack Services on OpenShift.

Procedure

Create the swift-conf secret that includes the Object Storage service hash path suffix and prefix:

$ oc apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: swift-conf
type: Opaque
data:
  swift.conf: $($CONTROLLER1_SSH sudo cat /var/lib/config-data/puppet-generated/swift/etc/swift/swift.conf | base64 -w0)
EOF

Create the swift-ring-files ConfigMap that includes the Object Storage service ring files:

$ oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: swift-ring-files
binaryData:
  swiftrings.tar.gz: $($CONTROLLER1_SSH "cd /var/lib/config-data/puppet-generated/swift/etc/swift && tar cz *.builder *.ring.gz backups/ | base64 -w0")
  account.ring.gz: $($CONTROLLER1_SSH "base64 -w0 /var/lib/config-data/puppet-generated/swift/etc/swift/account.ring.gz")
  container.ring.gz: $($CONTROLLER1_SSH "base64 -w0 /var/lib/config-data/puppet-generated/swift/etc/swift/container.ring.gz")
  object.ring.gz: $($CONTROLLER1_SSH "base64 -w0 /var/lib/config-data/puppet-generated/swift/etc/swift/object.ring.gz")
EOF

Patch the OpenStackControlPlane custom resource to deploy the Object Storage service:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  swift:
    enabled: true
    template:
      memcachedInstance: memcached
      swiftRing:
        ringReplicas: 3
      swiftStorage:
        replicas: 0
        networkAttachments:
        - storage
        storageClass: local-storage
        storageRequest: 10Gi
      swiftProxy:
        secret: osp-secret
        replicas: 2
        passwordSelectors:
          service: SwiftPassword
        serviceUser: swift
        override:
          service:
            internal:
              metadata:
                annotations:
                  metallb.universe.tf/address-pool: internalapi
                  metallb.universe.tf/allow-shared-ip: internalapi
                  metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
              spec:
                type: LoadBalancer
        networkAttachments:
        - storage
'

spec.swift.swiftStorage.storageClass must match the RHOSO deployment storage class.
metallb.universe.tf/loadBalancerIPs: <172.17.0.80> specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example, metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.
spec.swift.swiftProxy.networkAttachments must match the network attachment for the previous Object Storage service configuration from the OSP deployment.

Verification

Inspect the resulting Object Storage service pods:
```
$ oc get pods -l component=swift-proxy
```

Verify that the Object Storage proxy service is registered in the Identity service (keystone):

$ openstack service list | grep swift
| b5b9b1d3c79241aa867fa2d05f2bbd52 | swift    | object-store |

$ openstack endpoint list | grep swift
| 32ee4bd555414ab48f2dc90a19e1bcd5 | regionOne | swift        | object-store | True    | public    | https://swift-public-openstack.apps-crc.testing/v1/AUTH_%(tenant_id)s |
| db4b8547d3ae4e7999154b203c6a5bed | regionOne | swift        | object-store | True    | internal  | http://swift-internal.openstack.svc:8080/v1/AUTH_%(tenant_id)s        |

Verify that you are able to upload and download objects:

$ openstack container create test
+---------------------------------------+-----------+------------------------------------+
| account                               | container | x-trans-id                         |
+---------------------------------------+-----------+------------------------------------+
| AUTH_4d9be0a9193e4577820d187acdd2714a | test      | txe5f9a10ce21e4cddad473-0065ce41b9 |
+---------------------------------------+-----------+------------------------------------+

$ openstack object create test --name obj <(echo "Hello World!")
+--------+-----------+----------------------------------+
| object | container | etag                             |
+--------+-----------+----------------------------------+
| obj    | test      | d41d8cd98f00b204e9800998ecf8427e |
+--------+-----------+----------------------------------+

$ openstack object save test obj --file -
Hello World!

The Object Storage data is still stored on the existing OSP nodes. For more information about migrating the actual data from the OSP deployment to the RHOSO deployment, see Migrating the Object Storage service (swift) data from OSP to Red Hat OpenStack Services on OpenShift (RHOSO) nodes.

4.8. Adopting the Image service

To adopt the Image Service (glance) you patch an existing OpenStackControlPlane custom resource (CR) that has the Image service disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment.

The Image service adoption is complete if you see the following results:

The GlanceAPI service up and running.
The Identity service endpoints are updated, and the same back end of the source cloud is available.

To complete the Image service adoption, ensure that your environment meets the following criteria:

You have a running TripleO environment (the source cloud).
You have a Single Node OpenShift or OpenShift Local that is running in the OpenShift cluster.
Optional: You can reach an internal/external Ceph cluster by both crc and TripleO.

If you have image quotas in OSP , these quotas are transferred to Red Hat OpenStack Services on OpenShift (RHOSO) Antelope because the image quota system in Antelope is disabled by default. For more information about enabling image quotas in Antelope, see Configuring image quotas in Customizing persistent storage. If you enable image quotas in RHOSO Antelope, the new quotas replace the legacy quotas from OSP .

4.8.1. Adopting the Image service that is deployed with a Object Storage service back end

Adopt the Image Service (glance) that you deployed with an Object Storage service (swift) back end in the OpenStack (OSP) environment. The control plane glanceAPI instance is deployed with the following configuration. You use this configuration in the patch manifest that deploys the Image service with the object storage back end:

..
spec
  glance:
   ...
      customServiceConfig: |
          [DEFAULT]
          enabled_backends = default_backend:swift
          [glance_store]
          default_backend = default_backend
          [default_backend]
          swift_store_create_container_on_put = True
          swift_store_auth_version = 3
          swift_store_auth_address = {{ .KeystoneInternalURL }}
          swift_store_endpoint_type = internalURL
          swift_store_user = service:glance
          swift_store_key = {{ .ServicePassword }}

Prerequisites

You have completed the previous adoption steps.

Procedure

Create a new file, for example, glance_swift.patch, and include the following content:

spec:
  glance:
    enabled: true
    apiOverride:
      route: {}
    template:
      secret: osp-secret
      databaseInstance: openstack
      storage:
        storageRequest: 10G
      customServiceConfig: |
        [DEFAULT]
        enabled_backends = default_backend:swift
        [glance_store]
        default_backend = default_backend
        [default_backend]
        swift_store_create_container_on_put = True
        swift_store_auth_version = 3
        swift_store_auth_address = {{ .KeystoneInternalURL }}
        swift_store_endpoint_type = internalURL
        swift_store_user = service:glance
        swift_store_key = {{ .ServicePassword }}
      glanceAPIs:
        default:
          replicas: 1
          override:
            service:
              internal:
                metadata:
                  annotations:
                    metallb.universe.tf/address-pool: internalapi
                    metallb.universe.tf/allow-shared-ip: internalapi
                    metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
                spec:
                  type: LoadBalancer
          networkAttachments:
            - storage

where:

<172.17.0.80>

The Object Storage service as a back end establishes a dependency with the Image service. Any deployed GlanceAPI instances do not work if the Image service is configured with the Object Storage service that is not available in the OpenStackControlPlane custom resource. After the Object Storage service, and in particular SwiftProxy, is adopted, you can proceed with the GlanceAPI adoption. For more information, see Adopting the Object Storage service.

Verify that SwiftProxy is available:

$ oc get pod -l component=swift-proxy | grep Running
swift-proxy-75cb47f65-92rxq   3/3     Running   0

Patch the GlanceAPI service that is deployed in the control plane:

$ oc patch openstackcontrolplane openstack --type=merge --patch-file=glance_swift.patch

4.8.2. Adopting the Image service that is deployed with a Block Storage service back end

Adopt the Image Service (glance) that you deployed with a Block Storage service (cinder) back end in the OpenStack (OSP) environment. The control plane glanceAPI instance is deployed with the following configuration. You use this configuration in the patch manifest that deploys the Image service with the block storage back end:

..
spec
  glance:
   ...
      customServiceConfig: |
          [DEFAULT]
          enabled_backends = default_backend:cinder
          [glance_store]
          default_backend = default_backend
          [default_backend]
          description = Default cinder backend
          cinder_store_auth_address = {{ .KeystoneInternalURL }}
          cinder_store_user_name = {{ .ServiceUser }}
          cinder_store_password = {{ .ServicePassword }}
          cinder_store_project_name = service
          cinder_catalog_info = volumev3::internalURL
          cinder_use_multipath = true
          [oslo_concurrency]
          lock_path = /var/lib/glance/tmp

Prerequisites

You have completed the previous adoption steps.

Procedure

Create a new file, for example glance_cinder.patch, and include the following content:

spec:
  glance:
    enabled: true
    apiOverride:
      route: {}
    template:
      secret: osp-secret
      databaseInstance: openstack
      storage:
        storageRequest: 10G
      customServiceConfig: |
        [DEFAULT]
        enabled_backends = default_backend:cinder
        [glance_store]
        default_backend = default_backend
        [default_backend]
        description = Default cinder backend
        cinder_store_auth_address = {{ .KeystoneInternalURL }}
        cinder_store_user_name = {{ .ServiceUser }}
        cinder_store_password = {{ .ServicePassword }}
        cinder_store_project_name = service
        cinder_catalog_info = volumev3::internalURL
        cinder_use_multipath = true
        [oslo_concurrency]
        lock_path = /var/lib/glance/tmp
      glanceAPIs:
        default:
          replicas: 1
          override:
            service:
              internal:
                metadata:
                  annotations:
                    metallb.universe.tf/address-pool: internalapi
                    metallb.universe.tf/allow-shared-ip: internalapi
                    metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
                spec:
                  type: LoadBalancer
          networkAttachments:
            - storage

where:

<172.17.0.80>

Specifies the load balancer IP. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example, metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.

The Block Storage service as a back end establishes a dependency with the Image service. Any deployed GlanceAPI instances do not work if the Image service is configured with the Block Storage service that is not available in the OpenStackControlPlane custom resource. After the Block Storage service, and in particular CinderVolume, is adopted, you can proceed with the GlanceAPI adoption. For more information, see Adopting the Block Storage service.

Verify that CinderVolume is available:

$ oc get pod -l component=cinder-volume | grep Running
cinder-volume-75cb47f65-92rxq   3/3     Running   0

Patch the GlanceAPI service that is deployed in the control plane:

$ oc patch openstackcontrolplane openstack --type=merge --patch-file=glance_cinder.patch

4.8.3. Adopting the Image service that is deployed with an NFS back end

Adopt the Image Service (glance) that you deployed with an NFS back end. To complete the following procedure, ensure that your environment meets the following criteria:

The Storage network is propagated to the OpenStack (OSP) control plane.
The Image service can reach the Storage network and connect to the nfs-server through the port 2049.

Prerequisites

You have completed the previous adoption steps.

In the source cloud, verify the NFS parameters that the overcloud uses to configure the Image service back end. Specifically, in yourTripleO heat templates, find the following variables that override the default content that is provided by the glance-nfs.yaml file in the /usr/share/openstack-tripleo-heat-templates/environments/storage directory:

GlanceBackend: file
GlanceNfsEnabled: true
GlanceNfsShare: 192.168.24.1:/var/nfs

In this example, the GlanceBackend variable shows that the Image service has no notion of an NFS back end. The variable is using the File driver and, in the background, the filesystem_store_datadir. The filesystem_store_datadir is mapped to the export value provided by the GlanceNfsShare variable instead of /var/lib/glance/images/. If you do not export the GlanceNfsShare through a network that is propagated to the adopted Red Hat OpenStack Services on OpenShift (RHOSO) control plane, you must stop the nfs-server and remap the export to the storage network. Before doing so, ensure that the Image service is stopped in the source Controller nodes. In the control plane, as per the (network isolation diagram, the Image service is attached to the Storage network, propagated via the associated NetworkAttachmentsDefinition custom resource, and the resulting Pods have already the right permissions to handle the Image service traffic through this network.

In a deployed OSP control plane, you can verify that the network mapping matches with what has been deployed in the TripleO-based environment by checking both the NodeNetworkConfigPolicy (nncp) and the NetworkAttachmentDefinition (net-attach-def). The following is an example of the output that you should check in the OpenShift environment to make sure that there are no issues with the propagated networks:

$ oc get nncp
NAME                        STATUS      REASON
enp6s0-crc-8cf2w-master-0   Available   SuccessfullyConfigured

$ oc get net-attach-def
NAME
ctlplane
internalapi
storage
tenant

$ oc get ipaddresspool -n metallb-system
NAME          AUTO ASSIGN   AVOID BUGGY IPS   ADDRESSES
ctlplane      true          false             ["192.168.122.80-192.168.122.90"]
internalapi   true          false             ["172.17.0.80-172.17.0.90"]
storage       true          false             ["172.18.0.80-172.18.0.90"]
tenant        true          false             ["172.19.0.80-172.19.0.90"]

Procedure

Adopt the Image service and create a new default GlanceAPI instance that is connected with the existing NFS share:

$ cat << EOF > glance_nfs_patch.yaml

spec:
  extraMounts:
  - extraVol:
    - extraVolType: Nfs
      mounts:
      - mountPath: /var/lib/glance/images
        name: nfs
      propagation:
      - Glance
      volumes:
      - name: nfs
        nfs:
          path: <exported_path>
          server: <ip_address>
    name: r1
    region: r1
  glance:
    enabled: true
    template:
      databaseInstance: openstack
      customServiceConfig: |
        [DEFAULT]
        enabled_backends = default_backend:file
        [glance_store]
        default_backend = default_backend
        [default_backend]
        filesystem_store_datadir = /var/lib/glance/images/
      storage:
        storageRequest: 10G
      keystoneEndpoint: nfs
      glanceAPIs:
        nfs:
          replicas: 3
          type: single
          override:
            service:
              internal:
                metadata:
                  annotations:
                    metallb.universe.tf/address-pool: internalapi
                    metallb.universe.tf/allow-shared-ip: internalapi
                    metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
                spec:
                  type: LoadBalancer
          networkAttachments:
          - storage
EOF

where:

<exported_path>: Specifies the exported path in the nfs-server.
<ip_address>: Specifies the IP address that you use to communicate with the nfs-server.
<172.17.0.80>: Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example, metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.

Patch the OpenStackControlPlane CR to deploy the Image service with an NFS back end:

$ oc patch openstackcontrolplane openstack --type=merge --patch-file glance_nfs_patch.yaml

Patch the OpenStackControlPlane CR to remove the default Image service:

$ oc patch openstackcontrolplane openstack --type=json -p="[{'op': 'remove', 'path': '/spec/glance/template/glanceAPIs/default'}]"

Verification

When GlanceAPI is active, confirm that you can see a single API instance:

$ oc get pods -l service=glance
NAME                      READY   STATUS    RESTARTS
glance-nfs-single-0   2/2     Running   0
glance-nfs-single-1   2/2     Running   0
glance-nfs-single-2   2/2     Running   0

Ensure that the description of the pod reports the following output:

Mounts:
...
  nfs:
    Type:      NFS (an NFS mount that lasts the lifetime of a pod)
    Server:    {{ server ip address }}
    Path:      {{ nfs export path }}
    ReadOnly:  false
...

Check that the mountpoint that points to /var/lib/glance/images is mapped to the expected nfs server ip and nfs path that you defined in the new default GlanceAPI instance:

$ oc rsh -c glance-api glance-default-single-0

sh-5.1# mount
...
...
{{ ip address }}:/var/nfs on /var/lib/glance/images type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.18.0.5,local_lock=none,addr=172.18.0.5)
...
...

Confirm that the UUID is created in the exported directory on the NFS node. For example:

$ oc rsh openstackclient
$ openstack image list

sh-5.1$  curl -L -o /tmp/cirros-0.6.3-x86_64-disk.img http://download.cirros-cloud.net/0.6.3/cirros-0.6.3-x86_64-disk.img
...
...

sh-5.1$ openstack image create --container-format bare --disk-format raw --file /tmp/cirros-0.6.3-x86_64-disk.img cirros
...
...

sh-5.1$ openstack image list
+--------------------------------------+--------+--------+
| ID                                   | Name   | Status |
+--------------------------------------+--------+--------+
| 634482ca-4002-4a6d-b1d5-64502ad02630 | cirros | active |
+--------------------------------------+--------+--------+

On the nfs-server node, the same uuid is in the exported /var/nfs:
```
$ ls /var/nfs/
634482ca-4002-4a6d-b1d5-64502ad02630
```

4.8.4. Adopting the Image service that is deployed with a Ceph back end

Adopt the Image Service (glance) that you deployed with a Ceph back end. Use the customServiceConfig parameter to inject the right configuration to the GlanceAPI instance.

Prerequisites

You have completed the previous adoption steps.

Ensure that the Ceph-related secret (ceph-conf-files) is created in the openstack namespace and that the extraMounts property of the OpenStackControlPlane custom resource (CR) is configured properly. For more information, see Configuring a Ceph back end.

$ cat << EOF > glance_patch.yaml
spec:
  glance:
    enabled: true
    template:
      databaseInstance: openstack
      customServiceConfig: |
        [DEFAULT]
        enabled_backends=default_backend:rbd
        [glance_store]
        default_backend=default_backend
        [default_backend]
        rbd_store_ceph_conf=/etc/ceph/ceph.conf
        rbd_store_user=openstack
        rbd_store_pool=images
        store_description=Ceph glance store backend.
      storage:
        storageRequest: 10G
      glanceAPIs:
        default:
          replicas: 3
          override:
            service:
              internal:
                metadata:
                  annotations:
                    metallb.universe.tf/address-pool: internalapi
                    metallb.universe.tf/allow-shared-ip: internalapi
                    metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
                spec:
                  type: LoadBalancer
          networkAttachments:
          - storage
EOF

where:

<172.17.0.80>: Specifies the load balancer IP. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example, metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.

If you backed up your OpenStack (OSP) services configuration file from the original environment, you can compare it with the confgiuration file that you adopted and ensure that the configuration is correct. For more information, see Pulling the configuration from a TripleO deployment.

os-diff diff /tmp/collect_tripleo_configs/glance/etc/glance/glance-api.conf glance_patch.yaml --crd

This command produces the difference between both ini configuration files.

Procedure

Patch the OpenStackControlPlane CR to deploy the Image service with a Ceph back end:

$ oc patch openstackcontrolplane openstack --type=merge --patch-file glance_patch.yaml

4.8.5. Verifying the Image service adoption

Verify that you adopted the Image Service (glance) to the Red Hat OpenStack Services on OpenShift (RHOSO) Antelope deployment.

Procedure

Test the Image service from the OpenStack CLI. You can compare and ensure that the configuration is applied to the Image service pods:
```
$ os-diff diff /etc/glance/glance.conf.d/02-config.conf glance_patch.yaml --frompod -p glance-api
```
If no line appears, then the configuration is correct.

Inspect the resulting Image service pods:

GLANCE_POD=`oc get pod |grep glance-default | cut -f 1 -d' ' | head -n 1`
oc exec -t $GLANCE_POD -c glance-api -- cat /etc/glance/glance.conf.d/02-config.conf

[DEFAULT]
enabled_backends=default_backend:rbd
[glance_store]
default_backend=default_backend
[default_backend]
rbd_store_ceph_conf=/etc/ceph/ceph.conf
rbd_store_user=openstack
rbd_store_pool=images
store_description=Ceph glance store backend.

If you use a Ceph back end, ensure that the Ceph secrets are mounted:

$ oc exec -t $GLANCE_POD -c glance-api -- ls /etc/ceph
ceph.client.openstack.keyring
ceph.conf

Check that the service is active, and that the endpoints are updated in the OSP CLI:

$ oc rsh openstackclient
$ openstack service list | grep image

| fc52dbffef36434d906eeb99adfc6186 | glance    | image        |

$ openstack endpoint list | grep image

| 569ed81064f84d4a91e0d2d807e4c1f1 | regionOne | glance       | image        | True    | internal  | http://glance-internal-openstack.apps-crc.testing   |
| 5843fae70cba4e73b29d4aff3e8b616c | regionOne | glance       | image        | True    | public    | http://glance-public-openstack.apps-crc.testing     |

Check that the images that you previously listed in the source cloud are available in the adopted service:

$ openstack image list
+--------------------------------------+--------+--------+
| ID                                   | Name   | Status |
+--------------------------------------+--------+--------+
| c3158cad-d50b-452f-bec1-f250562f5c1f | cirros | active |
+--------------------------------------+--------+--------+

Test that you can create an image on the adopted service:

(openstack)$ alias openstack="oc exec -t openstackclient -- openstack"
(openstack)$ curl -L -o /tmp/cirros-0.6.3-x86_64-disk.img http://download.cirros-cloud.net/0.6.3/cirros-0.6.3-x86_64-disk.img
    qemu-img convert -O raw /tmp/cirros-0.6.3-x86_64-disk.img /tmp/cirros-0.6.3-x86_64-disk.img.raw
    openstack image create --container-format bare --disk-format raw --file /tmp/cirros-0.6.3-x86_64-disk.img.raw cirros2
    openstack image list
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   273  100   273    0     0   1525      0 --:--:-- --:--:-- --:--:--  1533
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 15.5M  100 15.5M    0     0  17.4M      0 --:--:-- --:--:-- --:--:-- 17.4M

+------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
| Field            | Value                                                                                                                                      |
+------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
| container_format | bare                                                                                                                                       |
| created_at       | 2023-01-31T21:12:56Z                                                                                                                       |
| disk_format      | raw                                                                                                                                        |
| file             | /v2/images/46a3eac1-7224-40bc-9083-f2f0cd122ba4/file                                                                                       |
| id               | 46a3eac1-7224-40bc-9083-f2f0cd122ba4                                                                                                       |
| min_disk         | 0                                                                                                                                          |
| min_ram          | 0                                                                                                                                          |
| name             | cirros                                                                                                                                     |
| owner            | 9f7e8fdc50f34b658cfaee9c48e5e12d                                                                                                           |
| properties       | os_hidden='False', owner_specified.openstack.md5='', owner_specified.openstack.object='images/cirros', owner_specified.openstack.sha256='' |
| protected        | False                                                                                                                                      |
| schema           | /v2/schemas/image                                                                                                                          |
| status           | queued                                                                                                                                     |
| tags             |                                                                                                                                            |
| updated_at       | 2023-01-31T21:12:56Z                                                                                                                       |
| visibility       | shared                                                                                                                                     |
+------------------+--------------------------------------------------------------------------------------------------------------------------------------------+

+--------------------------------------+--------+--------+
| ID                                   | Name   | Status |
+--------------------------------------+--------+--------+
| 46a3eac1-7224-40bc-9083-f2f0cd122ba4 | cirros2| active |
| c3158cad-d50b-452f-bec1-f250562f5c1f | cirros | active |
+--------------------------------------+--------+--------+


(openstack)$ oc rsh ceph
sh-4.4$ ceph -s
r  cluster:
    id:     432d9a34-9cee-4109-b705-0c59e8973983
    health: HEALTH_OK

  services:
    mon: 1 daemons, quorum a (age 4h)
    mgr: a(active, since 4h)
    osd: 1 osds: 1 up (since 4h), 1 in (since 4h)

  data:
    pools:   5 pools, 160 pgs
    objects: 46 objects, 224 MiB
    usage:   247 MiB used, 6.8 GiB / 7.0 GiB avail
    pgs:     160 active+clean

sh-4.4$ rbd -p images ls
46a3eac1-7224-40bc-9083-f2f0cd122ba4
c3158cad-d50b-452f-bec1-f250562f5c1f

4.9. Adopting the Placement service

To adopt the Placement service, you patch an existing OpenStackControlPlane custom resource (CR) that has the Placement service disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment.

Prerequisites

You import your databases to MariaDB instances on the control plane. For more information, see Migrating databases to MariaDB instances.
You adopt the Identity service (keystone). For more information, see Adopting the Identity service.

Procedure

Patch the OpenStackControlPlane CR to deploy the Placement service:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  placement:
    enabled: true
    apiOverride:
      route: {}
    template:
      databaseInstance: openstack
      databaseAccount: placement
      secret: osp-secret
      override:
        service:
          internal:
            metadata:
              annotations:
                metallb.universe.tf/address-pool: internalapi
                metallb.universe.tf/allow-shared-ip: internalapi
                metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
            spec:
              type: LoadBalancer
'

where:

<172.17.0.80>: Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example, metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.

Verification

Check that the Placement service endpoints are defined and pointing to the control plane FQDNs, and that the Placement API responds:

$ alias openstack="oc exec -t openstackclient -- openstack"

$ openstack endpoint list | grep placement


# Without OpenStack CLI placement plugin installed:
$ PLACEMENT_PUBLIC_URL=$(openstack endpoint list -c 'Service Name' -c 'Service Type' -c URL | grep placement | grep public | awk '{ print $6; }')
$ oc exec -t openstackclient -- curl "$PLACEMENT_PUBLIC_URL"

# With OpenStack CLI placement plugin installed:
$ openstack resource class list

4.10. Adopting the Bare Metal Provisioning service

Review information about your Bare Metal Provisioning service (ironic) configuration and then adopt the Bare Metal Provisioning service to the Red Hat OpenStack Services on OpenShift control plane.

4.10.1. Bare Metal Provisioning service configurations

You configure the Bare Metal Provisioning service (ironic) by using configuration snippets. For more information about configuring the control plane with the Bare Metal Provisioning service, see Customizing the Red Hat OpenStack Services on OpenShift deployment.

Some Bare Metal Provisioning service configuration is overridden in TripleO, for example, PXE Loader file names are often overridden at intermediate layers. You must pay attention to the settings you apply in your Red Hat OpenStack Services on OpenShift (RHOSO) deployment. The ironic-operator applies a reasonable working default configuration, but if you override them with your prior configuration, your experience might not be ideal or your new Bare Metal Provisioning service fails to operate. Similarly, additional configuration might be necessary, for example, if you enable and use additional hardware types in your ironic.conf file.

The model of reasonable defaults includes commonly used hardware-types and driver interfaces. For example, the redfish-virtual-media boot interface and the ramdisk deploy interface are enabled by default. If you add new bare metal nodes after the adoption is complete, the driver interface selection occurs based on the order of precedence in the configuration if you do not explicitly set it on the node creation request or as an established default in the ironic.conf file.

Some configuration parameters do not need to be set on an individual node level, for example, network UUID values, or they are centrally configured in the ironic.conf file, as the setting controls security behavior.

It is critical that you maintain the following parameters that you configured and formatted as [section] and parameter name from the prior deployment to the new deployment. These parameters that govern the underlying behavior and values in the previous configuration would have used specific values if set.

[neutron]cleaning_network
[neutron]provisioning_network
[neutron]rescuing_network
[neutron]inspection_network
[conductor]automated_clean
[deploy]erase_devices_priority
[deploy]erase_devices_metadata_priority
[conductor]force_power_state_during_sync

You can set the following parameters individually on a node. However, you might choose to use embedded configuration options to avoid the need to set the parameters individually when creating or managing bare metal nodes. Check your prior ironic.conf file for these parameters, and if set, apply a specific override configuration.

[conductor]bootloader
[conductor]rescue_ramdisk
[conductor]rescue_kernel
[conductor]deploy_kernel
[conductor]deploy_ramdisk

The instances of kernel_append_params, formerly pxe_append_params in the [pxe] and [redfish] configuration sections, are used to apply boot time options like "console" for the deployment ramdisk and as such often must be changed.

You cannot migrate hardware types that are set with the ironic.conf file enabled_hardware_types parameter, and hardware type driver interfaces starting with staging- into the adopted configuration.

4.10.2. Deploying the Bare Metal Provisioning service

To deploy the Bare Metal Provisioning service (ironic), you patch an existing OpenStackControlPlane custom resource (CR) that has the Bare Metal Provisioning service disabled. The ironic-operator applies the configuration and starts the Bare Metal Provisioning services. After the services are running, the Bare Metal Provisioning service automatically begins polling the power state of the bare-metal nodes that it manages.

By default, RHOSO versions Antelope and later of the Bare Metal Provisioning service include a new multi-tenant aware role-based access control (RBAC) model. As a result, bare-metal nodes might be missing when you run the openstack baremetal node list command after you adopt the Bare Metal Provisioning service. Your nodes are not deleted. Due to the increased access restrictions in the RBAC model, you must identify which project owns the missing bare-metal nodes and set the owner field on each missing bare-metal node.

Prerequisites

You have imported the service databases into the control plane database.
The Bare Metal Provisioning service is disabled in the RHOSO Antelope. The following command should return a string of false:
```
$ oc get openstackcontrolplanes.core.openstack.org <name> -o jsonpath='{.spec.ironic.enabled}'
```
- Replace <name> with the name of your existing OpenStackControlPlane CR, for example, openstack-control-plane.
The Identity service (keystone), Networking service (neutron), and Image Service (glance) are operational.

If you use the Bare Metal Provisioning service in a Bare Metal as a Service configuration, do not adopt the Compute service (nova) before you adopt the Bare Metal Provisioning service.
For the Bare Metal Provisioning service conductor services, the services must be able to reach Baseboard Management Controllers of hardware that is configured to be managed by the Bare Metal Provisioning service. If this hardware is unreachable, the nodes might enter "maintenance" state and be unavailable until connectivity is restored later.

You have downloaded the ironic.conf file locally:

$CONTROLLER1_SSH cat /var/lib/config-data/puppet-generated/ironic/etc/ironic/ironic.conf > ironic.conf

This configuration file must come from one of the Controller nodes and not a TripleO undercloud node. The TripleO undercloud node operates with different configuration that does not apply when you adopt the Overcloud Ironic deployment.

If you are adopting the Ironic Inspector service, you need the value of the IronicInspectorSubnets TripleO parameter. Use the same values to populate the dhcpRanges parameter in the RHOSO environment.
You have defined the following shell variables. Replace the following example values with values that apply to your environment:
```
$ alias openstack="oc exec -t openstackclient -- openstack"
```

Procedure

Patch the OpenStackControlPlane custom resource (CR) to deploy the Bare Metal Provisioning service:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  ironic:
    enabled: true
    template:
      rpcTransport: oslo
      databaseInstance: openstack
      ironicAPI:
        replicas: 1
        override:
          service:
            internal:
              metadata:
                annotations:
                  metallb.universe.tf/address-pool: internalapi
                  metallb.universe.tf/allow-shared-ip: internalapi
                  metallb.universe.tf/loadBalancerIPs: <loadBalancer_IP>
              spec:
                type: LoadBalancer
      ironicConductors:
      - replicas: 1
        networkAttachments:
          - baremetal
        provisionNetwork: baremetal
        storageRequest: 10G
        customServiceConfig: |
          [neutron]
          cleaning_network=<cleaning network uuid>
          provisioning_network=<provisioning network uuid>
          rescuing_network=<rescuing network uuid>
          inspection_network=<introspection network uuid>
          [conductor]
          automated_clean=true
      ironicInspector:
        replicas: 1
        inspectionNetwork: baremetal
        networkAttachments:
          - baremetal
        dhcpRanges:
          - name: inspector-0
            cidr: 172.20.1.0/24
            start: 172.20.1.190
            end: 172.20.1.199
            gateway: 172.20.1.1
        serviceUser: ironic-inspector
        databaseAccount: ironic-inspector
        passwordSelectors:
          database: IronicInspectorDatabasePassword
          service: IronicInspectorPassword
      ironicNeutronAgent:
        replicas: 1
        messagingBus:
          cluster: rabbitmq
      secret: osp-secret
'

where:

<loadBalancer_IP>: Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example, metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.

Wait for the Bare Metal Provisioning service control plane services CRs to become ready:
```
$ oc wait --for condition=Ready --timeout=300s ironics.ironic.openstack.org ironic
```

Verify that the individual services are ready:

$ oc wait --for condition=Ready --timeout=300s ironicapis.ironic.openstack.org ironic-api
$ oc wait --for condition=Ready --timeout=300s ironicconductors.ironic.openstack.org ironic-conductor
$ oc wait --for condition=Ready --timeout=300s ironicinspectors.ironic.openstack.org ironic-inspector
$ oc wait --for condition=Ready --timeout=300s ironicneutronagents.ironic.openstack.org ironic-ironic-neutron-agent

Update the DNS Nameservers on the provisioning, cleaning, and rescue networks:

For name resolution to work for Bare Metal Provisioning service operations, you must set the DNS nameserver to use the internal DNS servers in the RHOSO control plane:
```
$ openstack subnet set --dns-nameserver 192.168.122.80 provisioning-subnet
```

Verify that no Bare Metal Provisioning service nodes are missing from the node list:

$ openstack baremetal node list

If the openstack baremetal node list command output reports an incorrect power status, wait a few minutes and re-run the command to see if the output syncs with the actual state of the hardware being managed. The time required for the Bare Metal Provisioning service to review and reconcile the power state of bare-metal nodes depends on the number of operating conductors through the replicas parameter and which are present in the Bare Metal Provisioning service deployment being adopted.

If any Bare Metal Provisioning service nodes are missing from the openstack baremetal node list command, temporarily disable the new RBAC policy to see the nodes again:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  ironic:
    enabled: true
    template:
      databaseInstance: openstack
      ironicAPI:
        replicas: 1
        customServiceConfig: |
          [oslo_policy]
          enforce_scope=false
          enforce_new_defaults=false
'

After this configuration is applied, the operator restarts the Ironic API service and disables the new RBAC policy that is enabled by default.

View the bare-metal nodes that do not have an owner assigned:

$ openstack baremetal node list --long -c UUID -c Owner -c 'Provisioning State'

Assign all bare-metal nodes with no owner to a new project, for example, the admin project:

ADMIN_PROJECT_ID=$(openstack project show -c id -f value --domain default admin)
for node in $(openstack baremetal node list -f json -c UUID -c Owner | jq -r '.[] | select(.Owner == null) | .UUID'); do openstack baremetal node set --owner $ADMIN_PROJECT_ID $node; done

Re-apply the default RBAC by removing the customServiceConfig section or by setting the following values in the customServiceConfig section to true. For example:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  ironic:
    enabled: true
    template:
      databaseInstance: openstack
      ironicAPI:
        replicas: 1
        customServiceConfig: |
          [oslo_policy]
          enforce_scope=true
          enforce_new_defaults=true
'

Verification

Verify the list of endpoints:
```
$ openstack endpoint list |grep ironic
```
Verify the list of bare-metal nodes:
```
$ openstack baremetal node list
```

Reset the deploy images on all bare-metal nodes to use the new centrally configured images:

After adoption, bare-metal nodes might still reference the old deployment’s kernel and ramdisk images in their driver_info fields. Resetting these values causes the Bare Metal Provisioning service to use the new centrally configured deploy_kernel and deploy_ramdisk values from the ironic.conf file.

for node in $(openstack baremetal node list -c UUID -f value); do
  openstack baremetal node set $node \
    --driver-info deploy_ramdisk= \
    --driver-info deploy_kernel=
done

4.11. Adopting the Compute service

To adopt the Compute service (nova), you patch an existing OpenStackControlPlane custom resource (CR) where the Compute service is disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment. The following procedure describes a single-cell setup.

Prerequisites

You have completed the previous adoption steps.
You have defined the following shell variables. Replace the following example values with the values that are correct for your environment:
```
alias openstack="oc exec -t openstackclient -- openstack"

DEFAULT_CELL_NAME="cell3"
RENAMED_CELLS="cell1 cell2 $DEFAULT_CELL_NAME"
```
- The source cloud default cell takes a new $DEFAULT_CELL_NAME. In a multi-cell adoption scenario, the default cell might retain its original name,DEFAULT_CELL_NAME=default, or become renamed as a cell that is free for use. Do not use other existing cell names for DEFAULT_CELL_NAME, except for default.
- If you deployed the source cloud with a default cell, and want to rename it during adoption, define the new name that you want to use, as shown in the following example:
  DEFAULT_CELL_NAME="cell1" RENAMED_CELLS="cell1"

Procedure

Patch the OpenStackControlPlane CR to deploy the Compute service:

This procedure assumes that Compute service metadata is deployed on the top level and not on each cell level. If the OSP deployment has a per-cell metadata deployment, adjust the following patch as needed. You cannot run the metadata service in cell0. To enable the metadata services of a local cell, set the enabled property in the metadataServiceTemplate field of the local cell to true in the OpenStackControlPlane CR.

$ rm -f celltemplates
$ for CELL in $(echo $RENAMED_CELLS); do
>  cat >> celltemplates << EOF
>       ${CELL}:
>         hasAPIAccess: true
>         cellDatabaseAccount: nova-$CELL
>         cellDatabaseInstance: openstack-$CELL
>         cellMessageBusInstance: rabbitmq-$CELL
>         metadataServiceTemplate:
>           enabled: false
>           override:
>               service:
>                 metadata:
>                   annotations:
>                     metallb.universe.tf/address-pool: internalapi
>                     metallb.universe.tf/allow-shared-ip: internalapi
>                     metallb.universe.tf/loadBalancerIPs: 172.17.0.$(( 79 + ${CELL##*cell} ))
>                 spec:
>                   type: LoadBalancer
>           customServiceConfig: |
>             [workarounds]
>             disable_compute_service_check_for_ffu=true
>         conductorServiceTemplate:
>           customServiceConfig: |
>             [workarounds]
>             disable_compute_service_check_for_ffu=true
>EOF
>done

$ cat > oscp-patch.yaml << EOF
>spec:
>  nova:
>   enabled: true
>   apiOverride:
>     route: {}
>   template:
>     secret: osp-secret
>     apiDatabaseAccount: nova-api
>     apiServiceTemplate:
>       override:
>         service:
>           internal:
>             metadata:
>               annotations:
>                 metallb.universe.tf/address-pool: internalapi
>                 metallb.universe.tf/allow-shared-ip: internalapi
>                 metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
>             spec:
>               type: LoadBalancer
>       customServiceConfig: |
>         [workarounds]
>         disable_compute_service_check_for_ffu=true
>     metadataServiceTemplate:
>       enabled: true
>       override:
>         service:
>           metadata:
>             annotations:
>               metallb.universe.tf/address-pool: internalapi
>               metallb.universe.tf/allow-shared-ip: internalapi
>               metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
>           spec:
>             type: LoadBalancer
>       customServiceConfig: |
>         [workarounds]
>         disable_compute_service_check_for_ffu=true
>     schedulerServiceTemplate:
>       customServiceConfig: |
>         [workarounds]
>         disable_compute_service_check_for_ffu=true
>     cellTemplates:
>       cell0:
>         hasAPIAccess: true
>         cellDatabaseAccount: nova-cell0
>         cellDatabaseInstance: openstack
>         cellMessageBusInstance: rabbitmq
>         conductorServiceTemplate:
>           customServiceConfig: |
>             [workarounds]
>             disable_compute_service_check_for_ffu=true
>EOF
$ cat celltemplates >> oscp-patch.yaml
$ oc patch openstackcontrolplane openstack  --type=merge --patch-file=oscp-patch.yaml

${CELL}.hasAPIAccess specifies upcall access to the API. In the source cloud, cells are always configured with the main Nova API database upcall access. You can disable upcall access to the API by setting hasAPIAccess to false. However, do not make changes to the API during adoption.
${CELL}.cellDatabaseInstance specifies the database instance that is used by the cell. The database instance names must match the names that are defined in the OpenStackControlPlane CR that you created in when you deployed the back-end services as described in Deploying back-end services.
${CELL}.cellMessageBusInstance specifies the message bus instance that is used by the cell. The message bus instance names must match the names that are defined in the OpenStackControlPlane CR.
metallb.universe.tf/loadBalancerIPs: <172.17.0.80> specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example, metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.

If you are adopting the Compute service with the Bare Metal Provisioning service (ironic), append the novaComputeTemplates field with the following content in each cell in the Compute service CR patch. For example:

        cell1:
          novaComputeTemplates:
            standalone:
              customServiceConfig: |
                [DEFAULT]
                host = <hostname>
                [workarounds]
                disable_compute_service_check_for_ffu=true
              computeDriver: ironic.IronicDriver
        ...

Replace <hostname> with the hostname of the node that is running the ironic Compute driver in the source cloud.

Wait for the CRs for the Compute control plane services to be ready:

$ oc wait --for condition=Ready --timeout=300s Nova/nova

The local Conductor services are started for each cell, while the superconductor runs in cell0. Note that disable_compute_service_check_for_ffu is mandatory for all imported Compute services until the external data plane is imported, and until the Compute services are fast-forward upgraded. For more information, see Adopting Compute services to the RHOSO data plane and Upgrading Compute services.

Verification

Check that Compute service endpoints are defined and pointing to the control plane FQDNs, and that the Nova API responds:
```
$ openstack endpoint list | grep nova
$ openstack server list
```
- Compare the outputs with the topology-specific configuration in Retrieving topology-specific service configuration.
Query the superconductor to check that the expected cells exist, and compare it to its pre-adoption values:
```
for CELL in $(echo $CELLS); do
set +u
. ~/.source_cloud_exported_variables_$CELL
set -u
RCELL=$CELL
[ "$CELL" = "default" ] && RCELL=$DEFAULT_CELL_NAME

echo "comparing $CELL to $RCELL"
echo $PULL_OPENSTACK_CONFIGURATION_NOVAMANAGE_CELL_MAPPINGS | grep -F "| $CELL |"
oc rsh nova-cell0-conductor-0 nova-manage cell_v2 list_cells | grep -F "| $RCELL |"
done
```
The following changes are expected for each cell:
- The cellX nova database and username become nova_cellX.
- The default cell is renamed to DEFAULT_CELL_NAME. The default cell might retain the original name if there are multiple cells.
- The RabbitMQ transport URL no longer uses guest.

At this point, the Compute service control plane services do not control the existing Compute service workloads. The control plane manages the data plane only after the data adoption process is completed. For more information, see Adopting Compute services to the RHOSO data plane.

To import external Compute services to the RHOSO data plane, you must upgrade them first. For more information, see Adopting Compute services to the RHOSO data plane, and Performing a fast-forward upgrade on Compute services.

4.12. Adopting the Block Storage service

To adopt a TripleO-deployed Block Storage service (cinder), create the manifest based on the existing cinder.conf file, deploy the Block Storage service, and validate the new deployment.

Prerequisites

You have reviewed the Block Storage service limitations. For more information, see Limitations for adopting the Block Storage service.
You have planned the placement of the Block Storage services.
You have prepared the OpenShift nodes where the volume and backup services run. For more information, see OCP preparation for Block Storage service adoption.
The Block Storage service (cinder) is stopped.
The service databases are imported into the control plane MariaDB.
The Identity service (keystone) is adopted.
If your OpenStack deployment included the Key Manager service (barbican), the Key Manager service is adopted.
The Storage network is correctly configured on the OCP cluster.

The contents of cinder.conf file. Download the file so that you can access it locally:

$CONTROLLER1_SSH cat /var/lib/config-data/puppet-generated/cinder/etc/cinder/cinder.conf > cinder.conf

Procedure

Create a new file, for example, cinder_api.patch, and apply the configuration:

$ oc patch openstackcontrolplane openstack --type=merge --patch-file=<patch_name>

Replace <patch_name> with the name of your patch file.

The following example shows a cinder_api.patch file:

spec:
  extraMounts:
  - extraVol:
    - extraVolType: Ceph
      mounts:
      - mountPath: /etc/ceph
        name: ceph
        readOnly: true
      propagation:
      - CinderVolume
      - CinderBackup
      - Glance
      volumes:
      - name: ceph
        projected:
          sources:
          - secret:
              name: ceph-conf-files
  cinder:
    enabled: true
    apiOverride:
      route: {}
    template:
      databaseInstance: openstack
      databaseAccount: cinder
      secret: osp-secret
      cinderAPI:
        override:
          service:
            internal:
              metadata:
                annotations:
                  metallb.universe.tf/address-pool: internalapi
                  metallb.universe.tf/allow-shared-ip: internalapi
                  metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
              spec:
                type: LoadBalancer
        replicas: 1
        customServiceConfig: |
          [DEFAULT]
          default_volume_type=tripleo
      cinderScheduler:
        replicas: 0
      cinderBackup:
        networkAttachments:
        - storage
        replicas: 0
      cinderVolumes:
        ceph:
          networkAttachments:
          - storage
          replicas: 0

where:

<172.17.0.80>: Specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example, metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.

Retrieve the list of the previous scheduler and backup services:

$ openstack volume service list

+------------------+------------------------+------+---------+-------+----------------------------+
| Binary           | Host                   | Zone | Status  | State | Updated At                 |
+------------------+------------------------+------+---------+-------+----------------------------+
| cinder-scheduler | standalone.localdomain | nova | enabled | down  | 2024-11-04T17:47:14.000000 |
| cinder-backup    | standalone.localdomain | nova | enabled | down  | 2024-11-04T17:47:14.000000 |
| cinder-volume    | hostgroup@tripleo_ceph | nova | enabled | down  | 2024-11-04T17:47:14.000000 |
+------------------+------------------------+------+---------+-------+----------------------------+

Remove services for hosts that are in the down state:
```
$ oc exec -t cinder-api-0 -c cinder-api -- cinder-manage service remove <service_binary> <service_host>
```
- Replace <service_binary> with the name of the binary, for example, cinder-backup.
- Replace <service_host> with the host name, for example, cinder-backup-0.

Deploy the scheduler, backup, and volume services:

Create another file, for example, cinder_services.patch, and apply the configuration:
```
$ oc patch openstackcontrolplane openstack --type=merge --patch-file=<patch_name>
```
Replace <patch_name> with the name of your patch file.

The following example shows a cinder_services.patch file for a Ceph RBD deployment:

spec:
  cinder:
    enabled: true
    template:
      cinderScheduler:
        replicas: 1
      cinderBackup:
        networkAttachments:
        - storage
        replicas: 1
        customServiceConfig: |
          [DEFAULT]
          backup_driver=cinder.backup.drivers.ceph.CephBackupDriver
          backup_ceph_conf=/etc/ceph/ceph.conf
          backup_ceph_user=openstack
          backup_ceph_pool=backups
      cinderVolumes:
        ceph:
          networkAttachments:
          - storage
          replicas: 1
          customServiceConfig: |
            [tripleo_ceph]
            backend_host=hostgroup
            volume_backend_name=tripleo_ceph
            volume_driver=cinder.volume.drivers.rbd.RBDDriver
            rbd_ceph_conf=/etc/ceph/ceph.conf
            rbd_user=openstack
            rbd_pool=volumes
            rbd_flatten_volume_from_snapshot=False
            report_discard_supported=True

Ensure that you use the same configuration group name for the driver that you used in the source cluster. In this example, the driver configuration group in customServiceConfig is called tripleo_ceph because it reflects the value of the configuration group name in the cinder.conf file of the source OpenStack cluster.

Configure the NetApp NFS Block Storage volume service:

Create a secret that includes sensitive information such as hostnames, passwords, and usernames to access the third-party NetApp NFS storage. You can find the credentials in the cinder.conf file that was generated from the TripleO deployment:

$ oc apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  labels:
    service: cinder
    component: cinder-volume
  name: cinder-volume-ontap-secrets
type: Opaque
stringData:
  ontap-cinder-secrets: |
    [tripleo_netapp]
    netapp_login= netapp_username
    netapp_password= netapp_password
    netapp_vserver= netapp_vserver
    nas_host= netapp_nfsip
    nas_share_path=/netapp_nfspath
    netapp_pool_name_search_pattern=(netapp_poolpattern)
EOF

Patch the OpenStackControlPlane CR to deploy NetApp NFS Block Storage volume back end:

$ oc patch openstackcontrolplane openstack --type=merge --patch-file=<cinder_netappNFS.patch>

Replace <cinder_netappNFS.patch> with the name of the patch file for your NetApp NFS Block Storage volume back end.

The following example shows a cinder_netappNFS.patch file that configures a NetApp NFS Block Storage volume service:

spec:
  cinder:
    enabled: true
    template:
      cinderVolumes:
        ontap-nfs:
          networkAttachments:
            - storage
          customServiceConfig: |
            [tripleo_netapp]
            volume_backend_name=ontap-nfs
            volume_driver=cinder.volume.drivers.netapp.common.NetAppDriver
            nfs_snapshot_support=true
            nas_secure_file_operations=false
            nas_secure_file_permissions=false
            netapp_server_hostname= netapp_backendip
            netapp_server_port=80
            netapp_storage_protocol=nfs
            netapp_storage_family=ontap_cluster
          customServiceConfigSecrets:
          - cinder-volume-ontap-secrets

Configure the NetApp iSCSI Block Storage volume service:

Create a secret that includes sensitive information such as hostnames, passwords, and usernames to access the third-party NetApp iSCSI storage. You can find the credentials in the cinder.conf file that was generated from the TripleO deployment:

$ oc apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  labels:
    service: cinder
    component: cinder-volume
  name: cinder-volume-ontap-secrets
type: Opaque
stringData:
  ontap-cinder-secrets: |
    [tripleo_netapp]
    netapp_server_hostname = netapp_host
    netapp_login = netapp_username
    netapp_password = netapp_password
    netapp_vserver = netapp_vserver
    netapp_pool_name_search_pattern=(netapp_poolpattern)
EOF

Patch the OpenStackControlPlane custom resource (CR) to deploy the NetApp iSCSI Block Storage volume back end:

$ oc patch openstackcontrolplane openstack --type=merge --patch-file=<cinder_netappISCSI.patch>

Replace <cinder_netappISCSI.patch> with the name of the patch file for your NetApp iSCSI Block Storage volume back end.

The following example shows a cinder_netappISCSI.patch file that configures a NetApp iSCSI Block Storage volume service:

spec:
  cinder:
    enabled: true
    template:
      cinderVolumes:
        ontap-iscsi:
          networkAttachments:
            - storage
          customServiceConfig: |
            [tripleo_netapp]
            volume_backend_name=ontap-iscsi
            volume_driver=cinder.volume.drivers.netapp.common.NetAppDriver
            netapp_storage_protocol=iscsi
            netapp_storage_family=ontap_cluster
            consistencygroup_support=True
          customServiceConfigSecrets:
            - cinder-volume-ontap-secrets

Check if all the services are up and running:

$ openstack volume service list

+------------------+--------------------------+------+---------+-------+----------------------------+
| Binary           | Host                     | Zone | Status  | State | Updated At                 |
+------------------+--------------------------+------+---------+-------+----------------------------+
| cinder-volume    | hostgroup@tripleo_netapp | nova | enabled | up    | 2023-06-28T17:00:03.000000 |
| cinder-scheduler | cinder-scheduler-0       | nova | enabled | up    | 2023-06-28T17:00:02.000000 |
| cinder-backup    | cinder-backup-0          | nova | enabled | up    | 2023-06-28T17:00:01.000000 |
+------------------+--------------------------+------+---------+-------+----------------------------+

Apply the DB data migrations:

You are not required to run the data migrations at this step, but you must run them before the next upgrade. However, for adoption, you can run the migrations now to ensure that there are no issues before you run production workloads on the deployment.

$ oc exec -it cinder-scheduler-0 -- cinder-manage db online_data_migrations

Verification

Ensure that the openstack alias is defined:

$ alias openstack="oc exec -t openstackclient -- openstack"

Confirm that Block Storage service endpoints are defined and pointing to the control plane FQDNs:
```
$ openstack endpoint list --service <endpoint>
```
- Replace <endpoint> with the name of the endpoint that you want to confirm.
Confirm that the Block Storage services are running:
```
$ openstack volume service list
```
Cinder API services do not appear in the list. However, if you get a response from the openstack volume service list command, that means at least one of the cinder API services is running.

Confirm that you have your previous volume types, volumes, snapshots, and backups:

$ openstack volume type list
$ openstack volume list
$ openstack volume snapshot list
$ openstack volume backup list

To confirm that the configuration is working, perform the following steps:

Create a volume from an image to check that the connection to Image Service (glance) is working:
```
$ openstack volume create --image cirros --bootable --size 1 disk_new
```

Back up the previous attached volume:

$ openstack --os-volume-api-version 3.47 volume create --backup <backup_name>

Replace <backup_name> with the name of your new backup location.

You do not boot a Compute service (nova) instance by using the new volume from image or try to detach the previous volume because the Compute service and the Block Storage service are still not connected.

4.13. Adopting the Dashboard service

To adopt the Dashboard service (horizon), you patch an existing OpenStackControlPlane custom resource (CR) that has the Dashboard service disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack environment.

Prerequisites

You adopted Memcached. For more information, see Deploying back-end services.
You adopted the Identity service (keystone). For more information, see Adopting the Identity service.

Procedure

Patch the OpenStackControlPlane CR to deploy the Dashboard service:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  horizon:
    enabled: true
    apiOverride:
      route: {}
    template:
      memcachedInstance: memcached
      secret: osp-secret
'

Verification

Verify that the Dashboard service instance is successfully deployed and ready:
```
$ oc get horizon
```

Confirm that the Dashboard service is reachable and returns a 200 status code:

PUBLIC_URL=$(oc get horizon horizon -o jsonpath='{.status.endpoint}')
curl --silent --output /dev/stderr --head --write-out "%{http_code}" "$PUBLIC_URL/dashboard/auth/login/?next=/dashboard/" -k | grep 200

4.14. Adopting the Shared File Systems service

The Shared File Systems service (manila) in Red Hat OpenStack Services on OpenShift (RHOSO) provides a self-service API to create and manage file shares. File shares (or "shares"), are built for concurrent read/write access from multiple clients. This makes the Shared File Systems service essential in cloud environments that require a ReadWriteMany persistent storage.

File shares in RHOSO require network access. Ensure that the networking in the OpenStack (OSP) environment matches the network plans for your new cloud after adoption. This ensures that tenant workloads remain connected to storage during the adoption process. The Shared File Systems service control plane services are not in the data path. Shutting down the API, scheduler, and share manager services do not impact access to existing shared file systems.

Typically, storage and storage device management are separate networks. Shared File Systems services only need access to the storage device management network. For example, if you used a Ceph Storage cluster in the deployment, the "storage" network refers to the Ceph Storage cluster’s public network, and the Shared File Systems service’s share manager service needs to be able to reach it.

The Shared File Systems service supports the following storage networking scenarios:

You can directly control the networking for your respective file shares.
The RHOSO administrator configures the storage networking.

4.14.1. Guidelines for preparing the Shared File Systems service configuration

To deploy Shared File Systems service (manila) on the control plane, you must copy the original configuration file from the OpenStack deployment. You must review the content in the file to make sure you are adopting the correct configuration for Red Hat OpenStack Services on OpenShift (RHOSO) Antelope. Not all of the content needs to be brought into the new cloud environment.

Review the following guidelines for preparing your Shared File Systems service configuration file for adoption:

The Shared File Systems service operator sets up the following configurations and can be ignored:
- Database-related configuration ([database])
- Service authentication (auth_strategy, [keystone_authtoken])
- Message bus configuration (transport_url, control_exchange)
- The default paste config (api_paste_config)
- Inter-service communication configuration ([neutron], [nova], [cinder], [glance] [oslo_messaging_*])
Ignore the osapi_share_listen configuration. In Red Hat OpenStack Services on OpenShift (RHOSO) Antelope, you rely on OpenShift routes and ingress.
Check for policy overrides. In RHOSO Antelope, the Shared File Systems service ships with a secure default Role-based access control (RBAC), and overrides might not be necessary. Please review RBAC defaults by using the Oslo policy generator tool.

If a custom policy is necessary, you must provide it as a ConfigMap. The following example spec illustrates how you can set up a ConfigMap called manila-policy with the contents of a file called policy.yaml:

  spec:
    manila:
      enabled: true
      template:
        manilaAPI:
          customServiceConfig: |
             [oslo_policy]
             policy_file=/etc/manila/policy.yaml
        extraMounts:
        - extraVol:
          - extraVolType: Undefined
            mounts:
            - mountPath: /etc/manila/
              name: policy
              readOnly: true
            propagation:
            - ManilaAPI
            volumes:
            - name: policy
              projected:
                sources:
                - configMap:
                    name: manila-policy
                    items:
                      - key: policy
                        path: policy.yaml

The value of the host option under the [DEFAULT] section must be hostgroup.
To run the Shared File Systems service API service, you must add the enabled_share_protocols option to the customServiceConfig section in manila: template: manilaAPI.
If you have scheduler overrides, add them to the customServiceConfig section in manila: template: manilaScheduler.
If you have multiple storage back-end drivers configured with OSP , you need to split them up when deploying RHOSO Antelope. Each storage back-end driver needs to use its own instance of the manila-share service.

If a storage back-end driver needs a custom container image, find it in the Red Hat Ecosystem Catalog, and create or modify an OpenStackVersion custom resource (CR) to specify the custom image using the same custom name.

The following example shows a manila spec from the OpenStackControlPlane CR that includes multiple storage back-end drivers, where only one is using a custom container image:

  spec:
    manila:
      enabled: true
      template:
        manilaAPI:
          customServiceConfig: |
            [DEFAULT]
            enabled_share_protocols = nfs
          replicas: 3
        manilaScheduler:
          replicas: 3
        manilaShares:
         netapp:
           customServiceConfig: |
             [DEFAULT]
             debug = true
             enabled_share_backends = netapp
             host = hostgroup
             [netapp]
             driver_handles_share_servers = False
             share_backend_name = netapp
             share_driver = manila.share.drivers.netapp.common.NetAppDriver
             netapp_storage_family = ontap_cluster
             netapp_transport_type = http
           replicas: 1
         pure:
            customServiceConfig: |
             [DEFAULT]
             debug = true
             enabled_share_backends=pure-1
             host = hostgroup
             [pure-1]
             driver_handles_share_servers = False
             share_backend_name = pure-1
             share_driver = manila.share.drivers.purestorage.flashblade.FlashBladeShareDriver
             flashblade_mgmt_vip = 203.0.113.15
             flashblade_data_vip = 203.0.10.14
            replicas: 1

The following example shows the OpenStackVersion CR that defines the custom container image:

apiVersion: core.openstack.org/v1beta1
kind: OpenStackVersion
metadata:
  name: openstack
spec:
  customContainerImages:
    cinderVolumeImages:
      pure: registry.connect.redhat.com/purestorage/openstack-manila-share-pure-rhosp-18-0

The name of the OpenStackVersion CR must match the name of your OpenStackControlPlane CR.

If you are providing sensitive information, such as passwords, hostnames, and usernames, use OCP secrets, and the customServiceConfigSecrets key. You can use customConfigSecrets in any service. If you use third party storage that requires credentials, create a secret that is referenced in the manila CR/patch file by using the customServiceConfigSecrets key. For example:

Create a file that includes the secrets, for example, netapp_secrets.conf:

$ cat << __EOF__ > ~/netapp_secrets.conf

[netapp]
netapp_server_hostname = 203.0.113.10
netapp_login = fancy_netapp_user
netapp_password = secret_netapp_password
netapp_vserver = mydatavserver
__EOF__

$ oc create secret generic osp-secret-manila-netapp --from-file=~/<secret>

Replace <secret> with the name of the file that includes your secrets, for example, netapp_secrets.conf.

Add the secret to any Shared File Systems service file in the customServiceConfigSecrets section. The following example adds the osp-secret-manila-netapp secret to the manilaShares service:

  spec:
    manila:
      enabled: true
      template:
        < . . . >
        manilaShares:
         netapp:
           customServiceConfig: |
             [DEFAULT]
             debug = true
             enabled_share_backends = netapp
             host = hostgroup
             [netapp]
             driver_handles_share_servers = False
             share_backend_name = netapp
             share_driver = manila.share.drivers.netapp.common.NetAppDriver
             netapp_storage_family = ontap_cluster
             netapp_transport_type = http
           customServiceConfigSecrets:
             - osp-secret-manila-netapp
           replicas: 1
    < . . . >

4.14.2. Deploying the Shared File Systems service on the control plane

Copy the Shared File Systems service (manila) configuration from the OpenStack (OSP) deployment, and then deploy the Shared File Systems service on the control plane.

Prerequisites

The Shared File Systems service systemd services such as api, cron, and scheduler are stopped. For more information, see Stopping OpenStack services.
If the deployment uses CephFS through NFS as a storage back end, the Pacemaker ordering and collocation constraints are adjusted. For more information, see Stopping OpenStack services.
The Shared File Systems service Pacemaker service (openstack-manila-share) is stopped. For more information, see Stopping OpenStack services.
The database migration is complete. For more information, see Migrating databases to MariaDB instances.
The OpenShift nodes where the manila-share service is to be deployed can reach the management network that the storage system is in.
If the deployment uses CephFS through NFS as a storage back end, a new clustered Ceph NFS service is deployed on the Ceph Storage cluster with the help of Ceph orchestrator. For more information, see Creating a Ceph NFS cluster.
Services such as the Identity service (keystone) and memcached are available prior to adopting the Shared File Systems services.
If you enabled tenant-driven networking by setting driver_handles_share_servers=True, the Networking service (neutron) is deployed.
Define the CONTROLLER1_SSH environment variable if it hasn’t been defined already. Replace the following example values with values that are correct for your environment:
```
$ CONTROLLER1_SSH="ssh -i <path to SSH key> root@<node IP>"
```

Procedure

Copy the configuration file from OSP for reference:

$ CONTROLLER1_SSH cat /var/lib/config-data/puppet-generated/manila/etc/manila/manila.conf | awk '!/^ *#/ && NF' > ~/manila.conf

Review the configuration file for configuration changes that were made since OSP . For more information on preparing this file for Red Hat OpenStack Services on OpenShift (RHOSO), see Guidelines for preparing the Shared File Systems service configuration.

Create a patch file for the OpenStackControlPlane CR to deploy the Shared File Systems service. The following example manila.patch file uses native CephFS:

$ cat << __EOF__ > ~/manila.patch
spec:
  manila:
    enabled: true
    apiOverride:
      route: {}
    template:
      databaseInstance: openstack
      databaseAccount: manila
      secret: osp-secret
      manilaAPI:
        replicas: 3
        customServiceConfig: |
          [DEFAULT]
          enabled_share_protocols = cephfs
        override:
          service:
            internal:
              metadata:
                annotations:
                  metallb.universe.tf/address-pool: internalapi
                  metallb.universe.tf/allow-shared-ip: internalapi
                  metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
              spec:
                type: LoadBalancer
      manilaScheduler:
        replicas: 3
      manilaShares:
        cephfs:
          replicas: 1
          customServiceConfig: |
            [DEFAULT]
            enabled_share_backends = tripleo_ceph
            host = hostgroup
            [cephfs]
            driver_handles_share_servers=False
            share_backend_name=cephfs
            share_driver=manila.share.drivers.cephfs.driver.CephFSDriver
            cephfs_conf_path=/etc/ceph/ceph.conf
            cephfs_auth_id=openstack
            cephfs_cluster_name=ceph
            cephfs_volume_mode=0755
            cephfs_protocol_helper_type=CEPHFS
          networkAttachments:
              - storage
      extraMounts:
      - name: v1
        region: r1
        extraVol:
          - propagation:
            - ManilaShare
          extraVolType: Ceph
          volumes:
          - name: ceph
            secret:
              secretName: ceph-conf-files
          mounts:
          - name: ceph
            mountPath: "/etc/ceph"
            readOnly: true
__EOF__

metallb.universe.tf/loadBalancerIPs:<172.17.0.80> specifies the load balancer IP in your environment. If you use IPv6, change the load balancer IP to the load balancer IP in your environment, for example, metallb.universe.tf/loadBalancerIPs: fd00:bbbb::80.
share_backend_name specifies the names of the back ends to use in Red Hat OpenStack Services on OpenShift (RHOSO). Ensure that the names of the back ends are the same as they were in OSP .
networkAttachments specifies the appropriate storage management network. For example, the manilaShares instance with the CephFS back-end driver is connected to the storage network.

extraMounts specifies additional files to add to any of the services. For example, when using Ceph, you can add the Shared File Systems service Ceph user’s keyring file as well as the ceph.conf configuration file.

The following example patch file uses CephFS through NFS:

$ cat << __EOF__ > ~/manila.patch
spec:
  manila:
    enabled: true
    apiOverride:
      route: {}
    template:
      databaseInstance: openstack
      secret: osp-secret
      manilaAPI:
        replicas: 3
        customServiceConfig: |
          [DEFAULT]
          enabled_share_protocols = cephfs
        override:
          service:
            internal:
              metadata:
                annotations:
                  metallb.universe.tf/address-pool: internalapi
                  metallb.universe.tf/allow-shared-ip: internalapi
                  metallb.universe.tf/loadBalancerIPs: <172.17.0.80>
              spec:
                type: LoadBalancer
      manilaScheduler:
        replicas: 3
      manilaShares:
        cephfs:
          replicas: 1
          customServiceConfig: |
            [DEFAULT]
            enabled_share_backends = cephfs
            host = hostgroup
            [cephfs]
            driver_handles_share_servers=False
            share_backend_name=tripleo_ceph
            share_driver=manila.share.drivers.cephfs.driver.CephFSDriver
            cephfs_conf_path=/etc/ceph/ceph.conf
            cephfs_auth_id=openstack
            cephfs_cluster_name=ceph
            cephfs_protocol_helper_type=NFS
            cephfs_nfs_cluster_id=cephfs
            cephfs_ganesha_server_ip=172.17.5.47
          networkAttachments:
              - storage
__EOF__

Prior to adopting the manilaShares service for CephFS through NFS, ensure that you create a clustered Ceph NFS service. The name of the service must be cephfs_nfs_cluster_id. The cephfs_nfs_cluster_id option is set with the name of the NFS cluster created on Ceph.
The cephfs_ganesha_server_ip option is preserved from the configuration on the OSP environment.

Patch the OpenStackControlPlane CR:
```
$ oc patch openstackcontrolplane openstack --type=merge --patch-file=~/<manila.patch>
```
- Replace <manila.patch> with the name of your patch file.

Verification

Inspect the resulting Shared File Systems service pods:
```
$ oc get pods -l service=manila
```

Check that the Shared File Systems API service is registered in the Identity service (keystone):

$ openstack service list | grep manila

$ openstack endpoint list | grep manila

| 1164c70045d34b959e889846f9959c0e | regionOne | manila       | share        | True    | internal  | http://manila-internal.openstack.svc:8786/v1/%(project_id)s        |
| 63e89296522d4b28a9af56586641590c | regionOne | manilav2     | sharev2      | True    | public    | https://manila-public-openstack.apps-crc.testing/v2                |
| af36c57adcdf4d50b10f484b616764cc | regionOne | manila       | share        | True    | public    | https://manila-public-openstack.apps-crc.testing/v1/%(project_id)s |
| d655b4390d7544a29ce4ea356cc2b547 | regionOne | manilav2     | sharev2      | True    | internal  | http://manila-internal.openstack.svc:8786/v2                       |

Test the health of the service:

$ openstack share service list
$ openstack share pool list --detail

Check existing workloads:

$ openstack share list
$ openstack share snapshot list

You can create further resources:

$ openstack share create cephfs 10 --snapshot mysharesnap --name myshareclone
$ openstack share create nfs 10 --name mynfsshare
$ openstack share export location list mynfsshare

4.14.3. Decommissioning the OpenStack standalone Ceph NFS service

If your deployment uses CephFS through NFS, you must decommission the OpenStack(OSP) standalone NFS service. Since future software upgrades do not support the previous NFS service, ensure that the decommissioning period is short.

Prerequisites

You identified the new export locations for your existing shares by querying the Shared File Systems API.
You unmounted and remounted the shared file systems on each client to stop using the previous NFS server.
If you are consuming the Shared File Systems service shares with the Shared File Systems service CSI plugin for OpenShift, you migrated the shares by scaling down the application pods and scaling them back up.

Clients that are creating new workloads cannot use share exports through the previous NFS service. The Shared File Systems service no longer communicates with the previous NFS service, and cannot apply or alter export rules on the previous NFS service.

Procedure

Remove the cephfs_ganesha_server_ip option from the manila-share service configuration:

This restarts the manila-share process and removes the export locations that applied to the previous NFS service from all the shares.

$ cat << __EOF__ > ~/manila.patch
spec:
  manila:
    enabled: true
    apiOverride:
      route: {}
    template:
      manilaShares:
        cephfs:
          replicas: 1
          customServiceConfig: |
            [DEFAULT]
            enabled_share_backends = cephfs
            host = hostgroup
            [cephfs]
            driver_handles_share_servers=False
            share_backend_name=cephfs
            share_driver=manila.share.drivers.cephfs.driver.CephFSDriver
            cephfs_conf_path=/etc/ceph/ceph.conf
            cephfs_auth_id=openstack
            cephfs_cluster_name=ceph
            cephfs_protocol_helper_type=NFS
            cephfs_nfs_cluster_id=cephfs
          networkAttachments:
              - storage
__EOF__

Patch the OpenStackControlPlane custom resource:
```
$ oc patch openstackcontrolplane openstack --type=merge --patch-file=~/<manila.patch>
```
- Replace <manila.patch> with the name of your patch file.
Clean up the standalone ceph-nfs service from the OSP control plane nodes by disabling and deleting the Pacemaker resources associated with the service:

You can defer this step until after RHOSO Antelope is operational. During this time, you cannot decommission the Controller nodes.
```
$ sudo pcs resource disable ceph-nfs
$ sudo pcs resource disable ip-<VIP>
$ sudo pcs resource unmanage ceph-nfs
$ sudo pcs resource unmanage ip-<VIP>
```
- Replace <VIP> with the IP address assigned to the ceph-nfs service in your environment.

4.15. Adopting the Orchestration service

To adopt the Orchestration service (heat), you patch an existing OpenStackControlPlane custom resource (CR), where the Orchestration service is disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment.

After you complete the adoption process, you have CRs for Heat, HeatAPI, HeatEngine, and HeatCFNAPI, and endpoints within the Identity service (keystone) to facilitate these services.

The Heat Adoption follows a similar workflow to Keystone.

Prerequisites

The source TripleO environment is running.
The target OpenShift environment is running.
You adopted MariaDB and the Identity service.
If your existing Orchestration service stacks contain resources from other services such as Networking service (neutron), Compute service (nova), Object Storage service (swift), and so on, adopt those sevices before adopting the Orchestration service.

Procedure

Retrieve the existing auth_encryption_key and service passwords. You use these passwords to patch the osp-secret. In the following example, the auth_encryption_key is used as HeatAuthEncryptionKey and the service password is used as HeatPassword:

[stack@rhosp17 ~]$ grep -E 'HeatPassword|HeatAuth|HeatStackDomainAdmin' ~/overcloud-deploy/overcloud/overcloud-passwords.yaml
  HeatAuthEncryptionKey: Q60Hj8PqbrDNu2dDCbyIQE2dibpQUPg2
  HeatPassword: dU2N0Vr2bdelYH7eQonAwPfI3
  HeatStackDomainAdminPassword: dU2N0Vr2bdelYH7eQonAwPfI3

[stack@rhosp17 ~]$ ansible -i overcloud-deploy/overcloud/config-download/overcloud/tripleo-ansible-inventory.yaml overcloud-controller-0 -m shell -a "grep auth_encryption_key /var/lib/config-data/puppet-generated/heat/etc/heat/heat.conf | grep -Ev '^#|^$'" -b
overcloud-controller-0 | CHANGED | rc=0 >>
auth_encryption_key=Q60Hj8PqbrDNu2dDCbyIQE2dibpQUPg2

Encode the password to Base64 format:

$ echo Q60Hj8PqbrDNu2dDCbyIQE2dibpQUPg2 | base64
UTYwSGo4UHFickROdTJkRENieUlRRTJkaWJwUVVQZzIK

Patch the osp-secret to update the HeatAuthEncryptionKey and HeatPassword parameters. These values must match the values in the TripleO Orchestration service configuration:

$ oc patch secret osp-secret --type='json' -p='[{"op" : "replace" ,"path" : "/data/HeatAuthEncryptionKey" ,"value" : "UTYwSGo4UHFickROdTJkRENieUlRRTJkaWJwUVVQZzIK"}]'
secret/osp-secret patched

Patch the OpenStackControlPlane CR to deploy the Orchestration service:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  heat:
    enabled: true
    apiOverride:
      route: {}
    template:
      databaseInstance: openstack
      databaseAccount: heat
      secret: osp-secret
      memcachedInstance: memcached
      passwordSelectors:
        authEncryptionKey: HeatAuthEncryptionKey
        service: HeatPassword
        stackDomainAdminPassword: HeatStackDomainAdminPassword
'

Verification

Ensure that the statuses of all the CRs are Setup complete:

$ oc get Heat,HeatAPI,HeatEngine,HeatCFNAPI
NAME                           STATUS   MESSAGE
heat.heat.openstack.org/heat   True     Setup complete

NAME                                  STATUS   MESSAGE
heatapi.heat.openstack.org/heat-api   True     Setup complete

NAME                                        STATUS   MESSAGE
heatengine.heat.openstack.org/heat-engine   True     Setup complete

NAME                                        STATUS   MESSAGE
heatcfnapi.heat.openstack.org/heat-cfnapi   True     Setup complete

Check that the Orchestration service is registered in the Identity service:

$ oc exec -it openstackclient -- openstack service list -c Name -c Type
+------------+----------------+
| Name       | Type           |
+------------+----------------+
| heat       | orchestration  |
| glance     | image          |
| heat-cfn   | cloudformation |
| ceilometer | Ceilometer     |
| keystone   | identity       |
| placement  | placement      |
| cinderv3   | volumev3       |
| nova       | compute        |
| neutron    | network        |
+------------+----------------+

$ oc exec -it openstackclient -- openstack endpoint list --service=heat -f yaml
- Enabled: true
  ID: 1da7df5b25b94d1cae85e3ad736b25a5
  Interface: public
  Region: regionOne
  Service Name: heat
  Service Type: orchestration
  URL: http://heat-api-public-openstack-operators.apps.okd.bne-shift.net/v1/%(tenant_id)s
- Enabled: true
  ID: 414dd03d8e9d462988113ea0e3a330b0
  Interface: internal
  Region: regionOne
  Service Name: heat
  Service Type: orchestration
  URL: http://heat-api-internal.openstack-operators.svc:8004/v1/%(tenant_id)s

Check that the Orchestration service engine services are running:

$ oc exec -it openstackclient -- openstack orchestration service list -f yaml
- Binary: heat-engine
  Engine ID: b16ad899-815a-4b0c-9f2e-e6d9c74aa200
  Host: heat-engine-6d47856868-p7pzz
  Hostname: heat-engine-6d47856868-p7pzz
  Status: up
  Topic: engine
  Updated At: '2023-10-11T21:48:01.000000'
- Binary: heat-engine
  Engine ID: 887ed392-0799-4310-b95c-ac2d3e6f965f
  Host: heat-engine-6d47856868-p7pzz
  Hostname: heat-engine-6d47856868-p7pzz
  Status: up
  Topic: engine
  Updated At: '2023-10-11T21:48:00.000000'
- Binary: heat-engine
  Engine ID: 26ed9668-b3f2-48aa-92e8-2862252485ea
  Host: heat-engine-6d47856868-p7pzz
  Hostname: heat-engine-6d47856868-p7pzz
  Status: up
  Topic: engine
  Updated At: '2023-10-11T21:48:00.000000'
- Binary: heat-engine
  Engine ID: 1011943b-9fea-4f53-b543-d841297245fd
  Host: heat-engine-6d47856868-p7pzz
  Hostname: heat-engine-6d47856868-p7pzz
  Status: up
  Topic: engine
  Updated At: '2023-10-11T21:48:01.000000'

Verify that you can see your Orchestration service stacks:

$ openstack stack list -f yaml
- Creation Time: '2023-10-11T22:03:20Z'
  ID: 20f95925-7443-49cb-9561-a1ab736749ba
  Project: 4eacd0d1cab04427bc315805c28e66c9
  Stack Name: test-networks
  Stack Status: CREATE_COMPLETE
  Updated Time: null

4.16. Adopting the Load-balancing service

To adopt the Load-balancing service (octavia), you patch an existing OpenStackControlPlane custom resource (CR) where the Load-balancing service is disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack (OSP) environment. After completing the data plane adoption, you must trigger a failover of existing load balancers to upgrade their amphora virtual machines to use the new image and to establish connectivity with the new control plane.

Procedure

Migrate the server certificate authority (CA) passphrase from the previous deployment:

SERVER_CA_PASSPHRASE=$($CONTROLLER1_SSH grep ^ca_private_key_passphrase /var/lib/config-data/puppet-generated/octavia/etc/octavia/octavia.conf)

oc apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: octavia-ca-passphrase
type: Opaque
data:
  server-ca-passphrase: $(echo -n $SERVER_CA_PASSPHRASE | base64 -w0)
EOF

To isolate the management network, add the network interface for the VLAN base interface:

$ oc get --no-headers nncp | cut -f 1 -d ' ' | grep -v nncp-dns | while read; do

interfaces=$(oc get nncp $REPLY -o jsonpath="{.spec.desiredState.interfaces[*].name}")

(echo $interfaces | grep -w -q "octbr\|enp6s0.24") || \
        oc patch nncp $REPLY --type json --patch '
[{
    "op": "add",
    "path": "/spec/desiredState/interfaces/-",
    "value": {
      "description": "Octavia VLAN host interface",
      "name": "enp6s0.24",
      "state": "up",
      "type": "vlan",
      "vlan": {
        "base-iface": "<enp6s0>",
        "id": 24
        }
    }
},
{
    "op": "add",
    "path": "/spec/desiredState/interfaces/-",
    "value": {
      "description": "Octavia Bridge",
      "mtu": <mtu>,
      "state": "up",
      "type": "linux-bridge",
      "name": "octbr",
      "bridge": {
        "options": { "stp": { "enabled": "false" } },
        "port": [ { "name": "enp6s0.24" } ]
        }
    }
}]'

done

where:

<enp6s0>: Specifies the name of the network interface in your OCP setup.
<mtu>: Specifies the mtu value in your environment.

To connect pods that manage load balancer virtual machines (amphorae) and the OpenvSwitch pods the OVN operator manages, configure the Load-balancing service network attachment definition:

$ cat  octavia-nad.yaml << EOF_CAT
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  labels:
    osp/net: octavia
  name: octavia
spec:
  config: |
    {
      "cniVersion": "0.3.1",
      "name": "octavia",
      "type": "bridge",
      "bridge": "octbr",
      "ipam": {
        "type": "whereabouts",
        "range": "172.23.0.0/24",
        "range_start": "172.23.0.30",
        "range_end": "172.23.0.70",
        "routes": [
           {
             "dst": "172.24.0.0/16",
             "gw" : "172.23.0.150"
           }
         ]
      }
    }
EOF_CAT

Create the NetworkAttachmentDefinition CR:
```
$ oc apply -f octavia-nad.yaml
```

Enable the Load-balancing service in OCP:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  ovn:
    template:
      ovnController:
        networkAttachment: tenant
        nicMappings:
          octavia: octbr
  octavia:
    enabled: true
    template:
      amphoraImageContainerImage: quay.io/gthiemonge/octavia-amphora-image
      octaviaHousekeeping:
        networkAttachments:
          - octavia
      octaviaHealthManager:
        networkAttachments:
          - octavia
      octaviaWorker:
        networkAttachments:
          - octavia
'

Wait for the Load-balancing service control plane services CRs to be ready:

$ oc wait --for condition=Ready --timeout=600s octavia.octavia.openstack.org/octavia

Ensure that the Load-balancing service is registered in the Identity service:

$ alias openstack="oc exec -t openstackclient -- openstack"
$ openstack service list | grep load-balancer
| bd078ca6f90c4b86a48801f45eb6f0d7 | octavia   | load-balancer |
$ openstack endpoint list --service load-balancer
+----------------------------------+-----------+--------------+---------------+---------+-----------+---------------------------------------------------+
| ID                               | Region    | Service Name | Service Type  | Enabled | Interface | URL                                               |
+----------------------------------+-----------+--------------+---------------+---------+-----------+---------------------------------------------------+
| f1ae7756b6164baf9cb82a1a670067a2 | regionOne | octavia      | load-balancer | True    | public    | https://octavia-public-openstack.apps-crc.testing |
| ff3222b4621843669e89843395213049 | regionOne | octavia      | load-balancer | True    | internal  | http://octavia-internal.openstack.svc:9876        |
+----------------------------------+-----------+--------------+---------------+---------+-----------+---------------------------------------------------+

Next steps

After you complete the data plane adoption, you must upgrade existing load balancers and remove old resources. For more information, see Post-adoption tasks for the Load-balancing service.

4.17. Adopting Telemetry services

To adopt Telemetry services, you patch an existing OpenStackControlPlane custom resource (CR) that has Telemetry services disabled to start the service with the configuration parameters that are provided by the OpenStack (OSP) environment.

If you adopt Telemetry services, the observability solution that is used in the OSP environment, Service Telemetry Framework, is removed from the cluster. The new solution is deployed in the Red Hat OpenStack Services on OpenShift (RHOSO) environment, allowing for metrics, and optionally logs, to be retrieved and stored in the new back ends.

You cannot automatically migrate old data because different back ends are used. Metrics and logs are considered short-lived data and are not intended to be migrated to the RHOSO environment. For information about adopting legacy autoscaling stack templates to the RHOSO environment, see Adopting Autoscaling services.

Prerequisites

The TripleO environment is running (the source cloud).
The Single Node OpenShift or OpenShift Local is running in the OpenShift cluster.
Previous adoption steps are completed.

Procedure

Patch the OpenStackControlPlane CR to deploy cluster-observability-operator:

$ oc create -f - <<EOF
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: cluster-observability-operator
  namespace: openshift-operators
spec:
  channel: stable
  installPlanApproval: Automatic
  name: cluster-observability-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
EOF

Wait for the installation to succeed:

$ oc wait --for jsonpath="{.status.phase}"=Succeeded csv --namespace=openshift-operators -l operators.coreos.com/cluster-observability-operator.openshift-operators

Patch the OpenStackControlPlane CR to deploy Ceilometer services:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  telemetry:
    enabled: true
    template:
      ceilometer:
        passwordSelector:
          ceilometerService: CeilometerPassword
        enabled: true
        secret: osp-secret
        serviceUser: ceilometer
'

Enable the metrics storage back end:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  telemetry:
    template:
      metricStorage:
        enabled: true
        monitoringStack:
          alertingEnabled: true
          scrapeInterval: 30s
          storage:
            strategy: persistent
            retention: 24h
            persistent:
              pvcStorageRequest: 20G
'

Verification

Verify that the alertmanager and prometheus pods are available:

$ oc get pods -l alertmanager=metric-storage
NAME                            READY   STATUS    RESTARTS   AGE
alertmanager-metric-storage-0   2/2     Running   0          46s
alertmanager-metric-storage-1   2/2     Running   0          46s

$ oc get pods -l prometheus=metric-storage
NAME                          READY   STATUS    RESTARTS   AGE
prometheus-metric-storage-0   3/3     Running   0          46s

Inspect the resulting Ceilometer pods:

CEILOMETETR_POD=`oc get pods -l service=ceilometer | tail -n 1 | cut -f 1 -d' '`
oc exec -t $CEILOMETETR_POD -c ceilometer-central-agent -- cat /etc/ceilometer/ceilometer.conf

Inspect enabled pollsters:

$ oc get secret ceilometer-config-data -o jsonpath="{.data['polling\.yaml\.j2']}"  | base64 -d

Optional: Override default pollsters according to the requirements of your environment:

$ oc patch openstackcontrolplane controlplane --type=merge --patch '
spec:
  telemetry:
    template:
      ceilometer:
          defaultConfigOverwrite:
            polling.yaml.j2: |
              ---
              sources:
                - name: pollsters
                  interval: 100
                  meters:
                    - volume.*
                    - image.size
          enabled: true
          secret: osp-secret
'

Next steps

Optional: Patch the OpenStackControlPlane CR to include logging:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  telemetry:
    template:
      logging:
        enabled: false
        ipaddr: 172.17.0.80
        port: 10514
        cloNamespace: openshift-logging
'

4.18. Adopting autoscaling services

To adopt services that enable autoscaling, you patch an existing OpenStackControlPlane custom resource (CR) where the Alarming services (aodh) are disabled. The patch starts the service with the configuration parameters that are provided by the OpenStack environment.

Prerequisites

The source TripleO environment is running.
A Single Node OpenShift or OpenShift Local is running in the OpenShift cluster.
You have adopted the following services:
- MariaDB
- Identity service (keystone)
- Orchestration service (heat)
- Telemetry service

Procedure

Patch the OpenStackControlPlane CR to deploy the autoscaling services:

$ oc patch openstackcontrolplane openstack --type=merge --patch '
spec:
  telemetry:
    enabled: true
    template:
      autoscaling:
        enabled: true
        aodh:
          passwordSelector:
            aodhService: AodhPassword
          databaseAccount: aodh
          databaseInstance: openstack
          secret: osp-secret
          serviceUser: aodh
        heatInstance: heat
'

Inspect the aodh pods:

$ AODH_POD=`oc get pods -l service=aodh | tail -n 1 | cut -f 1 -d' '`
$ oc exec -t $AODH_POD -c aodh-api -- cat /etc/aodh/aodh.conf

Check whether the aodh API service is registered in the Identity service:

$ openstack endpoint list | grep aodh
| d05d120153cd4f9b8310ac396b572926 | regionOne | aodh  | alarming  | True    | internal  | http://aodh-internal.openstack.svc:8042  |
| d6daee0183494d7a9a5faee681c79046 | regionOne | aodh  | alarming  | True    | public    | http://aodh-public.openstack.svc:8042    |

Optional: Create aodh alarms with the PrometheusAlarm alarm type:

You must use the PrometheusAlarm alarm type instead of GnocchiAggregationByResourcesAlarm.

$ openstack alarm create --name high_cpu_alarm \
--type prometheus \
--query "(rate(ceilometer_cpu{resource_name=~'cirros'})) * 100" \
--alarm-action 'log://' \
--granularity 15 \
--evaluation-periods 3 \
--comparison-operator gt \
--threshold 7000000000

Verify that the alarm is enabled:

$ openstack alarm list
+--------------------------------------+------------+------------------+-------------------+----------+
| alarm_id                             | type       | name             | state  | severity | enabled  |
+--------------------------------------+------------+------------------+-------------------+----------+
| 209dc2e9-f9d6-40e5-aecc-e767ce50e9c0 | prometheus | prometheus_alarm |   ok   |    low   |   True   |
+--------------------------------------+------------+------------------+-------------------+----------+

4.19. Pulling the configuration from a TripleO deployment

Before you start the data plane adoption workflow, back up the configuration from the OpenStack (OSP) services and TripleO. You can then use the files during the configuration of the adopted services to ensure that nothing is missed or misconfigured.

Prerequisites

The os-diff tool is installed and configured. For more information, see Comparing configuration files between deployments.

All the services are describes in a yaml file:

service config parameters

Procedure

Update your ssh parameters according to your environment in the os-diff.cfg. Os-diff uses the ssh parameters to connect to your TripleO node, and then query and download the configuration files:
```
ssh_cmd=ssh -F ssh.config standalone
container_engine=podman
connection=ssh
remote_config_path=/tmp/tripleo
```
Ensure that the ssh command you provide in ssh_cmd parameter is correct and includes key authentication.

Enable the services that you want to include in the /etc/os-diff/config.yaml file, and disable the services that you want to exclude from the file. Ensure that you have the correct permissions to edit the file:

$ chown ospng:ospng /etc/os-diff/config.yaml

The following example enables the default Identity service (keystone) to be included in the /etc/os-diff/config.yaml file:

# service name and file location
services:
  # Service name
  keystone:
    # Bool to enable/disable a service (not implemented yet)
    enable: true
    # Pod name, in both OCP and podman context.
    # It could be strict match or will only just grep the podman_name
    # and work with all the pods which matched with pod_name.
    # To enable/disable use strict_pod_name_match: true/false
    podman_name: keystone
    pod_name: keystone
    container_name: keystone-api
    # pod options
    # strict match for getting pod id in TripleO and podman context
    strict_pod_name_match: false
    # Path of the config files you want to analyze.
    # It could be whatever path you want:
    # /etc/<service_name> or /etc or /usr/share/<something> or even /
    # @TODO: need to implement loop over path to support multiple paths such as:
    # - /etc
    # - /usr/share
    path:
      - /etc/
      - /etc/keystone
      - /etc/keystone/keystone.conf
      - /etc/keystone/logging.conf

Repeat this step for each OSP service that you want to disable or enable.

If you use non-containerized services, such as the ovs-external-ids, pull the configuration or the command output. For example:

services:
  ovs_external_ids:
    hosts:
      - standalone
    service_command: "ovs-vsctl list Open_vSwitch . | grep external_ids | awk -F ': ' '{ print $2; }'"
    cat_output: true
    path:
      - ovs_external_ids.json
    config_mapping:
      ovn-bridge-mappings: edpm_ovn_bridge_mappings
      ovn-bridge: edpm_ovn_bridge
      ovn-encap-type: edpm_ovn_encap_type
      ovn-monitor-all: ovn_monitor_all
      ovn-remote-probe-interval: edpm_ovn_remote_probe_interval
      ovn-ofctrl-wait-before-clear: edpm_ovn_ofctrl_wait_before_clear

You must correctly configure an SSH configuration file or equivalent for non-standard services, such as OVS. The ovs_external_ids service does not run in a container, and the OVS data is stored on each host of your cloud, for example, controller_1/controller_2/, and so on.

hosts specifies the list of hosts, for example, compute-1, compute-2.
service_command: "ovs-vsctl list Open_vSwitch . | grep external_ids | awk -F ': ' '{ print $2; }'" runs against the hosts.
cat_output: true provides os-diff with the output of the command and stores the output in a file that is specified by the key path.
config_mapping provides a mapping between, in this example, the data plane custom resource definition and the ovs-vsctl output.
ovn-bridge-mappings: edpm_ovn_bridge_mappings must be a list of strings, for example, ["datacentre:br-ex"].
1. Compare the values:
  $ os-diff diff ovs_external_ids.json edpm.crd --crd --service ovs_external_ids
  For example, to check the /etc/yum.conf on every host, you must put the following statement in the config.yaml file. The following example uses a file called yum_config:
  services: yum_config: hosts: - undercloud - controller_1 - compute_1 - compute_2 service_command: "cat /etc/yum.conf" cat_output: true path: - yum.conf

Pull the configuration:

The following command pulls all the configuration files that are included in the /etc/os-diff/config.yaml file. You can configure os-diff to update this file automatically according to your running environment by using the --update or --update-only option. These options set the podman information into the config.yaml for all running containers. The podman information can be useful later, when all the OpenStack services are turned off.

Note that when the config.yaml file is populated automatically you must provide the configuration paths manually for each service.

# will only update the /etc/os-diff/config.yaml
os-diff pull --update-only

# will update the /etc/os-diff/config.yaml and pull configuration
os-diff pull --update

# will update the /etc/os-diff/config.yaml and pull configuration
os-diff pull

The configuration is pulled and stored by default in the following directory:

/tmp/tripleo/

Verification

Verify that you have a directory for each service configuration in your local path:
```
  ▾ tmp/
    ▾ tripleo/
      ▾ glance/
      ▾ keystone/
```

4.20. Rolling back the control plane adoption

If you encountered a problem and are unable to complete the adoption of the OpenStack (OSP) control plane services, you can roll back the control plane adoption.

Do not attempt the rollback if you altered the data plane nodes in any way. You can only roll back the control plane adoption if you altered the control plane.

During the control plane adoption, services on the OSP control plane are stopped but not removed. The databases on the OSP control plane are not edited during the adoption procedure. The Red Hat OpenStack Services on OpenShift (RHOSO) control plane receives a copy of the original control plane databases. The rollback procedure assumes that the data plane has not yet been modified by the adoption procedure, and it is still connected to the OSP control plane.

The rollback procedure consists of the following steps:

Restoring the functionality of the OSP control plane.
Removing the partially or fully deployed RHOSO control plane.

Procedure

To restore the source cloud to a working state, start the OSP control plane services that you previously stopped during the adoption procedure:

ServicesToStart=("tripleo_horizon.service"
                 "tripleo_keystone.service"
                 "tripleo_barbican_api.service"
                 "tripleo_barbican_worker.service"
                 "tripleo_barbican_keystone_listener.service"
                 "tripleo_cinder_api.service"
                 "tripleo_cinder_api_cron.service"
                 "tripleo_cinder_scheduler.service"
                 "tripleo_cinder_volume.service"
                 "tripleo_cinder_backup.service"
                 "tripleo_designate_api.service"
                 "tripleo_designate_backend_bind9.service"
                 "tripleo_designate_central.service"
                 "tripleo_designate_mdns.service"
                 "tripleo_designate_producer.service"
                 "tripleo_designate_worker.service"
                 "tripleo_glance_api.service"
                 "tripleo_manila_api.service"
                 "tripleo_manila_api_cron.service"
                 "tripleo_manila_scheduler.service"
                 "tripleo_neutron_api.service"
                 "tripleo_placement_api.service"
                 "tripleo_nova_api_cron.service"
                 "tripleo_nova_api.service"
                 "tripleo_nova_conductor.service"
                 "tripleo_nova_metadata.service"
                 "tripleo_nova_scheduler.service"
                 "tripleo_nova_vnc_proxy.service"
                 "tripleo_aodh_api.service"
                 "tripleo_aodh_api_cron.service"
                 "tripleo_aodh_evaluator.service"
                 "tripleo_aodh_listener.service"
                 "tripleo_aodh_notifier.service"
                 "tripleo_ceilometer_agent_central.service"
                 "tripleo_ceilometer_agent_compute.service"
                 "tripleo_ceilometer_agent_ipmi.service"
                 "tripleo_ceilometer_agent_notification.service"
                 "tripleo_ovn_cluster_north_db_server.service"
                 "tripleo_ovn_cluster_south_db_server.service"
                 "tripleo_ovn_cluster_northd.service"
                 "tripleo_octavia_api.service"
                 "tripleo_octavia_health_manager.service"
                 "tripleo_octavia_rsyslog.service"
                 "tripleo_octavia_driver_agent.service"
                 "tripleo_octavia_housekeeping.service"
                 "tripleo_octavia_worker.service"
                 "tripleo_unbound.service")

PacemakerResourcesToStart=("galera-bundle"
                           "haproxy-bundle"
                           "rabbitmq-bundle"
                           "openstack-cinder-volume"
                           "openstack-cinder-backup"
                           "openstack-manila-share")

echo "Starting systemd OpenStack services"
for service in ${ServicesToStart[*]}; do
    for i in {1..3}; do
        SSH_CMD=CONTROLLER${i}_SSH
        if [ ! -z "${!SSH_CMD}" ]; then
            if ${!SSH_CMD} sudo systemctl is-enabled $service &> /dev/null; then
                echo "Starting the $service in controller $i"
                ${!SSH_CMD} sudo systemctl start $service
            fi
        fi
    done
done

echo "Checking systemd OpenStack services"
for service in ${ServicesToStart[*]}; do
    for i in {1..3}; do
        SSH_CMD=CONTROLLER${i}_SSH
        if [ ! -z "${!SSH_CMD}" ]; then
            if ${!SSH_CMD} sudo systemctl is-enabled $service &> /dev/null; then
                if ! ${!SSH_CMD} systemctl show $service | grep ActiveState=active >/dev/null; then
                    echo "ERROR: Service $service is not running on controller $i"
                else
                    echo "OK: Service $service is running in controller $i"
                fi
            fi
        fi
    done
done

echo "Starting pacemaker OpenStack services"
for i in {1..3}; do
    SSH_CMD=CONTROLLER${i}_SSH
    if [ ! -z "${!SSH_CMD}" ]; then
        echo "Using controller $i to run pacemaker commands"
        for resource in ${PacemakerResourcesToStart[*]}; do
            if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then
                echo "Starting $resource"
                ${!SSH_CMD} sudo pcs resource enable $resource
            else
                echo "Service $resource not present"
            fi
        done
        break
    fi
done

echo "Checking pacemaker OpenStack services"
for i in {1..3}; do
    SSH_CMD=CONTROLLER${i}_SSH
    if [ ! -z "${!SSH_CMD}" ]; then
        echo "Using controller $i to run pacemaker commands"
        for resource in ${PacemakerResourcesToStop[*]}; do
            if ${!SSH_CMD} sudo pcs resource config $resource &>/dev/null; then
                if ${!SSH_CMD} sudo pcs resource status $resource | grep Started >/dev/null; then
                    echo "OK: Service $resource is started"
                else
                    echo "ERROR: Service $resource is stopped"
                fi
            fi
        done
        break
    fi
done

If the Ceph NFS service is running on the deployment as a Shared File Systems service (manila) back end, you must restore the Pacemaker order and colocation constraints for the openstack-manila-share service:

$ sudo pcs constraint order start ceph-nfs then openstack-manila-share kind=Optional id=order-ceph-nfs-openstack-manila-share-Optional
$ sudo pcs constraint colocation add openstack-manila-share with ceph-nfs score=INFINITY id=colocation-openstack-manila-share-ceph-nfs-INFINITY

Verify that the source cloud is operational again, for example, you can run openstack CLI commands such as openstack server list, or check that you can access the Dashboard service (horizon).

Remove the partially or fully deployed control plane so that you can attempt the adoption again later:

$ oc delete --ignore-not-found=true --wait=false openstackcontrolplane/openstack
$ oc patch openstackcontrolplane openstack --type=merge --patch '
> metadata:
>   finalizers: []
> ' || true
>
>while oc get pod | grep rabbitmq-server-0; do
>    sleep 2
>done
>while oc get pod | grep openstack-galera-0; do
>    sleep 2
>done

$ oc delete --ignore-not-found=true --wait=false pod mariadb-copy-data
$ oc delete --ignore-not-found=true --wait=false pvc mariadb-data
$ oc delete --ignore-not-found=true --wait=false pod ovn-copy-data
$ oc delete --ignore-not-found=true secret osp-secret

Next steps

After you restore the OSP control plane services, their internal state might have changed. Before you retry the adoption procedure, verify that all the control plane resources are removed and that there are no leftovers which could affect the following adoption procedure attempt. You must not use previously created copies of the database contents in another adoption attempt. You must make a new copy of the latest state of the original source database contents. For more information about making new copies of the database, see Migrating databases to the control plane.

5. Adopting the data plane

Adopting the Red Hat OpenStack Services on OpenShift (RHOSO) data plane involves the following steps:

Stop any remaining services on the OpenStack (OSP) control plane.
Deploy the required custom resources.
Perform a fast-forward upgrade on Compute services from OSP to RHOSO Antelope.
Adopt Networker services to the RHOSO data plane.

After the RHOSO control plane manages the newly deployed data plane, you must not re-enable services on the OSP control plane and data plane. If you re-enable services, workloads are managed by two control planes or two data planes, resulting in data corruption, loss of control of existing workloads, inability to start new workloads, or other issues.

5.1. Stopping infrastructure management and Compute services

You must stop cloud database nodes and messaging nodes on the OpenStack control plane. Do not stop nodes that are running the following roles:

Compute
Storage
Networker
Controller if running OVN Controller Gateway agent network agent

The following procedure applies to a standalone TripleO deployment. You must stop the Pacemaker services on your host so that you can install libvirt packages when the Compute roles are adopted as data plane nodes. Modular libvirt daemons no longer run in podman containers on data plane nodes.

Prerequisites

Define the shell variables. Replace the following example values with values that apply to your environment:
```
CONTROLLER1_SSH="ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa root@<controller-1 IP>"
# ...
# ...
EDPM_PRIVATEKEY_PATH="~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa"
```
- CONTROLLER<X>_SSH defines the SSH connection details for all Controller nodes, including cell Controller nodes, of the source TripleO cloud.

Procedure

Stop the Pacemaker services:

PacemakerResourcesToStop=(
                "galera-bundle"
                "haproxy-bundle"
                "rabbitmq-bundle")

echo "Stopping pacemaker services"
for i in {1..3}; do
    SSH_CMD=CONTROLLER${i}_SSH
    if [ ! -z "${!SSH_CMD}" ]; then
        echo "Using controller $i to run pacemaker commands"
        for resource in ${PacemakerResourcesToStop[*]}; do
            if ${!SSH_CMD} sudo pcs resource config $resource; then
                ${!SSH_CMD} sudo pcs resource disable $resource
            fi
        done
        break
    fi
done

5.2. Adopting Compute services to the RHOSO data plane

Adopt your Compute (nova) services to the Red Hat OpenStack Services on OpenShift (RHOSO) data plane.

Prerequisites

You have stopped the remaining control plane nodes, repositories, and packages on the Compute service (nova) hosts. For more information, see Stopping infrastructure management and Compute services.
You have configured the Ceph back end for the NovaLibvirt service. For more information, see Configuring a Ceph back end.

You have configured IP Address Management (IPAM):

$ oc apply -f - <<EOF
apiVersion: network.openstack.org/v1beta1
kind: NetConfig
metadata:
  name: netconfig
spec:
  networks:
  - name: ctlplane
    dnsDomain: ctlplane.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 192.168.122.120
        start: 192.168.122.100
      - end: 192.168.122.200
        start: 192.168.122.150
      cidr: 192.168.122.0/24
      gateway: 192.168.122.1
  - name: internalapi
    dnsDomain: internalapi.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 172.17.0.250
        start: 172.17.0.100
      cidr: 172.17.0.0/24
      vlan: 20
  - name: External
    dnsDomain: external.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 10.0.0.250
        start: 10.0.0.100
      cidr: 10.0.0.0/24
      gateway: 10.0.0.1
  - name: storage
    dnsDomain: storage.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 172.18.0.250
        start: 172.18.0.100
      cidr: 172.18.0.0/24
      vlan: 21
  - name: storagemgmt
    dnsDomain: storagemgmt.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 172.20.0.250
        start: 172.20.0.100
      cidr: 172.20.0.0/24
      vlan: 23
  - name: tenant
    dnsDomain: tenant.example.com
    subnets:
    - name: subnet1
      allocationRanges:
      - end: 172.19.0.250
        start: 172.19.0.100
      cidr: 172.19.0.0/24
      vlan: 22
EOF

If neutron-sriov-nic-agent is running on your Compute service nodes, ensure that the physical device mappings match the values that are defined in the OpenStackDataPlaneNodeSet custom resource (CR). For more information, see Pulling the configuration from a TripleO deployment.

You have defined the shell variables to run the script that runs the upgrade:

$ CEPH_FSID=$(oc get secret ceph-conf-files -o json | jq -r .data."ceph.conf" | base64 -d | grep fsid | sed -e s/fsid = //)

$ alias openstack="oc exec -t openstackclient -- openstack"

$ DEFAULT_CELL_NAME="cell3"
$ RENAMED_CELLS="cell1 cell2 $DEFAULT_CELL_NAME"

$ declare -A COMPUTES_CELL1
$ export COMPUTES_CELL1=(
> ["standalone.localdomain"]="192.168.122.100"
> # <compute1>
> # <compute2>
> # <compute3>
>)
$ declare -A COMPUTES_CELL2
$ export COMPUTES_CELL2=(
> # <compute1>
>)
$ declare -A COMPUTES_CELL3
$export COMPUTES_CELL3=(
> # <compute1>
> # <compute2>
>)

$ declare -A COMPUTES_API_CELL1
$export COMPUTES_API_CELL1=(
> ["standalone.localdomain"]="172.17.0.100"
> ["standalone2.localdomain"]="172.17.0.101"
>)

$ NODESETS=""
$ for CELL in $(echo $RENAMED_CELLS); do
> ref="COMPUTES_$(echo ${CELL}|tr [:lower:] [:upper:])"
> eval names=\${!${ref}[@]}
> [ -z "$names" ] && continue
> NODESETS="openstack-${CELL}, $NODESETS"
>done
$ NODESETS="[${NODESETS%,*}]"

DEFAULT_CELL_NAME="cell3" defines the source cloud default cell that acquires a new DEFAULT_CELL_NAME on the destination cloud after adoption. In a multi-cell adoption scenario, you can retain the original name, default, or create a new cell default name by providing the incremented index of the last cell in the source cloud. For example, if the incremented index of the last cell is cell5, the new cell default name is cell6.
export COMPUTES_CELL1= For each cell, update the <["standalone.localdomain"]="x.x.x.x"> value and the COMPUTES_CELL<X> value with the names and IP addresses of the Compute service nodes that are connected to the ctlplane and internalapi networks. Do not specify a real FQDN defined for each network. Always use the same hostname for each connected network of a Compute node. Provide the IP addresses and the names of the hosts on the remaining networks of the source cloud as needed, or you can manually adjust the files that you generate in step 9 of this procedure.
<compute1>, <compute2>, and <compute3> specifies the names of your Compute service nodes for each cell. Assign all Compute service nodes from the source cloud cell1 cell into COMPUTES_CELL1, and so on.
export COMPUTES_CELL<X>=( specifies all Compute service nodes that you assign from the source cloud default cell into COMPUTES_CELL<X> and COMPUTES_API_CELL<X>, where <X> is the DEFAULT_CELL_NAME environment variable value. In this example, the DEFAULT_CELL_NAME environment variable value equals cell3.
export COMPUTES_API_CELL1=( For each cell, update the <["standalone.localdomain"]="192.168.122.100"> value and the COMPUTES_API_CELL<X> value with the names and IP addresses of the Compute service nodes that are connected to the ctlplane and internalapi networks. ["standalone.localdomain"]="192.168.122.100" defines the custom DNS domain in the FQDN value of the nodes. This value is used in the data plane node set spec.nodes.<NODE NAME>.hostName. Do not specify a real FQDN defined for each network. Use the same hostname for each of its connected networks. Provide the IP addresses and the names of the hosts on the remaining networks of the source cloud as needed, or you can manually adjust the files that you generate in step 9 of this procedure.

NODESETS="'openstack-${CELL}', $NODESETS" specifies the cells that contain Compute nodes. Cells that do not contain Compute nodes are omitted from this template because no node sets are created for the cells.

If you deployed the source cloud with a default cell, and want to rename it during adoption, define the new name that you want to use, as shown in the following example:

$ DEFAULT_CELL_NAME="cell1"
$ RENAMED_CELLS="cell1"

Do not set a value for the CEPH_FSID parameter if the local storage back end is configured by the Compute service for libvirt. The storage back end must match the source cloud storage back end. You cannot change the storage back end during adoption.

Procedure

Create a ssh authentication secret for the data plane nodes:

$ oc apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
    name: dataplane-adoption-secret
data:
    ssh-privatekey: |
$(cat ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa | base64 | sed 's/^/        /')
EOF

Generate an ssh key-pair nova-migration-ssh-key secret:

$ cd "$(mktemp -d)"
$ ssh-keygen -f ./id -t ecdsa-sha2-nistp521 -N ''
$ oc get secret nova-migration-ssh-key || oc create secret generic nova-migration-ssh-key \
  --from-file=ssh-privatekey=id \
  --from-file=ssh-publickey=id.pub \
  --type kubernetes.io/ssh-auth
$ rm -f id*
$ cd -

If TLS Everywhere is enabled, set LIBVIRT_PASSWORD to match the existing OSP deployment password:

declare -A TRIPLEO_PASSWORDS
TRIPLEO_PASSWORDS[default]="$HOME/overcloud-passwords.yaml"
LIBVIRT_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' LibvirtTLSPassword:' | awk -F ': ' '{ print $2; }')
LIBVIRT_PASSWORD_BASE64=$(echo -n "$LIBVIRT_PASSWORD" | base64)

Create libvirt-secret when TLS-e is enabled:

$ oc apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: libvirt-secret
type: Opaque
data:
  LibvirtPassword: ${LIBVIRT_PASSWORD_BASE64}
EOF

Create a configuration map to use for all cells to configure a local storage back end for libvirt:

$ oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: nova-cells-global-config
data:
  99-nova-compute-cells-workarounds.conf: |
    [workarounds]
    disable_compute_service_check_for_ffu=true
EOF

data provides the configuration files for all the cells.

99-nova-compute-cells-workarounds.conf: | specifies the index of the <*.conf> files. There is a requirement to index the <*.conf> files from 03 to 99, based on precedence. A <99-*.conf> file takes the highest precedence, while indexes below 03 are reserved for internal use.

If you adopt a live cloud, you might be required to carry over additional configurations for the default nova data plane services that are stored in the cell1 default nova-extra-config configuration map. Do not delete or overwrite the existing configuration in the cell1 default nova-extra-config configuration map that is assigned to nova. Overwriting the configuration can break the data place services that rely on specific contents of the nova-extra-config configuration map.

Configure a Ceph back end for libvirt:

$ oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: nova-cells-global-config
data:
  99-nova-compute-cells-workarounds.conf: |
    [workarounds]
    disable_compute_service_check_for_ffu=true
  03-ceph-nova.conf: |
    [libvirt]
    images_type=rbd
    images_rbd_pool=vms
    images_rbd_ceph_conf=/etc/ceph/ceph.conf
    images_rbd_glance_store_name=default_backend
    images_rbd_glance_copy_poll_interval=15
    images_rbd_glance_copy_timeout=600
    rbd_user=openstack
    rbd_secret_uuid=$CEPH_FSID
EOF

For Ceph environments with multi-cell configurations, you must name configuration maps and OpenStack data plane services similar to the following examples: nova-custom-ceph-cellX and nova-compute-extraconfig-cellX.

Create the data plane services for Compute service cells to enable pre-upgrade workarounds, and to configure the Compute services for your chosen storage back end:

for CELL in $(echo $RENAMED_CELLS); do
oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneService
metadata:
 name: nova-$CELL
spec:
 dataSources:
   - secretRef:
       name: nova-$CELL-compute-config
   - secretRef:
       name: nova-migration-ssh-key
   - configMapRef:
       name: nova-cells-global-config
 playbook: osp.edpm.nova
 caCerts: combined-ca-bundle
 edpmServiceType: nova
 containerImageFields:
 - NovaComputeImage
 - EdpmIscsidImage
EOF
 done

spec.dataSources.secretRef specifies an additional auto-generated nova-cell<X>-metadata-neutron-config secret to enable a local metadata service for cell<X>. You should also set spec.nova.template.cellTemplates.cell<X>.metadataServiceTemplate.enable in the OpenStackControlPlane/openstack CR, as described in Adopting the Compute service. You can configure a single top-level metadata, or define the metadata per cell.
nova-$CELL-compute-config specifies the secret that auto-generates for each cell<X>. You must append the nova-cell<X>-compute-config for each custom OpenStackDataPlaneService CR that is related to the Compute service.

nova-migration-ssh-key specifies the secret that you must append for each custom OpenStackDataPlaneService CR that is related to the Compute service.

When creating your data plane services for Compute service cells, review the following considerations:

In this example, the same nova-migration-ssh-key key is shared across cells. However, you should use different keys for different cells.
For simple configuration overrides, you do not need a custom data plane service. However, to reconfigure the cell, cell1, the safest option is to create a custom service and a dedicated configuration map for it.
The cell, cell1, is already managed with the default OpenStackDataPlaneService CR called nova and its nova-extra-config configuration map. Do not change the default data plane service nova definition. The changes are lost when the RHOSO operator is updated with OLM.
When a cell spans multiple node sets, give the custom OpenStackDataPlaneService resources a name that relates to the node set, for example, nova-cell1-nfv and nova-cell1-enterprise. The auto-generated configuration maps are then named nova-cell1-nfv-extra-config and nova-cell1-enterprise-extra-config.
Different configurations for nodes in multiple node sets of the same cell are also supported, but are not covered in this guide.

If TLS Everywhere is enabled, append the following content to the OpenStackDataPlaneService CR:

  tlsCerts:
    contents:
      - dnsnames
      - ips
    networks:
      - ctlplane
    issuer: osp-rootca-issuer-internal
    edpmRoleServiceName: nova
  caCerts: combined-ca-bundle
  edpmServiceType: nova

You do not need to reference the subscription-manager secret in the dataSources field of the OpenStackDataPlaneService CR. The secret is already passed in with a node-specific OpenStackDataPlaneNodeSet CR in the ansibleVarsFrom property in the nodeTemplate field.

Create the data plane node set definitions for each cell:

$ declare -A names
$ for CELL in $(echo $RENAMED_CELLS); do
  ref="COMPUTES_$(echo ${CELL}|tr [:lower:] [:upper:])"
  eval names=\${!${ref}[@]}
  ref_api="COMPUTES_API_$(echo ${CELL}|tr [:lower:] [:upper:])"
  [ -z "$names" ] && continue
  ind=0
  rm -f computes-$CELL
  for compute in $names; do
    ip="${ref}[$compute]"
    ip_api="${ref_api}[$compute]"
    cat >> computes-$CELL << EOF
    ${compute}:
     hostName: $compute
      ansible:
        ansibleHost: $compute
     networks:
      - defaultRoute: true
        fixedIP: ${!ip}
        name: ctlplane
        subnetName: subnet1
      - name: internalapi
        subnetName: subnet1
        fixedIP: ${!ip_api}
      - name: storage
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
EOF
    ind=$(( ind + 1 ))
  done

  test -f computes-$CELL || continue
  cat > nodeset-${CELL}.yaml <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
 name: openstack-$CELL
spec:
  tlsEnabled: false
  networkAttachments:
      - ctlplane
  preProvisioned: true
  services:
    - bootstrap
    - download-cache
    - configure-network
    - validate-network
    - install-os
    - configure-os
    - ssh-known-hosts
    - run-os
    - reboot-os
    - install-certs
    - ovn
    - neutron-metadata
    - libvirt
    - nova-$CELL
    - telemetry
  env:
    - name: ANSIBLE_CALLBACKS_ENABLED
      value: "profile_tasks"
    - name: ANSIBLE_FORCE_COLOR
      value: "True"
    - name: ANSIBLE_VERBOSITY
      value: 3
  nodeTemplate:
    ansibleSSHPrivateKeySecret: dataplane-adoption-secret
    ansible:
      ansibleUser: root
      ansibleVars:
        edpm_bootstrap_release_version_package: []
        # edpm_network_config
        # Default nic config template for a EDPM node
        # These vars are edpm_network_config role vars
        edpm_network_config_template: |
           ---
           {% set mtu_list = [ctlplane_mtu] %}
           {% for network in nodeset_networks %}
           {% set _ = mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) %}
           {%- endfor %}
           {% set min_viable_mtu = mtu_list | max %}
           network_config:
           - type: ovs_bridge
             name: {{ neutron_physical_bridge_name }}
             mtu: {{ min_viable_mtu }}
             use_dhcp: false
             dns_servers: {{ ctlplane_dns_nameservers }}
             domain: {{ dns_search_domains }}
             addresses:
             - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
             routes: {{ ctlplane_host_routes }}
             members:
             - type: interface
               name: nic1
               mtu: {{ min_viable_mtu }}
               # force the MAC address of the bridge to this interface
               primary: true
           {% for network in nodeset_networks %}
             - type: vlan
               mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }}
               vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }}
               addresses:
               - ip_netmask:
                   {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }}
               routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }}
           {% endfor %}

        edpm_network_config_nmstate: false
        # Control resolv.conf management by NetworkManager
        # false = disable NetworkManager resolv.conf update (default)
        # true = enable NetworkManager resolv.conf update
        edpm_bootstrap_network_resolvconf_update: false
        edpm_network_config_hide_sensitive_logs: false
        #
        # These vars are for the network config templates themselves and are
        # considered EDPM network defaults.
        neutron_physical_bridge_name: br-ctlplane
        neutron_public_interface_name: eth0

        # edpm_nodes_validation
        edpm_nodes_validation_validate_controllers_icmp: false
        edpm_nodes_validation_validate_gateway_icmp: false

        # edpm ovn-controller configuration
        edpm_ovn_bridge_mappings: <bridge_mappings>
        edpm_ovn_bridge: br-int
        edpm_ovn_encap_type: geneve
        ovn_monitor_all: true
        edpm_ovn_remote_probe_interval: 60000
        edpm_ovn_ofctrl_wait_before_clear: 8000

        timesync_ntp_servers:
        - hostname: pool.ntp.org

        edpm_bootstrap_command: |
          # This is a hack to deploy RDO Delorean repos to RHEL as if it were Centos 9 Stream
          set -euxo pipefail
          curl -sL https://github.com/openstack-k8s-operators/repo-setup/archive/refs/heads/main.tar.gz | tar -xz
          python3 -m venv ./venv
          PBR_VERSION=0.0.0 ./venv/bin/pip install ./repo-setup-main
          # This is required for FIPS enabled until trunk.rdoproject.org
          # is not being served from a centos7 host, tracked by
          # https://issues.redhat.com/browse/RHOSZUUL-1517
          dnf -y install crypto-policies
          update-crypto-policies --set FIPS:NO-ENFORCE-EMS
          ./venv/bin/repo-setup current-podified -b antelope -d centos9 --stream
          dnf -y upgrade openstack-selinux
          rm -f /run/virtlogd.pid
          rm -rf repo-setup-main

        gather_facts: false
        # edpm firewall, change the allowed CIDR if needed
        edpm_sshd_configure_firewall: true
        edpm_sshd_allowed_ranges: [192.168.122.0/24]

        # Do not attempt OVS major upgrades here
        edpm_ovs_packages:
        - openvswitch3.3
        edpm_default_mounts:
          - path: /dev/hugepages<size>
            opts: pagesize=<size>
            fstype: hugetlbfs
            group: hugetlbfs
  nodes:
EOF
  cat computes-$CELL >> nodeset-${CELL}.yaml
done

${compute}.hostName specifies the FQDN for the node if your deployment has a custom DNS Domain.
${compute}.networks specifies the network composition. The network composition must match the source cloud configuration to avoid data plane connectivity downtime. The ctlplane network must come first. The commands only retain IP addresses for the hosts on the ctlplane and internalapi networks. Repeat this step for other isolated networks, or update the resulting files manually.
metadata.name: specifies the node set names for each cell, for example, openstack-cell1, openstack-cell2. Only create node sets for cells that contain Compute nodes.
spec.tlsEnabled specifies whether TLS Everywhere is enabled. If it is enabled, change tlsEnabled to true.
spec.services specifies the services to be adopted. If you are not adopting telemetry services, omit it from the services list.
neutron_physical_bridge_name: br-ctlplane specifies the bridge name. The bridge name and other OVN and Networking service-specific values must match the source cloud configuration to avoid data plane connectivity downtime. *edpm_ovn_bridge_mappings: <bridge_mappings> specifies the value of the bridge mappings in your configuration, for example, "datacentre:br-ctlplane".

path: /dev/hugepages<size> and opts: pagesize=<size> configures huge pages. Replace <size> with the size of the page. To configure multi-sized huge pages, create more items in the list. Note that the mount points must match the source cloud configuration.

Ensure that you use the same ovn-controller settings in the OpenStackDataPlaneNodeSet CR that you used in the Compute service nodes before adoption. This configuration is stored in the external_ids column in the Open_vSwitch table in the Open vSwitch database:

$ ovs-vsctl list Open .
...
external_ids        : {hostname=standalone.localdomain, ovn-bridge=br-int, ovn-bridge-mappings=<bridge_mappings>, ovn-chassis-mac-mappings="datacentre:1e:0a:bb:e6:7c:ad", ovn-encap-ip="172.19.0.100", ovn-encap-tos="0", ovn-encap-type=geneve, ovn-match-northd-version=False, ovn-monitor-all=True, ovn-ofctrl-wait-before-clear="8000", ovn-openflow-probe-interval="60", ovn-remote="tcp:ovsdbserver-sb.openstack.svc:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="2eec68e6-aa21-4c95-a868-31aeafc11736"}
...

Deploy the OpenStackDataPlaneNodeSet CRs for each Compute cell:

for CELL in $(echo $RENAMED_CELLS); do
test -f nodeset-${CELL}.yaml || continue
oc apply -f nodeset-${CELL}.yaml
done

If you use a Ceph back end for Block Storage service (cinder), prepare the adopted data plane workloads:

for CELL in $(echo $RENAMED_CELLS); do
test -f nodeset-${CELL}.yaml || continue
oc patch osdpns/openstack-$CELL --type=merge --patch "
spec:
  services:
    - bootstrap
    - download-cache
    - configure-network
    - validate-network
    - install-os
    - configure-os
    - ssh-known-hosts
    - run-os
    - reboot-os
    - ceph-client
    - install-certs
    - ovn
    - neutron-metadata
    - libvirt
    - nova-$CELL
    - telemetry
  nodeTemplate:
    extraMounts:
    - extraVolType: Ceph
      volumes:
      - name: ceph
        secret:
          secretName: ceph-conf-files
      mounts:
      - name: ceph
        mountPath: "/etc/ceph"
        readOnly: true
  "
done

Ensure that you use the same list of services from the original OpenStackDataPlaneNodeSet CR, except for the ceph-client and ceph-hci-pre services.

For environments that are enabled with border gateway protocol (BGP), you must add the following services to the list in the order shown:

After configure-network and before validate-network: Add frr service for FRRouting BGP support
After neutron-metadata and before libvirt: Add ovn-bgp-agent service for OVN BGP agent

You must also configure the following additional Ansible variables in the nodeTemplate.ansible.ansibleVars section:

edpm_frr_image: The FRRouting container image
edpm_ovn_bgp_agent_image: The OVN BGP agent container image
edpm_frr_bgp_ipv4_src_network: The network name for BGP IPv4 source (for example, bgpmainnet)
edpm_frr_bgp_ipv6_src_network: The network name for BGP IPv6 source (for example, bgpmainnetv6)
edpm_frr_bgp_neighbor_password: The BGP neighbor password
edpm_ovn_encap_ip: Set to the BGP main network IP (for example, {{ lookup("vars", "bgpmainnet_ip") }})

Optional: Enable neutron-sriov-nic-agent in the OpenStackDataPlaneNodeSet CR:

for CELL in $(echo $RENAMED_CELLS); do
test -f nodeset-${CELL}.yaml || continue
oc patch openstackdataplanenodeset openstack-$CELL --type='json' --patch='[
{
  "op": "add",
  "path": "/spec/services/-",
  "value": "neutron-sriov"
}, {
  "op": "add",
  "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_physical_device_mappings",
  "value": "dummy_sriov_net:dummy-dev"
}, {
  "op": "add",
  "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_resource_provider_bandwidths",
  "value": "dummy-dev:40000000:40000000"
}, {
  "op": "add",
  "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_neutron_sriov_agent_SRIOV_NIC_resource_provider_hypervisors",
  "value": "dummy-dev:standalone.localdomain"
}]'
done

Optional: Enable neutron-dhcp in the OpenStackDataPlaneNodeSet CR:

for CELL in $(echo $RENAMED_CELLS); do
test -f nodeset-${CELL}.yaml || continue
oc patch openstackdataplanenodeset openstack-$CELL --type='json' --patch='[
{
  "op": "add",
  "path": "/spec/services/-",
  "value": "neutron-dhcp"
}]'
done

To use neutron-dhcp with OVN for the Bare Metal Provisioning service (ironic), you must set the disable_ovn_dhcp_for_baremetal_ports configuration option for the Networking service (neutron) to true. You can set this configuration in the NeutronAPI spec:

..
spec:
  serviceUser: neutron
   ...
      customServiceConfig: |
          [DEFAULT]
          dhcp_agent_notification = True
          [ovn]
          disable_ovn_dhcp_for_baremetal_ports = true

Run the pre-adoption validation:

Create the validation service:

$ oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneService
metadata:
  name: pre-adoption-validation
spec:
  playbook: osp.edpm.pre_adoption_validation
EOF

Create a OpenStackDataPlaneDeployment CR that runs only the validation:

$ oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
  name: openstack-pre-adoption
spec:
  nodeSets: $NODESETS
  servicesOverride:
  - pre-adoption-validation
EOF

If you created different migration SSH keys for different OpenStackDataPlaneService CRs, you should also define a separate OpenStackDataPlaneDeployment CR for each node set or node sets that represent a cell.

When the validation is finished, confirm that the status of the Ansible EE pods is Completed:

$ watch oc get pod -l app=openstackansibleee

$ oc logs -l app=openstackansibleee -f --max-log-requests 20

Wait for the deployment to reach the Ready status:

$ oc wait --for condition=Ready openstackdataplanedeployment/openstack-pre-adoption --timeout=10m

If any openstack-pre-adoption validations fail, you must reference the Ansible logs to determine which ones were unsuccessful, and then try the following troubleshooting options:

If the hostname validation failed, check that the hostname of the data plane node is correctly listed in the OpenStackDataPlaneNodeSet CR.
If the kernel argument check failed, ensure that the kernel argument configuration in the edpm_kernel_args and edpm_kernel_hugepages variables in the OpenStackDataPlaneNodeSet CR is the same as the kernel argument configuration that you used in the OpenStack (OSP) node.
If the tuned profile check failed, ensure that the edpm_tuned_profile variable in the OpenStackDataPlaneNodeSet CR is configured to use the same profile as the one set on the OSP node.

For environments that are enabled with border gateway protocol (BGP), preserve the default routes on the data plane nodes.

When adopting OpenStack environments with BGP, default routes can be lost when the data plane adoption procedure stops the OpenStack services, specifically when FRRouting (FRR) is stopped. This causes connectivity issues during the RHOSO data plane deployment.

To prevent this, configure the required routes by using os-net-config on the data plane nodes (Compute nodes and Networker nodes) affected by this issue.

Modify your os-net-config configuration file by adding the required routes, and then apply it:

$ sudo os-net-config -c /etc/os-net-config/modified_config_with_routes.yaml --provider ifcfg

This temporary default route is needed during the installation of the first services (such as download-cache) and is removed when the configure-network service applies the new network configuration.

Remove the remaining TripleO services:

Create an OpenStackDataPlaneService CR to clean up the data plane services you are adopting:

$ oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneService
metadata:
  name: tripleo-cleanup
spec:
  playbook: osp.edpm.tripleo_cleanup
EOF

Create the OpenStackDataPlaneDeployment CR to run the clean-up:

$ oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
  name: tripleo-cleanup
spec:
  nodeSets: $NODESETS
  servicesOverride:
  - tripleo-cleanup
EOF

When the clean-up is finished, deploy the OpenStackDataPlaneDeployment CR:

$ oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
  name: openstack
spec:
  nodeSets: $NODESETS
EOF

If you have other node sets to deploy, such as Networker nodes, you can add them in the nodeSets list in this step, or create separate OpenStackDataPlaneDeployment CRs later. You cannot add new node sets to an OpenStackDataPlaneDeployment CR after deployment.

Verification

Confirm that all the Ansible EE pods reach a Completed status:

$ watch oc get pod -l app=openstackansibleee

$ oc logs -l app=openstackansibleee -f --max-log-requests 20

Wait for the data plane node sets to reach the Ready status:

for CELL in $(echo $RENAMED_CELLS); do
oc wait --for condition=Ready osdpns/openstack-$CELL --timeout=30m
done

Verify that the Networking service (neutron) agents are running:

$ oc exec openstackclient -- openstack network agent list
+--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
| ID                                   | Agent Type                   | Host                   | Availability Zone | Alive | State | Binary                     |
+--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
| 174fc099-5cc9-4348-b8fc-59ed44fcfb0e | DHCP agent                   | standalone.localdomain | nova              | :-)   | UP    | neutron-dhcp-agent         |
| 10482583-2130-5b0d-958f-3430da21b929 | OVN Metadata agent           | standalone.localdomain |                   | :-)   | UP    | neutron-ovn-metadata-agent |
| a4f1b584-16f1-4937-b2b0-28102a3f6eaa | OVN Controller agent         | standalone.localdomain |                   | :-)   | UP    | ovn-controller             |
+--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+

After you remove all the services from the TripleO cell controllers, you can decomission the cell controllers. To create new cell Compute nodes, you re-provision the decomissioned controllers as new data plane hosts and add them to the node sets of corresponding or new cells.

Next steps

You must perform a fast-forward upgrade on your Compute services. For more information, see Performing a fast-forward upgrade on Compute services.

5.3. Configuring data plane node sets for DCN sites

If you are adopting a Distributed Compute Node (DCN) deployment, you must create separate OpenStackDataPlaneNodeSet custom resources (CRs) for each site. Each node set requires site-specific configuration for network subnets, OVN bridge mappings, and inter-site routes.

Prerequisites

You have adopted the OpenStack (OSP) control plane to Red Hat OpenStack Services on OpenShift (RHOSO).
You have configured control plane networking for your spine-leaf topology, including multi-subnet NetConfig and NetworkAttachmentDefinition CRs with routes to remote sites. For more information, see Configuring control plane networking for spine-leaf topologies.
You have the network configuration information for each DCN site:
- IP addresses and hostnames for all Compute nodes
- VLAN IDs for each service network
- Gateway addresses for inter-site routing
You have identified the OVN bridge mappings (physnets) for each site.

Procedure

Define the OVN bridge mappings for each site. Each site requires a unique physnet that maps to the local provider network bridge:

Table 5. Example OVN bridge mappings

Site OVN bridge mapping

Central

leaf0:br-ex

DCN1

leaf1:br-ex

DCN2

leaf2:br-ex

Configure OVN for DCN sites. The default OVN controller configuration uses the Kubernetes ClusterIP (ovsdbserver-sb.openstack.svc), which is not routable from remote DCN sites. You must create a DCN-specific configuration that uses direct internalapi IP addresses.

Get the OVN Southbound database internalapi IP addresses:

$ oc get pod -l service=ovsdbserver-sb -o jsonpath='{range .items[*]}{.metadata.annotations.k8s\.v1\.cni\.cncf\.io/network-status}{"\n"}{end}' | jq -r '.[] | select(.name=="openstack/internalapi") | .ips[0]'

Example output:

172.17.0.34
172.17.0.35
172.17.0.36

Create a ConfigMap with the OVN SB direct IPs for DCN sites:

$ oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: ovncontroller-config-dcn
  namespace: openstack
data:
  ovsdb-config: |
    ovn-remote: tcp:172.17.0.34:6642,tcp:172.17.0.35:6642,tcp:172.17.0.36:6642
EOF

Replace the IP addresses with the actual internalapi IPs from the previous step.

Create an OpenStackDataPlaneService CR for DCN OVN configuration:

$ oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneService
metadata:
  name: ovn-dcn
  namespace: openstack
spec:
  addCertMounts: false
  caCerts: combined-ca-bundle
  containerImageFields:
  - OvnControllerImage
  dataSources:
  - configMapRef:
      name: ovncontroller-config-dcn
  edpmServiceType: ovn
  playbook: osp.edpm.ovn
  tlsCerts:
    default:
      contents:
      - dnsnames
      - ips
      issuer: osp-rootca-issuer-ovn
      keyUsages:
      - digital signature
      - key encipherment
      - server auth
      - client auth
      networks:
      - ctlplane
EOF

The ovn-dcn service uses the ovncontroller-config-dcn ConfigMap (through dataSources), which contains the direct internalapi IPs instead of the ClusterIP. DCN node sets must use this service instead of the default ovn service.

Create an OpenStackDataPlaneNodeSet CR for the central site Compute nodes:

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
  name: openstack-edpm
spec:
  tlsEnabled: false
  networkAttachments:
    - ctlplane
  preProvisioned: true
  services:
    - bootstrap
    - download-cache
    - configure-network
    - validate-network
    - install-os
    - configure-os
    - ssh-known-hosts
    - run-os
    - reboot-os
    - install-certs
    - ovn
    - neutron-metadata
    - libvirt
    - nova-cell1
    - telemetry
  nodeTemplate:
    ansibleSSHPrivateKeySecret: dataplane-adoption-secret
    ansible:
      ansibleUser: root
      ansibleVars:
        edpm_ovn_bridge_mappings: ["leaf0:br-ex"]
        edpm_ovn_bridge: br-int
        edpm_ovn_encap_type: geneve
        # Network configuration template for central site
        edpm_network_config_template: |
          ---
          network_config:
          - type: ovs_bridge
            name: {{ neutron_physical_bridge_name }}
            use_dhcp: false
            dns_servers: {{ ctlplane_dns_nameservers }}
            addresses:
            - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
            routes: {{ ctlplane_host_routes }}
            members:
            - type: interface
              name: nic1
              primary: true
          {% for network in nodeset_networks %}
            - type: vlan
              vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }}
              addresses:
              - ip_netmask:
                  {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }}
              routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }}
          {% endfor %}
  nodes:
    compute-0:
      hostName: compute-0.example.com
      ansible:
        ansibleHost: compute-0.example.com
      networks:
      - defaultRoute: true
        fixedIP: 192.168.122.100
        name: ctlplane
        subnetName: subnet1
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1

The OVN bridge mapping uses the central site physnet leaf0.
Central site nodes reference subnet1 for all networks.

Create an OpenStackDataPlaneNodeSet CR for DCN1 edge site compute nodes. You must add inter-site routes to the network configuration template and use the ovn-dcn service:

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
  name: openstack-edpm-dcn1
spec:
  tlsEnabled: false
  networkAttachments:
    - ctlplane
  preProvisioned: true
  services:
    - redhat
    - bootstrap
    - download-cache
    - configure-network
    - validate-network
    - install-os
    - configure-os
    - ssh-known-hosts
    - run-os
    - reboot-os
    - install-certs
    - ovn-dcn
    - neutron-metadata
    - libvirt
    - nova-cell1
    - telemetry
  nodeTemplate:
    ansibleSSHPrivateKeySecret: dataplane-adoption-secret
    ansible:
      ansibleUser: root
      ansibleVars:
        edpm_ovn_bridge_mappings: ["leaf1:br-ex"]
        edpm_ovn_bridge: br-int
        edpm_ovn_encap_type: geneve
        # Network configuration template for DCN1 site with inter-site routes
        edpm_network_config_template: |
          ---
          network_config:
          - type: ovs_bridge
            name: {{ neutron_physical_bridge_name }}
            use_dhcp: false
            dns_servers: {{ ctlplane_dns_nameservers }}
            addresses:
            - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
            routes:  (3)
              {{ ctlplane_host_routes }}
              - ip_netmask: 192.168.122.0/24
                next_hop: 192.168.133.1
              - ip_netmask: 192.168.144.0/24
                next_hop: 192.168.133.1
            members:
            - type: interface
              name: nic1
              primary: true
          {% for network in nodeset_networks %}
            - type: vlan
              vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }}
              addresses:
              - ip_netmask:
                  {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }}
              routes:
                {{ lookup(vars, networks_lower[network] ~ _host_routes) }}
                {% if network == internalapi %}
                - ip_netmask: 172.17.0.0/24
                  next_hop: 172.17.10.1
                - ip_netmask: 172.17.20.0/24
                  next_hop: 172.17.10.1
                {% endif %}
                {% if network == storage %}
                - ip_netmask: 172.18.0.0/24
                  next_hop: 172.18.10.1
                - ip_netmask: 172.18.20.0/24
                  next_hop: 172.18.10.1
                {% endif %}
                {% if network == tenant %}
                - ip_netmask: 172.19.0.0/24
                  next_hop: 172.19.10.1
                - ip_netmask: 172.19.20.0/24
                  next_hop: 172.19.10.1
                {% endif %}
          {% endfor %}
  nodes:
    dcn1-compute-0:
      hostName: dcn1-compute-0.example.com
      ansible:
        ansibleHost: dcn1-compute-0.example.com
      networks:
      - defaultRoute: true
        fixedIP: 192.168.133.100
        name: ctlplane
        subnetName: ctlplanedcn1
      - name: internalapi
        subnetName: internalapidcn1
      - name: storage
        subnetName: storagedcn1
      - name: tenant
        subnetName: tenantdcn1

Replace ovn with ovn-dcn under spec:services. This ensures OVN controller connects to the OVN Southbound database using direct internalapi IPs instead of the unreachable ClusterIP.
DCN1 uses the leaf1 physnet, for its OVN bridge mapping under spec:nodeTemplate:ansible:ansibleVars:edpm_ovn_bridge_mappings.
Inter-site routes must be added to the network configuration template. These routes enable DCN1 compute nodes to reach the central site (192.168.122.0/24) and other DCN sites (192.168.144.0/24 for DCN2). Similar routes are added for each service network (internalapi, storage, tenant).
DCN1 nodes reference site-specific subnet names like ctlplanedcn1 and internalapidcn1. These subnet names must match those defined in the NetConfig CR.

Repeat step 3 for all other DCN sites. Adjust site specific parameters:
- The nodeset name, for example: openstack-edpm-dcn2
- The OVN bridge mapping, for example: leaf2:br-ex
- The subnet names, for example: ctlplanedcn2, and internalapidcn2
- The inter-site routes. The routes from DCN2 should point to the central site subnets and the DCN1 site subnets.
- The compute node definitions with site-appropriate IP addresses.

Deploy all nodesets by creating an OpenStackDataPlaneDeployment CR:

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
  name: openstack-edpm-deployment
spec:
  nodeSets:
    - openstack-edpm
    - openstack-edpm-dcn1
    - openstack-edpm-dcn2

All nodesets can be deployed in parallel once the control plane adoption is complete.

Wait for the deployment to complete:

$ oc wait --for condition=Ready openstackdataplanedeployment/openstack-edpm-deployment --timeout=40m

Verification

Verify that all node sets reach the Ready status:

$ oc get openstackdataplanenodeset
NAME                  STATUS   MESSAGE
openstack-edpm        True     Ready
openstack-edpm-dcn1   True     Ready
openstack-edpm-dcn2   True     Ready

Verify that Compute services are running across all sites. Ensure that all nova-compute services show State=up for nodes in all availability zones:
```
$ oc exec openstackclient -- openstack compute service list
```

Verify inter-site connectivity by checking routes on a DCN Compute node:

$ ssh dcn1-compute-0 ip route show | grep 172.17.0
172.17.0.0/24 via 172.17.10.1 dev internalapi

Test that DCN Compute nodes can reach the control plane:
```
$ ssh dcn1-compute-0 ping -c 3 172.17.0.30
```
Replace 172.17.0.30 with an IP address of a control plane service on the internalapi network.

5.4. Performing a fast-forward upgrade on Compute services

You must upgrade the Compute services from OpenStack to Red Hat OpenStack Services on OpenShift (RHOSO) Antelope on the control plane and data plane by completing the following tasks:

Update the cell1 Compute data plane services version.
Remove pre-fast-forward upgrade workarounds from the Compute control plane services and Compute data plane services.
Run Compute database online migrations to update live data.

Prerequisites

Define the shell variables necessary to apply the fast-forward upgrade commands for each Compute service cell.

DEFAULT_CELL_NAME="cell1"
RENAMED_CELLS="$DEFAULT_CELL_NAME"

declare -A PODIFIED_DB_ROOT_PASSWORD
for CELL in $(echo "super $RENAMED_CELLS"); do
  PODIFIED_DB_ROOT_PASSWORD[$CELL]=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d)
done

Complete the steps in Adopting Compute services to the RHOSO data plane.

Procedure

Wait for the Compute service data plane services version to update for all the cells:

for CELL in $(echo $RENAMED_CELLS); do
oc exec openstack-$CELL-galera-0 -c galera -- mysql -rs -uroot -p"{PODIFIED_DB_ROOT_PASSWORD[$CELL]}" \
  -e "select a.version from nova_${CELL}.services a join nova_${CELL}.services b where a.version!=b.version and a.binary='nova-compute' and a.deleted=0;"
done

The query returns an empty result when the update is completed. No downtime is expected for virtual machine (VM) workloads.

Review any errors in the nova Compute agent logs on the data plane, and the nova-conductor journal records on the control plane.

Patch the OpenStackControlPlane CR to remove the pre-fast-forward upgrade workarounds from the Compute control plane services:

$ rm -f celltemplates
$ for CELL in $(echo $RENAMED_CELLS); do
$ cat >> celltemplates << EOF
        ${CELL}:
          metadataServiceTemplate:
            customServiceConfig: |
              [workarounds]
              disable_compute_service_check_for_ffu=false
          conductorServiceTemplate:
            customServiceConfig: |
              [workarounds]
              disable_compute_service_check_for_ffu=false
EOF
done

$ cat > oscp-patch.yaml << EOF
spec:
  nova:
    template:
      apiServiceTemplate:
        customServiceConfig: |
          [workarounds]
          disable_compute_service_check_for_ffu=false
      metadataServiceTemplate:
        customServiceConfig: |
          [workarounds]
          disable_compute_service_check_for_ffu=false
      schedulerServiceTemplate:
        customServiceConfig: |
          [workarounds]
          disable_compute_service_check_for_ffu=false
      cellTemplates:
        cell0:
          conductorServiceTemplate:
            customServiceConfig: |
              [workarounds]
              disable_compute_service_check_for_ffu=false
EOF
$ cat celltemplates >> oscp-patch.yaml

If you are adopting the Compute service with the Bare Metal Provisioning service (ironic), append the following novaComputeTemplates in the cell<X> section of the Compute service CR patch:

        cell<X>:
          novaComputeTemplates:
            <hostname>:
              customServiceConfig: |
                [DEFAULT]
                host = <hostname>
                [workarounds]
                disable_compute_service_check_for_ffu=true
              computeDriver: ironic.IronicDriver
        ...

where:

<hostname>: Specifies the hostname of the node that is running the ironic Compute driver in the source cloud of cell<X>.

Apply the patch file:

$ oc patch openstackcontrolplane openstack --type=merge --patch-file=oscp-patch.yaml

Wait until the Compute control plane services CRs are ready:
```
$ oc wait --for condition=Ready --timeout=300s Nova/nova
```

Remove the pre-fast-forward upgrade workarounds from the Compute data plane services:

$ oc patch cm nova-cells-global-config --type=json -p='[{"op": "replace", "path": "/data/99-nova-compute-cells-workarounds.conf", "value": "[workarounds]\n"}]'
$ for CELL in $(echo $RENAMED_CELLS); do
$ oc get Openstackdataplanenodeset openstack-${CELL} || continue
$ oc apply -f - <<EOF
---
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
  name: openstack-nova-compute-ffu-$CELL
spec:
  nodeSets:
    - openstack-${CELL}
  servicesOverride:
    - nova-${CELL}
backoffLimit: 3
EOF
done

Wait for the Compute data plane services to be ready for all the cells:

$ oc wait --for condition=Ready openstackdataplanedeployments --all --timeout=5m

Run Compute database online migrations to complete the upgrade:

$ oc exec -it nova-cell0-conductor-0 -- nova-manage db online_data_migrations
$ for CELL in $(echo $RENAMED_CELLS); do
$ oc exec -it nova-${CELL}-conductor-0 -- nova-manage db online_data_migrations
done

Discover the Compute hosts in the cells:

$ oc rsh nova-cell0-conductor-0 nova-manage cell_v2 discover_hosts --verbose

If you have a test VM that is not a production workload, complete the following verification steps:

Verify if the existing test VM instance is running:

${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test 2>&1 || echo FAIL

Verify if the Compute services can stop the existing test VM instance:

${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test ACTIVE" && ${BASH_ALIASES[openstack]} server stop test || echo PASS
${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test SHUTOFF" || echo FAIL
${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test 2>&1 || echo PASS

Verify if the Compute services can start the existing test VM instance:

${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test SHUTOFF" && ${BASH_ALIASES[openstack]} server start test || echo PASS
${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test ACTIVE" && \
  ${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test --fit-width -f json | jq -r '.state' | grep running || echo FAIL

Next steps

After the data plane adoption, the Compute hosts continue to run Red Hat Enterprise Linux (RHEL) 9.2. To take advantage of RHEL 9.4, perform a minor update procedure after finishing the adoption procedure.

5.5. Adopting Networker services to the RHOSO data plane

Adopt the Networker services in your existing OpenStack deployment to the Red Hat OpenStack Services on OpenShift (RHOSO) data plane. The Networker services could be running on Conroller nodes or dedicated Networker nodes. You decide which services you want to run on the Networker nodes, and create a separate OpenStackDataPlaneNodeSet custom resource (CR) for the Networker nodes. You might also decide to implement the following options if they apply to your environment:

Depending on your topology, you might need to run the neutron-metadata service on the nodes, specifically when you want to serve metadata to SR-IOV ports that are hosted on Compute nodes.
If you want to continue running OVN gateway services on Networker nodes, keep ovn service in the list to deploy.
Optional: You can run the neutron-dhcp service on your Networker nodes instead of your Compute nodes. You might not need to use neutron-dhcp with OVN, unless your deployment uses DHCP relays, or advanced DHCP options that are supported by dnsmasq but not by the OVN DHCP implementation.

Adopt each Controller or Networker node in your existing OpenStack deployment to the Red Hat OpenStack Services on OpenShift (RHOSO) when your node is set as an OVN chassis gateway. Any node with parameter set to enable-chassis-as-gw is considered OVN gateway chassis. In this case, such nodes will become edpm networker nodes after adoption.

Check for the nodes where OVN Controller Gateway agent agents are running. The list of agents varies depending on the services you enabled:

$ oc exec openstackclient -- openstack network agent list
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+
| ID                                   | Agent Type                   | Host                     | Availability Zone | Alive | State | Binary                     |
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+
| e5075ee0-9dd9-4f0a-a42a-6bbdf1a6111c | OVN Controller Gateway agent | controller-0.localdomain |                   | XXX   | UP    | ovn-controller             |
| f3112349-054c-403a-b00a-e219238192b8 | OVN Controller agent         | compute-0.localdomain    |                   | XXX   | UP    | ovn-controller             |
| af9dae2d-1c1c-55a8-a743-f84719f6406d | OVN Metadata agent           | compute-0.localdomain    |                   | XXX   | UP    | neutron-ovn-metadata-agent |
| 51a11df8-a66e-47a2-aec0-52eb8589626c | OVN Controller Gateway agent | controller-1.localdomain |                   | XXX   | UP    | ovn-controller             |
| bb817e5e-7832-410a-9e67-934dac8c602f | OVN Controller Gateway agent | controller-2.localdomain |                   | XXX   | UP    | ovn-controller             |
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+

Prerequisites

Define the shell variable. Based on above agent list output, controller-0, controller-1, controller-2 are our target hosts. If you have both Controller and Networker nodes running networker services then add all those hosts below.
```
declare -A networkers
networkers+=(
  ["controller-0.localdomain"]="192.168.122.100"
  ["controller-1.localdomain"]="192.168.122.101"
  ["controller-2.localdomain"]="192.168.122.102"
  # ...
)
```
- Replace ["<node-name>"]="192.168.122.100" with the name and IP address of the corresponding Networker or Controller node as per your environment.

Procedure

Deploy the OpenStackDataPlaneNodeSet CR for your nodes:

You can reuse most of the nodeTemplate section from the OpenStackDataPlaneNodeSet CR that is designated for your Compute nodes. You can omit some of the variables because of the limited set of services that are running on the Networker nodes.

$ oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
  name: openstack-networker
spec:
  tlsEnabled: false
  networkAttachments:
      - ctlplane
  preProvisioned: true
  services:
    - bootstrap
    - download-cache
    - configure-network
    - validate-network
    - install-os
    - configure-os
    - ssh-known-hosts
    - run-os
    - install-certs
    - ovn
  env:
    - name: ANSIBLE_CALLBACKS_ENABLED
      value: "profile_tasks"
    - name: ANSIBLE_FORCE_COLOR
      value: "True"
  nodes:
    controller-0:
      hostName: controller-0
      ansible:
        ansibleHost: ${networkers[controller-0.localdomain]}
      networks:
      - defaultRoute: true
        fixedIP: ${networkers[controller-0.localdomain]}
        name: ctlplane
        subnetName: subnet1
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
    controller-1:
      hostName: controller-1
      ansible:
        ansibleHost: ${networkers[controller-1.localdomain]}
      networks:
      - defaultRoute: true
        fixedIP: ${networkers[controller-1.localdomain]}
        name: ctlplane
        subnetName: subnet1
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
    controller-2:
      hostName: controller-2
      ansible:
        ansibleHost: ${networkers[controller-2.localdomain]}
      networks:
      - defaultRoute: true
        fixedIP: ${networkers[controller-2.localdomain]}
        name: ctlplane
        subnetName: subnet1
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
  nodeTemplate:
    ansibleSSHPrivateKeySecret: dataplane-adoption-secret
    ansible:
      ansibleUser: root
      ansibleVars:
        edpm_bootstrap_release_version_package: []
        # edpm_network_config
        # Default nic config template for a EDPM node
        # These vars are edpm_network_config role vars
        edpm_network_config_template: |
           ---
           {% set mtu_list = [ctlplane_mtu] %}
           {% for network in nodeset_networks %}
           {% set _ = mtu_list.append(lookup(vars, networks_lower[network] ~ _mtu)) %}
           {%- endfor %}
           {% set min_viable_mtu = mtu_list | max %}
           network_config:
           - type: ovs_bridge
             name: {{ neutron_physical_bridge_name }}
             mtu: {{ min_viable_mtu }}
             use_dhcp: false
             dns_servers: {{ ctlplane_dns_nameservers }}
             domain: {{ dns_search_domains }}
             addresses:
             - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_cidr }}
             routes: {{ ctlplane_host_routes }}
             members:
             - type: interface
               name: nic1
               mtu: {{ min_viable_mtu }}
               # force the MAC address of the bridge to this interface
               primary: true
           {% for network in nodeset_networks %}
             - type: vlan
               mtu: {{ lookup(vars, networks_lower[network] ~ _mtu) }}
               vlan_id: {{ lookup(vars, networks_lower[network] ~ _vlan_id) }}
               addresses:
               - ip_netmask:
                   {{ lookup(vars, networks_lower[network] ~ _ip) }}/{{ lookup(vars, networks_lower[network] ~ _cidr) }}
               routes: {{ lookup(vars, networks_lower[network] ~ _host_routes) }}
           {% endfor %}
        edpm_network_config_nmstate: false
        edpm_network_config_hide_sensitive_logs: false
        #
        # These vars are for the network config templates themselves and are
        # considered EDPM network defaults.
        neutron_physical_bridge_name: br-ctlplane
        neutron_public_interface_name: eth0

        # edpm_nodes_validation
        edpm_nodes_validation_validate_controllers_icmp: false
        edpm_nodes_validation_validate_gateway_icmp: false

        # edpm ovn-controller configuration
        edpm_ovn_bridge_mappings: <bridge_mappings>
        edpm_ovn_bridge: br-int
        edpm_ovn_encap_type: geneve
        ovn_monitor_all: true
        edpm_ovn_remote_probe_interval: 60000
        edpm_ovn_ofctrl_wait_before_clear: 8000

        # serve as a OVN gateway
        edpm_enable_chassis_gw: true

        timesync_ntp_servers:
        - hostname: pool.ntp.org

        edpm_bootstrap_command: |
          # This is a hack to deploy RDO Delorean repos to RHEL as if it were Centos 9 Stream
          set -euxo pipefail
          curl -sL https://github.com/openstack-k8s-operators/repo-setup/archive/refs/heads/main.tar.gz | tar -xz
          python3 -m venv ./venv
          PBR_VERSION=0.0.0 ./venv/bin/pip install ./repo-setup-main
          # This is required for FIPS enabled until trunk.rdoproject.org
          # is not being served from a centos7 host, tracked by
          # https://issues.redhat.com/browse/RHOSZUUL-1517
          dnf -y install crypto-policies
          update-crypto-policies --set FIPS:NO-ENFORCE-EMS
          ./venv/bin/repo-setup current-podified -b antelope -d centos9 --stream
          rm -rf repo-setup-main

        gather_facts: false
        enable_debug: false
        # edpm firewall, change the allowed CIDR if needed
        edpm_sshd_configure_firewall: true
        edpm_sshd_allowed_ranges: [192.168.122.0/24]
        # SELinux module
        edpm_selinux_mode: enforcing

        # Do not attempt OVS major upgrades here
        edpm_ovs_packages:
        - openvswitch3.3
EOF

spec.tlsEnabled specifies whether TLS Everywhere is enabled. If TLS is enabled, change spec:tlsEnabled to true.
edpm_ovn_bridge_mappings: <bridge_mappings> specifies the bridge mapping values that you used in your OpenStack deployment.

edpm_enable_chassis_gw specifies whether to run ovn-controller in gateway mode.

For environments that are enabled with border gateway protocol (BGP), preserve the default routes on the data plane nodes.

To prevent this, configure the required routes by using os-net-config on the data plane nodes (Compute nodes and Networker nodes) affected by this issue.

Modify your os-net-config configuration file by adding the required routes, and then apply it:

$ sudo os-net-config -c /etc/os-net-config/modified_config_with_routes.yaml --provider ifcfg

For environments that are enabled with border gateway protocol (BGP), you must add the following services to the services list in the order shown:

After configure-network and before validate-network: Add frr service for FRRouting BGP support
After ovn and neutron-metadata services: Add ovn-bgp-agent service

You must also configure the following additional Ansible variables in the nodeTemplate.ansible.ansibleVars section:

edpm_frr_image: The FRRouting container image
edpm_ovn_bgp_agent_image: The OVN BGP agent container image
edpm_frr_bgp_ipv4_src_network: The network name for BGP IPv4 source (for example, bgpmainnet)
edpm_frr_bgp_ipv6_src_network: The network name for BGP IPv6 source (for example, bgpmainnetv6)
edpm_frr_bgp_neighbor_password: The BGP neighbor password
edpm_ovn_encap_ip: Set to the BGP main network IP (for example, {{ lookup("vars", "bgpmainnet_ip") }})

Ensure that you use the same ovn-controller settings in the OpenStackDataPlaneNodeSet CR that you used in the Networker nodes before adoption. This configuration is stored in the external_ids column in the Open_vSwitch table in the Open vSwitch database:

ovs-vsctl list Open .
...
external_ids        : {hostname=controller-0.localdomain, ovn-bridge=br-int, ovn-bridge-mappings=<bridge_mappings>, ovn-chassis-mac-mappings="datacentre:1e:0a:bb:e6:7c:ad", ovn-cms-options=enable-chassis-as-gw, ovn-encap-ip="172.19.0.100", ovn-encap-tos="0", ovn-encap-type=geneve, ovn-match-northd-version=False, ovn-monitor-all=True, ovn-ofctrl-wait-before-clear="8000", ovn-openflow-probe-interval="60", ovn-remote="tcp:ovsdbserver-sb.openstack.svc:6642", ovn-remote-probe-interval="60000", rundir="/var/run/openvswitch", system-id="2eec68e6-aa21-4c95-a868-31aeafc11736"}
...

Replace <bridge_mappings> with the value of the bridge mappings in your configuration, for example, "datacentre:br-ctlplane".

Optional: Enable neutron-metadata in the OpenStackDataPlaneNodeSet CR:
```
$ oc patch openstackdataplanenodeset <networker_CR_name> --type='json' --patch='[
  {
    "op": "add",
    "path": "/spec/services/-",
    "value": "neutron-metadata"
  }]'
```
- Replace <networker_CR_name> with the name of the CR that you deployed for your Networker nodes, for example, openstack-networker.

Optional: Enable neutron-dhcp in the OpenStackDataPlaneNodeSet CR:

$ oc patch openstackdataplanenodeset <networker_CR_name> --type='json' --patch='[
  {
    "op": "add",
    "path": "/spec/services/-",
    "value": "neutron-dhcp"
  }]'

Run the pre-adoption-validation service for Networker nodes:

Create a OpenStackDataPlaneDeployment CR that runs only the validation:

$ oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
  name: openstack-pre-adoption-networker
spec:
  nodeSets:
  - openstack-networker
  servicesOverride:
  - pre-adoption-validation
EOF

When the validation is finished, confirm that the status of the Ansible EE pods is Completed:

$ watch oc get pod -l app=openstackansibleee

$ oc logs -l app=openstackansibleee -f --max-log-requests 20

Wait for the deployment to reach the Ready status:

$ oc wait --for condition=Ready openstackdataplanedeployment/openstack-pre-adoption-networker --timeout=10m

Deploy the OpenStackDataPlaneDeployment CR for Networker nodes:

$ oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
  name: openstack-networker
spec:
  nodeSets:
  - openstack-networker
EOF

Alternatively, you can include the Networker node set in the nodeSets list before you deploy the main OpenStackDataPlaneDeployment CR. You cannot add new node sets to the OpenStackDataPlaneDeployment CR after deployment.

Clean up any Networking service (neutron) agents that are no longer running.

In some cases, agents from the old data plane that are replaced or retired remain in RHOSO. The function these agents provided might be provided by a new agent that is running in RHOSO, or the function might be replaced by other components. For example, DHCP agents might no longer be needed, since OVN DHCP in RHOSO can provide this function.

List the agents:

$ oc exec openstackclient -- openstack network agent list
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+
| ID                                   | Agent Type                   | Host                     | Availability Zone | Alive | State | Binary                     |
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+
| e5075ee0-9dd9-4f0a-a42a-6bbdf1a6111c | OVN Controller Gateway agent | controller-0.localdomain |                   | :-)   | UP    | ovn-controller             |
| 856960f0-5530-46c7-a331-6eadcba362da | DHCP agent                   | controller-1.localdomain | nova              | XXX   | UP    | neutron-dhcp-agent         |
| 8bd22720-789f-45b8-8d7d-006dee862bf9 | DHCP agent                   | controller-2.localdomain | nova              | XXX   | UP    | neutron-dhcp-agent         |
| e584e00d-be4c-4e98-a11a-4ecd87d21be7 | DHCP agent                   | controller-0.localdomain | nova              | XXX   | UP    | neutron-dhcp-agent         |
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+

If any agent in the list shows XXX in the Alive field, verify the Host and Agent Type, if the functions of this agent is no longer required, and the agent has been permanently stopped on the OpenStack host. Then, delete the agent:
```
$ oc exec openstackclient -- openstack network agent delete <agent_id>
```
- Replace <agent_id> with the ID of the agent to delete, for example, 856960f0-5530-46c7-a331-6eadcba362da.

Verification

Confirm that all the Ansible EE pods reach a Completed status:

$ watch oc get pod -l app=openstackansibleee

$ oc logs -l app=openstackansibleee -f --max-log-requests 20

Wait for the data plane node set to reach the Ready status:
```
$ oc wait --for condition=Ready osdpns/<networker_CR_name> --timeout=30m
```
- Replace <networker_CR_name> with the name of the CR that you deployed for your Networker nodes, for example, openstack-networker.

Verify that the Networking service (neutron) agents are running. The list of agents varies depending on the services you enabled:

$ oc exec openstackclient -- openstack network agent list
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+
| ID                                   | Agent Type                   | Host                     | Availability Zone | Alive | State | Binary                     |
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+
| e5075ee0-9dd9-4f0a-a42a-6bbdf1a6111c | OVN Controller Gateway agent | controller-0.localdomain |                   | :-)   | UP    | ovn-controller             |
| f3112349-054c-403a-b00a-e219238192b8 | OVN Controller agent         | compute-0.localdomain    |                   | :-)   | UP    | ovn-controller             |
| af9dae2d-1c1c-55a8-a743-f84719f6406d | OVN Metadata agent           | compute-0.localdomain    |                   | :-)   | UP    | neutron-ovn-metadata-agent |
| 51a11df8-a66e-47a2-aec0-52eb8589626c | OVN Controller Gateway agent | controller-1.localdomain |                   | :-)   | UP    | ovn-controller             |
| bb817e5e-7832-410a-9e67-934dac8c602f | OVN Controller Gateway agent | controller-2.localdomain |                   | :-)   | UP    | ovn-controller             |
+--------------------------------------+------------------------------+--------------------------+-------------------+-------+-------+----------------------------+

5.6. Enabling the high availability for Compute instances service

To enable the high availability for Compute instances (Instance HA) service, you create the following resources:

Fencing secret.
Configuration map. You can create the configuration map manually, or the configuration map is created automatically when you deploy the Instance HA resource. However, you must create the configuration map manually if you want to disable the Instance HA service.
Instance HA resource.

Prerequisites

You have created the fencing-secret.yaml configuration file. For more information, see Maintaining the Instance HA functionality after adoption.
You have disabled Pacemaker on your Compute nodes. For more information, see Preventing Pacemaker from monitoring Compute nodes.

Procedure

Create the secret:

$ oc apply -f fencing-secret.yaml -n openstack

Optional: Create the Instance HA configuration map and set the DISABLED parameter to false. For example:

$ cat << EOF > iha-cm.yaml
kind: ConfigMap
metadata:
  name: instanceha-0-config
  namespace: openstack
apiVersion: v1
data:
  config.yaml: |
    config:
      EVACUABLE_TAG: "evacuable"
      TAGGED_IMAGES: "true"
      TAGGED_FLAVORS: "true"
      TAGGED_AGGREGATES: "true"
      SMART_EVACUATION: "false"
      DELTA: "30"
      DELAY: "0"
      POLL: "45"
      THRESHOLD: "50"
      WORKERS: "4"
      RESERVED_HOSTS: "false"
      LEAVE_DISABLED: "false"
      CHECK_KDUMP: "false"
      LOGLEVEL: "info"
      DISABLED: "false"
EOF

Apply the configuration:

$ oc apply -f iha-cm.yaml -n openstack

If you want to restrict which Compute nodes are evacuated, create host aggregates and set them by using the EVACUABLE_TAG parameter. Alternatively, you can set the TAGGED_AGGREGATES parameter to false to enable monitoring and evacuation of all your Compute nodes. For more information about Instance HA service parameters, see Editing the Instance HA service parameters in Configuring high availability for instances.

Create an Instance HA resource and reference the fencing secret and configuration map. For example:
```
$ cat << EOF > iha.yaml
apiVersion: instanceha.openstack.org/v1beta1
kind: InstanceHa
metadata:
  name: instanceha-0
  namespace: openstack
spec:
  caBundleSecretName: combined-ca-bundle
  instanceHaConfigMap:
  fencingSecret: fencing-secret
EOF
```
- spec.instanceHaConfigMap defines the name of the YAML file containing the Instance HA configuration map that you created. If you do not create this file, then a YAML file called instanceha-config is created automatically when the Instance HA service is installed, providing the default values of the Instance HA service parameters. You can then edit the values as needed.
Deploy the Instance HA resource:
```
$ oc apply -f iha.yaml -n openstack
```

Next steps

After you complete the Red Hat OpenStack Services on OpenShift adoption, remove the Pacemaker components from the Compute nodes. You must run the following commands on each Compute node:

$ sudo systemctl stop pacemaker_remote
$ sudo systemctl stop pcsd
$ sudo systemctl stop pcsd-ruby.service
$ sudo systemctl disable pacemaker_remote
$ sudo systemctl disable pcsd
$ sudo systemctl disable pcsd-ruby.service
$ sudo dnf remove pacemaker pacemaker-remote pcs pcsd -y

5.7. Post-adoption tasks for the Load-balancing service

If you adopted the Load-balancing service (octavia), after you complete the data plane adoption, you must perform the following tasks:

Upgrade the amphorae virtual machines to the new images.
Remove obsolete resources from your existing load balancers.

Prerequisites

You have adopted the Load-balancing service. For more information, see Adopting the Load-balancing service.

Procedure

Ensure that the connectivity between the new control plane and the adopted Compute nodes is functional by creating a new load balancer and checking that its provisioning_status becomes ACTIVE:
```
$ alias openstack="oc exec -t openstackclient -- openstack"
$ openstack loadbalancer create --vip-subnet-id public-subnet --name lb-post-adoption --wait
```
Trigger a failover for all existing load balancers to upgrade the amphorae virtual machines to use the new image and to establish connectivity with the new control plane:
```
$ openstack loadbalancer list -f value -c id | \
      xargs -r -n1 -P4 ${BASH_ALIASES[openstack]} loadbalancer failover --wait
```

Delete old flavors that were migrated to the new control plane:

$ openstack flavor delete octavia_65
# The following flavors might not exist in OSP 17.1 deployments
$ openstack flavor show octavia_amphora-mvcpu-ha && \
  openstack flavor delete octavia_amphora-mvcpu-ha
$ openstack loadbalancer flavor show octavia_amphora-mvcpu-ha && \
  openstack loadbalancer flavor delete octavia_amphora-mvcpu-ha
$ openstack loadbalancer flavorprofile show octavia_amphora-mvcpu-ha_profile && \
  openstack loadbalancer flavorprofile delete octavia_amphora-mvcpu-ha_profile

Some flavors might still be used by load balancers and cannot be deleted.

Delete the old management network and its ports:

$ for net_id in $(openstack network list -f value -c ID --name lb-mgmt-net); do \
    desc=$(openstack network show "$net_id" -f value -c description); \
    [ -z "$desc" ] && WALLABY_LB_MGMT_NET_ID="$net_id" ; \
  done
$ for id in $(openstack port list --network "$WALLABY_LB_MGMT_NET_ID" -f value -c ID); do \
    openstack port delete "$id" ; \
  done
$ openstack network delete "$WALLABY_LB_MGMT_NET_ID"

Verify that only one lb-mgmt-net and one lb-mgmt-subnet exists:

$ openstack network list | grep lb-mgmt-net
| fe470c29-0482-4809-9996-6d636e3feea3 | lb-mgmt-net          | 6a881091-097d-441c-937b-5a23f4f243b7 |
$ openstack subnet list | grep lb-mgmt-subnet
| 6a881091-097d-441c-937b-5a23f4f243b7 | lb-mgmt-subnet          | fe470c29-0482-4809-9996-6d636e3feea3 | 172.24.0.0/16   |

6. Migrating the Object Storage service to Red Hat OpenStack Services on OpenShift nodes

If you are using the OpenStack Object Storage service (swift) as an Object Storage service, you must migrate your Object Storage service to Red Hat OpenStack Services on OpenShift nodes.

If you are using the Object Storage API of the Ceph Object Gateway (RGW), you can skip this chapter and migrate your Red Hat Ceph Storage cluster. For more information, see Migrate the Ceph cluster. If you are not planning to migrate Ceph daemons from Controller nodes, you must perform the steps that are described in Deploying a Ceph ingress daemon and Create or update the Object Storage service endpoints.

The data migration happens replica by replica. For example, if you have 3 replicas, move them one at a time to ensure that the other 2 replicas are still operational, which enables you to continue to use the Object Storage service during the migration.

Data migration to the new deployment is a long-running process that executes mostly in the background. The Object Storage service replicators move data from old to new nodes, which might take a long time depending on the amount of storage used. To reduce downtime, you can use the old nodes if they are running and continue with adopting other services while waiting for the migration to complete. Performance might be degraded due to the amount of replication traffic in the network.

6.1. Migrating the Object Storage service data from OSP to RHOSO nodes

The Object Storage service (swift) migration involves the following steps:

Add new nodes to the Object Storage service rings.
Set weights of existing nodes to 0.
Rebalance rings by moving one replica.
Copy rings to old nodes and restart services.
Check replication status and repeat the previous two steps until the old nodes are drained.
Remove the old nodes from the rings.

Prerequisites

Adopt the Object Storage service. For more information, see Adopting the Object Storage service.
For DNS servers, ensure that all existing nodes are able to resolve the hostnames of the OpenShift pods, for example, by using the external IP of the DNSMasq service as the nameserver in /etc/resolv.conf:
```
$ oc get service dnsmasq-dns -o jsonpath="{.status.loadBalancer.ingress[0].ip}" | $CONTROLLER1_SSH sudo tee /etc/resolv.conf
```

Track the current status of the replication by using the swift-dispersion tool:

$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-dispersion-populate'

The command might need a few minutes to complete. It creates 0-byte objects that are distributed across the Object Storage service deployment, and you can use the swift-dispersion-report afterward to show the current replication status:

$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-dispersion-report'

The output of the swift-dispersion-report command looks similar to the following:

Queried 1024 containers for dispersion reporting, 5s, 0 retries
100.00% of container copies found (3072 of 3072)
Sample represents 100.00% of the container partition space
Queried 1024 objects for dispersion reporting, 4s, 0 retries
There were 1024 partitions missing 0 copies.
100.00% of object copies found (3072 of 3072)
Sample represents 100.00% of the object partition space

Procedure

Add new nodes by scaling up the SwiftStorage resource from 0 to 3:
```
$ oc patch openstackcontrolplane openstack --type=merge -p='{"spec":{"swift":{"template":{"swiftStorage":{"replicas": 3}}}}}'
```
This command creates three storage instances on the OpenShift cluster that use Persistent Volume Claims.

Wait until all three pods are running and the rings include the new devices:

$ oc wait pods --for condition=Ready -l component=swift-storage
$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-ring-builder object.builder search --device pv'

From the current rings, get the storage management IP addresses of the nodes to drain:

$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-ring-builder object.builder search _' | tail -n +2 | awk '{print $4}' | sort -u

The output looks similar to the following:

172.20.0.100
swift-storage-0.swift-storage.openstack.svc
swift-storage-1.swift-storage.openstack.svc
swift-storage-2.swift-storage.openstack.svc

Drain the old nodes. In the following example, the old node 172.20.0.100 is drained:

$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c '
> swift-ring-tool get
> swift-ring-tool drain 172.20.0.100
> swift-ring-tool rebalance
> swift-ring-tool push'

Depending on your deployment, you might have more nodes to include in the command.

Copy and apply the updated rings to the original nodes. Run the ssh commands for your existing nodes that store the Object Storage service data:

$ oc extract --confirm cm/swift-ring-files
$ $CONTROLLER1_SSH "tar -C /var/lib/config-data/puppet-generated/swift/etc/swift/ -xzf -" < swiftrings.tar.gz
$ $CONTROLLER1_SSH "systemctl restart tripleo_swift_*"

Track the replication progress by using the swift-dispersion-report tool:

$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c "swift-ring-tool get && swift-dispersion-report"

The output shows less than 100% of copies found. Repeat the command until all container and object copies are found:

Queried 1024 containers for dispersion reporting, 6s, 0 retries
There were 5 partitions missing 1 copy.
99.84% of container copies found (3067 of 3072)
Sample represents 100.00% of the container partition space
Queried 1024 objects for dispersion reporting, 7s, 0 retries
There were 739 partitions missing 1 copy.
There were 285 partitions missing 0 copies.
75.94% of object copies found (2333 of 3072)
Sample represents 100.00% of the object partition space

The rebalance command moves only one replica at a time to ensure that data is available continuously. This requires running the rebalance command multiple times to complete the full rebalance operation. Additionally, a minimum wait time of one hour between consecutive rebalance commands is enforced to prevent moving multiple replicas at the same time. Running the rebalance again before this period elapses has no effect.

Move the next replica to the new nodes by rebalancing and distributing the rings:

$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c '
> swift-ring-tool get
> swift-ring-tool rebalance
> swift-ring-tool push'

$ oc extract --confirm cm/swift-ring-files
$ $CONTROLLER1_SSH "tar -C /var/lib/config-data/puppet-generated/swift/etc/swift/ -xzf -" < swiftrings.tar.gz
$ $CONTROLLER1_SSH "systemctl restart tripleo_swift_*"

Monitor the swift-dispersion-report output again, wait until all copies are found, and then repeat this step until all your replicas are moved to the new nodes.

Remove the nodes from the rings:

$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c '
> swift-ring-tool get
> swift-ring-tool remove 172.20.0.100
> swift-ring-tool rebalance
> swift-ring-tool push'

Even if all replicas are on the new nodes and the swift-dispersion-report command reports 100% of the copies found, there might still be data on the old nodes. The replicators remove this data, but it might take more time.

Verification

Check the disk usage of all disks in the cluster:

$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-recon -d'

Confirm that there are no more \*.db or *.data files in the /srv/node directory on the nodes:
```
$CONTROLLER1_SSH "find /srv/node/ -type f -name '*.db' -o -name '*.data' | wc -l"
```

6.2. Troubleshooting the Object Storage service migration

You can troubleshoot issues with the Object Storage service (swift) migration.

If the replication is not working and the swift-dispersion-report is not back to 100% availability, check the replicator progress to help you debug:

$ CONTROLLER1_SSH tail /var/log/containers/swift/swift.log | grep object-server

The following shows an example of the output:

Mar 14 06:05:30 standalone object-server[652216]: <f+++++++++ 4e2/9cbea55c47e243994b0b10d8957184e2/1710395823.58025.data
Mar 14 06:05:30 standalone object-server[652216]: Successful rsync of /srv/node/vdd/objects/626/4e2 to swift-storage-1.swift-storage.openstack.svc::object/d1/objects/626 (0.094)
Mar 14 06:05:30 standalone object-server[652216]: Removing partition: /srv/node/vdd/objects/626
Mar 14 06:05:31 standalone object-server[652216]: <f+++++++++ 85f/cf53b5a048e5b19049e05a548cde185f/1710395796.70868.data
Mar 14 06:05:31 standalone object-server[652216]: Successful rsync of /srv/node/vdb/objects/829/85f to swift-storage-2.swift-storage.openstack.svc::object/d1/objects/829 (0.095)
Mar 14 06:05:31 standalone object-server[652216]: Removing partition: /srv/node/vdb/objects/829

You can also check the ring consistency and replicator status:

$ oc debug --keep-labels=true job/swift-ring-rebalance -- /bin/sh -c 'swift-ring-tool get && swift-recon -r --md5'

The output might show a md5 mismatch until approximately 2 minutes after pushing the new rings. After the 2 minutes, the output looks similar to the following example:

Oldest completion was 2024-03-14 16:53:27 (3 minutes ago) by 172.20.0.100:6000.
Most recent completion was 2024-03-14 16:56:38 (12 seconds ago) by swift-storage-0.swift-storage.openstack.svc:6200.
===============================================================================
[2024-03-14 16:56:50] Checking ring md5sums
4/4 hosts matched, 0 error[s] while checking hosts.

7. Migrating the Ceph cluster

In the context of data plane adoption, where the OpenStack (OSP) services are redeployed in OpenShift, you migrate a TripleO-deployed Ceph Storage cluster by using a process called “externalizing” the Ceph Storage cluster.

There are two deployment topologies that include an internal Ceph Storage cluster:

OSP includes dedicated Ceph Storage nodes to host object storage daemons (OSDs)
Hyperconverged Infrastructure (HCI), where Compute and Storage services are colocated on hyperconverged nodes

In either scenario, there are some Ceph processes that are deployed on OSP Controller nodes: Ceph monitors, Ceph Object Gateway (RGW), Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS Ganesha. To migrate your Ceph Storage cluster, you must decommission the Controller nodes and move the Ceph daemons to a set of target nodes that are already part of the Ceph Storage cluster.

Before you begin the migration, complete the tasks in your OpenStack environment. For more information, see Ceph prerequisites.

7.1. Ceph daemon cardinality

Ceph Reef and later applies strict constraints in the way daemons can be colocated within the same node. Your topology depends on the available hardware and the amount of Ceph services in the Controller nodes that you retire. The amount of services that you can migrate depends on the amount of available nodes in the cluster. The following diagrams show the distribution of Ceph daemons on Ceph nodes where at least 3 nodes are required.

The following scenario includes only RGW and RBD, without the Ceph dashboard:

|    |                     |             |
|----|---------------------|-------------|
| osd | mon/mgr/crash      | rgw/ingress |
| osd | mon/mgr/crash      | rgw/ingress |
| osd | mon/mgr/crash      | rgw/ingress |

With the Ceph dashboard, but without Shared File Systems service (manila), at least 4 nodes are required. The Ceph dashboard has no failover:

|     |                     |             |
|-----|---------------------|-------------|
| osd | mon/mgr/crash | rgw/ingress       |
| osd | mon/mgr/crash | rgw/ingress       |
| osd | mon/mgr/crash | dashboard/grafana |
| osd | rgw/ingress   | (free)            |

With the Ceph dashboard and the Shared File Systems service, a minimum of 5 nodes are required, and the Ceph dashboard has no failover:

|     |                     |                         |
|-----|---------------------|-------------------------|
| osd | mon/mgr/crash       | rgw/ingress             |
| osd | mon/mgr/crash       | rgw/ingress             |
| osd | mon/mgr/crash       | mds/ganesha/ingress     |
| osd | rgw/ingress         | mds/ganesha/ingress     |
| osd | mds/ganesha/ingress | dashboard/grafana       |

7.2. Migrating the monitoring stack component to new nodes within an existing Ceph cluster

The Ceph Dashboard module adds web-based monitoring and administration to the Ceph Manager. With TripleO-deployed Ceph, the Ceph Dashboard is enabled as part of the overcloud deploy and is composed of the following components:

Ceph Manager module
Grafana
Prometheus
Alertmanager
Node exporter

The Ceph Dashboard containers are included through tripleo-container-image-prepare parameters, and high availability (HA) relies on HAProxy and Pacemaker to be deployed on the OpenStack (OSP) environment. For an external Ceph Storage cluster, HA is not supported.

You migrate and relocate the Ceph Monitoring components to free Controller nodes. Before you begin the migration, complete the tasks in your OpenStack environment. For more information, see Ceph prerequisites.

7.2.1. Migrating the monitoring stack to the target nodes

To migrate the monitoring stack to the target nodes, you add the monitoring label to your existing nodes and update the configuration of each daemon. You do not need to migrate node exporters. These daemons are deployed across the nodes that are part of the Ceph cluster (the placement is ‘*’).

Depending on the target nodes and the number of deployed or active daemons, you can either relocate the existing containers to the target nodes, or select a subset of nodes that host the monitoring stack daemons. High availability (HA) is not supported. Reducing the placement with count: 1 allows you to migrate the existing daemons in a Hyperconverged Infrastructure, or hardware-limited, scenario without impacting other services.

Migrating the existing daemons to the target nodes

The following procedure is an example of an environment with 3 Ceph nodes or ComputeHCI nodes. This scenario extends the monitoring labels to all the Ceph or ComputeHCI nodes that are part of the cluster. This means that you keep 3 placements for the target nodes.

Prerequisites

Confirm that the firewall rules are in place and the ports are open for a given monitoring stack service.

Procedure

Add the monitoring label to all the Ceph Storage or ComputeHCI nodes in the cluster:

for item in $(sudo cephadm shell --  ceph orch host ls --format json | jq -r '.[].hostname'); do
    sudo cephadm shell -- ceph orch host label add  $item monitoring;
done

Verify that all the hosts on the target nodes have the monitoring label:

[tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls

HOST                        ADDR           LABELS
cephstorage-0.redhat.local  192.168.24.11  osd monitoring
cephstorage-1.redhat.local  192.168.24.12  osd monitoring
cephstorage-2.redhat.local  192.168.24.47  osd monitoring
controller-0.redhat.local   192.168.24.35  _admin mon mgr monitoring
controller-1.redhat.local   192.168.24.53  mon _admin mgr monitoring
controller-2.redhat.local   192.168.24.10  mon _admin mgr monitoring

Remove the labels from the Controller nodes:

$ for i in 0 1 2; do sudo cephadm shell -- ceph orch host label rm "controller-$i.redhat.local" monitoring; done

Removed label monitoring from host controller-0.redhat.local
Removed label monitoring from host controller-1.redhat.local
Removed label monitoring from host controller-2.redhat.local

Dump the current monitoring stack spec:

function export_spec {
    local component="$1"
    local target_dir="$2"
    sudo cephadm shell -- ceph orch ls --export "$component" > "$target_dir/$component"
}

SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
mkdir -p ${SPEC_DIR}
for m in grafana prometheus alertmanager; do
    export_spec "$m" "$SPEC_DIR"
done

For each daemon, edit the current spec and replace the placement.hosts: section with the placement.label: section, for example:
```
service_type: grafana
service_name: grafana
placement:
  label: monitoring
networks:
- 172.17.3.0/24
spec:
  port: 3100
```
This step also applies to Prometheus and Alertmanager specs.

Apply the new monitoring spec to relocate the monitoring stack daemons:

SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
function migrate_daemon {
    local component="$1"
    local target_dir="$2"
    sudo cephadm shell -m "$target_dir" -- ceph orch apply -i /mnt/ceph_specs/$component
}
for m in grafana prometheus alertmanager; do
    migrate_daemon  "$m" "$SPEC_DIR"
done

Verify that the daemons are deployed on the expected nodes:

[ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)"
alertmanager.cephstorage-2  cephstorage-2.redhat.local  172.17.3.144:9093,9094
grafana.cephstorage-0       cephstorage-0.redhat.local  172.17.3.83:3100
prometheus.cephstorage-1    cephstorage-1.redhat.local  172.17.3.53:9092

After you migrate the monitoring stack, you lose high availability. The monitoring stack daemons no longer have a Virtual IP address and HAProxy anymore. Node exporters are still running on all the nodes.

Review the Ceph configuration to ensure that it aligns with the configuration on the target nodes. In particular, focus on the following configuration entries:

[ceph: root@controller-0 /]# ceph config dump | grep -i dashboard
...
mgr  advanced  mgr/dashboard/ALERTMANAGER_API_HOST  http://172.17.3.83:9093
mgr  advanced  mgr/dashboard/GRAFANA_API_URL        https://172.17.3.144:3100
mgr  advanced  mgr/dashboard/PROMETHEUS_API_HOST    http://172.17.3.83:9092
mgr  advanced  mgr/dashboard/controller-0.ycokob/server_addr  172.17.3.33
mgr  advanced  mgr/dashboard/controller-1.lmzpuc/server_addr  172.17.3.147
mgr  advanced  mgr/dashboard/controller-2.xpdgfl/server_addr  172.17.3.138

Verify that the API_HOST/URL of the grafana, alertmanager and prometheus services points to the IP addresses on the storage network of the node where each daemon is relocated:

[ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)"
alertmanager.cephstorage-0  cephstorage-0.redhat.local  172.17.3.83:9093,9094
alertmanager.cephstorage-1  cephstorage-1.redhat.local  172.17.3.53:9093,9094
alertmanager.cephstorage-2  cephstorage-2.redhat.local  172.17.3.144:9093,9094
grafana.cephstorage-0       cephstorage-0.redhat.local  172.17.3.83:3100
grafana.cephstorage-1       cephstorage-1.redhat.local  172.17.3.53:3100
grafana.cephstorage-2       cephstorage-2.redhat.local  172.17.3.144:3100
prometheus.cephstorage-0    cephstorage-0.redhat.local  172.17.3.83:9092
prometheus.cephstorage-1    cephstorage-1.redhat.local  172.17.3.53:9092
prometheus.cephstorage-2    cephstorage-2.redhat.local  172.17.3.144:9092

[ceph: root@controller-0 /]# ceph config dump
...
...
mgr  advanced  mgr/dashboard/ALERTMANAGER_API_HOST   http://172.17.3.83:9093
mgr  advanced  mgr/dashboard/PROMETHEUS_API_HOST     http://172.17.3.83:9092
mgr  advanced  mgr/dashboard/GRAFANA_API_URL         https://172.17.3.144:3100

The Ceph Dashboard, as the service provided by the Ceph mgr, is not impacted by the relocation. You might experience an impact when the active mgr daemon is migrated or is force-failed. However, you can define 3 replicas in the Ceph Manager configuration to redirect requests to a different instance.

Scenario 2: Relocating one instance of a monitoring stack to migrate daemons to target nodes

Instead of adding a single monitoring label to all the target nodes, it is possible to relocate one instance of each monitoring stack daemon on a particular node.

Prerequisites

Confirm that the firewall rules are in place and the ports are open for a given monitoring stack service.

Procedure

Set each of your nodes to host a particular daemon instance, for example, if you have three target nodes:

[tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls | grep -i cephstorage

HOST                        ADDR           LABELS
cephstorage-0.redhat.local  192.168.24.11  osd ---> grafana
cephstorage-1.redhat.local  192.168.24.12  osd ---> prometheus
cephstorage-2.redhat.local  192.168.24.47  osd ---> alertmanager

Add the appropriate labels to the target nodes:

declare -A target_nodes

target_nodes[grafana]=cephstorage-0
target_nodes[prometheus]=cephstorage-1
target_nodes[alertmanager]=cephstorage-2

for label in "${!target_nodes[@]}"; do
    ceph orch host label add ${target_nodes[$label]} $label
done

Verify that the labels are properly applied to the target nodes:

[tripleo-admin@controller-0 ~]$ sudo cephadm shell -- ceph orch host ls | grep -i cephstorage

HOST                    	ADDR       	LABELS          	STATUS
cephstorage-0.redhat.local  192.168.24.11  osd grafana
cephstorage-1.redhat.local  192.168.24.12  osd prometheus
cephstorage-2.redhat.local  192.168.24.47  osd alertmanager

Dump the current monitoring stack spec:

function export_spec {
    local component="$1"
    local target_dir="$2"
    sudo cephadm shell -- ceph orch ls --export "$component" > "$target_dir/$component"
}

SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
for m in grafana prometheus alertmanager; do
    export_spec "$m" "$SPEC_DIR"
done

For each daemon, edit the current spec and replace the placement/hosts section with the placement/label section, for example:
```
service_type: grafana
service_name: grafana
placement:
  label: grafana
networks:
- 172.17.3.0/24
spec:
  port: 3100
```
The same procedure applies to Prometheus and Alertmanager specs.

Apply the new monitoring spec to relocate the monitoring stack daemons:

SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
function migrate_daemon {
    local component="$1"
    local target_dir="$2"
    sudo cephadm shell -m "$target_dir" -- ceph orch apply -i /mnt/ceph_specs/$component
}
for m in grafana prometheus alertmanager; do
    migrate_daemon  "$m" "$SPEC_DIR"
done

Verify that the daemons are deployed on the expected nodes:

[ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)"
alertmanager.cephstorage-2  cephstorage-2.redhat.local  172.17.3.144:9093,9094
grafana.cephstorage-0       cephstorage-0.redhat.local  172.17.3.83:3100
prometheus.cephstorage-1    cephstorage-1.redhat.local  172.17.3.53:9092

With the procedure described above we lose High Availability: the monitoring stack daemons have no VIP and haproxy anymore; Node exporters are still running on all the nodes: instead of using labels we keep the current approach as we want to not reduce the monitoring space covered.

Update the Ceph Dashboard Manager configuration. An important aspect that should be considered at this point is to replace and verify that the Ceph configuration is aligned with the relocation you just made. Run the ceph config dump command and review the current config. In particular, focus on the following configuration entries:

[ceph: root@controller-0 /]# ceph config dump
...
mgr  advanced  mgr/dashboard/ALERTMANAGER_API_HOST  http://172.17.3.83:9093
mgr  advanced  mgr/dashboard/GRAFANA_API_URL        https://172.17.3.144:3100
mgr  advanced  mgr/dashboard/PROMETHEUS_API_HOST    http://172.17.3.83:9092
mgr  advanced  mgr/dashboard/controller-0.ycokob/server_addr  172.17.3.33
mgr  advanced  mgr/dashboard/controller-1.lmzpuc/server_addr  172.17.3.147
mgr  advanced  mgr/dashboard/controller-2.xpdgfl/server_addr  172.17.3.138

Verify that grafana, alertmanager and prometheus API_HOST/URL point to the IP addresses (on the storage network) of the node where each daemon has been relocated. This should be automatically addressed by cephadm and it shouldn’t require any manual action.

[ceph: root@controller-0 /]# ceph orch ps | grep -iE "(prome|alert|grafa)"
alertmanager.cephstorage-0  cephstorage-0.redhat.local  172.17.3.83:9093,9094
alertmanager.cephstorage-1  cephstorage-1.redhat.local  172.17.3.53:9093,9094
alertmanager.cephstorage-2  cephstorage-2.redhat.local  172.17.3.144:9093,9094
grafana.cephstorage-0       cephstorage-0.redhat.local  172.17.3.83:3100
grafana.cephstorage-1       cephstorage-1.redhat.local  172.17.3.53:3100
grafana.cephstorage-2       cephstorage-2.redhat.local  172.17.3.144:3100
prometheus.cephstorage-0    cephstorage-0.redhat.local  172.17.3.83:9092
prometheus.cephstorage-1    cephstorage-1.redhat.local  172.17.3.53:9092
prometheus.cephstorage-2    cephstorage-2.redhat.local  172.17.3.144:9092

[ceph: root@controller-0 /]# ceph config dump
...
...
mgr  advanced  mgr/dashboard/ALERTMANAGER_API_HOST   http://172.17.3.83:9093
mgr  advanced  mgr/dashboard/PROMETHEUS_API_HOST     http://172.17.3.83:9092
mgr  advanced  mgr/dashboard/GRAFANA_API_URL         https://172.17.3.144:3100

The Ceph Dashboard (mgr module plugin) has not been impacted at all by this relocation. The service is provided by the Ceph Manager daemon, hence we might experience an impact when the active mgr is migrated or is force-failed. However, having three replicas definition allows to redirect requests to a different instance (it’s still an A/P model), hence the impact should be limited.
1. When the RBD migration is over, the following Ceph config keys must be regenerated to point to the right mgr container:
  mgr advanced mgr/dashboard/controller-0.ycokob/server_addr 172.17.3.33 mgr advanced mgr/dashboard/controller-1.lmzpuc/server_addr 172.17.3.147 mgr advanced mgr/dashboard/controller-2.xpdgfl/server_addr 172.17.3.138
  $ sudo cephadm shell $ ceph orch ps | awk '/mgr./ {print $1}'
2. For each retrieved mgr, update the entry in the Ceph configuration:
  $ ceph config set mgr mgr/dashboard/<>/server_addr/<ip addr>

7.3. Migrating Ceph MDS to new nodes within the existing cluster

You can migrate the MDS daemon when Shared File Systems service (manila), deployed with either a cephfs-native or ceph-nfs back end, is part of the overcloud deployment. The MDS migration is performed by cephadm, and you move the daemons placement from a hosts-based approach to a label-based approach. This ensures that you can visualize the status of the cluster and where daemons are placed by using the ceph orch host command, and have a general view of how the daemons are co-located within a given host.

Prerequisites

Complete the tasks in your OpenStack environment. For more information, see Ceph prerequisites.

Procedure

Verify that the Ceph Storage cluster is healthy and check the MDS status:

$ sudo cephadm shell -- ceph fs ls
name: cephfs, metadata pool: manila_metadata, data pools: [manila_data ]

$ sudo cephadm shell -- ceph mds stat
cephfs:1 {0=mds.controller-2.oebubl=up:active} 2 up:standby

$ sudo cephadm shell -- ceph fs status cephfs

cephfs - 0 clients
======
RANK  STATE         	MDS           	ACTIVITY 	DNS	INOS   DIRS   CAPS
 0	active  mds.controller-2.oebubl  Reqs:	0 /s   696	196	173  	0
  	POOL     	TYPE 	USED  AVAIL
manila_metadata  metadata   152M   141G
  manila_data  	data	3072M   141G
  	STANDBY MDS
mds.controller-0.anwiwd
mds.controller-1.cwzhog

Retrieve more detailed information on the Ceph File System (CephFS) MDS status:

$ sudo cephadm shell -- ceph fs dump

e8
enable_multiple, ever_enabled_multiple: 1,1
default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
legacy client fscid: 1

Filesystem 'cephfs' (1)
fs_name cephfs
epoch   5
flags   12 joinable allow_snaps allow_multimds_snaps
created 2024-01-18T19:04:01.633820+0000
modified    	2024-01-18T19:04:05.393046+0000
tableserver 	0
root	0
session_timeout 60
session_autoclose   	300
max_file_size   1099511627776
required_client_features    	{}
last_failure	0
last_failure_osd_epoch  0
compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds 1
in  	0
up  	{0=24553}
failed
damaged
stopped
data_pools  	[7]
metadata_pool   9
inline_data 	disabled
balancer
standby_count_wanted	1
[mds.mds.controller-2.oebubl{0:24553} state up:active seq 2 addr [v2:172.17.3.114:6800/680266012,v1:172.17.3.114:6801/680266012] compat {c=[1],r=[1],i=[7ff]}]


Standby daemons:

[mds.mds.controller-0.anwiwd{-1:14715} state up:standby seq 1 addr [v2:172.17.3.20:6802/3969145800,v1:172.17.3.20:6803/3969145800] compat {c=[1],r=[1],i=[7ff]}]
[mds.mds.controller-1.cwzhog{-1:24566} state up:standby seq 1 addr [v2:172.17.3.43:6800/2227381308,v1:172.17.3.43:6801/2227381308] compat {c=[1],r=[1],i=[7ff]}]
dumped fsmap epoch 8

Check the OSD blocklist and clean up the client list:

$ sudo cephadm shell -- ceph osd blocklist ls
$ for item in $(sudo cephadm shell -- ceph osd blocklist ls | awk '{print $1}'); do
>     sudo cephadm shell -- ceph osd blocklist rm $item;
> done

When a file system client is unresponsive or misbehaving, the access to the file system might be forcibly terminated. This process is called eviction. Evicting a CephFS client prevents it from communicating further with MDS daemons and OSD daemons.

Ordinarily, a blocklisted client cannot reconnect to the servers; you must unmount and then remount the client. However, permitting a client that was evicted to attempt to reconnect can be useful. Because CephFS uses the RADOS OSD blocklist to control client eviction, you can permit CephFS clients to reconnect by removing them from the blocklist.

Get the hosts that are currently part of the Ceph cluster:

[ceph: root@controller-0 /]# ceph orch host ls
HOST                        ADDR           LABELS          STATUS
cephstorage-0.redhat.local  192.168.24.25  osd
cephstorage-1.redhat.local  192.168.24.50  osd
cephstorage-2.redhat.local  192.168.24.47  osd
controller-0.redhat.local   192.168.24.24  _admin mgr mon
controller-1.redhat.local   192.168.24.42  mgr _admin mon
controller-2.redhat.local   192.168.24.37  mgr _admin mon
6 hosts in cluster

Apply the MDS labels to the target nodes:

for item in $(sudo cephadm shell --  ceph orch host ls --format json | jq -r '.[].hostname'); do
    sudo cephadm shell -- ceph orch host label add  $item mds;
done

Verify that all the hosts have the MDS label:

$ sudo cephadm shell -- ceph orch host ls

HOST                    	ADDR       	   LABELS
cephstorage-0.redhat.local  192.168.24.11  osd mds
cephstorage-1.redhat.local  192.168.24.12  osd mds
cephstorage-2.redhat.local  192.168.24.47  osd mds
controller-0.redhat.local   192.168.24.35  _admin mon mgr mds
controller-1.redhat.local   192.168.24.53  mon _admin mgr mds
controller-2.redhat.local   192.168.24.10  mon _admin mgr mds

Dump the current MDS spec:

$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ mkdir -p ${SPEC_DIR}
$ sudo cephadm shell -- ceph orch ls --export mds > ${SPEC_DIR}/mds

Edit the retrieved spec and replace the placement.hosts section with placement.label:
```
service_type: mds
service_id: mds
service_name: mds.mds
placement:
  label: mds
```

Use the ceph orchestrator to apply the new MDS spec:

$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ sudo cephadm shell -m ${SPEC_DIR}/mds -- ceph orch apply -i /mnt/mds

Scheduling new mds deployment ...

This results in an increased number of MDS daemons.

Check the new standby daemons that are temporarily added to the CephFS:

$ sudo cephadm shell -- ceph fs dump

Active

standby_count_wanted    1
[mds.mds.controller-0.awzplm{0:463158} state up:active seq 307 join_fscid=1 addr [v2:172.17.3.20:6802/51565420,v1:172.17.3.20:6803/51565420] compat {c=[1],r=[1],i=[7ff]}]


Standby daemons:

[mds.mds.cephstorage-1.jkvomp{-1:463800} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.135:6820/2075903648,v1:172.17.3.135:6821/2075903648] compat {c=[1],r=[1],i=[7ff]}]
[mds.mds.controller-2.gfrqvc{-1:475945} state up:standby seq 1 addr [v2:172.17.3.114:6800/2452517189,v1:172.17.3.114:6801/2452517189] compat {c=[1],r=[1],i=[7ff]}]
[mds.mds.cephstorage-0.fqcshx{-1:476503} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.92:6820/4120523799,v1:172.17.3.92:6821/4120523799] compat {c=[1],r=[1],i=[7ff]}]
[mds.mds.cephstorage-2.gnfhfe{-1:499067} state up:standby seq 1 addr [v2:172.17.3.79:6820/2448613348,v1:172.17.3.79:6821/2448613348] compat {c=[1],r=[1],i=[7ff]}]
[mds.mds.controller-1.tyiziq{-1:499136} state up:standby seq 1 addr [v2:172.17.3.43:6800/3615018301,v1:172.17.3.43:6801/3615018301] compat {c=[1],r=[1],i=[7ff]}]

To migrate MDS to the target nodes, set the MDS affinity that manages the MDS failover:

It is possible to elect a dedicated MDS as "active" for a particular file system. To configure this preference, CephFS provides a configuration option for MDS called mds_join_fs, which enforces this affinity. When failing over MDS daemons, cluster monitors prefer standby daemons with mds_join_fs equal to the file system name with the failed rank. If no standby exists with mds_join_fs equal to the file system name, it chooses an unqualified standby as a replacement.

$ sudo cephadm shell -- ceph config set mds.mds.cephstorage-0.fqcshx mds_join_fs cephfs

Replace mds.mds.cephstorage-0.fqcshx with the daemon deployed on cephstorage-0 that was retrieved from the previous step.

Remove the labels from the Controller nodes and force the MDS failover to the target node:

$ for i in 0 1 2; do sudo cephadm shell -- ceph orch host label rm "controller-$i.redhat.local" mds; done

Removed label mds from host controller-0.redhat.local
Removed label mds from host controller-1.redhat.local
Removed label mds from host controller-2.redhat.local

The switch to the target node happens in the background. The new active MDS is the one that you set by using the mds_join_fs command.

Check the result of the failover and the new deployed daemons:

$ sudo cephadm shell -- ceph fs dump
…
…
standby_count_wanted    1
[mds.mds.cephstorage-0.fqcshx{0:476503} state up:active seq 168 join_fscid=1 addr [v2:172.17.3.92:6820/4120523799,v1:172.17.3.92:6821/4120523799] compat {c=[1],r=[1],i=[7ff]}]


Standby daemons:

[mds.mds.cephstorage-2.gnfhfe{-1:499067} state up:standby seq 1 addr [v2:172.17.3.79:6820/2448613348,v1:172.17.3.79:6821/2448613348] compat {c=[1],r=[1],i=[7ff]}]
[mds.mds.cephstorage-1.jkvomp{-1:499760} state up:standby seq 1 join_fscid=1 addr [v2:172.17.3.135:6820/452139733,v1:172.17.3.135:6821/452139733] compat {c=[1],r=[1],i=[7ff]}]


$ sudo cephadm shell -- ceph orch ls

NAME                     PORTS   RUNNING  REFRESHED  AGE  PLACEMENT
crash                                6/6  10m ago    10d  *
mds.mds                          3/3  10m ago    32m  label:mds


$ sudo cephadm shell -- ceph orch ps | grep mds


mds.mds.cephstorage-0.fqcshx  cephstorage-0.redhat.local                     running (79m)     3m ago  79m    27.2M        -  17.2.6-100.el9cp  1af7b794f353  2a2dc5ba6d57
mds.mds.cephstorage-1.jkvomp  cephstorage-1.redhat.local                     running (79m)     3m ago  79m    21.5M        -  17.2.6-100.el9cp  1af7b794f353  7198b87104c8
mds.mds.cephstorage-2.gnfhfe  cephstorage-2.redhat.local                     running (79m)     3m ago  79m    24.2M        -  17.2.6-100.el9cp  1af7b794f353  f3cb859e2a15

Additional resources

7.4. Migrating Ceph RGW to external RHEL nodes

For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes, you must migrate the Ceph Object Gateway (RGW) daemons that are included in the OpenStack Controller nodes into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or Ceph nodes. Your environment must have Ceph Reef or later and be managed by cephadm or Ceph Orchestrator.

Before you begin the migration, complete the tasks in your OpenStack environment. For more information, see Ceph prerequisites.

7.4.1. Migrating the Ceph RGW back ends

You must migrate your Ceph Object Gateway (RGW) back ends from your Controller nodes to your Ceph nodes. To ensure that you distribute the correct amount of services to your available nodes, you use cephadm labels to refer to a group of nodes where a given daemon type is deployed. For more information about the cardinality diagram, see Ceph daemon cardinality. The following procedure assumes that you have three target nodes, cephstorage-0, cephstorage-1, cephstorage-2.

Procedure

Add the RGW label to the Ceph nodes that you want to migrate your RGW back ends to:

$ sudo cephadm shell -- ceph orch host label add cephstorage-0 rgw;
$ sudo cephadm shell -- ceph orch host label add cephstorage-1 rgw;
$ sudo cephadm shell -- ceph orch host label add cephstorage-2 rgw;

Added label rgw to host cephstorage-0
Added label rgw to host cephstorage-1
Added label rgw to host cephstorage-2

$ sudo cephadm shell -- ceph orch host ls

HOST       	ADDR       	LABELS      	STATUS
cephstorage-0  192.168.24.54  osd rgw
cephstorage-1  192.168.24.44  osd rgw
cephstorage-2  192.168.24.30  osd rgw
controller-0   192.168.24.45  _admin mon mgr
controller-1   192.168.24.11  _admin mon mgr
controller-2   192.168.24.38  _admin mon mgr

6 hosts in cluster

During the overcloud deployment, a cephadm-compatible spec is generated in /home/ceph-admin/specs/rgw. Find and patch the RGW spec, specify the right placement by using labels, and change the RGW back-end port to 8090 to avoid conflicts with the Ceph ingress daemon front-end port.

$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ mkdir -p ${SPEC_DIR}
$ sudo cephadm shell -- ceph orch ls --export rgw > ${SPEC_DIR}/rgw
$ cat ${SPEC_DIR}/rgw

networks:
- 172.17.3.0/24
placement:
  hosts:
  - controller-0
  - controller-1
  - controller-2
service_id: rgw
service_name: rgw.rgw
service_type: rgw
spec:
  rgw_frontend_port: 8080
  rgw_realm: default
  rgw_zone: default

This example assumes that 172.17.3.0/24 is the storage network.

In the placement section, ensure that the label and rgw_frontend_port values are set:
```
---
networks:
- 172.17.3.0/24
placement:
  label: rgw
service_id: rgw
service_name: rgw.rgw
service_type: rgw
spec:
  rgw_frontend_port: 8090
  rgw_realm: default
  rgw_zone: default
  rgw_frontend_ssl_certificate: ...
  ssl: true
```
- networks defines the storage network where the RGW back ends are deployed.
- placement.label: rgw replaces the Controller nodes with the rgw label.
- spec.rgw_frontend_port specifies the value as 8090 to avoid conflicts with the Ceph ingress daemon.
- spec.rgw_frontend_ssl_certificate defines the SSL certificate and key concatenation if TLS is enabled as described in Configuring RGW with TLS for an external Red Hat Ceph Storage cluster in Configuring persistent storage.

Apply the new RGW spec by using the orchestrator CLI:

$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ sudo cephadm shell -m ${SPEC_DIR}/rgw -- ceph orch apply -i /mnt/rgw

This command triggers the redeploy, for example:

...
osd.9                     	cephstorage-2
rgw.rgw.cephstorage-0.wsjlgx  cephstorage-0  172.17.3.23:8090   starting
rgw.rgw.cephstorage-1.qynkan  cephstorage-1  172.17.3.26:8090   starting
rgw.rgw.cephstorage-2.krycit  cephstorage-2  172.17.3.81:8090   starting
rgw.rgw.controller-1.eyvrzw   controller-1   172.17.3.146:8080  running (5h)
rgw.rgw.controller-2.navbxa   controller-2   172.17.3.66:8080   running (5h)

...
osd.9                     	cephstorage-2
rgw.rgw.cephstorage-0.wsjlgx  cephstorage-0  172.17.3.23:8090  running (19s)
rgw.rgw.cephstorage-1.qynkan  cephstorage-1  172.17.3.26:8090  running (16s)
rgw.rgw.cephstorage-2.krycit  cephstorage-2  172.17.3.81:8090  running (13s)

Ensure that the new RGW back ends are reachable on the new ports, so you can enable an ingress daemon on port 8080 later. Log in to each Ceph Storage node that includes RGW and add the iptables rule to allow connections to both 8080 and 8090 ports in the Ceph Storage nodes:

$ iptables -I INPUT -p tcp -m tcp --dport 8080 -m conntrack --ctstate NEW -m comment --comment "ceph rgw ingress" -j ACCEPT
$ iptables -I INPUT -p tcp -m tcp --dport 8090 -m conntrack --ctstate NEW -m comment --comment "ceph rgw backends" -j ACCEPT
$ sudo iptables-save
$ sudo systemctl restart iptables

If nftables is used in the existing deployment, edit /etc/nftables/tripleo-rules.nft and add the following content:

# 100 ceph_rgw {'dport': ['8080','8090']}
add rule inet filter TRIPLEO_INPUT tcp dport { 8080,8090 } ct state new counter accept comment "100 ceph_rgw"

Save the file.
Restart the nftables service:
```
$ sudo systemctl restart nftables
```
Verify that the rules are applied:
```
$ sudo nft list ruleset | grep ceph_rgw
```

From a Controller node, such as controller-0, try to reach the RGW back ends:

$ curl http://cephstorage-0.storage:8090;

You should observe the following output:

<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>

Repeat the verification for each node where a RGW daemon is deployed.

If you migrated RGW back ends to the Ceph nodes, there is no internalAPI network, except in the case of HCI nodes. You must reconfigure the RGW keystone endpoint to point to the external network that you propagated:
```
[ceph: root@controller-0 /]# ceph config dump | grep keystone
global   basic rgw_keystone_url  http://172.16.1.111:5000

[ceph: root@controller-0 /]# ceph config set global rgw_keystone_url http://<keystone_endpoint>:5000
```
- Replace <keystone_endpoint> with the Identity service (keystone) internal endpoint of the service that is deployed in the OpenStackControlPlane CR when you adopt the Identity service. For more information, see Adopting the Identity service.

7.4.2. Deploying a Ceph ingress daemon

To deploy the Ceph ingress daemon, you perform the following actions:

Remove the existing ceph_rgw configuration.
Clean up the configuration created by TripleO.
Redeploy the Object Storage service (swift).

When you deploy the ingress daemon, two new containers are created:

HAProxy, which you use to reach the back ends.
Keepalived, which you use to own the virtual IP address.

You use the rgw label to distribute the ingress daemon to only the number of nodes that host Ceph Object Gateway (RGW) daemons. For more information about distributing daemons among your nodes, see Ceph daemon cardinality.

After you complete this procedure, you can reach the RGW back end from the ingress daemon and use RGW through the Object Storage service CLI.

Procedure

Log in to each Controller node and remove the following configuration from the /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg file:

listen ceph_rgw
  bind 10.0.0.103:8080 transparent
  mode http
  balance leastconn
  http-request set-header X-Forwarded-Proto https if { ssl_fc }
  http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
  http-request set-header X-Forwarded-Port %[dst_port]
  option httpchk GET /swift/healthcheck
  option httplog
  option forwardfor
   server controller-0.storage.redhat.local 172.17.3.73:8080 check fall 5 inter 2000 rise 2
  server controller-1.storage.redhat.local 172.17.3.146:8080 check fall 5 inter 2000 rise 2
  server controller-2.storage.redhat.local 172.17.3.156:8080 check fall 5 inter 2000 rise 2

Restart haproxy-bundle and confirm that it is started:

[root@controller-0 ~]# sudo pcs resource restart haproxy-bundle
haproxy-bundle successfully restarted


[root@controller-0 ~]# sudo pcs status | grep haproxy

  * Container bundle set: haproxy-bundle [undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp17-openstack-haproxy:pcmklatest]:
    * haproxy-bundle-podman-0   (ocf:heartbeat:podman):  Started controller-0
    * haproxy-bundle-podman-1   (ocf:heartbeat:podman):  Started controller-1
    * haproxy-bundle-podman-2   (ocf:heartbeat:podman):  Started controller-2

Confirm that no process is connected to port 8080:

[root@controller-0 ~]# ss -antop | grep 8080
[root@controller-0 ~]#

You can expect the Object Storage service (swift) CLI to fail to establish the connection:

(overcloud) [root@cephstorage-0 ~]# swift list

HTTPConnectionPool(host='10.0.0.103', port=8080): Max retries exceeded with url: /swift/v1/AUTH_852f24425bb54fa896476af48cbe35d3?format=json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc41beb0430>: Failed to establish a new connection: [Errno 111] Connection refused'))

Set the required images for both HAProxy and Keepalived:

[ceph: root@controller-0 /]# ceph config set mgr mgr/cephadm/container_image_haproxy quay.io/ceph/haproxy:2.3
[ceph: root@controller-0 /]# ceph config set mgr mgr/cephadm/container_image_keepalived quay.io/ceph/keepalived:2.1.5

Create a file called rgw_ingress in controller-0:

$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ vim ${SPEC_DIR}/rgw_ingress

Paste the following content into the rgw_ingress file:
```
---
service_type: ingress
service_id: rgw.rgw
placement:
  label: rgw
spec:
  backend_service: rgw.rgw
  virtual_ip: 10.0.0.89/24
  frontend_port: 8080
  monitor_port: 8898
  virtual_interface_networks:
    - <external_network>
  ssl_cert: ...
```
- Replace <external_network> with your external network, for example, 10.0.0.0/24. For more information, see Completing prerequisites for migrating Ceph RGW.
- If TLS is enabled, add the SSL certificate and key concatenation as described in Configuring RGW with TLS for an external Red Hat Ceph Storage cluster in Configuring persistent storage.

Apply the rgw_ingress spec by using the Ceph orchestrator CLI:

$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ cephadm shell -m ${SPEC_DIR}/rgw_ingress -- ceph orch apply -i /mnt/rgw_ingress

Wait until the ingress is deployed and query the resulting endpoint:

$ sudo cephadm shell -- ceph orch ls

NAME                 	PORTS            	RUNNING  REFRESHED  AGE  PLACEMENT
crash                                         	6/6  6m ago 	3d   *
ingress.rgw.rgw      	10.0.0.89:8080,8898  	6/6  37s ago	60s  label:rgw
mds.mds                   3/3  6m ago 	3d   controller-0;controller-1;controller-2
mgr                       3/3  6m ago 	3d   controller-0;controller-1;controller-2
mon                       3/3  6m ago 	3d   controller-0;controller-1;controller-2
osd.default_drive_group   15  37s ago	3d   cephstorage-0;cephstorage-1;cephstorage-2
rgw.rgw   ?:8090          3/3  37s ago	4m   label:rgw

$ curl 10.0.0.89:8080

---
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>[ceph: root@controller-0 /]#
—

7.4.3. Create or update the Object Storage service endpoints

You must create or update the Object Storage service (swift) endpoints to configure the new virtual IP address (VIP) that you reserved on the same network that you used to deploy RGW ingress.

Procedure

List the current swift endpoints and service:

$ oc rsh openstackclient openstack endpoint list | grep "swift.*object"

$ oc rsh openstackclient openstack service list | grep "swift.*object"

If the service and endpoints do not exist, create the missing swift resources:

$ oc rsh openstackclient openstack service create --name swift --description 'OpenStack Object Storage' object-store
$ oc rsh openstackclient openstack role add --user swift --project service member
$ oc rsh openstackclient openstack role add --user swift --project service admin
> for i in public internal; do
>     oc rsh openstackclient endpoint create --region regionOne  object-store $i http://<RGW_VIP>:8080/swift/v1/AUTH_%\(tenant_id\)s
> done
$ oc rsh openstackclient openstack role add --project admin --user admin swiftoperator

Replace <RGW_VIP> with the Ceph RGW ingress VIP.

If the endpoints exist, update the endpoints to point to the right RGW ingress VIP:

$ oc rsh openstackclient openstack endpoint set --url http://<RGW_VIP>:8080/swift/v1/AUTH_%\(tenant_id\)s <swift_public_endpoint_uuid>
$ oc rsh openstackclient openstack endpoint set --url http://<RGW_VIP>:8080/swift/v1/AUTH_%\(tenant_id\)s <swift_internal_endpoint_uuid>
$ oc rsh openstackclient openstack endpoint list | grep object
| 0d682ad71b564cf386f974f90f80de0d | regionOne | swift        | object-store | True    | public    | http://172.18.0.100:8080/swift/v1/AUTH_%(tenant_id)s    |
| b311349c305346f39d005feefe464fb1 | regionOne | swift        | object-store | True    | internal  | http://172.18.0.100:8080/swift/v1/AUTH_%(tenant_id)s    |

Replace <swift_public_endpoint_uuid> with the UUID of the swift public endpoint.
Replace <swift_internal_endpoint_uuid> with the UUID of the swift internal endpoint.

Test the migrated service:

$ oc rsh openstackclient openstack container list --debug

...
...
...
REQ: curl -g -i -X GET http://keystone-public-openstack.apps.ocp.openstack.lab -H "Accept: application/json" -H "User-Agent: openstacksdk/1.0.2 keystoneauth1/5.1.3 python-requests/2.25.1 CPython/3.9.23"
Starting new HTTP connection (1): keystone-public-openstack.apps.ocp.openstack.lab:80
http://keystone-public-openstack.apps.ocp.openstack.lab:80 "GET / HTTP/1.1" 300 298
RESP: [300] content-length: 298 content-type: application/json date: Mon, 14 Jul 2025 17:41:29 GMT location: http://keystone-public-openstack.apps.ocp.openstack.lab/v3/ server: Apache set-cookie: b5697f82cf3c19ece8be533395142512=d5c6a9ee2
267c4b63e9f656ad7565270; path=/; HttpOnly vary: X-Auth-Token x-openstack-request-id: req-452e42c5-e60f-440f-a6e8-fe1b9ea89055
RESP BODY: {"versions": {"values": [{"id": "v3.14", "status": "stable", "updated": "2020-04-07T00:00:00Z", "links": [{"rel": "self", "href": "http://keystone-public-openstack.apps.ocp.openstack.lab/v3/"}], "media-types": [{"base": "applic
ation/json", "type": "application/vnd.openstack.identity-v3+json"}]}]}}
GET call to http://keystone-public-openstack.apps.ocp.openstack.lab/ used request id req-452e42c5-e60f-440f-a6e8-fe1b9ea89055

...

REQ: curl -g -i -X GET "http://172.18.0.100:8080/swift/v1/AUTH_44477474b0dc4b5b8911ceec23a22246?format=json" -H "User-Agent: openstacksdk/1.0.2 keystoneauth1/5.1.3 python-requests/2.25.1 CPython/3.9.23" -H "X-Auth-Token: {SHA256}ec5deca0be37bd8bfe659f132b9cdf396b8f409db5dc16972d50cbf3f28474d4"
Starting new HTTP connection (1): 172.18.0.100:8080
http://172.18.0.100:8080 "GET /swift/v1/AUTH_44477474b0dc4b5b8911ceec23a22246?format=json HTTP/1.1" 200 2
RESP: [200] accept-ranges: bytes content-length: 2 content-type: application/json; charset=utf-8 date: Mon, 14 Jul 2025 17:41:31 GMT x-account-bytes-used: 0 x-account-bytes-used-actual: 0 x-account-container-count: 0 x-account-object-count: 0 x-account-storage-policy-default-placement-bytes-used: 0 x-account-storage-policy-default-placement-bytes-used-actual: 0 x-account-storage-policy-default-placement-container-count: 0 x-account-storage-policy-default-placement-object-count: 0 x-openstack-request-id: tx000001e95361131ccf694-006875414a-7753-default x-timestamp: 1752514891.25991 x-trans-id: tx000001e95361131ccf694-006875414a-7753-default
RESP BODY: []
GET call to http://172.18.0.100:8080/swift/v1/AUTH_44477474b0dc4b5b8911ceec23a22246?format=json used request id tx000001e95361131ccf694-006875414a-7753-default

clean_up ListContainer:
END return value: 0

Run tempest tests against Object Storage service:

$ tempest run --regex tempest.api.object_storage
...
...
...
======
Totals
======
Ran: 141 tests in 606.5579 sec.
 - Passed: 128
 - Skipped: 13
 - Expected Fail: 0
 - Unexpected Success: 0
 - Failed: 0
Sum of execute time for each test: 657.5183 sec.

==============
Worker Balance
==============
 - Worker 0 (1 tests) => 0:10:03.400561
 - Worker 1 (2 tests) => 0:00:24.531916
 - Worker 2 (4 tests) => 0:00:10.249889
 - Worker 3 (30 tests) => 0:00:32.730095
 - Worker 4 (51 tests) => 0:00:26.246044
 - Worker 5 (6 tests) => 0:00:20.114803
 - Worker 6 (20 tests) => 0:00:16.290323
 - Worker 7 (27 tests) => 0:00:17.103827

7.5. Migrating Red Hat Ceph Storage RBD to external RHEL nodes

For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes that are running Ceph Reef or later, you must migrate the daemons that are included in the OpenStack control plane into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or dedicated storage nodes.

Before you begin the migration, complete the tasks in your OpenStack environment. For more information, see Ceph prerequisites.

7.5.1. Migrating Ceph Manager daemons to Ceph nodes

You must migrate your Ceph Manager daemons from the OpenStack (OSP) Controller nodes to a set of target nodes. Target nodes are either existing Ceph nodes, or OSP Compute nodes if Ceph is deployed by TripleO with a Hyperconverged Infrastructure (HCI) topology.

The following procedure uses cephadm and the Ceph Orchestrator to drive the Ceph Manager migration, and the Ceph spec to modify the placement and reschedule the Ceph Manager daemons. Ceph Manager is run in an active/passive state. It also provides many modules, including the Ceph Orchestrator. Every potential module, such as the Ceph Dashboard, that is provided by ceph-mgr is implicitly migrated with Ceph Manager.

Procedure

SSH into the target node and enable the firewall rules that are required to reach a Ceph Manager service:
```
dports="6800:7300"
ssh heat-admin@<target_node> sudo iptables -I INPUT \
    -p tcp --match multiport --dports $dports -j ACCEPT;
```
- Replace <target_node> with the hostname of the hosts that are listed in the Ceph environment. Run ceph orch host ls to see the list of the hosts.
  
  Repeat this step for each target node.

Check that the rules are properly applied to the target node and persist them:

$ sudo iptables-save
$ sudo systemctl restart iptables

The default dashboard port for ceph-mgr in a greenfield deployment is 8443. With director-deployed Ceph, the default port is 8444 because the service ran on the Controller node, and it was necessary to use this port to avoid a conflict. For adoption, update the dashboard port to 8443 in the ceph-mgr configuration and firewall rules.

$ sudo cephadm shell
$ ceph config set mgr mgr/dashboard/server_port 8443
$ ceph config set mgr mgr/dashboard/ssl_server_port 8443
$ ceph mgr module disable dashboard
$ ceph mgr module enable dashboard

If nftables is used in the existing deployment, edit /etc/nftables/tripleo-rules.nft and add the following content:

# 113 ceph_mgr {'dport': ['6800-7300', 8443]}
add rule inet filter TRIPLEO_INPUT tcp dport { 6800-7300,8443 } ct state new counter accept comment "113 ceph_mgr"

Save the file.
Restart the nftables service:
```
$ sudo systemctl restart nftables
```
Verify that the rules are applied:
```
$ sudo nft list ruleset | grep ceph_mgr
```
Prepare the target node to host the new Ceph Manager daemon, and add the mgr label to the target node:
```
$ sudo cephadm shell -- ceph orch host label add <target_node> mgr
```
Repeat steps 1-7 for each target node that hosts a Ceph Manager daemon.

Get the Ceph Manager spec:

$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ mkdir -p ${SPEC_DIR}
$ sudo cephadm shell -- ceph orch ls --export mgr > ${SPEC_DIR}/mgr

Edit the retrieved spec and add the label: mgr section to the placement section:
```
service_type: mgr
service_id: mgr
placement:
  label: mgr
```
Save the spec.

Apply the spec with cephadm by using the Ceph Orchestrator:

$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ sudo cephadm shell -m ${SPEC_DIR}/mgr -- ceph orch apply -i /mnt/mgr

Verification

Verify that the new Ceph Manager daemons are created in the target nodes:

$ sudo cephadm shell -- ceph orch ps | grep -i mgr
$ sudo cephadm shell -- ceph -s

The Ceph Manager daemon count should match the number of hosts where the mgr label is added.

The migration does not shrink the Ceph Manager daemons. The count grows by the number of target nodes, and migrating Ceph Monitor daemons to Ceph nodes decommissions the stand-by Ceph Manager instances. For more information, see Migrating Ceph Monitor daemons to Ceph nodes.

7.5.2. Migrating Ceph Monitor daemons to Ceph nodes

You must move Ceph Monitor daemons from the OpenStack (OSP) Controller nodes to a set of target nodes. Target nodes are either existing Ceph nodes, or OSP Compute nodes if Ceph is deployed by TripleO with a Hyperconverged Infrastructure (HCI) topology. Additional Ceph Monitors are deployed to the target nodes, and they are promoted as _admin nodes that you can use to manage the Ceph Storage cluster and perform day 2 operations.

To migrate the Ceph Monitor daemons, you must perform the following high-level steps:

Configure the target nodes for Ceph Monitor migration.
Drain the source node.
Migrate your Ceph Monitor IP addresses to the target nodes.
Redeploy the Ceph Monitor on the target node.
Verify that the Ceph Storage cluster is healthy.

Repeat these steps for any additional Controller node that hosts a Ceph Monitor until you migrate all the Ceph Monitor daemons to the target nodes.

Configuring target nodes for Ceph Monitor migration

Prepare the target Ceph nodes for the Ceph Monitor migration by performing the following actions:

Enable firewall rules in a target node and persist them.
Create a spec that is based on labels and apply it by using cephadm.
Ensure that the Ceph Monitor quorum is maintained during the migration process.

Procedure

SSH into the target node and enable the firewall rules that are required to reach a Ceph Monitor service:
```
$ for port in 3300 6789; {
    ssh heat-admin@<target_node> sudo iptables -I INPUT \
    -p tcp -m tcp --dport $port -m conntrack --ctstate NEW \
    -j ACCEPT;
}
```
- Replace <target_node> with the hostname of the node that hosts the new Ceph Monitor.
Check that the rules are properly applied to the target node and persist them:
```
$ sudo iptables-save
$ sudo systemctl restart iptables
```

If nftables is used in the existing deployment, edit /etc/nftables/tripleo-rules.nft and add the following content:

# 110 ceph_mon {'dport': [6789, 3300, '9100']}
add rule inet filter TRIPLEO_INPUT tcp dport { 6789,3300,9100 } ct state new counter accept comment "110 ceph_mon"

Save the file.
Restart the nftables service:
```
$ sudo systemctl restart nftables
```
Verify that the rules are applied:
```
$ sudo nft list ruleset | grep ceph_mon
```
To migrate the existing Ceph Monitors to the target Ceph nodes, retrieve the Ceph mon spec from the first Ceph Monitor, or the first Controller node:
```
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ mkdir -p ${SPEC_DIR}
$ sudo cephadm shell -- ceph orch ls --export mon > ${SPEC_DIR}/mon
```

Add the label:mon section to the placement section:

service_type: mon
service_id: mon
placement:
  label: mon

Save the spec.

Apply the spec with cephadm by using the Ceph Orchestrator:

$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ sudo cephadm shell -m ${SPEC_DIR}/mon -- ceph orch apply -i /mnt/mon

Extend the mon label to the remaining Ceph target nodes to ensure that quorum is maintained during the migration process:

for item in $(sudo cephadm shell --  ceph orch host ls --format json | jq -r '.[].hostname'); do
    sudo cephadm shell -- ceph orch host label add  $item mon;
    sudo cephadm shell -- ceph orch host label add  $item _admin;
done

Applying the mon spec allows the existing strategy to use labels instead of hosts. As a result, any node with the mon label can host a Ceph Monitor daemon. Perform this step only once to avoid multiple iterations when multiple Ceph Monitors are migrated.

Check the status of the Ceph Storage and the Ceph Orchestrator daemons list. Ensure that Ceph Monitors are in a quorum and listed by the ceph orch command:

$ sudo cephadm shell -- ceph -s
  cluster:
    id:     f6ec3ebe-26f7-56c8-985d-eb974e8e08e3
    health: HEALTH_OK

  services:
    mon: 6 daemons, quorum controller-0,controller-1,controller-2,ceph-0,ceph-1,ceph-2 (age 19m)
    mgr: controller-0.xzgtvo(active, since 32m), standbys: controller-1.mtxohd, controller-2.ahrgsk
    osd: 8 osds: 8 up (since 12m), 8 in (since 18m); 1 remapped pgs

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   43 MiB used, 400 GiB / 400 GiB avail
    pgs:     1 active+clean

$ sudo cephadm shell -- ceph orch host ls
HOST              ADDR           LABELS          STATUS
ceph-0        192.168.24.14  osd mon mgr _admin
ceph-1        192.168.24.7   osd mon mgr _admin
ceph-2        192.168.24.8   osd mon mgr _admin
controller-0  192.168.24.15  _admin mgr mon
controller-1  192.168.24.23  _admin mgr mon
controller-2  192.168.24.13  _admin mgr mon

Set up a Ceph client on the first Controller node that is used during the rest of the procedure to interact with Ceph. Set up an additional IP address on the storage network that is used to interact with Ceph when the first Controller node is decommissioned:
1. Back up the content of /etc/ceph in the ceph_client_backup directory.
  $ mkdir -p $HOME/ceph_client_backup $ sudo cp -R /etc/ceph/* $HOME/ceph_client_backup
2. Edit /etc/os-net-config/config.yaml and add - ip_netmask: 172.17.3.200 after the IP address on the VLAN that belongs to the storage network. Replace 172.17.3.200 with any other available IP address on the storage network that can be statically assigned to controller-0.
3. Save the file and refresh the controller-0 network configuration:
  $ sudo os-net-config -c /etc/os-net-config/config.yaml
4. Verify that the IP address is present in the Controller node:
  $ ip -o a | grep 172.17.3.200
5. Ping the IP address and confirm that it is reachable:
  $ ping -c 3 172.17.3.200
6. Verify that you can interact with the Ceph cluster:
  $ sudo cephadm shell -c $HOME/ceph_client_backup/ceph.conf -k $HOME/ceph_client_backup/ceph.client.admin.keyring -- ceph -s

Next steps

Proceed to the next step Draining the source node.

Draining the source node

Drain the source node and remove the source node host from the Ceph Storage cluster.

Procedure

On the source node, back up the /etc/ceph/ directory to run cephadm and get a shell for the Ceph cluster from the source node:
```
$ mkdir -p $HOME/ceph_client_backup
$ sudo cp -R /etc/ceph $HOME/ceph_client_backup
```
Identify the active ceph-mgr instance:
```
$ sudo cephadm shell -- ceph mgr stat
```
Fail the ceph-mgr if it is active on the source node:
```
$ sudo cephadm shell -- ceph mgr fail <mgr_instance>
```
- Replace <mgr_instance> with the Ceph Manager daemon to fail.
From the cephadm shell, remove the labels on the source node:
```
$ for label in mon mgr _admin; do
    sudo cephadm shell -- ceph orch host label rm <source_node> $label;
done
```
- Replace <source_node> with the hostname of the source node.
Optional: Ensure that you remove the Ceph Monitor daemon from the source node if it is still running:
```
$ sudo cephadm shell -- ceph orch daemon rm mon.<source_node> --force
```

Drain the source node to remove any leftover daemons:

$ sudo cephadm shell -- ceph orch host drain <source_node>

Remove the source node host from the Ceph Storage cluster:

$ sudo cephadm shell -- ceph orch host rm <source_node> --force

The source node is not part of the cluster anymore, and should not appear in the Ceph host list when you run sudo cephadm shell -- ceph orch host ls. However, if you run sudo podman ps in the source node, the list might show that both Ceph Monitors and Ceph Managers are still running.

[root@controller-1 ~]# sudo podman ps
CONTAINER ID  IMAGE                                                                                        COMMAND               CREATED         STATUS             PORTS       NAMES
5c1ad36472bc  quay.io/ceph/daemon@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mon.contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-controller-1
3b14cc7bf4dd  quay.io/ceph/daemon@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mgr.contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mgr-controller-1-mtxohd

Confirm that mons are still in quorum:

$ sudo cephadm shell -- ceph -s
$ sudo cephadm shell -- ceph orch ps | grep -i mon

Next steps

Proceed to the next step Migrating the Ceph Monitor IP address.

Migrating the Ceph Monitor IP address

You must migrate your Ceph Monitor IP addresses to the target Ceph nodes. The IP address migration assumes that the target nodes are originally deployed by TripleO and that the network configuration is managed by os-net-config.

Procedure

Get the original Ceph Monitor IP addresses from $HOME/ceph_client_backup/ceph.conf file on the mon_host line, for example:

mon_host = [v2:172.17.3.60:3300/0,v1:172.17.3.60:6789/0] [v2:172.17.3.29:3300/0,v1:172.17.3.29:6789/0] [v2:172.17.3.53:3300/0,v1:172.17.3.53:6789/0]

Match the IP address retrieved in the previous step with the storage network IP addresses on the source node, and find the Ceph Monitor IP address:

[tripleo-admin@controller-0 ~]$ ip -o -4 a | grep 172.17.3
9: vlan30    inet 172.17.3.60/24 brd 172.17.3.255 scope global vlan30\       valid_lft forever preferred_lft forever
9: vlan30    inet 172.17.3.13/32 brd 172.17.3.255 scope global vlan30\       valid_lft forever preferred_lft forever

Confirm that the Ceph Monitor IP address is present in the os-net-config configuration that is located in the /etc/os-net-config directory on the source node:
```
[tripleo-admin@controller-0 ~]$ grep "172.17.3.60" /etc/os-net-config/config.yaml
    - ip_netmask: 172.17.3.60/24
```
Edit the /etc/os-net-config/config.yaml file and remove the ip_netmask line.
Save the file and refresh the node network configuration:
```
$ sudo os-net-config -c /etc/os-net-config/config.yaml
```
Verify that the IP address is not present in the source node anymore, for example:
```
[controller-0]$ ip -o a | grep 172.17.3.60
```
SSH into the target node, for example cephstorage-0, and add the IP address for the new Ceph Monitor.
On the target node, edit /etc/os-net-config/config.yaml and add the - ip_netmask: 172.17.3.60 line that you removed in the source node.
Save the file and refresh the node network configuration:
```
$ sudo os-net-config -c /etc/os-net-config/config.yaml
```
Verify that the IP address is present in the target node.
```
$ ip -o a | grep 172.17.3.60
```
From the Ceph client node, controller-0, ping the IP address that is migrated to the target node and confirm that it is still reachable:
```
[controller-0]$ ping -c 3 172.17.3.60
```

Next steps

Proceed to the next step Redeploying the Ceph Monitor on the target node.

Redeploying a Ceph Monitor on the target node

You use the IP address that you migrated to the target node to redeploy the Ceph Monitor on the target node.

Procedure

From the Ceph client node, for example controller-0, get the Ceph mon spec:

$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ sudo cephadm shell -- ceph orch ls --export mon > ${SPEC_DIR}/mon

Edit the retrieved spec and add the unmanaged: true keyword:

service_type: mon
service_id: mon
placement:
  label: mon
unmanaged: true

Save the spec.
Apply the spec with cephadm by using the Ceph Orchestrator:
```
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ sudo cephadm shell -m ${SPEC_DIR}/mon -- ceph orch apply -i /mnt/mon
```
The Ceph Monitor daemons are marked as unmanaged, and you can now redeploy the existing daemon and bind it to the migrated IP address.
Delete the existing Ceph Monitor on the target node:
```
$ sudo cephadm shell -- ceph orch daemon rm mon.<target_node> --force
```
- Replace <target_node> with the hostname of the target node that is included in the Ceph cluster.
Redeploy the new Ceph Monitor on the target node by using the migrated IP address:
```
$ sudo cephadm shell -- ceph orch daemon add mon <target_node>:<ip_address>
```
- Replace <ip_address> with the IP address of the migrated IP address.

Get the Ceph Monitor spec:

$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ sudo cephadm shell -- ceph orch ls --export mon > ${SPEC_DIR}/mon

Edit the retrieved spec and set the unmanaged keyword to false:

service_type: mon
service_id: mon
placement:
  label: mon
unmanaged: false

Save the spec.
Apply the spec with cephadm by using the Ceph Orchestrator:
```
$ SPEC_DIR=${SPEC_DIR:-"$PWD/ceph_specs"}
$ sudo cephadm shell -m ${SPEC_DIR}/mon -- ceph orch apply -i /mnt/mon
```
The new Ceph Monitor runs on the target node with the original IP address.
Identify the running mgr:
```
$ sudo cephadm shell -- ceph mgr stat
```
Refresh the Ceph Manager information by force-failing it:
```
$ sudo cephadm shell -- ceph mgr fail
```

Refresh the OSD information:

$ sudo cephadm shell -- ceph orch reconfig osd.default_drive_group

Next steps

Repeat the procedure starting from step Draining the source node for each node that you want to decommission. Proceed to the next step Verifying the Ceph Storage cluster after Ceph Monitor migration.

Verifying the Ceph Storage cluster after Ceph Monitor migration

After you finish migrating your Ceph Monitor daemons to the target nodes, verify that the the Ceph Storage cluster is healthy.

Procedure

Verify that the Ceph Storage cluster is healthy:

$ ceph -s
  cluster:
    id:     f6ec3ebe-26f7-56c8-985d-eb974e8e08e3
    health: HEALTH_OK
...
...

Verify that the Ceph Storage mons are running with the old IP addresses. SSH into the target nodes and verify that the Ceph Monitor daemons are bound to the expected IP and port:
```
$ netstat -tulpn | grep 3300
```

7.6. Updating the Ceph Storage cluster Ceph Dashboard configuration

If the Ceph Dashboard is part of the enabled Ceph Manager modules, you need to reconfigure the failover settings.

Procedure

Regenerate the following Ceph configuration keys to point to the right mgr container:

mgr    advanced  mgr/dashboard/controller-0.ycokob/server_addr  172.17.3.33
mgr    advanced  mgr/dashboard/controller-1.lmzpuc/server_addr  172.17.3.147
mgr    advanced  mgr/dashboard/controller-2.xpdgfl/server_addr  172.17.3.138

$ sudo cephadm shell
$ ceph orch ps | awk '/mgr./ {print $1}'

For each retrieved mgr daemon, update the corresponding entry in the Ceph configuration:
```
$ ceph config set mgr mgr/dashboard/<>/server_addr/<ip addr>
```

Adopting a Red Hat OpenStack Platform 17.1 deployment

1. Red Hat OpenStack Services on OpenShift Antelope adoption overview

1.1. Adoption limitations

1.2. Adoption prerequisites

1.3. Guidelines for planning the adoption

1.4. Adoption process overview

1.5. Overview of Distributed Compute Node adoption

1.6. Installing the systemd-container package on Compute hosts

1.7. Identity service authentication

1.8. Configuring the network for the Red Hat OpenStack Services on OpenShift deployment

1.8.1. Retrieving the network configuration from your existing deployment

1.8.2. Planning your IPAM configuration

Configuring new subnet ranges

Reusing existing subnet ranges

1.8.3. Configuring isolated networks

Configuring isolated networks on OCP worker nodes

Configuring isolated networks on control plane services

Configuring isolated networks on data plane nodes

1.9. Configuring spine-leaf networks for the Red Hat OpenStack Services on OpenShift deployment

1.10. Storage requirements

1.10.1. Storage driver certification

1.10.2. Block Storage service guidelines

Preparing the Block Storage service by using an agnostic configuration file

About the Block Storage service configuration generation helper tool

1.10.3. Limitations for adopting the Block Storage service

1.10.4. OCP preparation for Block Storage service adoption

1.10.5. Preparing the Block Storage service by customizing the configuration

1.10.6. Changes to CephFS through NFS

1.11. Ceph prerequisites

1.11.1. Completing prerequisites for a Ceph cluster with monitoring stack components

1.11.2. Completing prerequisites for Ceph RGW migration

1.11.3. Completing prerequisites for a Ceph RBD migration

1.11.4. Creating an NFS Ganesha cluster

1.12. Preparing an Instance HA deployment for adoption

1.12.1. Maintaining the Instance HA functionality after adoption

1.12.2. Preventing Pacemaker from monitoring Compute nodes

1.13. Comparing configuration files between deployments

2. Migrating TLS-e to the RHOSO deployment

3. Migrating databases to the control plane

3.1. Retrieving topology-specific service configuration

3.2. Deploying back-end services

3.3. Configuring a Ceph back end

3.4. Stopping OpenStack services

3.5. Migrating databases to MariaDB instances

3.6. Migrating OVN data

4. Adopting OpenStack control plane services

4.1. Adopting the Identity service

4.2. Configuring LDAP with domain-specific drivers

4.3. Adopting the Key Manager service

4.4. Adopting the Key Manager service with HSM integration

4.4.1. Key Manager service HSM adoption approaches

4.4.2. Adopting the Key Manager service with Proteccio HSM integration

4.4.3. Adopting the Key Manager service with HSM integration

4.4.4. Troubleshooting Key Manager HSM adoption

Resolving configuration validation failures

Resolving missing HSM file prerequisites

Resolving source environment connectivity issues

Resolving HSM secret creation failures

Resolving custom image registry issues

Resolving HSM back-end detection failures

Resolving database migration issues

Resolving service startup failures

Resolving performance and connectivity issues

4.4.5. Troubleshooting Key Manager service Proteccio HSM adoption

Resolving prerequisite validation failures

Resolving SSH connection failures to the source environment

Resolving database import failures

Resolving custom image pull failures

Resolving HSM certificate mounting issues

Resolving service startup failures

Resolving adoption verification failures

4.4.6. Rolling back the HSM adoption

4.5. Adopting the Networking service

4.6. Configuring control plane networking for spine-leaf topologies

4.7. Adopting the Object Storage service

4.8. Adopting the Image service

4.8.1. Adopting the Image service that is deployed with a Object Storage service back end

4.8.2. Adopting the Image service that is deployed with a Block Storage service back end

4.8.3. Adopting the Image service that is deployed with an NFS back end

4.8.4. Adopting the Image service that is deployed with a Ceph back end

1.6. Installing the `systemd-container` package on Compute hosts