Developer Documentation

The watcher-operator is an OpenShift Operator built using the Operator Framework for Go. The Operator provides a way to install and manage the OpenStack Watcher service on OpenShift. This Operator is developed using RDO containers for OpenStack.

Description

This operator is built using the operator-sdk framework to provide day one and day two lifecycle managment of the OpenStack Watcher service on an OpenShift cluster.

Getting Started

Prerequisites

  • go version v1.21.0+

  • docker version 17.03+.

  • kubectl version v1.11.3+.

  • Access to a Kubernetes v1.11.3+ cluster.

To Deploy on the cluster

Build and push your image to the location specified by IMG:

make docker-build docker-push IMG=<some-registry>/watcher-operator:tag

NOTE: This image ought to be published in the personal registry you specified. And it is required to have access to pull the image from the working environment. Make sure you have the proper permission to the registry if the above commands don’t work.

Install the CRDs into the cluster:

make install

Deploy the Manager to the cluster with the image specified by IMG:

make deploy IMG=<some-registry>/watcher-operator:tag

NOTE: If you encounter RBAC errors, you may need to grant yourself cluster-admin privileges or be logged in as admin.

Create instances of your solution You can apply the samples (examples) from the config/sample:

kubectl apply -k config/samples/

NOTE: Ensure that the samples has default values to test it out.

To Uninstall

Delete the instances (CRs) from the cluster:

kubectl delete -k config/samples/

Delete the APIs(CRDs) from the cluster:

make uninstall

UnDeploy the controller from the cluster:

make undeploy

To Deploy via OLM

**Deploy watcher-operator via olm

make watcher

**Deply watcher-operator via olm with different catalog image

make watcher CATALOG_IMAGE=<catalog image url with tag>
To Deploy watcher service

**Deploy watcher service

make watcher_deploy

To Uninstall OLM deployed watcher-operator

**Undeploy watcher service

make watcher_deploy_cleanup

**Uninstall watcher-operator

make watcher_cleanup

Project Distribution

Following are the steps to build the installer and distribute this project to users.

  1. Build the installer for the image built and published in the registry:

make build-installer IMG=<some-registry>/watcher-operator:tag
The makefile target mentioned above generates an install.yaml file in the dist directory. This file contains all the resources built with Kustomize, which are necessary to install this project without its dependencies.
  1. Using the installer

Users can just run kubectl apply -f to install the project, i.e.:

kubectl apply -f https://raw.githubusercontent.com/<org>/watcher-operator/<tag or branch>/dist/install.yaml

Contributing

NOTE: Run make help for more information on all potential make targets

More information can be found via the Kubebuilder Documentation

License

Copyright 2024.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

User Installation Guide

Getting Started

Before installing the Watcher operator you first need a functional OpenShift installation with the required Openstack operators, including the Telemetry operator. The following links point to documents detailing how to create this required starting environment:

A CRC (Code Ready Containers) installation is adequade for a developer environment.

To verify that the environment set up is ready, do the following:

  1. Log in to the Kubernetes/Openshift environment:

    $ oc login -u <username> -p <password> https://api.crc.testing:6443 --insecure-skip-tls-verify=true
  2. Access the Openstack client and verify the service endpoints are available:

    $ oc rsh openstackclient openstack endpoint list -c 'ID' -c 'Service Name' -c 'Enabled'
    +----------------------------------+--------------+---------+
    | ID                               | Service Name | Enabled |
    +----------------------------------+--------------+---------+
    | 0bada656064a4d409bc5fed610654edd | neutron      | True    |
    | 17453066f8dc40bfa0f8584007cffc9a | cinderv3     | True    |
    | 22768bf3e9a34fefa57b96c20d405cfe | keystone     | True    |
    | 284fd13676ed4bb095b602a004b4a0f2 | watcher      | True    |
    | 48602a17288f442a98c300931f82244a | watcher      | True    |
    | 54e3d48cdda84263b7f1c65c924f3e3a | glance       | True    |
    | 74345a18262740eb952d2b6b7220ceeb | keystone     | True    |
    | 789a2d6048174b849a7c7243421675b4 | placement    | True    |
    | 9b7d8f26834343a59108a4225e0e574a | nova         | True    |
    | a836d134394846ff88f2f3dd8d96de34 | nova         | True    |
    | af1bf23e62c148d3b7f6c47f8f071739 | placement    | True    |
    | ce0489dfeff64afb859338e480397f90 | glance       | True    |
    | db69cc22117344b796f97e8dd3dc67e5 | neutron      | True    |
    | fa48dc132b524915b4d1ca963c50a653 | cinderv3     | True    |
    +----------------------------------+--------------+---------+
  3. Verify that the Telemetry operator with Prometheus metric storage is ready:

    $ oc get telemetry
    NAME        STATUS   MESSAGE
    telemetry   True     Setup complete
    
    $ oc get metricstorage
    NAME             STATUS   MESSAGE
    metric-storage   True     Setup complete
    
    $ oc get route metric-storage-prometheus
    NAME                        HOST/PORT                                              PATH   SERVICES                    PORT   TERMINATION     WILDCARD
    metric-storage-prometheus   metric-storage-prometheus-openstack.apps-crc.testing          metric-storage-prometheus   web    edge/Redirect   None
  4. You can view the Prometheus metrics in a web browser at the HOST/PORT address, for example, https://metric-storage-prometheus-openstack.apps-crc.testing.

Installing the Operator

Procedure

Now that you have a ready working environment, you can install the Watcher Operator. NOTE: The steps below require you to log in to your OpenShift cluster as a user with cluster-admin privileges.

  1. Create a watcher-operator.yaml file:

    ---
    apiVersion: operators.coreos.com/v1alpha1
    kind: CatalogSource
    metadata:
      name: watcher-operator-index
      namespace: openstack-operators
    spec:
      image: quay.io/openstack-k8s-operators/watcher-operator-index:latest
      sourceType: grpc
    ---
    apiVersion: operators.coreos.com/v1
    kind: OperatorGroup
    metadata:
      name: openstack
      namespace: openstack-operators
    ---
    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: watcher-operator
      namespace: openstack-operators
    spec:
      name: watcher-operator
      channel: alpha
      source: watcher-operator-index
      sourceNamespace: openstack-operators
  2. oc apply the file to create the resources:

    $ oc apply -f watcher-operator.yaml
    catalogsource.operators.coreos.com/watcher-operator-index created
    operatorgroup.operators.coreos.com/openstack unchanged
    subscription.operators.coreos.com/watcher-operator created
  3. Check that the operator is installed:

    $ oc get subscription.operators.coreos.com/watcher-operator -n openstack-operators
    NAME               PACKAGE            SOURCE                   CHANNEL
    watcher-operator   watcher-operator   watcher-operator-index   alpha
    
    $ oc get pod -l openstack.org/operator-name=watcher -n openstack-operators
    NAME                                                  READY   STATUS    RESTARTS   AGE
    watcher-operator-controller-manager-dd95db756-kslw9   2/2     Running   0          44s
    
    $ oc get csv watcher-operator.v0.0.1
    NAME                      DISPLAY            VERSION   REPLACES   PHASE
    watcher-operator.v0.0.1   Watcher Operator   0.0.1                Succeeded

Deploying the Watcher Service

Now, you will need to create a Watcher Custom Resource based on the Watcher CRD in the same project where your OpenStackControlPlane CR is created. Typically, this is openstack project but you can check it with:

$ oc get OpenStackControlPlane --all-namespaces
NAMESPACE   NAME                     STATUS   MESSAGE
openstack   openstack-controlplane   True     Setup complete
Procedure
  1. Use the following commands to view the Watcher CRD definition and specification schema:

    $ oc describe crd watcher
    
    $ oc explain watcher.spec
  2. Add a WatcherPassword field to the Secret created as part of the control plane deployment.

  3. Update the Secret, and verify that the WatcherPassword field is present:

    $ oc apply -f <secret file> -n openstack
    
    $ oc describe secret osp-secret -n openstack | grep Watcher
    WatcherPassword:                  9 bytes
  4. Create a file on your workstation named watcher.yaml to define the Watcher CR. Although the exact parameters of your file may depend on your specific environment customization, a Watcher CR similar to the example below would work in a typical deployment:

    apiVersion: watcher.openstack.org/v1beta1
    kind: Watcher
    metadata:
      name: watcher
    spec:
      databaseInstance: "openstack"
      secret: <name of the secret with the credentials of the ControlPlane deploy>
      apiServiceTemplate:
        tls:
          caBundleSecretName: "combined-ca-bundle"

    There are certain fields of the Watcher CR spec that need to match with the values used in the existing OpenStackControlplane:

    • databaseInstance parameter value must match to the name of the galera database created in the existing Control Plane. By default, this value is openstack but you can find it by running (ignore any galera having cell in its name):

      $ oc get galeras -n openstack
      NAME              READY   MESSAGE
      openstack         True    Setup complete
    • rabbitMqClusterName parameter value should be the name of the existing Rabbitmq cluster, which can be found with the command (ignore any rabbitmq having cell in its name). By default, it is rabbitmq.

      $ oc get rabbitmq -n openstack
      NAME             ALLREPLICASREADY   RECONCILESUCCESS   AGE
      rabbitmq         True               True               6d15h
    • memcachedInstance must contain the name of the existing memcached CR in the same project (memcached by default). you can find it with:

      $ oc get memcached -n openstack
      NAME        READY   MESSAGE
      memcached   True    Setup complete
    • caBundleSecretName under apiServiceTemplate.tls section must match the value found in command:

      $ oc get OpenStackControlPlane openstack-controlplane -n openstack \
        -o jsonpath='{.status.tls.caBundleSecretName}'
      combined-ca-bundle

      For more information about how to define an OpenStackControlPlane custom resource (CR), see Creating the control plane.

  5. oc apply to configure Watcher

    $ oc apply -f watcher.yaml -n openstack
    watcher.watcher.openstack.org/watcher configured
  6. To check if the service status, run:

    $ oc wait -n openstack --for condition=Ready --timeout=300s Watcher watcher
    watcher.watcher.openstack.org/watcher condition met

    where Watcher refers to the kind and watcher refers to the name of the CR.

  7. Check that the watcher service has been registered in list of keystone services with command:

    $ oc rsh openstackclient openstack service list
    +----------------------------------+------------+-------------+
    | ID                               | Name       | Type        |
    +----------------------------------+------------+-------------+
    | 1470e8d6019446a1bcdfdb6dc55f3f6a | nova       | compute     |
    | 41d60e1c678142cf8e5daf7a82af1864 | neutron    | network     |
    | 5b0d95d1c08e4deb832815addd859924 | ceilometer | Ceilometer  |
    | 7e081cb4928945d7aa41d1622f7b8586 | cinderv3   | volumev3    |
    | 8d7ee56ca2bb4dba999d67580909dd90 | glance     | image       |
    | c3348e10fb414780988fbbceac9c4b5f | watcher    | infra-optim |
    | db60453eca65409bbb0b61f4295c66ec | placement  | placement   |
    | fa717124fbcb4d708ba4c41c9109df81 | keystone   | identity    |
    +----------------------------------+------------+-------------+

Example Workflow

This section takes you through two example workflows using Watcher to go through optimization scenarios. The first workflow, instances consolidation in minimal compute nodes uses the CLI. The second workflow, workload stabilization, uses Horizon UI and Prometheus metrics dashboards.

Requirements

The example requires that the following setup is in place:

  • An OpenStack operators-based deployment with two or more Compute nodes

  • Nova Live migration is functional on your environment

  • Watcher has been deployed following the instructions in the User Installation Guide

  • Instances (Virtual Machines) have been created on the Compute nodes

Test instances can be created using the deploy-instance-demo.sh script:

#!/bin/bash
#
# Copyright 2025 Red Hat Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
#

#
# Instructions to delete what is created in demo project
#
#  oc rsh openstackclient
#  unset OS_CLOUD
#  . ./demorc
#  openstack server delete test_0
#  If more that one were created, delete them
#  openstack router delete priv_router
#  openstack subnet delete priv_sub_demo
#  openstack network delete private_demo
#  openstack security group delete basic
#  openstack image delete cirros
#  export OS_CLOUD=default
#  openstack project delete demo
#


set -ex

# Create Image
IMG=cirros-0.5.2-x86_64-disk.img
URL=http://download.cirros-cloud.net/0.5.2/$IMG
DISK_FORMAT=qcow2
RAW=$IMG
NUMBER_OF_INSTANCES=${1:-1}

openstack project show demo || \
    openstack project create demo
openstack role add --user admin --project demo member

openstack network show public || openstack network create public --external --provider-network-type flat --provider-physical-network datacentre
openstack subnet create public_subnet --subnet-range <PUBLIC SUBNET CIDR> --allocation-pool start=<START PUBLIC SUBNET ADDRESSES RANGE>,end=<END PUBLIC SUBNET ADDRESSES RANGE> --gateway <PUBLIC NETWORK GATEWAY IP> --dhcp --network public

# Create flavor
openstack flavor show m1.small || \
    openstack flavor create --ram 512 --vcpus 1 --disk 1 --ephemeral 1 m1.small

# Use the demo project from now on
unset OS_CLOUD
cp cloudrc demorc
sed -i 's/OS_PROJECT_NAME=admin/OS_PROJECT_NAME=demo/' demorc
. ./demorc

curl -L -# $URL > /tmp/$IMG
if type qemu-img >/dev/null 2>&1; then
    RAW=$(echo $IMG | sed s/img/raw/g)
    qemu-img convert -f qcow2 -O raw /tmp/$IMG /tmp/$RAW
    DISK_FORMAT=raw
fi

openstack image show cirros || \
    openstack image create --container-format bare --disk-format $DISK_FORMAT cirros < /tmp/$RAW

# Create networks
openstack network show private_demo || openstack network create private_demo
openstack subnet show priv_sub_demo || openstack subnet create priv_sub_demo --subnet-range 192.168.0.0/24 --network private_demo
openstack router show priv_router || {
    openstack router create priv_router
    openstack router add subnet priv_router priv_sub_demo
    openstack router set priv_router --external-gateway public
}

# Create security group and icmp/ssh rules
openstack security group show basic || {
    openstack security group create basic
    openstack security group rule create basic --protocol icmp --ingress --icmp-type -1
    openstack security group rule create basic --protocol tcp --ingress --dst-port 22
}

# Create an instance
for (( i=0; i<${NUMBER_OF_INSTANCES}; i++ )); do
    NAME=test_${i}
    openstack server show ${NAME} || {
        openstack server create --flavor m1.small --image cirros --nic net-id=private_demo ${NAME} --security-group basic --wait
        fip=$(openstack floating ip create public -f value -c floating_ip_address)
        openstack server add floating ip ${NAME} $fip
    }
    openstack server list --long

done

and modifying the script for your environment and deploy test instances as shown with the example commands below:

rm -f temp/deploy-instance-demo.sh
cp deploy-instance-demo.sh temp/deploy-instance-demo.sh

# Modify the openstack subnet create public_subnet command options
# to your particular environment.

# This example uses 8 instances.
# Adjust the number per the number of instances to be used
# in the test environment
oc cp temp/deploy-instance-demo.sh openstackclient:/home/cloud-admin
oc rsh openstackclient bash /home/cloud-admin/deploy-instance-demo.sh 8

You can check which hosts the VMs have been deployed on using the --long option on the openstack server list command:

$ oc rsh openstackclient openstack server list --project demo \
--long -c 'Name' -c 'Status' -c 'Host'
+--------+--------+-------------------------------+
| Name   | Status | Host                          |
+--------+--------+-------------------------------+
| test_7 | ACTIVE | compute1.ctlplane.localdomain |
| test_6 | ACTIVE | compute2.ctlplane.localdomain |
| test_5 | ACTIVE | compute1.ctlplane.localdomain |
| test_4 | ACTIVE | compute2.ctlplane.localdomain |
| test_3 | ACTIVE | compute1.ctlplane.localdomain |
| test_2 | ACTIVE | compute2.ctlplane.localdomain |
| test_1 | ACTIVE | compute1.ctlplane.localdomain |
| test_0 | ACTIVE | compute1.ctlplane.localdomain |
+--------+--------+-------------------------------+

Watcher Workflow (CLI)

Optimization Scenario I: Instances Consolidation in Minimal Compute Nodes

Information on the Watcher Strategies is available in the Openstack documentation. The server consolidation strategy is explained in the strategies documentation.

Procedure

The steps below are all executed from the openstackclient pod. Run oc rsh openstackclient to access the openstackclient pod before beginning the workflow steps.

  1. Create the Audit Template

    $ openstack optimize audittemplate create -s node_resource_consolidation \
    AuditTemplateNodeConsolidation server_consolidation
    +-------------+--------------------------------------+
    | Field       | Value                                |
    +-------------+--------------------------------------+
    | UUID        | 4b80a46d-d6e3-401a-a615-d8d1d5c3ec1b |
    | Created At  | 2025-02-11T16:16:54.797663+00:00     |
    | Updated At  | None                                 |
    | Deleted At  | None                                 |
    | Description | None                                 |
    | Name        | AuditTemplateNodeConsolidation       |
    | Goal        | server_consolidation                 |
    | Strategy    | node_resource_consolidation          |
    | Audit Scope | []                                   |
    +-------------+--------------------------------------+

    Check that Audit Template has been created with:

    $ openstack optimize audittemplate list -c 'UUID' -c 'Goal' -c 'Strategy'
    +--------------------------------------+----------------------+-----------------------------+
    | UUID                                 | Goal                 | Strategy                    |
    +--------------------------------------+----------------------+-----------------------------+
    | 4b80a46d-d6e3-401a-a615-d8d1d5c3ec1b | server_consolidation | node_resource_consolidation |
    +--------------------------------------+----------------------+-----------------------------+
  2. Now create the Audit

    $ openstack optimize audit create -a AuditTemplateNodeConsolidation -t ONESHOT \
    --name node_server_consolidation-PoC
    +---------------+--------------------------------------+
    | Field         | Value                                |
    +---------------+--------------------------------------+
    | UUID          | 19ae8e21-3185-4366-96e3-ed04184234e8 |
    | Name          | node_server_consolidation-PoC        |
    | Created At    | 2025-02-11T16:19:41.454788+00:00     |
    | Updated At    | None                                 |
    | Deleted At    | None                                 |
    | State         | PENDING                              |
    | Audit Type    | ONESHOT                              |
    | Parameters    | {'host_choice': 'auto'}              |
    | Interval      | None                                 |
    | Goal          | server_consolidation                 |
    | Strategy      | node_resource_consolidation          |
    | Audit Scope   | []                                   |
    | Auto Trigger  | False                                |
    | Next Run Time | None                                 |
    | Hostname      | None                                 |
    | Start Time    | None                                 |
    | End Time      | None                                 |
    | Force         | False                                |
    +---------------+--------------------------------------+

    Verify the Audit was created with the command below:

    $ openstack optimize audit list -c 'UUID' -c 'Name' -c 'Audit Type' -c 'State'
    +--------------------------------------+-------------------------------+------------+-----------+
    | UUID                                 | Name                          | Audit Type | State     |
    +--------------------------------------+-------------------------------+------------+-----------+
    | 19ae8e21-3185-4366-96e3-ed04184234e8 | node_server_consolidation-PoC | ONESHOT    | SUCCEEDED |
    +--------------------------------------+-------------------------------+------------+-----------+

    Note that you are looking for the State to show SUCCEEDED.

  3. Now check the Action Plan

    $ openstack optimize actionplan list  -c 'UUID' -c 'State' -c 'Global efficacy'
    +--------------------------------------+-------------+-------------------------------+
    | UUID                                 | State       | Global efficacy               |
    +--------------------------------------+-------------+-------------------------------+
    | dfdcb491-89c5-4c07-a5ed-65d2085c488c | RECOMMENDED | Released_nodes_ratio: 50.00 % |
    |                                      |             |                               |
    +--------------------------------------+-------------+-------------------------------+

    Note that the State is RECOMMENDED and the Global efficacy shows Released_nodes_ratio: 50.00 %. This indicates that implementing this Action Plan will empty 50% of the Compute nodes.

  4. List the actions inside this Action Plan

    # dfdcb491-89c5-4c07-a5ed-65d2085c488c is the UUID of the Action Plan
    $ openstack optimize action list --action-plan dfdcb491-89c5-4c07-a5ed-65d2085c488c \
    -c 'UUID' -c 'State' -c 'Action'
    +--------------------------------------+-----------+---------------------------+
    | UUID                                 | State     | Action                    |
    +--------------------------------------+-----------+---------------------------+
    | 01774d02-00a9-4f34-a6f7-6b2264b8970c | PENDING   | change_nova_service_state |
    | 8573aa4e-6fac-4002-8de6-569e72d4dca3 | PENDING   | migrate                   |
    | 6d88ea4a-012b-4cb8-af86-8e699c6c2738 | PENDING   | migrate                   |
    | fa2827c6-78f8-48b8-8f8a-734b9f170841 | PENDING   | migrate                   |
    | 4009c44d-9af6-4a6e-91dd-96bd4f17abd5 | PENDING   | migrate                   |
    | e3dc2dec-74fc-4f16-b76d-c4b99acb1b01 | PENDING   | change_nova_service_state |
    +--------------------------------------+-----------+---------------------------+

    Listed above you will see that the Action Plan has 5 actions. You can see the details of each Action with using with: $ openstack optimize action show < Action UUID >.

    $ openstack optimize action show 8573aa4e-6fac-4002-8de6-569e72d4dca3 \
    --max-width=72
    +-------------+--------------------------------------------------------+
    | Field       | Value                                                  |
    +-------------+--------------------------------------------------------+
    | UUID        | 8573aa4e-6fac-4002-8de6-569e72d4dca3                   |
    | Created At  | 2025-02-11T16:19:44+00:00                              |
    | Updated At  | None                                                   |
    | Deleted At  | None                                                   |
    | Parents     | ['01774d02-00a9-4f34-a6f7-6b2264b8970c']               |
    | State       | PENDING                                                |
    | Action Plan | dfdcb491-89c5-4c07-a5ed-65d2085c488c                   |
    | Action      | migrate                                                |
    | Parameters  | {'migration_type': 'live', 'source_node':              |
    |             | 'compute1.ctlplane.localdomain', 'resource_name':      |
    |             | 'test_7', 'resource_id':                               |
    |             | '0cbda264-b496-4649-ab55-9405984092e9'}                |
    | Description | Moving a VM instance from source_node to               |
    |             | destination_node                                       |
    +-------------+--------------------------------------------------------+

    In this example, the Action Plan is disabling the Compute node which is going to be freed, then it will migrate the three instances running on it, and finally it will enable the Compute node again to make sure it is available for new workloads if needed.

  5. Now you are ready to execute Action Plan using the command: $ openstack optimize actionplan start <Action Plan UUID>.

    $ openstack optimize actionplan start dfdcb491-89c5-4c07-a5ed-65d2085c488c \
    --max-width=72
    +---------------------+------------------------------------------------+
    | Field               | Value                                          |
    +---------------------+------------------------------------------------+
    | UUID                | dfdcb491-89c5-4c07-a5ed-65d2085c488c           |
    | Created At          | 2025-02-11T16:19:44+00:00                      |
    | Updated At          | 2025-02-11T16:38:36+00:00                      |
    | Deleted At          | None                                           |
    | Audit               | 19ae8e21-3185-4366-96e3-ed04184234e8           |
    | Strategy            | node_resource_consolidation                    |
    | State               | PENDING                                        |
    | Efficacy indicators | [{'name': 'compute_nodes_count',               |
    |                     | 'description': 'The total number of enabled    |
    |                     | compute nodes.', 'unit': None, 'value': 2.0},  |
    |                     | {'name': 'released_compute_nodes_count',       |
    |                     | 'description': 'The number of compute nodes    |
    |                     | to be released.', 'unit': None, 'value': 1.0}, |
    |                     | {'name': 'instance_migrations_count',          |
    |                     | 'description': 'The number of VM migrations    |
    |                     |to be performed.', 'unit': None, 'value': 4.0}] |
    | Global efficacy     | [{'name': 'released_nodes_ratio',              |
    |                     | 'description': 'Ratio of released compute      |
    |                     | nodes divided by the total number of enabled   |
    |                     | compute nodes.', 'unit': '%', 'value': 50.0}]  |
    | Hostname            | None                                           |
    +---------------------+------------------------------------------------+
  6. Finally, you can monitor the Action Plan progress and check the results. You can track the status of each action in the plan with: $ openstack optimize action list --action-plan <Action Plan UUID>. After some time, all the actions should report SUCCEEDED state as shown in an example below:

    $ openstack optimize action list --action-plan dfdcb491-89c5-4c07-a5ed-65d2085c488c \
    -c 'UUID' -c 'State' -c 'Action'
    +--------------------------------------+-----------+---------------------------+
    | UUID                                 | State     | Action                    |
    +--------------------------------------+-----------+---------------------------+
    | 01774d02-00a9-4f34-a6f7-6b2264b8970c | SUCCEEDED | change_nova_service_state |
    | 8573aa4e-6fac-4002-8de6-569e72d4dca3 | SUCCEEDED | migrate                   |
    | 6d88ea4a-012b-4cb8-af86-8e699c6c2738 | SUCCEEDED | migrate                   |
    | fa2827c6-78f8-48b8-8f8a-734b9f170841 | SUCCEEDED | migrate                   |
    | 4009c44d-9af6-4a6e-91dd-96bd4f17abd5 | SUCCEEDED | migrate                   |
    | e3dc2dec-74fc-4f16-b76d-c4b99acb1b01 | SUCCEEDED | change_nova_service_state |
    +--------------------------------------+-----------+---------------------------

    You can check that the instances have been actually consolidated in one of your hosts by listing the instances (VMs) on each one of your hypervisors.

    # List the hypervisors:
    $ openstack hypervisor list -c 'Hypervisor Hostname' -c 'State'
    +-------------------------------+-------+
    | Hypervisor Hostname           | State |
    +-------------------------------+-------+
    | compute2.ctlplane.localdomain | up    |
    | compute1.ctlplane.localdomain | up    |
    +-------------------------------+-------+
    +
    # Note that the output below lists all instances on one host:
    $ openstack server list --long -c Name -c 'Host' --project demo
    +--------+-------------------------------+
    | Name   | Host                          |
    +--------+-------------------------------+
    | test_7 | compute2.ctlplane.localdomain |
    | test_6 | compute2.ctlplane.localdomain |
    | test_5 | compute2.ctlplane.localdomain |
    | test_4 | compute2.ctlplane.localdomain |
    | test_3 | compute2.ctlplane.localdomain |
    | test_2 | compute2.ctlplane.localdomain |
    | test_1 | compute2.ctlplane.localdomain |
    | test_0 | compute2.ctlplane.localdomain |
    +--------+-------------------------------+

Watcher Workflow (Horizon Dashboard UI)

Optimization Scenario II: Workload balancing/stabilization

This workflow is demonstrated through actions in the Horizon UI. Example screenshots are added where necessary to further explain the user flow.

Procedure
  1. Login in the Horizon dashboard with credentials which would enable the Admin role in one or more projects.

  2. Make sure you have the Administration menu enabled by selecting the project where you have the Admin role assigned in the Projects tab:

    admin role
  3. In the Admin menu a new panel Optimization should be available:

    optimize menu
  4. In the Audit Templates panel, click on Create Template button. This will open a Create Audit Template window. Add a new Audit Template called AuditTemplateWorkloadStabilization with the goal Workload Balancing and the Strategy Workload stabilization. Further information on the Workload stabilization strategy is available in the Workload stabilization strategy reference.

    audit template
  5. In the Audit panel, click on the Create Audit button. This will bring up the Create Audit window. Select the AuditTemplateWorkloadStabilization Audit Template, the CONTINUOUS Audit Type. In the Interval field, set the value to 180.

    audit

    Click on the Create Audit button and a new Audit will be shown.

  6. Click on the UUID of the Audit listed and you will find the Action Plans created for the new Audit. Given the low usage of resources in the instances created for the example workflows, the initial Action Plan will not have real actions.

  7. Increase CPU consumption in one of the created instances (VMs). You can find the CPU usage of each instance at the example URL: Prometheus Metrics

    prometheus metrics

    View the list of instances in Horizon by selecting the Instances panel from the menus: AdminComputeInstances. Click on the name of one of the instances, and go to the Console tab for that instance. Log in with cirros user and the gocubsgo password, and run following command:

    $ dd if=/dev/random of=/dev/null

    After a few minutes, the CPU usage of the edited instance should increase up to close to 100%. This increase will be seen in the instance metrics show in the Prometheus metrics URL.

  8. Go back to Audit panel by through menu options: AdminOptimizationAudit. Click on the UUID of the continuous Audit. The next execution of the Audit should generate a non-empty Action Plan with a RECOMMENDED status. Depending on specific resources, it may take one or two executions of the Audit to create this non-empty plan.

    recommended plan
  9. Click on the RECOMMENDED Action Plan, there should be a Migrate Action listed. Click on the Action to see the related details. The resource_name field should match name of the instance where you logged in and ran the dd command.

    action migrate
  10. Go back to the list of Action Plans, and click the Start Action Plan button for the RECOMMENDED plan. Click on the Action Plan UUID to track the status until it goes to SUCCEEDED.

    plan succeeded
  11. Check the distribution of the test instances over the hosts using the openstack server list --long command. You should see that the instance where the load was increased, has moved.

    $ openstack server list --long -c Name -c 'Host' --project demo
    +--------+-------------------------------+
    | Name   | Host                          |
    +--------+-------------------------------+
    | test_7 | compute1.ctlplane.localdomain |
    | test_6 | compute2.ctlplane.localdomain |
    | test_5 | compute2.ctlplane.localdomain |
    | test_4 | compute2.ctlplane.localdomain |
    | test_3 | compute2.ctlplane.localdomain |
    | test_2 | compute2.ctlplane.localdomain |
    | test_1 | compute2.ctlplane.localdomain |
    | test_0 | compute2.ctlplane.localdomain |
    +--------+-------------------------------+
  12. Stop the CONTINUOUS Audit from the Audits panel and select Cancel Action. If this option is not available from Horizon in your environment, you can execute it using the CLI:

    $ openstack optimize audit update <audit uuid> replace state=CANCELLED