Developer Documentation
The watcher-operator is an OpenShift Operator built using the Operator Framework for Go. The Operator provides a way to install and manage the OpenStack Watcher service on OpenShift. This Operator is developed using RDO containers for OpenStack.
Description
This operator is built using the operator-sdk framework to provide day one and day two lifecycle managment of the OpenStack Watcher service on an OpenShift cluster.
Getting Started
Prerequisites
-
go version v1.21.0+
-
docker version 17.03+.
-
kubectl version v1.11.3+.
-
Access to a Kubernetes v1.11.3+ cluster.
To Deploy on the cluster
Build and push your image to the location specified by IMG
:
make docker-build docker-push IMG=<some-registry>/watcher-operator:tag
NOTE: This image ought to be published in the personal registry you specified. And it is required to have access to pull the image from the working environment. Make sure you have the proper permission to the registry if the above commands don’t work.
Install the CRDs into the cluster:
make install
Deploy the Manager to the cluster with the image specified by IMG
:
make deploy IMG=<some-registry>/watcher-operator:tag
NOTE: If you encounter RBAC errors, you may need to grant yourself cluster-admin privileges or be logged in as admin.
Create instances of your solution You can apply the samples (examples) from the config/sample:
kubectl apply -k config/samples/
NOTE: Ensure that the samples has default values to test it out.
To Uninstall
Delete the instances (CRs) from the cluster:
kubectl delete -k config/samples/
Delete the APIs(CRDs) from the cluster:
make uninstall
UnDeploy the controller from the cluster:
make undeploy
To Deploy via OLM
**Deploy watcher-operator via olm
make watcher
**Deply watcher-operator via olm with different catalog image
make watcher CATALOG_IMAGE=<catalog image url with tag>
To Deploy watcher service
**Deploy watcher service
make watcher_deploy
To Uninstall OLM deployed watcher-operator
**Undeploy watcher service
make watcher_deploy_cleanup
**Uninstall watcher-operator
make watcher_cleanup
Project Distribution
Following are the steps to build the installer and distribute this project to users.
-
Build the installer for the image built and published in the registry:
make build-installer IMG=<some-registry>/watcher-operator:tag
The makefile target mentioned above generates an install.yaml file in the dist directory. This file contains all the resources built with Kustomize, which are necessary to install this project without its dependencies. |
-
Using the installer
Users can just run kubectl apply -f
kubectl apply -f https://raw.githubusercontent.com/<org>/watcher-operator/<tag or branch>/dist/install.yaml
Contributing
NOTE: Run make help
for more information on all potential make
targets
More information can be found via the Kubebuilder Documentation
License
Copyright 2024.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
User Installation Guide
Getting Started
Before installing the Watcher operator you first need a functional OpenShift installation with the required Openstack operators, including the Telemetry operator. The following links point to documents detailing how to create this required starting environment:
A CRC (Code Ready Containers) installation is adequade for a developer environment.
To verify that the environment set up is ready, do the following:
-
Log in to the Kubernetes/Openshift environment:
$ oc login -u <username> -p <password> https://api.crc.testing:6443 --insecure-skip-tls-verify=true
-
Access the Openstack client and verify the service endpoints are available:
$ oc rsh openstackclient openstack endpoint list -c 'ID' -c 'Service Name' -c 'Enabled' +----------------------------------+--------------+---------+ | ID | Service Name | Enabled | +----------------------------------+--------------+---------+ | 0bada656064a4d409bc5fed610654edd | neutron | True | | 17453066f8dc40bfa0f8584007cffc9a | cinderv3 | True | | 22768bf3e9a34fefa57b96c20d405cfe | keystone | True | | 284fd13676ed4bb095b602a004b4a0f2 | watcher | True | | 48602a17288f442a98c300931f82244a | watcher | True | | 54e3d48cdda84263b7f1c65c924f3e3a | glance | True | | 74345a18262740eb952d2b6b7220ceeb | keystone | True | | 789a2d6048174b849a7c7243421675b4 | placement | True | | 9b7d8f26834343a59108a4225e0e574a | nova | True | | a836d134394846ff88f2f3dd8d96de34 | nova | True | | af1bf23e62c148d3b7f6c47f8f071739 | placement | True | | ce0489dfeff64afb859338e480397f90 | glance | True | | db69cc22117344b796f97e8dd3dc67e5 | neutron | True | | fa48dc132b524915b4d1ca963c50a653 | cinderv3 | True | +----------------------------------+--------------+---------+
-
Verify that the Telemetry operator with Prometheus metric storage is ready:
$ oc get telemetry NAME STATUS MESSAGE telemetry True Setup complete $ oc get metricstorage NAME STATUS MESSAGE metric-storage True Setup complete $ oc get route metric-storage-prometheus NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD metric-storage-prometheus metric-storage-prometheus-openstack.apps-crc.testing metric-storage-prometheus web edge/Redirect None
-
You can view the Prometheus metrics in a web browser at the
HOST/PORT
address, for example, https://metric-storage-prometheus-openstack.apps-crc.testing.
Installing the Operator
Now that you have a ready working environment, you can install the Watcher Operator. NOTE: The steps below require you to log in to your OpenShift cluster as a user with cluster-admin privileges.
-
Create a
watcher-operator.yaml
file:--- apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: name: watcher-operator-index namespace: openstack-operators spec: image: quay.io/openstack-k8s-operators/watcher-operator-index:latest sourceType: grpc --- apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: openstack namespace: openstack-operators --- apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: watcher-operator namespace: openstack-operators spec: name: watcher-operator channel: alpha source: watcher-operator-index sourceNamespace: openstack-operators
-
oc apply
the file to create the resources:$ oc apply -f watcher-operator.yaml catalogsource.operators.coreos.com/watcher-operator-index created operatorgroup.operators.coreos.com/openstack unchanged subscription.operators.coreos.com/watcher-operator created
-
Check that the operator is installed:
$ oc get subscription.operators.coreos.com/watcher-operator -n openstack-operators NAME PACKAGE SOURCE CHANNEL watcher-operator watcher-operator watcher-operator-index alpha $ oc get pod -l openstack.org/operator-name=watcher -n openstack-operators NAME READY STATUS RESTARTS AGE watcher-operator-controller-manager-dd95db756-kslw9 2/2 Running 0 44s $ oc get csv watcher-operator.v0.0.1 NAME DISPLAY VERSION REPLACES PHASE watcher-operator.v0.0.1 Watcher Operator 0.0.1 Succeeded
Deploying the Watcher Service
Now, you will need to create a Watcher Custom Resource based on the Watcher CRD
in the same project where your
OpenStackControlPlane CR is created. Typically, this is openstack
project but you can check it with:
$ oc get OpenStackControlPlane --all-namespaces
NAMESPACE NAME STATUS MESSAGE
openstack openstack-controlplane True Setup complete
-
Use the following commands to view the
Watcher CRD
definition and specification schema:$ oc describe crd watcher $ oc explain watcher.spec
-
Add a WatcherPassword field to the
Secret
created as part of the control plane deployment.For more information, see Providing secure access to the Red Hat OpenStack Services on OpenShift services.
-
Update the
Secret
, and verify that theWatcherPassword
field is present:$ oc apply -f <secret file> -n openstack $ oc describe secret osp-secret -n openstack | grep Watcher WatcherPassword: 9 bytes
-
Create a file on your workstation named
watcher.yaml
to define the Watcher CR. Although the exact parameters of your file may depend on your specific environment customization, a Watcher CR similar to the example below would work in a typical deployment:apiVersion: watcher.openstack.org/v1beta1 kind: Watcher metadata: name: watcher spec: databaseInstance: "openstack" secret: <name of the secret with the credentials of the ControlPlane deploy> apiServiceTemplate: tls: caBundleSecretName: "combined-ca-bundle"
There are certain fields of the Watcher CR spec that need to match with the values used in the existing OpenStackControlplane:
-
databaseInstance parameter value must match to the name of the galera database created in the existing Control Plane. By default, this value is
openstack
but you can find it by running (ignore any galera havingcell
in its name):$ oc get galeras -n openstack NAME READY MESSAGE openstack True Setup complete
-
rabbitMqClusterName parameter value should be the name of the existing Rabbitmq cluster, which can be found with the command (ignore any rabbitmq having
cell
in its name). By default, it israbbitmq
.$ oc get rabbitmq -n openstack NAME ALLREPLICASREADY RECONCILESUCCESS AGE rabbitmq True True 6d15h
-
memcachedInstance must contain the name of the existing memcached CR in the same project (
memcached
by default). you can find it with:$ oc get memcached -n openstack NAME READY MESSAGE memcached True Setup complete
-
caBundleSecretName under apiServiceTemplate.tls section must match the value found in command:
$ oc get OpenStackControlPlane openstack-controlplane -n openstack \ -o jsonpath='{.status.tls.caBundleSecretName}' combined-ca-bundle
For more information about how to define an OpenStackControlPlane custom resource (CR), see Creating the control plane.
-
-
oc apply
to configure Watcher$ oc apply -f watcher.yaml -n openstack watcher.watcher.openstack.org/watcher configured
-
To check if the service status, run:
$ oc wait -n openstack --for condition=Ready --timeout=300s Watcher watcher watcher.watcher.openstack.org/watcher condition met
where
Watcher
refers to the kind andwatcher
refers to the name of the CR. -
Check that the watcher service has been registered in list of keystone services with command:
$ oc rsh openstackclient openstack service list +----------------------------------+------------+-------------+ | ID | Name | Type | +----------------------------------+------------+-------------+ | 1470e8d6019446a1bcdfdb6dc55f3f6a | nova | compute | | 41d60e1c678142cf8e5daf7a82af1864 | neutron | network | | 5b0d95d1c08e4deb832815addd859924 | ceilometer | Ceilometer | | 7e081cb4928945d7aa41d1622f7b8586 | cinderv3 | volumev3 | | 8d7ee56ca2bb4dba999d67580909dd90 | glance | image | | c3348e10fb414780988fbbceac9c4b5f | watcher | infra-optim | | db60453eca65409bbb0b61f4295c66ec | placement | placement | | fa717124fbcb4d708ba4c41c9109df81 | keystone | identity | +----------------------------------+------------+-------------+
Example Workflow
This section takes you through two example workflows using Watcher
to go through
optimization scenarios. The first workflow, instances consolidation in minimal compute
nodes uses the CLI. The second workflow, workload stabilization,
uses Horizon
UI and Prometheus metrics dashboards.
Requirements
The example requires that the following setup is in place:
-
An OpenStack operators-based deployment with two or more Compute nodes
-
Nova Live migration is functional on your environment
-
Watcher has been deployed following the instructions in the User Installation Guide
-
Instances (Virtual Machines) have been created on the Compute nodes
Test instances can be created using the deploy-instance-demo.sh
script:
#!/bin/bash
#
# Copyright 2025 Red Hat Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
#
#
# Instructions to delete what is created in demo project
#
# oc rsh openstackclient
# unset OS_CLOUD
# . ./demorc
# openstack server delete test_0
# If more that one were created, delete them
# openstack router delete priv_router
# openstack subnet delete priv_sub_demo
# openstack network delete private_demo
# openstack security group delete basic
# openstack image delete cirros
# export OS_CLOUD=default
# openstack project delete demo
#
set -ex
# Create Image
IMG=cirros-0.5.2-x86_64-disk.img
URL=http://download.cirros-cloud.net/0.5.2/$IMG
DISK_FORMAT=qcow2
RAW=$IMG
NUMBER_OF_INSTANCES=${1:-1}
openstack project show demo || \
openstack project create demo
openstack role add --user admin --project demo member
openstack network show public || openstack network create public --external --provider-network-type flat --provider-physical-network datacentre
openstack subnet create public_subnet --subnet-range <PUBLIC SUBNET CIDR> --allocation-pool start=<START PUBLIC SUBNET ADDRESSES RANGE>,end=<END PUBLIC SUBNET ADDRESSES RANGE> --gateway <PUBLIC NETWORK GATEWAY IP> --dhcp --network public
# Create flavor
openstack flavor show m1.small || \
openstack flavor create --ram 512 --vcpus 1 --disk 1 --ephemeral 1 m1.small
# Use the demo project from now on
unset OS_CLOUD
cp cloudrc demorc
sed -i 's/OS_PROJECT_NAME=admin/OS_PROJECT_NAME=demo/' demorc
. ./demorc
curl -L -# $URL > /tmp/$IMG
if type qemu-img >/dev/null 2>&1; then
RAW=$(echo $IMG | sed s/img/raw/g)
qemu-img convert -f qcow2 -O raw /tmp/$IMG /tmp/$RAW
DISK_FORMAT=raw
fi
openstack image show cirros || \
openstack image create --container-format bare --disk-format $DISK_FORMAT cirros < /tmp/$RAW
# Create networks
openstack network show private_demo || openstack network create private_demo
openstack subnet show priv_sub_demo || openstack subnet create priv_sub_demo --subnet-range 192.168.0.0/24 --network private_demo
openstack router show priv_router || {
openstack router create priv_router
openstack router add subnet priv_router priv_sub_demo
openstack router set priv_router --external-gateway public
}
# Create security group and icmp/ssh rules
openstack security group show basic || {
openstack security group create basic
openstack security group rule create basic --protocol icmp --ingress --icmp-type -1
openstack security group rule create basic --protocol tcp --ingress --dst-port 22
}
# Create an instance
for (( i=0; i<${NUMBER_OF_INSTANCES}; i++ )); do
NAME=test_${i}
openstack server show ${NAME} || {
openstack server create --flavor m1.small --image cirros --nic net-id=private_demo ${NAME} --security-group basic --wait
fip=$(openstack floating ip create public -f value -c floating_ip_address)
openstack server add floating ip ${NAME} $fip
}
openstack server list --long
done
and modifying the script for your environment and deploy test instances as shown with the example commands below:
rm -f temp/deploy-instance-demo.sh
cp deploy-instance-demo.sh temp/deploy-instance-demo.sh
# Modify the openstack subnet create public_subnet command options
# to your particular environment.
# This example uses 8 instances.
# Adjust the number per the number of instances to be used
# in the test environment
oc cp temp/deploy-instance-demo.sh openstackclient:/home/cloud-admin
oc rsh openstackclient bash /home/cloud-admin/deploy-instance-demo.sh 8
You can check which hosts the VMs have been deployed on using the --long
option
on the openstack server list
command:
$ oc rsh openstackclient openstack server list --project demo \
--long -c 'Name' -c 'Status' -c 'Host'
+--------+--------+-------------------------------+
| Name | Status | Host |
+--------+--------+-------------------------------+
| test_7 | ACTIVE | compute1.ctlplane.localdomain |
| test_6 | ACTIVE | compute2.ctlplane.localdomain |
| test_5 | ACTIVE | compute1.ctlplane.localdomain |
| test_4 | ACTIVE | compute2.ctlplane.localdomain |
| test_3 | ACTIVE | compute1.ctlplane.localdomain |
| test_2 | ACTIVE | compute2.ctlplane.localdomain |
| test_1 | ACTIVE | compute1.ctlplane.localdomain |
| test_0 | ACTIVE | compute1.ctlplane.localdomain |
+--------+--------+-------------------------------+
Watcher Workflow (CLI)
Optimization Scenario I: Instances Consolidation in Minimal Compute Nodes
Information on the Watcher Strategies is available in the Openstack documentation. The server consolidation strategy is explained in the strategies documentation.
The steps below are all executed from the openstackclient
pod.
Run oc rsh openstackclient
to access the openstackclient
pod before beginning the workflow steps.
-
Create the
Audit Template
$ openstack optimize audittemplate create -s node_resource_consolidation \ AuditTemplateNodeConsolidation server_consolidation +-------------+--------------------------------------+ | Field | Value | +-------------+--------------------------------------+ | UUID | 4b80a46d-d6e3-401a-a615-d8d1d5c3ec1b | | Created At | 2025-02-11T16:16:54.797663+00:00 | | Updated At | None | | Deleted At | None | | Description | None | | Name | AuditTemplateNodeConsolidation | | Goal | server_consolidation | | Strategy | node_resource_consolidation | | Audit Scope | [] | +-------------+--------------------------------------+
Check that
Audit Template
has been created with:$ openstack optimize audittemplate list -c 'UUID' -c 'Goal' -c 'Strategy' +--------------------------------------+----------------------+-----------------------------+ | UUID | Goal | Strategy | +--------------------------------------+----------------------+-----------------------------+ | 4b80a46d-d6e3-401a-a615-d8d1d5c3ec1b | server_consolidation | node_resource_consolidation | +--------------------------------------+----------------------+-----------------------------+
-
Now create the Audit
$ openstack optimize audit create -a AuditTemplateNodeConsolidation -t ONESHOT \ --name node_server_consolidation-PoC +---------------+--------------------------------------+ | Field | Value | +---------------+--------------------------------------+ | UUID | 19ae8e21-3185-4366-96e3-ed04184234e8 | | Name | node_server_consolidation-PoC | | Created At | 2025-02-11T16:19:41.454788+00:00 | | Updated At | None | | Deleted At | None | | State | PENDING | | Audit Type | ONESHOT | | Parameters | {'host_choice': 'auto'} | | Interval | None | | Goal | server_consolidation | | Strategy | node_resource_consolidation | | Audit Scope | [] | | Auto Trigger | False | | Next Run Time | None | | Hostname | None | | Start Time | None | | End Time | None | | Force | False | +---------------+--------------------------------------+
Verify the
Audit
was created with the command below:$ openstack optimize audit list -c 'UUID' -c 'Name' -c 'Audit Type' -c 'State' +--------------------------------------+-------------------------------+------------+-----------+ | UUID | Name | Audit Type | State | +--------------------------------------+-------------------------------+------------+-----------+ | 19ae8e21-3185-4366-96e3-ed04184234e8 | node_server_consolidation-PoC | ONESHOT | SUCCEEDED | +--------------------------------------+-------------------------------+------------+-----------+
Note that you are looking for the
State
to showSUCCEEDED
. -
Now check the
Action Plan
$ openstack optimize actionplan list -c 'UUID' -c 'State' -c 'Global efficacy' +--------------------------------------+-------------+-------------------------------+ | UUID | State | Global efficacy | +--------------------------------------+-------------+-------------------------------+ | dfdcb491-89c5-4c07-a5ed-65d2085c488c | RECOMMENDED | Released_nodes_ratio: 50.00 % | | | | | +--------------------------------------+-------------+-------------------------------+
Note that the
State
isRECOMMENDED
and theGlobal efficacy
showsReleased_nodes_ratio: 50.00 %
. This indicates that implementing this Action Plan will empty 50% of the Compute nodes. -
List the actions inside this
Action Plan
# dfdcb491-89c5-4c07-a5ed-65d2085c488c is the UUID of the Action Plan $ openstack optimize action list --action-plan dfdcb491-89c5-4c07-a5ed-65d2085c488c \ -c 'UUID' -c 'State' -c 'Action' +--------------------------------------+-----------+---------------------------+ | UUID | State | Action | +--------------------------------------+-----------+---------------------------+ | 01774d02-00a9-4f34-a6f7-6b2264b8970c | PENDING | change_nova_service_state | | 8573aa4e-6fac-4002-8de6-569e72d4dca3 | PENDING | migrate | | 6d88ea4a-012b-4cb8-af86-8e699c6c2738 | PENDING | migrate | | fa2827c6-78f8-48b8-8f8a-734b9f170841 | PENDING | migrate | | 4009c44d-9af6-4a6e-91dd-96bd4f17abd5 | PENDING | migrate | | e3dc2dec-74fc-4f16-b76d-c4b99acb1b01 | PENDING | change_nova_service_state | +--------------------------------------+-----------+---------------------------+
Listed above you will see that the
Action Plan
has 5 actions. You can see the details of eachAction
with using with:$ openstack optimize action show < Action UUID >
.$ openstack optimize action show 8573aa4e-6fac-4002-8de6-569e72d4dca3 \ --max-width=72 +-------------+--------------------------------------------------------+ | Field | Value | +-------------+--------------------------------------------------------+ | UUID | 8573aa4e-6fac-4002-8de6-569e72d4dca3 | | Created At | 2025-02-11T16:19:44+00:00 | | Updated At | None | | Deleted At | None | | Parents | ['01774d02-00a9-4f34-a6f7-6b2264b8970c'] | | State | PENDING | | Action Plan | dfdcb491-89c5-4c07-a5ed-65d2085c488c | | Action | migrate | | Parameters | {'migration_type': 'live', 'source_node': | | | 'compute1.ctlplane.localdomain', 'resource_name': | | | 'test_7', 'resource_id': | | | '0cbda264-b496-4649-ab55-9405984092e9'} | | Description | Moving a VM instance from source_node to | | | destination_node | +-------------+--------------------------------------------------------+
In this example, the
Action Plan
is disabling the Compute node which is going to be freed, then it will migrate the three instances running on it, and finally it will enable the Compute node again to make sure it is available for new workloads if needed. -
Now you are ready to execute
Action Plan
using the command:$ openstack optimize actionplan start <Action Plan UUID>
.$ openstack optimize actionplan start dfdcb491-89c5-4c07-a5ed-65d2085c488c \ --max-width=72 +---------------------+------------------------------------------------+ | Field | Value | +---------------------+------------------------------------------------+ | UUID | dfdcb491-89c5-4c07-a5ed-65d2085c488c | | Created At | 2025-02-11T16:19:44+00:00 | | Updated At | 2025-02-11T16:38:36+00:00 | | Deleted At | None | | Audit | 19ae8e21-3185-4366-96e3-ed04184234e8 | | Strategy | node_resource_consolidation | | State | PENDING | | Efficacy indicators | [{'name': 'compute_nodes_count', | | | 'description': 'The total number of enabled | | | compute nodes.', 'unit': None, 'value': 2.0}, | | | {'name': 'released_compute_nodes_count', | | | 'description': 'The number of compute nodes | | | to be released.', 'unit': None, 'value': 1.0}, | | | {'name': 'instance_migrations_count', | | | 'description': 'The number of VM migrations | | |to be performed.', 'unit': None, 'value': 4.0}] | | Global efficacy | [{'name': 'released_nodes_ratio', | | | 'description': 'Ratio of released compute | | | nodes divided by the total number of enabled | | | compute nodes.', 'unit': '%', 'value': 50.0}] | | Hostname | None | +---------------------+------------------------------------------------+
-
Finally, you can monitor the
Action Plan
progress and check the results. You can track the status of each action in the plan with:$ openstack optimize action list --action-plan <Action Plan UUID>
. After some time, all the actions should reportSUCCEEDED
state as shown in an example below:$ openstack optimize action list --action-plan dfdcb491-89c5-4c07-a5ed-65d2085c488c \ -c 'UUID' -c 'State' -c 'Action' +--------------------------------------+-----------+---------------------------+ | UUID | State | Action | +--------------------------------------+-----------+---------------------------+ | 01774d02-00a9-4f34-a6f7-6b2264b8970c | SUCCEEDED | change_nova_service_state | | 8573aa4e-6fac-4002-8de6-569e72d4dca3 | SUCCEEDED | migrate | | 6d88ea4a-012b-4cb8-af86-8e699c6c2738 | SUCCEEDED | migrate | | fa2827c6-78f8-48b8-8f8a-734b9f170841 | SUCCEEDED | migrate | | 4009c44d-9af6-4a6e-91dd-96bd4f17abd5 | SUCCEEDED | migrate | | e3dc2dec-74fc-4f16-b76d-c4b99acb1b01 | SUCCEEDED | change_nova_service_state | +--------------------------------------+-----------+---------------------------
You can check that the instances have been actually consolidated in one of your hosts by listing the instances (VMs) on each one of your hypervisors.
# List the hypervisors: $ openstack hypervisor list -c 'Hypervisor Hostname' -c 'State' +-------------------------------+-------+ | Hypervisor Hostname | State | +-------------------------------+-------+ | compute2.ctlplane.localdomain | up | | compute1.ctlplane.localdomain | up | +-------------------------------+-------+ + # Note that the output below lists all instances on one host: $ openstack server list --long -c Name -c 'Host' --project demo +--------+-------------------------------+ | Name | Host | +--------+-------------------------------+ | test_7 | compute2.ctlplane.localdomain | | test_6 | compute2.ctlplane.localdomain | | test_5 | compute2.ctlplane.localdomain | | test_4 | compute2.ctlplane.localdomain | | test_3 | compute2.ctlplane.localdomain | | test_2 | compute2.ctlplane.localdomain | | test_1 | compute2.ctlplane.localdomain | | test_0 | compute2.ctlplane.localdomain | +--------+-------------------------------+
Watcher Workflow (Horizon Dashboard UI)
Optimization Scenario II: Workload balancing/stabilization
This workflow is demonstrated through actions in the Horizon UI
.
Example screenshots are added where necessary to further explain the user flow.
-
Login in the
Horizon
dashboard with credentials which would enable theAdmin
role in one or more projects. -
Make sure you have the
Administration
menu enabled by selecting the project where you have theAdmin
role assigned in the Projects tab: -
In the Admin menu a new panel
Optimization
should be available: -
In the
Audit Templates
panel, click onCreate Template
button. This will open aCreate Audit Template
window. Add a newAudit Template
called AuditTemplateWorkloadStabilization with the goal Workload Balancing and the Strategy Workload stabilization. Further information on theWorkload stabilization
strategy is available in the Workload stabilization strategy reference. -
In the
Audit
panel, click on the Create Audit button. This will bring up theCreate Audit
window. Select the AuditTemplateWorkloadStabilizationAudit Template
, the CONTINUOUSAudit Type
. In the Interval field, set the value to 180.Click on the Create Audit button and a new
Audit
will be shown. -
Click on the UUID of the
Audit
listed and you will find theAction Plans
created for the new Audit. Given the low usage of resources in the instances created for the example workflows, the initialAction Plan
will not have real actions. -
Increase CPU consumption in one of the created instances (VMs). You can find the CPU usage of each instance at the example URL: Prometheus Metrics
View the list of instances in
Horizon
by selecting the Instances panel from the menus: Admin → Compute → Instances. Click on the name of one of the instances, and go to the Console tab for that instance. Log in withcirros
user and thegocubsgo
password, and run following command:$ dd if=/dev/random of=/dev/null
After a few minutes, the CPU usage of the edited instance should increase up to close to 100%. This increase will be seen in the instance metrics show in the Prometheus metrics URL.
-
Go back to
Audit
panel by through menu options: Admin → Optimization → Audit. Click on the UUID of the continuousAudit
. The next execution of theAudit
should generate a non-emptyAction Plan
with a RECOMMENDED status. Depending on specific resources, it may take one or two executions of theAudit
to create this non-empty plan. -
Click on the RECOMMENDED
Action Plan
, there should be a MigrateAction
listed. Click on theAction
to see the related details. The resource_name field should match name of the instance where you logged in and ran thedd
command. -
Go back to the list of
Action Plans
, and click the Start Action Plan button for the RECOMMENDED plan. Click on theAction Plan
UUID to track the status until it goes to SUCCEEDED. -
Check the distribution of the test instances over the hosts using the
openstack server list --long
command. You should see that the instance where the load was increased, has moved.$ openstack server list --long -c Name -c 'Host' --project demo +--------+-------------------------------+ | Name | Host | +--------+-------------------------------+ | test_7 | compute1.ctlplane.localdomain | | test_6 | compute2.ctlplane.localdomain | | test_5 | compute2.ctlplane.localdomain | | test_4 | compute2.ctlplane.localdomain | | test_3 | compute2.ctlplane.localdomain | | test_2 | compute2.ctlplane.localdomain | | test_1 | compute2.ctlplane.localdomain | | test_0 | compute2.ctlplane.localdomain | +--------+-------------------------------+
-
Stop the CONTINUOUS
Audit
from theAudits
panel and select Cancel Action. If this option is not available fromHorizon
in your environment, you can execute it using the CLI:$ openstack optimize audit update <audit uuid> replace state=CANCELLED