Nova Documentation Overview 23.1.1
Nova Documentation Overview 23.1.1
Release 23.1.1.dev14
OpenStack Foundation
1 What is nova? 1
3 For Operators 73
3.1 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.1.1 Nova System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.1.1.1 Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.2 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2.1 Compute service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2.1.2 Compute service overview . . . . . . . . . . . . . . . . . . . . . . . 79
3.2.1.3 Install and configure controller node . . . . . . . . . . . . . . . . . . 81
3.2.1.4 Install and configure a compute node . . . . . . . . . . . . . . . . . . 98
3.2.1.5 Verify operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
3.3 Deployment Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
3.3.1 Feature Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
3.3.1.1 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
3.3.1.2 General Purpose Cloud Features . . . . . . . . . . . . . . . . . . . . 114
3.3.1.3 NFV Cloud Features . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
3.3.1.4 HPC Cloud Features . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
3.3.1.5 Notes on Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
i
3.3.2 Feature Support Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
3.3.3 Cells Layout (v2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
3.3.3.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
3.3.3.2 Service Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
3.3.4 Using WSGI with Nova . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
3.4 Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
3.4.1 Compute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
3.4.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
3.4.1.2 Advanced configuration . . . . . . . . . . . . . . . . . . . . . . . . . 202
3.4.1.3 Additional guides . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
3.4.2 Flavors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
3.4.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
3.4.3 Upgrades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
3.4.3.1 Minimal Downtime Upgrade Process . . . . . . . . . . . . . . . . . . 395
3.4.3.2 Current Database Upgrade Types . . . . . . . . . . . . . . . . . . . . 397
3.4.3.3 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
3.4.3.4 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
3.4.4 Quotas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
3.4.4.1 Types of quota . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
3.4.4.2 Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
3.4.5 Filter Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
3.4.5.1 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
3.4.5.2 Configuring Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
3.4.5.3 Writing Your Own Filter . . . . . . . . . . . . . . . . . . . . . . . . 409
3.4.5.4 Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
3.5 Reference Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
3.5.1 Command-line Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
3.5.1.1 Nova Management Commands . . . . . . . . . . . . . . . . . . . . . 415
3.5.1.2 Service Daemons . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
3.5.1.3 WSGI Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
3.5.1.4 Additional Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
3.5.2 Configuration Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
3.5.2.1 Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
3.5.2.2 Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663
3.5.2.3 Extra Specs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
ii
4.1.4.2 Code Review Guide for Nova . . . . . . . . . . . . . . . . . . . . . . 808
4.1.4.3 Internationalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 812
4.1.4.4 Documentation Guidelines . . . . . . . . . . . . . . . . . . . . . . . 813
4.1.5 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 814
4.1.5.1 Test Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815
4.1.5.2 Testing NUMA related hardware setup with libvirt . . . . . . . . . . . 816
4.1.5.3 Testing Serial Console . . . . . . . . . . . . . . . . . . . . . . . . . . 830
4.1.5.4 Testing Zero Downtime Upgrade Process . . . . . . . . . . . . . . . . 832
4.1.5.5 Testing Down Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . 835
4.1.5.6 Profiling With Eventlet . . . . . . . . . . . . . . . . . . . . . . . . . 841
4.1.6 The Nova API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847
4.1.6.1 Extending the API . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847
4.1.6.2 Adding a Method to the OpenStack API . . . . . . . . . . . . . . . . 852
4.1.6.3 API Microversions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 853
4.1.6.4 API reference guideline . . . . . . . . . . . . . . . . . . . . . . . . . 860
4.1.7 Nova Major Subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 867
4.1.7.1 Evacuate vs Rebuild . . . . . . . . . . . . . . . . . . . . . . . . . . . 868
4.1.7.2 Resize and cold migrate . . . . . . . . . . . . . . . . . . . . . . . . . 869
4.2 Technical Reference Deep Dives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 872
4.2.1 Internals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 872
4.2.1.1 AMQP and Nova . . . . . . . . . . . . . . . . . . . . . . . . . . . . 873
4.2.1.2 Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 878
4.2.1.3 Scheduler hints versus flavor extra specs . . . . . . . . . . . . . . . . 881
4.2.1.4 Live Migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884
4.2.1.5 Services, Managers and Drivers . . . . . . . . . . . . . . . . . . . . . 884
4.2.1.6 Virtual Machine States and Transitions . . . . . . . . . . . . . . . . . 888
4.2.1.7 Threading model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 891
4.2.1.8 Notifications in Nova . . . . . . . . . . . . . . . . . . . . . . . . . . 892
4.2.1.9 ComputeDriver.update_provider_tree . . . . . . . . . . . . . . . . . . 902
4.2.1.10 Upgrade checks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 905
4.2.1.11 Conductor as a place for orchestrating tasks . . . . . . . . . . . . . . 910
4.2.1.12 Filtering hosts by isolating aggregates . . . . . . . . . . . . . . . . . 911
4.2.1.13 Attaching Volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . 912
4.2.1.14 Driver BDM Data Structures . . . . . . . . . . . . . . . . . . . . . . 913
4.2.1.15 Libvirt virt driver OS distribution support matrix . . . . . . . . . . . . 916
4.2.2 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 919
4.2.2.1 Guru Meditation Reports . . . . . . . . . . . . . . . . . . . . . . . . 920
4.2.3 Forward Looking Plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 921
4.2.3.1 Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 921
4.2.3.2 REST API Policy Enforcement . . . . . . . . . . . . . . . . . . . . . 933
4.2.3.3 Nova Stable REST API . . . . . . . . . . . . . . . . . . . . . . . . . 935
4.2.3.4 Scheduler Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 937
4.2.4 Additional Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 939
4.2.4.1 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 940
Index 941
iii
CHAPTER
ONE
WHAT IS NOVA?
Nova is the OpenStack project that provides a way to provision compute instances (aka virtual servers).
Nova supports creating virtual machines, baremetal servers (through the use of ironic), and has limited
support for system containers. Nova runs as a set of daemons on top of existing Linux servers to provide
that service.
It requires the following additional OpenStack services for basic function:
• Keystone: This provides identity and authentication for all OpenStack services.
• Glance: This provides the compute image repository. All compute instances launch from glance
images.
• Neutron: This is responsible for provisioning the virtual or physical networks that compute in-
stances connect to on boot.
• Placement: This is responsible for tracking inventory of resources available in a cloud and assist-
ing in choosing which provider of those resources will be used when creating a virtual machine.
It can also integrate with other services to include: persistent block storage, encrypted disks, and
baremetal compute instances.
1
CHAPTER
TWO
As an end user of nova, youll use nova to create and manage servers with either tools or the API directly.
Availability Zones are an end-user visible logical abstraction for partitioning a cloud without knowing
the physical infrastructure. Availability zones can be used to partition a cloud on arbitrary factors, such
as location (country, datacenter, rack), network layout and/or power source. Because of the flexibility,
the names and purposes of availability zones can vary massively between clouds.
In addition, other services, such as the networking service and the block storage service, also provide
an availability zone feature. However, the implementation of these features differs vastly between these
different services. Consult the documentation for these other services for more information on their
implementation of this feature.
Usage
Availability zones can only be created and configured by an admin but they can be used by an end-user
when creating an instance. For example:
It is also possible to specify a destination host and/or node using this command; however, this is an
admin-only operation by default. For more information, see Using availability zones to select hosts.
2
Nova Documentation, Release 23.1.1.dev14
Note: Instances that use the default security group cannot, by default, be accessed from any IP address
outside of the cloud. If you want those IP addresses to access the instances, you must modify the rules
for the default security group.
After you gather the parameters that you need to launch an instance, you can launch it from an image
or a volume. You can launch an instance directly from one of the available OpenStack images or from
an image that you have copied to a persistent volume. The OpenStack Image service provides a pool of
images that are accessible to members of different projects.
Note the ID of the flavor that you want to use for your instance:
+-----+-----------+-------+------+-----------+-------+-----------+
| ID | Name | RAM | Disk | Ephemeral | VCPUs | Is_Public |
+-----+-----------+-------+------+-----------+-------+-----------+
| 1 | m1.tiny | 512 | 1 | 0 | 1 | True |
| 2 | m1.small | 2048 | 20 | 0 | 1 | True |
| 3 | m1.medium | 4096 | 40 | 0 | 2 | True |
| 4 | m1.large | 8192 | 80 | 0 | 4 | True |
| 5 | m1.xlarge | 16384 | 160 | 0 | 8 | True |
+-----+-----------+-------+------+-----------+-------+-----------+
Note the ID of the image from which you want to boot your instance:
+--------------------------------------+------------------------------
,→---+--------+
| ID | Name
,→ | Status |
+--------------------------------------+------------------------------
,→---+--------+
| 397e713c-b95b-4186-ad46-6126863ea0a9 | cirros-0.3.5-x86_64-uec
,→ | active |
| df430cc2-3406-4061-b635-a51c16e488ac | cirros-0.3.5-x86_64-uec-
,→kernel | active |
| 3cf852bd-2332-48f4-9ae4-7d926d50945e | cirros-0.3.5-x86_64-uec-
,→ramdisk | active |
+--------------------------------------+------------------------------
,→---+--------+
You can also filter the image list by using grep to find a specific image, as follows:
| df430cc2-3406-4061-b635-a51c16e488ac | cirros-0.3.5-x86_64-uec-
,→kernel | active |
Note: If you are an admin user, this command will list groups for all tenants.
Note the ID of the security group that you want to use for your instance:
+--------------------------------------+---------+--------------------
,→----+----------------------------------+
| ID | Name | Description
,→ | Project |
+--------------------------------------+---------+--------------------
,→----+----------------------------------+
| b0d78827-0981-45ef-8561-93aee39bbd9f | default | Default security
,→group | 5669caad86a04256994cdf755df4d3c1 |
If you have not created any security groups, you can assign the instance to only the default security
group.
You can view rules for a specified security group:
5. List the available key pairs, and note the key pair name that you use for SSH access.
Launch an instance
Optionally, you can provide a key name for access control and a security group for security. You
can also include metadata key and value pairs. For example, you can add a description for your
server by providing the --property description="My Server" parameter.
You can pass user data in a local file at instance launch by using the --user-data
USER-DATA-FILE parameter.
Depending on the parameters that you provide, the command returns a list of server properties.
+--------------------------------------+------------------------------
,→-----------------+
| Field | Value
,→ |
+--------------------------------------+------------------------------
,→-----------------+
| OS-DCF:diskConfig | MANUAL
,→ |
| OS-EXT-AZ:availability_zone |
,→ |
| OS-EXT-SRV-ATTR:host | None
,→ |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None
,→ |
| OS-EXT-SRV-ATTR:instance_name |
,→ |
| OS-EXT-STS:power_state | NOSTATE
,→ |
| OS-EXT-STS:task_state | scheduling
,→ |
| OS-EXT-STS:vm_state | building
,→ |
| OS-SRV-USG:launched_at | None
,→ |
| OS-SRV-USG:terminated_at | None
,→ |
| accessIPv4 |
,→ |
| accessIPv6 |
,→ |
| addresses |
,→ |
| adminPass | E4Ksozt4Efi8
,→ |
| config_drive |
,→ |
| created | 2016-11-30T14:48:05Z
,→ |
| flavor | m1.tiny
,→ |
| hostId |
,→ |
| id | 89015cc9-bdf1-458a-8518-
,→fdca2b4a5785 |
(continues on next page)
A status of BUILD indicates that the instance has started, but is not yet online.
A status of ACTIVE indicates that the instance is active.
2. Copy the server ID value from the id field in the output. Use the ID to get server details or to
delete your server.
3. Copy the administrative password value from the adminPass field. Use the password to log in
to your server.
4. Check if the instance is online.
The list shows the ID, name, status, and private (and if assigned, public) IP addresses for all
instances in the project to which you belong:
+-------------+----------------------+--------+------------+----------
,→---+------------------+------------+
| ID | Name | Status | Task State | Power
,→State | Networks | Image Name |
+-------------+----------------------+--------+------------+----------
,→---+------------------+------------+
Note: If you did not provide a key pair, security groups, or rules, you can access the instance
only from inside the cloud through VNC. Even pinging the instance is not possible.
Note: The maximum limit on the number of disk devices allowed to attach to a single server is config-
urable with the option compute.max_disk_devices_to_attach.
Create a non-bootable volume and attach that volume to an instance that you boot from an image.
To create a non-bootable volume, do not create it from an image. The volume must be entirely empty
with no partition table and no file system.
1. Create a non-bootable volume.
2. List volumes.
3. Boot an instance from an image and attach the empty volume to the instance.
| OS-DCF:diskConfig | MANUAL
,→ |
| OS-EXT-AZ:availability_zone | nova
,→ |
(continues on next page)
You can create a volume from an existing image, volume, or snapshot. This procedure shows you how
to create a volume from an image, and use the volume to boot an instance.
1. List the available images.
$ openstack image list
+-----------------+---------------------------------+--------+
| ID | Name | Status |
+-----------------+---------------------------------+--------+
| 484e05af-a14... | Fedora-x86_64-20-20131211.1-sda | active |
| 98901246-af9... | cirros-0.3.5-x86_64-uec | active |
| b6e95589-7eb... | cirros-0.3.5-x86_64-uec-kernel | active |
| c90893ea-e73... | cirros-0.3.5-x86_64-uec-ramdisk | active |
+-----------------+---------------------------------+--------+
Note the ID of the image that you want to use to create a volume.
If you want to create a volume to a specific storage backend, you need to use an image which
has cinder_img_volume_type property. In this case, a new volume will be created as stor-
age_backend1 volume type.
$ openstack image show 98901246-af9d-4b61-bea8-09cc6dc41829
+------------------+--------------------------------------------------
,→----+
| Field | Value
,→ |
+------------------+--------------------------------------------------
,→----+
| checksum | ee1eca47dc88f4879d8a229cc70a07c6
,→ |
| container_format | bare
,→ |
| created_at | 2016-10-08T14:59:05Z
,→ |
| disk_format | qcow2
,→ |
| file | /v2/images/9fef3b2d-c35d-4b61-bea8-09cc6dc41829/
,→file |
| id | 98901246-af9d-4b61-bea8-09cc6dc41829
,→ |
| min_disk | 0
,→ |
| min_ram | 0
,→ |
| name | cirros-0.3.5-x86_64-uec
,→ |
| owner | 8d8ef3cdf2b54c25831cbb409ad9ae86
,→ |
| protected | False
,→ |
| schema | /v2/schemas/image
,→ |
| size | 13287936
,→ |
| status | active
,→ | (continues on next page)
Note the ID of the flavor that you want to use to create a volume.
3. To create a bootable volume from an image and launch an instance from this volume, use the
--block-device parameter with the nova boot command.
For example:
See the nova boot command documentation and Block Device Mapping in Nova for more details
on these parameters.
Note: As of the Stein release, the openstack server create command does not support
creating a volume-backed server from a source image like the nova boot command. The next
steps will show how to create a bootable volume from an image and then create a server from that
boot volume using the openstack server create command.
4. Create a bootable volume from an image. Cinder makes a volume bootable when --image
parameter is passed.
Note: A bootable encrypted volume can also be created by adding the type EN-
CRYPTED_VOLUME_TYPE parameter to the volume create command:
This requires an encrypted volume type, which must be created ahead of time by an admin. Re-
fer to horizonadmin/manage-volumes.html#create-an-encrypted-volume-type. in the OpenStack
Horizon Administration Guide.
5. Create a VM from previously created bootable volume. The volume is not deleted when the
instance is terminated.
Note: The example here uses the --volume option for simplicity. The
--block-device-mapping option could also be used for more granular control over the
6. List volumes to see the bootable volume and its attached myInstanceFromVolume instance.
Use the nova boot --swap parameter to attach a swap disk on boot or the nova boot
--ephemeral parameter to attach an ephemeral disk on boot. When you terminate the instance,
both disks are deleted.
Boot an instance with a 512 MB swap disk and 2 GB ephemeral disk.
Note: The flavor defines the maximum swap and ephemeral disk size. You cannot exceed these maxi-
mum values.
OpenStack supports booting instances using ISO images. But before you make such instances func-
tional, use the openstack server create command with the following parameters to boot an
instance:
$ openstack server create --image ubuntu-14.04.2-server-amd64.iso \
--nic net-id = NETWORK_UUID \
--flavor 2 INSTANCE_NAME
+--------------------------------------+-----------------------------------
,→---------+
| Field | Value
,→ |
+--------------------------------------+-----------------------------------
,→---------+
| OS-DCF:diskConfig | MANUAL
,→ |
| OS-EXT-AZ:availability_zone | nova
,→ |
| OS-EXT-SRV-ATTR:host | -
,→ |
| OS-EXT-SRV-ATTR:hypervisor_hostname | -
,→ |
| OS-EXT-SRV-ATTR:instance_name | instance-00000004
,→ |
| OS-EXT-STS:power_state | 0
,→ |
| OS-EXT-STS:task_state | scheduling
,→ |
| OS-EXT-STS:vm_state | building
,→ |
| OS-SRV-USG:launched_at | -
,→ |
| OS-SRV-USG:terminated_at | -
,→ |
| accessIPv4 |
,→ |
| accessIPv6 |
,→ |
| adminPass | ZaiYeC8iucgU
,→ |
| config_drive |
,→ |
| created | 2015-06-01T16:34:50Z
,→ |
| flavor | m1.small (2)
,→ |
| hostId |
,→ |
| id | 1e1797f3-1662-49ff-ae8c-
,→a77e82ee1571 |
| image | ubuntu-14.04.2-server-amd64.iso
,→ |
| key_name | -
,→ | (continues on next page)
Note: You need the Block Storage service to preserve the instance after shutdown. The
--block-device argument, used with the legacy nova boot, will not work with the Open-
Stack openstack server create command. Instead, the openstack volume create and
openstack server add volume commands create persistent storage.
After the instance is successfully launched, connect to the instance using a remote console and follow
the instructions to install the system as using ISO images on regular computers. When the installation
is finished and system is rebooted, the instance asks you again to install the operating system, which
means your instance is not usable. If you have problems with image creation, please check the Virtual
Machine Image Guide for reference.
Now complete the following steps to make your instances created using ISO image actually functional.
1. Delete the instance using the following command.
2. After you delete the instance, the system you have just installed using your ISO image remains,
because the parameter shutdown=preserve was set, so run the following command.
| ID | Name | Status |
,→Size | Attached to |
+--------------------------+-------------------------+-----------+----
,→--+-------------+
| 8edd7c97-1276-47a5-9563- |dc01d873-d0f1-40b6-bfcc- | available |
,→10 | |
| 1025f4264e4f | 26a8d955a1d9-blank-vol | |
,→ | |
+--------------------------+-------------------------+-----------+----
,→--+-------------+
You get a list with all the volumes in your system. In this list, you can find the volume that is
attached to your ISO created instance, with the false bootable property.
3. Upload the volume to glance.
The SOURCE_VOLUME is the UUID or a name of the volume that is attached to your ISO created
instance, and the IMAGE_NAME is the name that you give to your new image.
4. After the image is successfully uploaded, you can use the new image to boot instances.
The instances launched using this image contain the system that you have just installed using the
ISO image.
2.1.1.3 Metadata
Nova presents configuration information to instances it starts via a mechanism called metadata. These
mechanisms are widely used via helpers such as cloud-init to specify things like the root password the
instance should use.
This metadata is made available via either a config drive or the metadata service and can be somewhat
customised by the user using the user data feature. This guide provides an overview of these features
along with a summary of the types of metadata available.
Types of metadata
There are three separate groups of users who need to be able to specify metadata for an instance.
The user who booted the instance can pass metadata to the instance in several ways. For authentication
keypairs, the keypairs functionality of the nova API can be used to upload a key and then specify that key
during the nova boot API request. For less structured data, a small opaque blob of data may be passed
via the user data feature of the nova API. Examples of such unstructured data would be the puppet role
that the instance should use, or the HTTP address of a server from which to fetch post-boot configuration
information.
Nova itself needs to pass information to the instance via its internal implementation of the metadata
system. Such information includes the requested hostname for the instance and the availability zone the
instance is in. This happens by default and requires no configuration by the user or deployer.
Nova provides both an OpenStack metadata API and an EC2-compatible API. Both the OpenStack
metadata and EC2-compatible APIs are versioned by date. These are described later.
A deployer of OpenStack may need to pass data to an instance. It is also possible that this data is
not known to the user starting the instance. An example might be a cryptographic token to be used to
register the instance with Active Directory post boot the user starting the instance should not have access
to Active Directory to create this token, but the nova deployment might have permissions to generate the
token on the users behalf. This is possible using the vendordata feature, which must be configured by
your cloud operator.
Note: This section provides end user information about the metadata service. For deployment informa-
tion about the metadata service, refer to the admin guide.
The metadata service provides a way for instances to retrieve instance-specific data via a REST API. In-
stances access this service at 169.254.169.254 or at fe80::a9fe:a9fe. All types of metadata,
be it user-, nova- or vendor-provided, can be accessed via this service.
Changed in version 22.0.0: Starting with the Victoria release the metadata service is accessible over
IPv6 at the link-local address fe80::a9fe:a9fe.
Note: As with all IPv6 link-local addresses, the metadata IPv6 address is not complete with-
out a zone identifier (in a Linux guest that is usually the interface name concatenated after a per-
cent sign). Please also note that in URLs you should URL-encode the percent sign itself. For
example, assuming that the primary network interface in the guest is ens2 substitute http://
[fe80::a9fe:a9fe%25ens2]:80/... for http://169.254.169.254/....
To retrieve a list of supported versions for the OpenStack metadata API, make a GET request to http:/
/169.254.169.254/openstack, which will return a list of directories:
$ curl http://169.254.169.254/openstack
2012-08-10
2013-04-04
2013-10-17
2015-10-15
2016-06-30
2016-10-06
2017-02-22
2018-08-27
latest
Refer to OpenStack format metadata for information on the contents and structure of these directories.
To list supported versions for the EC2-compatible metadata API, make a GET request to http://
169.254.169.254, which will, once again, return a list of directories:
$ curl http://169.254.169.254
1.0
2007-01-19
2007-03-01
2007-08-29
2007-10-10
2007-12-15
2008-02-01
2008-09-01
2009-04-04
latest
Refer to EC2-compatible metadata for information on the contents and structure of these directories.
Config drives
Note: This section provides end user information about config drives. For deployment information
about the config drive feature, refer to the admin guide.
Config drives are special drives that are attached to an instance when it boots. The instance can mount
this drive and read files from it to get information that is normally available through the metadata service.
One use case for using the config drive is to pass a networking configuration when you do not use DHCP
to assign IP addresses to instances. For example, you might pass the IP address configuration for the
instance through the config drive, which the instance can mount and access before you configure the
network settings for the instance.
To enable the config drive for an instance, pass the --config-drive true parameter to the
openstack server create command.
The following example enables the config drive and passes a user data file and two key/value metadata
pairs, all of which are accessible from the config drive:
Note: The Compute service can be configured to always create a config drive. For more information,
refer to the admin guide.
If your guest operating system supports accessing disk by label, you can mount the config drive as the
/dev/disk/by-label/configurationDriveVolumeLabel device. In the following exam-
ple, the config drive has the config-2 volume label:
# mkdir -p /mnt/config
# mount /dev/disk/by-label/config-2 /mnt/config
If your guest operating system does not use udev, the /dev/disk/by-label directory is not
present. You can use the blkid command to identify the block device that corresponds to the con-
fig drive. For example:
# mkdir -p /mnt/config
# mount /dev/vdb /mnt/config
Once mounted, you can examine the contents of the config drive:
$ cd /mnt/config
$ find . -maxdepth 2
.
./ec2
./ec2/2009-04-04
./ec2/latest
./openstack
./openstack/2012-08-10
./openstack/2013-04-04
./openstack/2013-10-17
./openstack/2015-10-15
./openstack/2016-06-30
./openstack/2016-10-06
./openstack/2017-02-22
./openstack/latest
The files that appear on the config drive depend on the arguments that you pass to the openstack
server create command. The format of this directory is the same as that provided by the metadata
service, with the exception that the EC2-compatible metadata is now located in the ec2 directory instead
of the root (/) directory. Refer to the OpenStack format metadata and EC2-compatible metadata sections
for information about the format of the files and subdirectories within these directories.
Nova metadata
As noted previously, nova provides its metadata in two formats: OpenStack format and EC2-compatible
format.
Changed in version 12.0.0: Support for network metadata was added in the Liberty release.
Metadata from the OpenStack API is distributed in JSON format. There are two files provided for each
version: meta_data.json and network_data.json. The meta_data.json file contains
nova-specific information, while the network_data.json file contains information retrieved from
neutron. For example:
$ curl http://169.254.169.254/openstack/2018-08-27/meta_data.json
{
"random_seed": "yu5ZnkqF2CqnDZVAfZgarG...",
"availability_zone": "nova",
"keys": [
{
"data": "ssh-rsa AAAAB3NzaC1y...== Generated by Nova\n",
"type": "ssh",
"name": "mykey"
}
],
"hostname": "test.novalocal",
"launch_index": 0,
"meta": {
"priority": "low",
"role": "webserver"
},
"devices": [
{
"type": "nic",
"bus": "pci",
"address": "0000:00:02.0",
"mac": "00:11:22:33:44:55",
"tags": ["trusted"]
},
{
"type": "disk",
"bus": "ide",
"address": "0:0",
"serial": "disk-vol-2352423",
"path": "/dev/sda",
"tags": ["baz"]
}
],
(continues on next page)
$ curl http://169.254.169.254/openstack/2018-08-27/network_data.json
{
"links": [
{
"ethernet_mac_address": "fa:16:3e:9c:bf:3d",
"id": "tapcd9f6d46-4a",
"mtu": null,
"type": "bridge",
"vif_id": "cd9f6d46-4a3a-43ab-a466-994af9db96fc"
}
],
"networks": [
{
"id": "network0",
"link": "tapcd9f6d46-4a",
"network_id": "99e88329-f20d-4741-9593-25bf07847b16",
"type": "ipv4_dhcp"
}
],
"services": [
{
"address": "8.8.8.8",
"type": "dns"
}
]
}
EC2-compatible metadata
The EC2-compatible API is compatible with version 2009-04-04 of the Amazon EC2 metadata service
This means that virtual machine images designed for EC2 will work properly with OpenStack.
The EC2 API exposes a separate URL for each metadata element. Retrieve a listing of these elements
by making a GET query to http://169.254.169.254/2009-04-04/meta-data/. For ex-
ample:
$ curl http://169.254.169.254/2009-04-04/meta-data/
ami-id
ami-launch-index
ami-manifest-path
block-device-mapping/
hostname
instance-action
(continues on next page)
$ curl http://169.254.169.254/2009-04-04/meta-data/block-device-mapping/
ami
$ curl http://169.254.169.254/2009-04-04/meta-data/placement/
availability-zone
$ curl http://169.254.169.254/2009-04-04/meta-data/public-keys/
0=mykey
Instances can retrieve the public SSH key (identified by keypair name when a user requests a new in-
stance) by making a GET request to http://169.254.169.254/2009-04-04/meta-data/
public-keys/0/openssh-key:
$ curl http://169.254.169.254/2009-04-04/meta-data/public-keys/0/openssh-
,→key
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDYVEprvtYJXVOBN0XNKVVRNCRX6BlnNbI+US\
LGais1sUWPwtSg7z9K9vhbYAPUZcq8c/s5S9dg5vTHbsiyPCIDOKyeHba4MUJq8Oh5b2i71/3B\
ISpyxTBH/uZDHdslW2a+SrPDCeuMMoss9NFhBdKtDkdG9zyi0ibmCP6yMdEX8Q== Generated\
by Nova
User data
User data is a blob of data that the user can specify when they launch an instance. The instance can
access this data through the metadata service or config drive. Commonly used to pass a shell script that
the instance runs on boot.
For example, one application that uses user data is the cloud-init system, which is an open-source pack-
age from Ubuntu that is available on various Linux distributions and which handles early initialization
of a cloud instance.
You can place user data in a local file and pass it through the --user-data <user-data-file>
parameter at instance creation.
$ openstack server create --image ubuntu-cloudimage --flavor 1 \
--user-data mydata.file VM_INSTANCE
Note: The provided user data should not be base64-encoded, as it will be automatically encoded in
order to pass valid input to the REST API, which has a limit of 65535 bytes after encoding.
Once booted, you can access this data from the instance using either the metadata service or
the config drive. To access it via the metadata service, make a GET request to either http:/
/169.254.169.254/openstack/{version}/user_data (OpenStack API) or http://
169.254.169.254/{version}/user-data (EC2-compatible API). For example:
$ curl http://169.254.169.254/openstack/2018-08-27/user_data
#!/bin/bash
echo 'Extra user data here'
Vendordata
Note: This section provides end user information about the vendordata feature. For deployment infor-
mation about this feature, refer to the admin guide.
Changed in version 14.0.0: Support for dynamic vendor data was added in the Newton release.
Where configured, instances can retrieve vendor-specific data from the metadata service or
config drive. To access it via the metadata service, make a GET request to either
http://169.254.169.254/openstack/{version}/vendor_data.json or http://
169.254.169.254/openstack/{version}/vendor_data2.json, depending on the de-
ployment. For example:
$ curl http://169.254.169.254/openstack/2018-08-27/vendor_data2.json
{
"testing": {
"value1": 1,
"value2": 2,
"value3": "three"
}
}
Note: The presence and contents of this file will vary from deployment to deployment.
General guidelines
• Do not rely on the presence of the EC2 metadata in the metadata API or config drive, because
this content might be removed in a future release. For example, do not rely on files in the ec2
directory.
• When you create images that access metadata service or config drive data and multiple di-
rectories are under the openstack directory, always select the highest API version by date
that your consumer supports. For example, if your guest image supports the 2012-03-05,
2012-08-05, and 2013-04-13 versions, try 2013-04-13 first and fall back to a previous
version if 2013-04-13 is not present.
Each instance has a private, fixed IP address and can also have a public, or floating IP address. Private
IP addresses are used for communication between instances, and public addresses are used for commu-
nication with networks outside the cloud, including the Internet.
When you launch an instance, it is automatically assigned a private IP address that stays the same until
you explicitly terminate the instance. Rebooting an instance has no effect on the private IP address.
A pool of floating IP addresses, configured by the cloud administrator, is available in OpenStack Com-
pute. The project quota defines the maximum number of floating IP addresses that you can allocate to
the project. After you allocate a floating IP address to a project, you can:
• Associate the floating IP address with an instance of the project.
• Disassociate a floating IP address from an instance in the project.
• Delete a floating IP from the project which automatically deletes that IPs associations.
Use the openstack commands to manage floating IP addresses.
Note: If this list is empty, the cloud administrator must configure a pool of floating IP addresses.
To list all floating IP addresses that are allocated to the current project, run:
+--------------------------------------+---------------------+-------------
,→-----+------+
| 760963b2-779c-4a49-a50d-f073c1ca5b9e | 172.24.4.228 | None
,→ | None |
| 89532684-13e1-4af3-bd79-f434c9920cc3 | 172.24.4.235 | None
,→ | None |
| ea3ebc6d-a146-47cd-aaa8-35f06e1e8c3d | 172.24.4.229 | None
,→ | None |
+--------------------------------------+---------------------+-------------
,→-----+------+
For each floating IP address that is allocated to the current project, the command outputs the floating
IP address, the ID for the instance to which the floating IP address is assigned, the associated fixed IP
address, and the pool from which the floating IP address was allocated.
2. List all project instances with which a floating IP address could be associated.
$ openstack server list
+---------------------+------+---------+------------+-------------+---
,→---------------+------------+
,→ |
| | | | =
,→'a2b3acbe-fbeb-40d3-b21f-121268c21b55' |
,→ |
+--------------------------------------+------+-------------------+---
,→-----------------------------------------------------------+--------
,→+
For example:
After you associate the IP address and configure security group rules for the instance, the instance
is publicly available at the floating IP address.
The IP address is returned to the pool of IP addresses that is available for all projects. If the IP address
is still associated with a running instance, it is automatically disassociated from that instance.
Nova can determine if the certificate used to generate and verify the signature of a signed image (see
Glance Image Signature Verification documentation) is trusted by the user. This feature is called certifi-
cate validation and can be applied to the creation or rebuild of an instance.
Certificate validation is meant to be performed jointly with image signature verification but
each feature has its own Nova configuration option, to be specified in the [glance] sec-
tion of the nova.conf configuration file. To enable certificate validation, set glance.
enable_certificate_validation to True. To enable signature validation, set glance.
verify_glance_signatures to True. Conversely, to disable either of these features, set their
option to False or do not include the option in the Nova configurations at all.
Certificate validation operates in concert with signature validation in Cursive. It takes in a list of trusted
certificate IDs and verifies that the certificate used to sign the image being booted is cryptographically
linked to at least one of the provided trusted certificates. This provides the user with confidence in the
identity and integrity of the image being booted.
Certificate validation will only be performed if image signature validation is enabled. However,
the presence of trusted certificate IDs overrides the enable_certificate_validation and
verify_glance_signatures settings. In other words, if a list of trusted certificate IDs is pro-
vided to the instance create or rebuild commands, signature verification and certificate validation will be
performed, regardless of their settings in the Nova configurations. See Using Signature Verification for
details.
Note: Certificate validation configuration options must be specified in the Nova configuration file
that controls the nova-osapi_compute and nova-compute services, as opposed to other Nova
services (conductor, scheduler, etc.).
Requirements
Key manager that is a backend to the Castellan Interface. Possible key managers are:
• Barbican
• Vault
Limitations
• As of the 18.0.0 Rocky release, only the libvirt compute driver supports trusted image certification
validation. The feature is not, however, driver specific so other drivers should be able to support
this feature over time. See the feature support matrix for information on which drivers support the
feature at any given release.
• As of the 18.0.0 Rocky release, image signature and trusted image certification validation is not
supported with the Libvirt compute driver when using the rbd image backend ([libvirt]/
images_type=rbd) and RAW formatted images. This is due to the images being cloned directly
in the RBD backend avoiding calls to download and verify on the compute.
• As of the 18.0.0 Rocky release, trusted image certification validation is not supported with volume-
backed (boot from volume) instances. The block storage service support may be available in a
future release:
https://blueprints.launchpad.net/cinder/+spec/certificate-validate
• Trusted image certification support can be controlled via policy configuration if it needs
to be disabled. See the os_compute_api:servers:create:trusted_certs and
os_compute_api:servers:rebuild:trusted_certs policy rules.
Configuration
Nova will use the key manager defined by the Castellan key manager interface, which is the Bar-
bican key manager by default. To use a different key manager, update the backend value in the
[key_manager] group of the nova configuration file. For example:
[key_manager]
backend = barbican
Note: If these lines do not exist, then simply add them to the end of the file.
[glance]
verify_glance_signatures = True
enable_certificate_validation = True
$ export OS_TRUSTED_IMAGE_CERTIFICATE_IDS=79a6ad17-3298-4e55-8b3a-
,→1672dd93c40f,b20f5600-3c9d-4af5-8f37-3110df3533a0
Command-Line Flag If booting or rebuilding an instance using the nova commands, use the
--trusted-image-certificate-id flag to define a single trusted certificate ID.
The flag may be used multiple times to specify multiple trusted certificate IDs. For example:
If booting or rebuilding an instance using the openstack server commands, use the
--trusted-image-certificate-id flag to define a single trusted certificate ID.
The flag may be used multiple times to specify multiple trusted certificate IDs. For example:
--flavor 1 \
--image myImageId \
--nic net-id=fd25c0b2-b36b-45a8-82e4-ab52516289e5 \
--trusted-image-certificate-id 79a6ad17-3298-4e55-8b3a-
,→1672dd93c40f \
--trusted-image-certificate-id b20f5600-3c9d-4af5-8f37-
,→3110df3533a0
[glance]
default_trusted_certificate_ids=79a6ad17-3298-4e55-8b3a-
,→1672dd93c40f,b20f5600-3c9d-4af5-8f37-3110df3533a0
Example Usage
For these instructions, we will construct a 4-certificate chain to illustrate that it is possible to have a
single trusted root certificate. We will upload all four certificates to Barbican. Then, we will sign an
image and upload it to Glance, which will illustrate image signature verification. Finally, we will boot
the signed image from Glance to show that certificate validation is enforced.
Enable image signature verification and certificate validation by setting both of their Nova configuration
options to True:
[glance]
verify_glance_signatures = True
enable_certificate_validation = True
As mentioned above, we will construct a 4-certificate chain to illustrate that it is possible to have a
single trusted root certificate. Before we begin to build our certificate chain, we must first create files for
OpenSSL to use for indexing and serial number tracking:
$ touch index.txt
$ echo '01' > serial.txt
For these instructions, we will create a single configuration file called ca.conf, which contains various
sections that we can specify for use on the command-line during certificate requests and generation.
Note that this certificate will be able to sign other certificates because it is a certificate authority. Also
note the root CAs unique common name (root). The intermediate certificates common names will be
specified on the command-line when generating the corresponding certificate requests.
ca.conf:
[ req ]
prompt = no
distinguished_name = dn-param
x509_extensions = ca_cert_extensions
[ ca ]
default_ca = ca_default
[ dn-param ]
C = US
CN = Root CA
[ ca_cert_extensions ]
keyUsage = keyCertSign, digitalSignature
basicConstraints = CA:TRUE, pathlen:2
[ ca_default ]
new_certs_dir = . # Location for new certs after signing
database = ./index.txt # Database index file
serial = ./serial.txt # The current serial number
default_days = 1000
default_md = sha256
(continues on next page)
policy = signing_policy
email_in_dn = no
[ intermediate_cert_extensions ]
keyUsage = keyCertSign, digitalSignature
basicConstraints = CA:TRUE, pathlen:1
[client_cert_extensions]
keyUsage = keyCertSign, digitalSignature
basicConstraints = CA:FALSE
[ signing_policy ]
countryName = optional
stateOrProvinceName = optional
localityName = optional
organizationName = optional
organizationalUnitName = optional
commonName = supplied
emailAddress = optional
For these instructions, we will save the certificate as cert_ca.pem and the private key as key_ca.
pem. This certificate will be a self-signed root certificate authority (CA) that can sign other CAs and
non-CA certificates.
$ openssl req \
-x509 \
-nodes \
-newkey rsa:1024 \
-config ca.conf \
-keyout key_ca.pem \
-out cert_ca.pem
Create a certificate request for the first intermediate certificate. For these instructions, we
will save the certificate request as cert_intermeidate_a.csr and the private key as
key_intermediate_a.pem.
$ openssl req \
-nodes \
-newkey rsa:2048 \
-subj '/CN=First Intermediate Certificate' \
(continues on next page)
Generate the first intermediate certificate by signing its certificate request with the CA. For these instruc-
tions we will save the certificate as cert_intermediate_a.pem.
$ openssl ca \
-config ca.conf \
-extensions intermediate_cert_extensions \
-cert cert_ca.pem \
-keyfile key_ca.pem \
-out cert_intermediate_a.pem \
-infiles cert_intermediate_a.csr
Create a certificate request for the second intermediate certificate. For these instructions, we
will save the certificate request as cert_intermeidate_b.csr and the private key as
key_intermediate_b.pem.
$ openssl req \
-nodes \
-newkey rsa:2048 \
-subj '/CN=Second Intermediate Certificate' \
-keyout key_intermediate_b.pem \
-out cert_intermediate_b.csr
Generate the second intermediate certificate by signing its certificate request with the first intermediate
$ openssl ca \
-config ca.conf \
-extensions intermediate_cert_extensions \
-cert cert_intermediate_a.pem \
-keyfile key_intermediate_a.pem \
-out cert_intermediate_b.pem \
-infiles cert_intermediate_b.csr
Create a certificate request for the client certificate. For these instructions, we will save the certificate
request as cert_client.csr and the private key as key_client.pem.
$ openssl req \
-nodes \
-newkey rsa:2048 \
-subj '/CN=Client Certificate' \
-keyout key_client.pem \
-out cert_client.csr
Generate the client certificate by signing its certificate request with the second intermediate certificate.
For these instructions we will save the certificate as cert_client.pem.
$ openssl ca \
-config ca.conf \
-extensions client_cert_extensions \
-cert cert_intermediate_b.pem \
-keyfile key_intermediate_b.pem \
-out cert_client.pem \
-infiles cert_client.csr
In order interact with the key manager, the user needs to have a creator role.
To list all users with a creator role, run the following command as an admin:
+---------+-----------------------------+-------+-------------------+------
,→--+-----------+
| Role | User | Group | Project |
,→Domain | Inherited |
+---------+-----------------------------+-------+-------------------+------
,→--+-----------+
| creator | project_a_creator_2@Default | | project_a@Default |
,→ | False |
| creator | project_b_creator@Default | | project_b@Default |
,→ | False |
| creator | project_a_creator@Default | | project_a@Default |
,→ | False |
+---------+-----------------------------+-------+-------------------+------
,→--+-----------+
To give the demo user a creator role in the demo project, run the following command as an admin:
Note: This command provides no output. If the command fails, the user will see a 4xx Client error
indicating that Secret creation attempt not allowed and to please review your user/project privileges.
Note: The following openstack secret commands require that the python-barbicanclient package is
installed.
$ cert_ca_uuid=8fbcce5d-d646-4295-ba8a-269fc9451eeb
$ cert_intermediate_a_uuid=0b5d2c72-12cc-4ba6-a8d7-3ff5cc1d8cb8
$ cert_intermediate_b_uuid=674736e3-f25c-405c-8362-bbf991e0ce0a
$ cert_client_uuid=125e6199-2de4-46e3-b091-8e2401ef0d63
$ openssl dgst \
-sha256 \
-sign key_client.pem \
-sigopt rsa_padding_mode:pss \
-out cirros.self_signed.signature \
cirros.tar.gz
$ base64_signature=$(base64 -w 0 cirros.self_signed.signature)
+------------------+-------------------------------------------------------
,→-----------------+
| Field | Value
,→ |
+------------------+-------------------------------------------------------
,→-----------------+
| checksum | d41d8cd98f00b204e9800998ecf8427e
,→ |
| container_format | bare
,→ |
| created_at | 2019-02-06T06:29:56Z
,→ |
| disk_format | qcow2
,→ |
| file | /v2/images/17f48a6c-e592-446e-9c91-00fbc436d47e/file
,→ |
| id | 17f48a6c-e592-446e-9c91-00fbc436d47e
,→ |
| min_disk | 0
,→ |
| min_ram | 0
,→ |
| name | cirros_client_signedImage
,→ |
| owner | 45e13e63606f40d6b23275c3cd91aec2
,→ |
| properties | img_signature='swA/
,→hZi3WaNh35VMGlnfGnBWuXMlUbdO8h306uG7W3nwOyZP6dGRJ3 |
| | Xoi/
,→07Bo2dMUB9saFowqVhdlW5EywQAK6vgDsi9O5aItHM4u7zUPw+2e8eeaIoHlGhTks |
| |
,→kmW9isLy0mYA9nAfs3coChOIPXW4V8VgVXEfb6VYGHWm0nShiAP1e0do9WwitsE/TVKoS |
| | QnWjhggIYij5hmUZ628KAygPnXklxVhqPpY/
,→dFzL+tTzNRD0nWAtsc5wrl6/8HcNzZsaP |
| |
,→oexAysXJtcFzDrf6UQu66D3UvFBVucRYL8S3W56It3Xqu0+InLGaXJJpNagVQBb476zB2 |
| | ZzZ5RJ/4Zyxw==',
,→ |
| | img_signature_certificate_uuid='125e6199-2de4-46e3-
,→b091-8e2401ef0d63', |
| | img_signature_hash_method='SHA-256',
,→ |
| | img_signature_key_type='RSA-PSS',
,→ |
| | os_hash_algo='sha512',
,→ |
(continues on next page)
Note: Creating the image can fail if validation does not succeed. This will cause the image to be deleted
and the Glance log to report that Signature verification failed for the given image ID.
Note: The instance should fail to boot because certificate validation fails when the feature is enabled
but no trusted image certificates are provided. The Nova log output should indicate that Image signature
certificate validation failed because Certificate chain building failed.
Note: The instance should successfully boot and certificate validation should succeed. The Nova log
output should indicate that Image signature certificate validation succeeded.
Other Links
• https://etherpad.openstack.org/p/mitaka-glance-image-signing-instructions
• https://etherpad.openstack.org/p/queens-nova-certificate-validation
• https://wiki.openstack.org/wiki/OpsGuide/User-Facing_Operations
• http://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/
nova-validate-certificates.html
You can change the size of an instance by changing its flavor. This rebuilds the instance and therefore
results in a restart.
To list the VMs you want to resize, run:
$ openstack server list
Once you have the name or UUID of the server you wish to resize, resize it using the openstack
server resize command:
$ openstack server resize --flavor FLAVOR SERVER
Note: By default, the openstack server resize command gives the guest operating system a
chance to perform a controlled shutdown before the instance is powered off and the instance is resized.
This behavior can be configured by the administrator but it can also be overridden on a per image basis
using the os_shutdown_timeout image metadata setting. This allows different types of operating
systems to specify how much time they need to shut down cleanly. See Useful image properties for
details.
Resizing can take some time. During this time, the instance status will be RESIZE:
$ openstack server list
+----------------------+----------------+--------+-------------------------
,→----------------+
| ID | Name | Status | Networks
,→ |
+----------------------+----------------+--------+-------------------------
,→----------------+
| 67bc9a9a-5928-47c... | myCirrosServer | RESIZE | admin_internal_net=192.
,→168.111.139 |
+----------------------+----------------+--------+-------------------------
,→----------------+
When the resize completes, the instance status will be VERIFY_RESIZE. You can now confirm the
resize to change the status to ACTIVE:
Note: The resized server may be automatically confirmed based on the administrators configuration of
the deployment.
If the resize does not work as expected, you can revert the resize. This will revert the instance to the old
flavor and change the status to ACTIVE:
You can soft or hard reboot a running instance. A soft reboot attempts a graceful shut down and restart
of the instance. A hard reboot power cycles the instance.
To reboot a server, use the openstack server reboot command:
By default, when you reboot an instance it is a soft reboot. To perform a hard reboot, pass the --hard
parameter as follows:
It is also possible to reboot a running instance into rescue mode. For example, this operation may be
required if a filesystem of an instance becomes corrupted with prolonged use. See Rescue an instance
for more details.
Instance rescue provides a mechanism for access, even if an image renders the instance inaccessible.
Two rescue modes are currently provided.
Instance rescue
By default the instance is booted from the provided rescue image or a fresh copy of the original instance
image if a rescue image is not provided. The root disk and optional regenerated config drive are also
attached to the instance for data recovery.
As of 21.0.0 (Ussuri) an additional stable device rescue mode is available. This mode now supports the
rescue of volume-backed instances.
This mode keeps all devices both local and remote attached in their original order to the instance during
the rescue while booting from the provided rescue image. This mode is enabled and controlled by
the presence of hw_rescue_device or hw_rescue_bus image properties on the provided rescue
image.
As their names suggest these properties control the rescue device type (cdrom, disk or floppy) and
bus type (scsi, virtio, ide, or usb) used when attaching the rescue image to the instance.
Support for each combination of the hw_rescue_device and hw_rescue_bus image properties
is dependent on the underlying hypervisor and platform being used. For example the IDE bus is not
available on POWER KVM based compute hosts.
Note: This mode is only supported when using the Libvirt virt driver.
This mode is not supported when using the LXC hypervisor as enabled by the libvirt.virt_type
configurable on the computes.
Usage
Note: Pause, suspend, and stop operations are not allowed when an instance is running in rescue
mode, as triggering these actions causes the loss of the original instance state and makes it impossible
to unrescue the instance.
Note: On running the openstack server rescue command, an instance performs a soft shut-
down first. This means that the guest operating system has a chance to perform a controlled shutdown
before the instance is powered off. The shutdown behavior is configured by the shutdown_timeout
parameter that can be set in the nova.conf file. Its value stands for the overall period (in seconds) a
guest operating system is allowed to complete the shutdown.
The timeout value can be overridden on a per image basis by means of os_shutdown_timeout that
is an image metadata setting allowing different types of operating systems to specify how much time
they need to shut down cleanly.
To rescue an instance that boots from a volume you need to use the 2.87 microversion or later.
If you want to rescue an instance with a specific image, rather than the default one, use the --image
parameter:
To restart the instance from the normal boot disk, run the following command:
Nova has a concept of block devices that can be exposed to cloud instances. There are several types of
block devices an instance can have (we will go into more details about this later in this document), and
which ones are available depends on a particular deployment and the usage limitations set for tenants
and users. Block device mapping is a way to organize and keep data about all of the block devices an
instance has.
When we talk about block device mapping, we usually refer to one of two things
1. API/CLI structure and syntax for specifying block devices for an instance boot request
2. The data structure internal to Nova that is used for recording and keeping, which is ultimately per-
sisted in the block_device_mapping table. However, Nova internally has several slightly different
formats for representing the same data. All of them are documented in the code and or presented
by a distinct set of classes, but not knowing that they exist might trip up people reading the code.
So in addition to BlockDeviceMapping1 objects that mirror the database schema, we have:
2.1 The API format - this is the set of raw key-value pairs received from the API client, and is
almost immediately transformed into the object; however, some validations are done using this
format. We will refer to this format as the API BDMs from now on.
2.2 The virt driver format - this is the format defined by the classes in :mod:
nova.virt.block_device. This format is used and expected by the code in the various virt drivers.
These classes, in addition to exposing a different format (mimicking the Python dict interface),
also provide a place to bundle some functionality common to certain types of block devices (for
example attaching volumes which has to interact with both Cinder and the virt driver code). We
will refer to this format as Driver BDMs from now on.
For more details on this please refer to the Driver BDM Data Structures refernce document.
Note: The maximum limit on the number of disk devices allowed to attach to a single server is config-
urable with the option compute.max_disk_devices_to_attach.
1
In addition to the BlockDeviceMapping Nova object, we also have the BlockDeviceDict class in :mod: nova.block_device
module. This class handles transforming and validating the API BDM format.
In the early days of Nova, block device mapping general structure closely mirrored that of the EC2 API.
During the Havana release of Nova, block device handling code, and in turn the block device mapping
structure, had work done on improving the generality and usefulness. These improvements included
exposing additional details and features in the API. In order to facilitate this, a new extension was added
to the v2 API called BlockDeviceMappingV2Boot2 , that added an additional block_device_mapping_v2
field to the instance boot API request.
This was the original format that supported only cinder volumes (similar to how EC2 block devices
support only EBS volumes). Every entry was keyed by device name (we will discuss why this was
problematic in its own section later on this page), and would accept only:
• UUID of the Cinder volume or snapshot
• Type field - used only to distinguish between volumes and Cinder volume snapshots
• Optional size field
• Optional delete_on_termination flag
While all of Nova internal code only uses and stores the new data structure, we still need to handle API
requests that use the legacy format. This is handled by the Nova API service on every request. As we will
see later, since block device mapping information can also be stored in the image metadata in Glance,
this is another place where we need to handle the v1 format. The code to handle legacy conversions is
part of the :mod: nova.block_device module.
Using device names as the primary per-instance identifier, and exposing them in the API, is problematic
for Nova mostly because several hypervisors Nova supports with its drivers cant guarantee that the
device names the guest OS assigns are the ones the user requested from Nova. Exposing such a detail
in the public API of Nova is obviously not ideal, but it needed to stay for backwards compatibility. It is
also required for some (slightly obscure) features around overloading a block device in a Glance image
when booting an instance3 .
The plan for fixing this was to allow users to not specify the device name of a block device, and Nova
will determine it (with the help of the virt driver), so that it can still be discovered through the API and
used when necessary, like for the features mentioned above (and preferably only then).
Another use for specifying the device name was to allow the boot from volume functionality, by speci-
fying a device name that matches the root device name for the instance (usually /dev/vda).
Currently (mid Liberty) users are discouraged from specifying device names for all calls requiring or
allowing block device mapping, except when trying to override the image block device mapping on
instance boot, and it will likely remain like that in the future. Libvirt device driver will outright override
any device names passed with its own values.
2
This work predates API microversions and thus the only way to add it was by means of an API extension.
3
This is a feature that the EC2 API offers as well and has been in Nova for a long time, although it has been broken in
several releases. More info can be found on this bug <https://launchpad.net/bugs/1370250>
New format was introduced in an attempt to solve issues with the original block device mapping format
discussed above, and also to allow for more flexibility and addition of features that were not possible
with the simple format we had.
New block device mapping is a list of dictionaries containing the following fields (in addition to the ones
that were already there):
• source_type - this can have one of the following values:
– image
– volume
– snapshot
– blank
• destination_type - this can have one of the following values:
– local
– volume
• guest_format - Tells Nova how/if to format the device prior to attaching, should be only used with
blank local images. Denotes a swap disk if the value is swap.
• device_name - See the previous section for a more in depth explanation of this - currently best
left empty (not specified that is), unless the user wants to override the existing device specified in
the image metadata. In case of Libvirt, even when passed in with the purpose of overriding the
existing image metadata, final set of device names for the instance may still get changed by the
driver.
• disk_bus and device_type - low level details that some hypervisors (currently only libvirt) may
support. Some example disk_bus values can be: ide, usb, virtio, scsi, while device_type may be
disk, cdrom, floppy, lun. This is not an exhaustive list as it depends on the virtualization driver,
and may change as more support is added. Leaving these empty is the most common thing to do.
• boot_index - Defines the order in which a hypervisor will try devices when attempting to boot the
guest from storage. Each device which is capable of being used as boot device should be given a
unique boot index, starting from 0 in ascending order. Some hypervisors may not support booting
from multiple devices, so will only consider the device with boot index of 0. Some hypervisors
will support booting from multiple devices, but only if they are of different types - eg a disk
and CD-ROM. Setting a negative value or None indicates that the device should not be used for
booting. The simplest usage is to set it to 0 for the boot device and leave it as None for any other
devices.
• volume_type - Added in microversion 2.67 to the servers create API to support specify-
ing volume type when booting instances. When we snapshot a volume-backed server, the
block_device_mapping_v2 image metadata will include the volume_type from the BDM record
so if the user then creates another server from that snapshot, the volume that nova creates from
that snapshot will use the same volume_type. If a user wishes to change that volume type in the
image metadata, they can do so via the image API.
Combination of the source_type and destination_type will define the kind of block device
the entry is referring to. The following combinations are supported:
• image -> local - this is only currently reserved for the entry referring to the Glance image that the
instance is being booted with (it should also be marked as a boot device). It is also worth noting
that an API request that specifies this, also has to provide the same Glance uuid as the image_ref
parameter to the boot request (this is done for backwards compatibility and may be changed in the
future). This functionality might be extended to specify additional Glance images to be attached
to an instance after boot (similar to kernel/ramdisk images) but this functionality is not supported
by any of the current drivers.
• volume -> volume - this is just a Cinder volume to be attached to the instance. It can be marked as
a boot device.
• snapshot -> volume - this works exactly as passing type=snap does. It would create a volume
from a Cinder volume snapshot and attach that volume to the instance. Can be marked bootable.
• image -> volume - As one would imagine, this would download a Glance image to a cinder volume
and attach it to an instance. Can also be marked as bootable. This is really only a shortcut for
creating a volume out of an image before booting an instance with the newly created volume.
• blank -> volume - Creates a blank Cinder volume and attaches it. This will also require the volume
size to be set.
• blank -> local - Depending on the guest_format field (see below), this will either mean an
ephemeral blank disk on hypervisor local storage, or a swap disk (instances can have only one
of those).
Nova will not allow mixing of BDMv1 and BDMv2 in a single request, and will do basic validation to
make sure that the requested block device mapping is valid before accepting a boot request.
FAQs
1. Is it possible to configure nova to automatically use cinder to back all root disks with volumes?
No, there is nothing automatic within nova that converts a non-boot-from-volume request to con-
vert the image to a root volume. Several ideas have been discussed over time which are captured in
the spec for volume-backed flavors. However, if you wish to force users to always create volume-
backed servers, you can configure the API service by setting max_local_block_devices to
0. This will result in any non-boot-from-volume server create request to fail with a 400 response.
This documents the changes made to the REST API with every microversion change. The description
for each version should be a verbose one which has enough information to be suitable for use in user
documentation.
2.1
This is the initial version of the v2.1 API which supports microversions. The V2.1 API is from the REST
API users point of view exactly the same as v2.0 except with strong input validation.
A user can specify a header in the API request:
X-OpenStack-Nova-API-Version: <version>
2.2
2.4
Show the reserved status on a FixedIP object in the os-fixed-ips API extension. The ex-
tension allows one to reserve and unreserve a fixed IP but the show method does not report the
current status.
2.5
Before version 2.5, the command nova list --ip6 xxx returns all servers for non-admins, as the
filter option is silently discarded. There is no reason to treat ip6 different from ip, though, so we just add
this option to the allowed list.
2.6
POST /servers/<uuid>/remote-consoles
{
"remote_console": {
"protocol": ["vnc"|"rdp"|"serial"|"spice"],
"type": ["novnc"|"xpvnc"|"rdp-html5"|"spice-html5"|"serial"]
}
}
Example response:
{
"remote_console": {
"protocol": "vnc",
"type": "novnc",
"url": "http://example.com:6080/vnc_auto.html?path=%3Ftoken%3DXYZ"
}
}
2.7
Check the is_public attribute of a flavor before adding tenant access to it. Reject the request with
HTTPConflict error.
2.8
2.9
Add a new locked attribute to the detailed view, update, and rebuild action. locked will be true if
anyone is currently holding a lock on the server, false otherwise.
2.10
Added user_id parameter to os-keypairs plugin, as well as a new property in the request body,
for the create operation.
Administrators will be able to list, get details and delete keypairs owned by users other than themselves
and to create new keypairs on behalf of their users.
2.11
Exposed attribute forced_down for os-services. Added ability to change the forced_down
attribute by calling an update.
Exposes VIF net_id attribute in os-virtual-interfaces. User will be able to get Virtual
Interfaces net_id in Virtual Interfaces list and can determine in which network a Virtual Interface is
plugged into.
2.13
2.14
Remove onSharedStorage parameter from servers evacuate action. Nova will automatically detect
if the instance is on shared storage.
adminPass is removed from the response body. The user can get the password with the servers
os-server-password action.
2.15
From this version of the API users can choose soft-affinity and soft-anti-affinity rules too for server-
groups.
2.16
Exposes new host_status attribute for servers/detail and servers/{server_id}. Ability to get nova-
compute status when querying servers. By default, this is only exposed to cloud administrators.
2.17
Add a new API for triggering crash dump in an instance. Different operation systems in instance may
need different configurations to trigger crash dump.
2.18
2.19
Allow the user to set and get the server description. The user will be able to set the description when
creating, rebuilding, or updating a server, and get the description as part of the server details.
2.20
From this version of the API user can call detach and attach volumes for instances which are in
shelved and shelved_offloaded state.
2.21
2.22
A new resource, servers:migrations, is added. A new API to force live migration to complete
added:
POST /servers/<uuid>/migrations/<id>/action
{
"force_complete": null
}
2.23
From this version of the API users can get the migration summary list by index API or the information
of a specific migration by get API. Add migration_type for old /os-migrations API, also
add ref link to the /servers/{uuid}/migrations/{id} for it when the migration is an in-
progress live-migration.
2.24
DELETE /servers/<uuid>/migrations/<id>
Modify input parameter for os-migrateLive. The block_migration field now supports an
auto value and the disk_over_commit flag is removed.
2.26
{
'id': {server_id},
...
'tags': ['foo', 'bar', 'baz']
}
A user can get only a set of server tags by making a GET request to /servers/<server_id>/
tags.
Response
{
'tags': ['foo', 'bar', 'baz']
}
A user can check if a tag exists or not on a server by making a GET request to /servers/
{server_id}/tags/{tag}.
Request returns 204 No Content if tag exist on a server or 404 Not Found if tag doesnt exist on
a server.
A user can filter servers in GET /servers request by new filters:
• tags
• tags-any
• not-tags
• not-tags-any
These filters can be combined. Also user can use more than one string tags for each filter. In this case
string tags for each filter must be separated by comma. For example:
GET /servers?tags=red&tags-any=green,orange
2.27
Added support for the new form of microversion headers described in the Microversion Specification.
Both the original form of header and the new form is supported.
2.28
2.29
Updates the POST request body for the evacuate action to include the optional force boolean field
defaulted to False. Also changes the evacuate action behaviour when providing a host string field by
calling the nova scheduler to verify the provided host unless the force attribute is set.
2.30
Updates the POST request body for the live-migrate action to include the optional force boolean
field defaulted to False. Also changes the live-migrate action behaviour when providing a host string
field by calling the nova scheduler to verify the provided host unless the force attribute is set.
2.31
Fix os-console-auth-tokens to return connection info for all types of tokens, not just RDP.
2.32
Adds an optional, arbitrary tag item to the networks item in the server boot request body. In addition,
every item in the block_device_mapping_v2 array can also have an optional, arbitrary tag item. These
tags are used to identify virtual device metadata, as exposed in the metadata API and on the config drive.
For example, a network interface on the virtual PCI bus tagged with nic1 will appear in the metadata
along with its bus (PCI), bus address (ex: 0000:00:02.0), MAC address, and tag (nic1).
Note: A bug has caused the tag attribute to no longer be accepted for networks starting with version 2.37
and for block_device_mapping_v2 starting with version 2.33. In other words, networks could only be
tagged between versions 2.32 and 2.36 inclusively and block devices only in version 2.32. As of version
2.42 the tag attribute has been restored and both networks and block devices can be tagged again.
2.33
Support pagination for hypervisor by accepting limit and marker from the GET API request:
GET /v2.1/{tenant_id}/os-hypervisors?marker={hypervisor_id}&limit={limit}
In the context of device tagging at server create time, 2.33 also removes the tag attribute from
block_device_mapping_v2. This is a bug that is fixed in 2.42, in which the tag attribute is reintroduced.
2.34
Checks in os-migrateLive before live-migration actually starts are now made in background.
os-migrateLive is not throwing 400 Bad Request if pre-live-migration checks fail.
2.35
GET /os-keypairs?limit={limit}&marker={kp_name}
2.36
All the APIs which proxy to another service were deprecated in this version, also the fping API. Those
APIs will return 404 with Microversion 2.36. The network related quotas and limits are removed from
API also. The deprecated API endpoints as below:
'/images'
'/os-networks'
'/os-tenant-networks'
'/os-fixed-ips'
'/os-floating-ips'
'/os-floating-ips-bulk'
'/os-floating-ip-pools'
'/os-floating-ip-dns'
'/os-security-groups'
'/os-security-group-rules'
'/os-security-group-default-rules'
'/os-volumes'
'/os-snapshots'
'/os-baremetal-nodes'
'/os-fping'
Note: A regression was introduced in this microversion which broke the force parameter in the PUT
/os-quota-sets API. The fix will have to be applied to restore this functionality.
Changed in version 18.0.0: The os-fping API was completely removed in the 18.0.0 (Rocky) release.
On deployments newer than this, the API will return HTTP 410 (Gone) regardless of the requested
microversion.
Changed in version 21.0.0: The os-security-group-default-rules API was completely re-
moved in the 21.0.0 (Ussuri) release. On deployments newer than this, the APIs will return HTTP 410
(Gone) regardless of the requested microversion.
Changed in version 21.0.0: The os-networks API was partially removed in the 21.0.0 (Ussuri) re-
lease. On deployments newer than this, some endpoints of the API will return HTTP 410 (Gone) regard-
less of the requested microversion.
Changed in version 21.0.0: The os-tenant-networks API was partially removed in the 21.0.0
(Ussuri) release. On deployments newer than this, some endpoints of the API will return HTTP 410
(Gone) regardless of the requested microversion.
2.37
Added support for automatic allocation of networking, also known as Get Me a Network. With this
microversion, when requesting the creation of a new server (or servers) the networks entry in the
server portion of the request body is required. The networks object in the request can either be a
list or an enum with values:
1. none which means no networking will be allocated for the created server(s).
2. auto which means either a network that is already available to the project will be used, or if one
does not exist, will be automatically created for the project. Automatic network allocation for a
project only happens once for a project. Subsequent requests using auto for the same project will
reuse the network that was previously allocated.
Also, the uuid field in the networks object in the server create request is now strictly enforced to be
in UUID format.
In the context of device tagging at server create time, 2.37 also removes the tag attribute from networks.
This is a bug that is fixed in 2.42, in which the tag attribute is reintroduced.
Before version 2.38, the command nova list --status invalid_status was returning
empty list for non admin user and 500 InternalServerError for admin user. As there are sufficient statuses
defined already, any invalid status should not be accepted. From this version of the API admin as well
as non admin user will get 400 HTTPBadRequest if invalid status is passed to nova list command.
2.39
Deprecates image-metadata proxy API that is just a proxy for Glance API to operate the image meta-
data. Also removes the extra quota enforcement with Nova metadata quota (quota checks for createIm-
age and createBackup actions in Nova were removed). After this version Glance configuration option
image_property_quota should be used to control the quota of image metadatas. Also, removes the max-
ImageMeta field from os-limits API response.
2.40
Optional query parameters limit and marker were added to the os-simple-tenant-usage
endpoints for pagination. If a limit isnt provided, the configurable max_limit will be used which
currently defaults to 1000.
GET /os-simple-tenant-usage?limit={limit}&marker={instance_uuid}
GET /os-simple-tenant-usage/{tenant_id}?limit={limit}&marker={instance_
,→uuid}
A tenants usage statistics may span multiple pages when the number of instances exceeds limit, and API
consumers will need to stitch together the aggregate results if they still want totals for all instances in a
specific time window, grouped by tenant.
Older versions of the os-simple-tenant-usage endpoints will not accept these new paging query
parameters, but they will start to silently limit by max_limit to encourage the adoption of this new mi-
croversion, and circumvent the existing possibility of DoS-like usage requests when there are thousands
of instances.
2.41
The uuid attribute of an aggregate is now returned from calls to the /os-aggregates endpoint. This
attribute is auto-generated upon creation of an aggregate. The os-aggregates API resource endpoint
remains an administrator-only API.
In the context of device tagging at server create time, a bug has caused the tag attribute to no longer be
accepted for networks starting with version 2.37 and for block_device_mapping_v2 starting with version
2.33. Microversion 2.42 restores the tag parameter to both networks and block_device_mapping_v2,
allowing networks and block devices to be tagged again.
2.43
The os-hosts API is deprecated as of the 2.43 microversion. Requests made with microversion >=
2.43 will result in a 404 error. To list and show host details, use the os-hypervisors API. To enable
or disable a service, use the os-services API. There is no replacement for the shutdown, startup,
reboot, or maintenance_mode actions as those are system-level operations which should be outside of
the control of the compute service.
2.44
The following APIs which are considered as proxies of Neutron networking API, are deprecated and
will result in a 404 error response in new Microversion:
POST /servers/{server_uuid}/action
{
"addFixedIp": {...}
}
POST /servers/{server_uuid}/action
{
"removeFixedIp": {...}
}
POST /servers/{server_uuid}/action
{
"addFloatingIp": {...}
}
POST /servers/{server_uuid}/action
{
"removeFloatingIp": {...}
}
Those server actions can be replaced by calling the Neutron API directly.
The nova-network specific API to query the servers interfaces is deprecated:
GET /servers/{server_uuid}/os-virtual-interfaces
To query attached neutron interfaces for a specific server, the API GET /servers/{server_uuid}/os-
interface can be used.
2.45
The createImage and createBackup server action APIs no longer return a Location header in
the response for the snapshot image, they now return a json dict in the response body with an image_id
key and uuid value.
2.46
The request_id created for every inbound request is now returned in X-OpenStack-Request-ID in
addition to X-Compute-Request-ID to be consistent with the rest of OpenStack. This is a signaling
only microversion, as these header settings happen well before microversion processing.
2.47
Replace the flavor name/ref with the actual flavor details from the embedded flavor object when
displaying server details. Requests made with microversion >= 2.47 will no longer return the flavor
ID/link but instead will return a subset of the flavor details. If the user is prevented by policy from
indexing extra-specs, then the extra_specs field will not be included in the flavor information.
2.48
Before version 2.48, VM diagnostics response was just a blob of data returned by each hypervisor. From
this version VM diagnostics response is standardized. It has a set of fields which each hypervisor will
try to fill. If a hypervisor driver is unable to provide a specific field then this field will be reported as
None.
2.49
Continuing from device role tagging at server create time introduced in version 2.32 and later fixed in
2.42, microversion 2.49 allows the attachment of network interfaces and volumes with an optional tag
parameter. This tag is used to identify the virtual devices in the guest and is exposed in the metadata
API. Because the config drive cannot be updated while the guest is running, it will only contain metadata
of devices that were tagged at boot time. Any changes made to devices while the instance is running -
be it detaching a tagged device or performing a tagged device attachment - will not be reflected in the
config drive.
Tagged volume attachment is not supported for shelved-offloaded instances.
2.50
The server_groups and server_group_members keys are exposed in GET & PUT
os-quota-class-sets APIs Response body. Networks related quotas have been filtered out from
os-quota-class. Below quotas are filtered out and not available in os-quota-class-sets APIs
from this microversion onwards.
• fixed_ips
• floating_ips
• networks,
• security_group_rules
• security_groups
2.51
2.52
Adds support for applying tags when creating a server. The tag schema is the same as in the 2.26
microversion.
os-services
Services are now identified by uuid instead of database id to ensure uniqueness across cells. This mi-
croversion brings the following changes:
• GET /os-services returns a uuid in the id field of the response
• DELETE /os-services/{service_uuid} requires a service uuid in the path
• The following APIs have been superseded by PUT /os-services/{service_uuid}/:
– PUT /os-services/disable
– PUT /os-services/disable-log-reason
– PUT /os-services/enable
– PUT /os-services/force-down
2.54
Allow the user to set the server key pair while rebuilding.
2.55
2.56
Updates the POST request body for the migrate action to include the optional host string field
defaulted to null. If host is set the migrate action verifies the provided host with the nova scheduler
and uses it as the destination for the migration.
2.57
2.58
Add pagination support and changes-since filter for os-instance-actions API. Users can now use
limit and marker to perform paginated query when listing instance actions. Users can also use
changes-since filter to filter the results based on the last time the instance action was updated.
2.59
From this version of the API users can attach a multiattach capable volume to multiple instances.
The API request for creating the additional attachments is the same. The chosen virt driver and the
volume back end has to support the functionality as well.
2.61
Exposes flavor extra_specs in the flavor representation. Now users can see the flavor extra-specs in
flavor APIs response and do not need to call GET /flavors/{flavor_id}/os-extra_specs
API. If the user is prevented by policy from indexing extra-specs, then the extra_specs field will
not be included in the flavor information. Flavor extra_specs will be included in Response body of the
following APIs:
• GET /flavors/detail
• GET /flavors/{flavor_id}
• POST /flavors
• PUT /flavors/{flavor_id}
2.62
Adds host (hostname) and hostId (an obfuscated hashed host id string) fields to the in-
stance action GET /servers/{server_id}/os-instance-actions/{req_id}
API. The display of the newly added host field will be controlled via policy rule
os_compute_api:os-instance-actions:events, which is the same policy used for
the events.traceback field. If the user is prevented by policy, only hostId will be displayed.
2.63
Adds support for the trusted_image_certificates parameter, which is used to define a list
of trusted certificate IDs that can be used during image signature verification and certificate validation.
The list is restricted to a maximum of 50 IDs. Note that trusted_image_certificates is not
supported with volume-backed servers.
The trusted_image_certificates request parameter can be passed to the server create and
rebuild APIs:
• POST /servers
• POST /servers/{server_id}/action (rebuild)
2.64
Enable users to define the policy rules on server group policy to meet more advanced policy requirement.
This microversion brings the following changes in server group APIs:
• Add policy and rules fields in the request of POST /os-server-groups. The policy
represents the name of policy. The rules field, which is a dict, can be applied to the policy,
which currently only support max_server_per_host for anti-affinity policy.
• The policy and rules fields will be returned in response body of POST, GET /
os-server-groups API and GET /os-server-groups/{server_group_id} API.
• The policies and metadata fields have been removed from the response body of POST, GET
/os-server-groups API and GET /os-server-groups/{server_group_id}
API.
Add support for abort live migrations in queued and preparing status for API DELETE /
servers/{server_id}/migrations/{migration_id}.
2.66
The changes-before filter can be included as a request parameter of the following APIs to filter by
changes before or equal to the resource updated_at time:
• GET /servers
• GET /servers/detail
• GET /servers/{server_id}/os-instance-actions
• GET /os-migrations
2.67
2.68
Remove support for forced live migration and evacuate server actions.
2.69
Add support for returning minimal constructs for GET /servers, GET /servers/detail, GET
/servers/{server_id} and GET /os-services when there is a transient unavailability con-
dition in the deployment like an infrastructure failure. Starting from this microversion, the responses
from the down part of the infrastructure for the above four requests will have missing key values to
make it more resilient. The response body will only have a minimal set of information obtained from
the available information in the API database for the down cells. See handling down cells for more
information.
2.70
Exposes virtual device tags for volume attachments and virtual interfaces (ports). A tag parameter is
added to the response body for the following APIs:
Volumes
• GET /servers/{server_id}/os-volume_attachments (list)
• GET /servers/{server_id}/os-volume_attachments/{volume_id} (show)
• POST /servers/{server_id}/os-volume_attachments (attach)
Ports
• GET /servers/{server_id}/os-interface (list)
• GET /servers/{server_id}/os-interface/{port_id} (show)
• POST /servers/{server_id}/os-interface (attach)
2.71
The server_groups parameter will be in the response body of the following APIs to list the server
groups to which the server belongs:
• GET /servers/{server_id}
• PUT /servers/{server_id}
• POST /servers/{server_id}/action (rebuild)
API microversion 2.72 adds support for creating servers with neutron ports that has resource request,
e.g. neutron ports with QoS minimum bandwidth rule. Deleting servers with such ports have already
been handled properly as well as detaching these type of ports.
API limitations:
• Creating servers with Neutron networks having QoS minimum bandwidth rule is not supported.
• Attaching Neutron ports and networks having QoS minimum bandwidth rule is not supported.
• Moving (resizing, migrating, live-migrating, evacuating, unshelving after shelve offload) servers
with ports having resource request is not yet supported.
2.73
API microversion 2.73 adds support for specifying a reason when locking the server and exposes this
information via GET /servers/detail, GET /servers/{server_id}, PUT servers/
{server_id} and POST /servers/{server_id}/action where the action is rebuild. It also
supports locked as a filter/sort parameter for GET /servers/detail and GET /servers.
2.74
API microversion 2.74 adds support for specifying optional host and/or hypervisor_hostname
parameters in the request body of POST /servers. These request a specific destination
host/node to boot the requested server. These parameters are mutually exclusive with the special
availability_zone format of zone:host:node. Unlike zone:host:node, the host
and/or hypervisor_hostname parameters still allow scheduler filters to be run. If the requested
host/node is unavailable or otherwise unsuitable, earlier failure will be raised. There will be also a new
policy named compute:servers:create:requested_destination. By default, it can be
specified by administrators only.
2.75
2.76
2.77
API microversion 2.77 adds support for specifying availability zone when unshelving a shelved of-
floaded server.
2.78
API microversion 2.79 adds support for specifying the delete_on_termination field in the re-
quest body when attaching a volume to a server, to support configuring whether to delete the data volume
when the server is destroyed. Also, delete_on_termination is added to the GET responses when
showing attached volumes, and the delete_on_termination field is contained in the POST API
response body when attaching a volume.
The affected APIs are as follows:
• POST /servers/{server_id}/os-volume_attachments
• GET /servers/{server_id}/os-volume_attachments
• GET /servers/{server_id}/os-volume_attachments/{volume_id}
2.80
Microversion 2.80 changes the list migrations APIs and the os-migrations API.
Expose the user_id and project_id fields in the following APIs:
• GET /os-migrations
• GET /servers/{server_id}/migrations
• GET /servers/{server_id}/migrations/{migration_id}
The GET /os-migrations API will also have optional user_id and project_id query param-
eters for filtering migrations by user and/or project, for example:
• GET /os-migrations?user_id=ef9d34b4-45d0-4530-871b-3fb535988394
• GET /os-migrations?project_id=011ee9f4-8f16-4c38-8633-a254d420fd54
• GET /os-migrations?user_id=ef9d34b4-45d0-4530-871b-3fb535988394&project_id=0
2.81
Adds support for image cache management by aggregate by adding POST /os-aggregates/
{aggregate_id}/images.
2.82
2.83
Allow the following filter parameters for GET /servers/detail and GET /servers for non-
admin :
• availability_zone
• config_drive
• key_name
• created_at
• launched_at
• terminated_at
• power_state
• task_state
• vm_state
• progress
• user_id
2.84
2.85
2.86
Add support for validation of known extra specs. This is enabled by default for the following APIs:
• POST /flavors/{flavor_id}/os-extra_specs
• PUT /flavors/{flavor_id}/os-extra_specs/{id}
Validation is only used for recognized extra spec namespaces, currently: accel,
aggregate_instance_extra_specs, capabilities, hw, hw_rng, hw_video, os,
pci_passthrough, powervm, quota, resources, trait, and vmware.
Adds support for rescuing boot from volume instances when the compute host reports the
COMPUTE_BFV_RESCUE capability trait.
The following fields are no longer included in responses for the GET /os-hypervisors/detail
and GET /os-hypervisors/{hypervisor_id} APIs:
• current_workload
• cpu_info
• vcpus
• vcpus_used
• free_disk_gb
• local_gb
• local_gb_used
• disk_available_least
• free_ram_mb
• memory_mb
• memory_mb_used
• running_vms
These fields were removed as the information they provided were frequently misleading or outright
wrong, and many can be better queried from placement.
In addition, the GET /os-hypervisors/statistics API, which provided a summary view with
just the fields listed above, has been removed entirely and will now raise a HTTP 404 with microversion
2.88 or greater.
Finally, the GET /os-hypervisors/{hypervisor}/uptime API, which provided a
similar response to the GET /os-hypervisors/detail and GET /os-hypervisors/
{hypervisor_id} APIs but with an additional uptime field, has been removed in favour of includ-
ing this field in the primary GET /os-hypervisors/detail and GET /os-hypervisors/
{hypervisor_id} APIs.
Todo: The rest of this document should probably move to the admin guide.
There is information you might want to consider before doing your deployment, especially if it is going
to be a larger deployment. For smaller deployments the defaults from the install guide will be sufficient.
• Compute Driver Features Supported: While the majority of nova deployments use libvirt/kvm,
you can use nova with other compute drivers. Nova attempts to provide a unified feature set across
these, however, not all features are implemented on all backends, and not all features are equally
well tested.
– Feature Support by Use Case: A view of what features each driver supports based on whats
important to some large use cases (General Purpose Cloud, NFV Cloud, HPC Cloud).
– Feature Support full list: A detailed dive through features in each compute driver backend.
• Cells v2 Planning: For large deployments, Cells v2 allows sharding of your compute environment.
Upfront planning is key to a successful Cells v2 layout.
• Placement service: Overview of the placement service, including how it fits in with the rest of
nova.
• Running nova-api on wsgi: Considerations for using a real WSGI container instead of the baked-in
eventlet web server.
2.1.4 Maintenance
Once you are running nova, the following information is extremely useful.
• Admin Guide: A collection of guides for administrating nova.
• Upgrades: How nova is designed to be upgraded for minimal service impact, and the order you
should do them in.
• Quotas: Managing project quotas in nova.
• Availablity Zones: Availability Zones are an end-user visible logical abstraction for partitioning
a cloud without knowing the physical infrastructure. They can be used to partition a cloud on
arbitrary factors, such as location (country, datacenter, rack), network layout and/or power source.
• Filter Scheduler: How the filter scheduler is configured, and how that will impact where compute
instances land in your environment. If you are seeing unexpected distribution of compute instances
in your hosts, youll want to dive into this configuration.
• Exposing custom metadata to compute instances: How and when you might want to extend the
basic metadata exposed to compute instances (either via metadata server or config drive) for your
specific purposes.
All end user (and some administrative) features of nova are exposed via a REST API, which can be used
to build more complicated logic or automation with nova. This can be consumed directly, or via various
SDKs. The following resources will help you get started with consuming the API directly.
• Compute API Guide: The concept guide for the API. This helps lay out the concepts behind the
API to make consuming the API reference easier.
• Compute API Reference: The complete reference for the compute API, including all methods and
request / response parameters and their meaning.
• Compute API Microversion History: The compute API evolves over time through Microversions.
This provides the history of all those changes. Consider it a whats new in the compute API.
• Block Device Mapping: One of the trickier parts to understand is the Block Device Mapping
parameters used to connect specific block devices to computes. This deserves its own deep dive.
• Metadata: Provide information to the guest instance when it is created.
Nova can be configured to emit notifications over RPC.
• Versioned Notifications: This provides the list of existing versioned notifications with sample
payloads.
Other end-user guides can be found under User Documentation.
THREE
FOR OPERATORS
Nova comprises multiple server processes, each performing different functions. The user-facing in-
terface is a REST API, while internally Nova components communicate via an RPC message passing
mechanism.
The API servers process REST requests, which typically involve database reads/writes, optionally send-
ing RPC messages to other Nova services, and generating responses to the REST calls. RPC messaging
is done via the oslo.messaging library, an abstraction on top of message queues. Most of the major nova
components can be run on multiple servers, and have a manager that is listening for RPC messages. The
one major exception is nova-compute, where a single process runs on the hypervisor it is managing
(except when using the VMware or Ironic drivers). The manager also, optionally, has periodic tasks. For
more details on our RPC system, please see: AMQP and Nova
Nova also uses a central database that is (logically) shared between all components. However, to aid up-
grade, the DB is accessed through an object layer that ensures an upgraded control plane can still com-
municate with a nova-compute running the previous release. To make this possible nova-compute
proxies DB requests over RPC to a central manager called nova-conductor.
To horizontally expand Nova deployments, we have a deployment sharding concept called cells. For
more information please see: Cells
3.1.1.1 Components
Below you will find a helpful explanation of the key components of a typical Nova deployment.
73
Nova Documentation, Release 23.1.1.dev14
External service
Nova service
oslo.messaging
DB
HTTP
API
API
Keystone API DB
Conductor Scheduler
Conductor Scheduler Glance &
Neutron Conductor Scheduler Cinder
Compute
Compute
Compute
Placement
Hypervisor
3.2 Installation
The detailed install guide for nova. A functioning nova will also require having installed keystone,
glance, neutron, and placement. Ensure that you follow their install guides first.
3.2.1.1 Overview
The OpenStack project is an open source cloud computing platform that supports all types of cloud
environments. The project aims for simple implementation, massive scalability, and a rich set of features.
Cloud computing experts from around the world contribute to the project.
OpenStack provides an Infrastructure-as-a-Service (IaaS) solution through a variety of complementary
services. Each service offers an Application Programming Interface (API) that facilitates this integration.
This guide covers step-by-step deployment of the major OpenStack services using a functional example
architecture suitable for new users of OpenStack with sufficient Linux experience. This guide is not
intended to be used for production system installations, but to create a minimum proof-of-concept for
the purpose of learning about OpenStack.
After becoming familiar with basic installation, configuration, operation, and troubleshooting of these
OpenStack services, you should consider the following steps toward deployment using a production
architecture:
• Determine and implement the necessary core and optional services to meet performance and re-
dundancy requirements.
• Increase security using methods such as firewalls, encryption, and service policies.
• Implement a deployment tool such as Ansible, Chef, Puppet, or Salt to automate deployment and
management of the production environment.
Example architecture
The example architecture requires at least two nodes (hosts) to launch a basic virtual machine (VM) or
instance. Optional services such as Block Storage and Object Storage require additional nodes.
Important: The example architecture used in this guide is a minimum configuration, and is not in-
tended for production system installations. It is designed to provide a minimum proof-of-concept for the
purpose of learning about OpenStack. For information on creating architectures for specific use cases,
or how to determine which architecture is required, see the Architecture Design Guide.
3.2. Installation 75
Nova Documentation, Release 23.1.1.dev14
3.2. Installation 76
Nova Documentation, Release 23.1.1.dev14
Controller
The controller node runs the Identity service, Image service, management portions of Compute, manage-
ment portion of Networking, various Networking agents, and the Dashboard. It also includes supporting
services such as an SQL database, message queue, and Network Time Protocol (NTP).
Optionally, the controller node runs portions of the Block Storage, Object Storage, Orchestration, and
Telemetry services.
The controller node requires a minimum of two network interfaces.
Compute
The compute node runs the hypervisor portion of Compute that operates instances. By default, Compute
uses the kernel-based VM (KVM) hypervisor. The compute node also runs a Networking service agent
that connects instances to virtual networks and provides firewalling services to instances via security
groups.
You can deploy more than one compute node. Each node requires a minimum of two network interfaces.
Block Storage
The optional Block Storage node contains the disks that the Block Storage and Shared File System
services provision for instances.
For simplicity, service traffic between compute nodes and this node uses the management network.
Production environments should implement a separate storage network to increase performance and
security.
You can deploy more than one block storage node. Each node requires a minimum of one network
interface.
Object Storage
The optional Object Storage node contain the disks that the Object Storage service uses for storing
accounts, containers, and objects.
For simplicity, service traffic between compute nodes and this node uses the management network.
Production environments should implement a separate storage network to increase performance and
security.
This service requires two nodes. Each node requires a minimum of one network interface. You can
deploy more than two object storage nodes.
3.2. Installation 77
Nova Documentation, Release 23.1.1.dev14
Networking
The provider networks option deploys the OpenStack Networking service in the simplest way possible
with primarily layer-2 (bridging/switching) services and VLAN segmentation of networks. Essentially,
it bridges virtual networks to physical networks and relies on physical network infrastructure for layer-
3 (routing) services. Additionally, a DHCP<Dynamic Host Configuration Protocol (DHCP) service
provides IP address information to instances.
The OpenStack user requires more information about the underlying network infrastructure to create a
virtual network to exactly match the infrastructure.
Warning: This option lacks support for self-service (private) networks, layer-3 (routing) services,
and advanced services such as Load-Balancer-as-a-Service (LBaaS) and FireWall-as-a-Service
(FWaaS). Consider the self-service networks option below if you desire these features.
3.2. Installation 78
Nova Documentation, Release 23.1.1.dev14
The self-service networks option augments the provider networks option with layer-3 (routing) services
that enable self-service networks using overlay segmentation methods such as Virtual Extensible LAN
(VXLAN). Essentially, it routes virtual networks to physical networks using Network Address Transla-
tion (NAT). Additionally, this option provides the foundation for advanced services such as LBaaS and
FWaaS.
The OpenStack user can create virtual networks without the knowledge of underlying infrastructure on
the data network. This can also include VLAN networks if the layer-2 plug-in is configured accordingly.
Use OpenStack Compute to host and manage cloud computing systems. OpenStack Compute is a major
part of an Infrastructure-as-a-Service (IaaS) system. The main modules are implemented in Python.
OpenStack Compute interacts with OpenStack Identity for authentication, OpenStack Placement for
resource inventory tracking and selection, OpenStack Image service for disk and server images, and
OpenStack Dashboard for the user and administrative interface. Image access is limited by projects, and
3.2. Installation 79
Nova Documentation, Release 23.1.1.dev14
by users; quotas are limited per project (the number of instances, for example). OpenStack Compute
can scale horizontally on standard hardware, and download images to launch instances.
OpenStack Compute consists of the following areas and their components:
nova-api service Accepts and responds to end user compute API calls. The service supports the
OpenStack Compute API. It enforces some policies and initiates most orchestration activities,
such as running an instance.
nova-api-metadata service Accepts metadata requests from instances. For more information,
refer to Metadata service.
nova-compute service A worker daemon that creates and terminates virtual machine instances
through hypervisor APIs. For example:
• libvirt for KVM or QEMU
• VMwareAPI for VMware
Processing is fairly complex. Basically, the daemon accepts actions from the queue and performs
a series of system commands such as launching a KVM instance and updating its state in the
database.
nova-scheduler service Takes a virtual machine instance request from the queue and determines
on which compute server host it runs.
nova-conductor module Mediates interactions between the nova-compute service and the
database. It eliminates direct accesses to the cloud database made by the nova-compute ser-
vice. The nova-conductor module scales horizontally. However, do not deploy it on nodes
where the nova-compute service runs. For more information, see the conductor section in
the Configuration Options.
nova-novncproxy daemon Provides a proxy for accessing running instances through a VNC con-
nection. Supports browser-based novnc clients.
nova-spicehtml5proxy daemon Provides a proxy for accessing running instances through a
SPICE connection. Supports browser-based HTML5 client.
The queue A central hub for passing messages between daemons. Usually implemented with Rab-
bitMQ but other options are available.
SQL database Stores most build-time and run-time states for a cloud infrastructure, including:
• Available instance types
• Instances in use
• Available networks
• Projects
Theoretically, OpenStack Compute can support any database that SQLAlchemy supports. Com-
mon databases are SQLite3 for test and development work, MySQL, MariaDB, and PostgreSQL.
3.2. Installation 80
Nova Documentation, Release 23.1.1.dev14
This section describes how to install and configure the Compute service on the controller node for
Ubuntu, openSUSE and SUSE Linux Enterprise, and Red Hat Enterprise Linux and CentOS.
This section describes how to install and configure the Compute service, code-named nova, on the
controller node.
Prerequisites
Before you install and configure the Compute service, you must create databases, service credentials,
and API endpoints.
1. To create the databases, complete these steps:
• Use the database access client to connect to the database server as the root user:
# mysql
3.2. Installation 81
Nova Documentation, Release 23.1.1.dev14
$ . admin-openrc
User Password:
Repeat User Password:
+---------------------+----------------------------------+
| Field | Value |
+---------------------+----------------------------------+
| domain_id | default |
| enabled | True |
| id | 8a7dbf5279404537b1c7b86c033620fe |
| name | nova |
| options | {} |
| password_expires_at | None |
+---------------------+----------------------------------+
+-------------+----------------------------------+
| Field | Value |
+-------------+----------------------------------+
| description | OpenStack Compute |
| enabled | True |
| id | 060d59eac51b4594815603d75a00aba2 |
| name | nova |
| type | compute |
+-------------+----------------------------------+
+--------------+-------------------------------------------+
| Field | Value |
+--------------+-------------------------------------------+
| enabled | True |
| id | 3c1caa473bfe4390a11e7177894bcc7b |
| interface | public |
| region | RegionOne |
(continues on next page)
3.2. Installation 82
Nova Documentation, Release 23.1.1.dev14
+--------------+-------------------------------------------+
| Field | Value |
+--------------+-------------------------------------------+
| enabled | True |
| id | e3c918de680746a586eac1f2d9bc10ab |
| interface | internal |
| region | RegionOne |
| region_id | RegionOne |
| service_id | 060d59eac51b4594815603d75a00aba2 |
| service_name | nova |
| service_type | compute |
| url | http://controller:8774/v2.1 |
+--------------+-------------------------------------------+
+--------------+-------------------------------------------+
| Field | Value |
+--------------+-------------------------------------------+
| enabled | True |
| id | 38f7af91666a47cfb97b4dc790b94424 |
| interface | admin |
| region | RegionOne |
| region_id | RegionOne |
| service_id | 060d59eac51b4594815603d75a00aba2 |
| service_name | nova |
| service_type | compute |
| url | http://controller:8774/v2.1 |
+--------------+-------------------------------------------+
Note: Default configuration files vary by distribution. You might need to add these sections and options
rather than modifying existing sections and options. Also, an ellipsis (...) in the configuration snippets
indicates potential default configuration options that you should retain.
3.2. Installation 83
Nova Documentation, Release 23.1.1.dev14
[api_database]
# ...
connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova_api
[database]
# ...
connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova
Replace NOVA_DBPASS with the password you chose for the Compute databases.
• In the [DEFAULT] section, configure RabbitMQ message queue access:
[DEFAULT]
# ...
transport_url = rabbit://openstack:RABBIT_PASS@controller:5672/
Replace RABBIT_PASS with the password you chose for the openstack account in
RabbitMQ.
• In the [api] and [keystone_authtoken] sections, configure Identity service access:
[api]
# ...
auth_strategy = keystone
[keystone_authtoken]
# ...
www_authenticate_uri = http://controller:5000/
auth_url = http://controller:5000/
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = nova
password = NOVA_PASS
Replace NOVA_PASS with the password you chose for the nova user in the Identity service.
Note: Comment out or remove any other options in the [keystone_authtoken] sec-
tion.
• In the [DEFAULT] section, configure the my_ip option to use the management interface
IP address of the controller node:
[DEFAULT]
# ...
my_ip = 10.0.0.11
3.2. Installation 84
Nova Documentation, Release 23.1.1.dev14
[vnc]
enabled = true
# ...
server_listen = $my_ip
server_proxyclient_address = $my_ip
• In the [glance] section, configure the location of the Image service API:
[glance]
# ...
api_servers = http://controller:9292
[oslo_concurrency]
# ...
lock_path = /var/lib/nova/tmp
• Due to a packaging bug, remove the log_dir option from the [DEFAULT] section.
• In the [placement] section, configure access to the Placement service:
[placement]
# ...
region_name = RegionOne
project_domain_name = Default
project_name = service
auth_type = password
user_domain_name = Default
auth_url = http://controller:5000/v3
username = placement
password = PLACEMENT_PASS
Replace PLACEMENT_PASS with the password you choose for the placement service
user created when installing Placement. Comment out or remove any other options in the
[placement] section.
3. Populate the nova-api database:
3.2. Installation 85
Nova Documentation, Release 23.1.1.dev14
Finalize installation
3.2. Installation 86
Nova Documentation, Release 23.1.1.dev14
Install and configure controller node for openSUSE and SUSE Linux Enterprise
This section describes how to install and configure the Compute service, code-named nova, on the
controller node.
Prerequisites
Before you install and configure the Compute service, you must create databases, service credentials,
and API endpoints.
1. To create the databases, complete these steps:
• Use the database access client to connect to the database server as the root user:
$ mysql -u root -p
IDENTIFIED BY 'NOVA_DBPASS';
$ . admin-openrc
3.2. Installation 87
Nova Documentation, Release 23.1.1.dev14
User Password:
Repeat User Password:
+---------------------+----------------------------------+
| Field | Value |
+---------------------+----------------------------------+
| domain_id | default |
| enabled | True |
| id | 8a7dbf5279404537b1c7b86c033620fe |
| name | nova |
| options | {} |
| password_expires_at | None |
+---------------------+----------------------------------+
+-------------+----------------------------------+
| Field | Value |
+-------------+----------------------------------+
| description | OpenStack Compute |
| enabled | True |
| id | 060d59eac51b4594815603d75a00aba2 |
| name | nova |
| type | compute |
+-------------+----------------------------------+
+--------------+-------------------------------------------+
| Field | Value |
+--------------+-------------------------------------------+
| enabled | True |
| id | 3c1caa473bfe4390a11e7177894bcc7b |
| interface | public |
| region | RegionOne |
| region_id | RegionOne |
| service_id | 060d59eac51b4594815603d75a00aba2 |
| service_name | nova |
| service_type | compute |
| url | http://controller:8774/v2.1 |
+--------------+-------------------------------------------+
(continues on next page)
3.2. Installation 88
Nova Documentation, Release 23.1.1.dev14
+--------------+-------------------------------------------+
| Field | Value |
+--------------+-------------------------------------------+
| enabled | True |
| id | e3c918de680746a586eac1f2d9bc10ab |
| interface | internal |
| region | RegionOne |
| region_id | RegionOne |
| service_id | 060d59eac51b4594815603d75a00aba2 |
| service_name | nova |
| service_type | compute |
| url | http://controller:8774/v2.1 |
+--------------+-------------------------------------------+
+--------------+-------------------------------------------+
| Field | Value |
+--------------+-------------------------------------------+
| enabled | True |
| id | 38f7af91666a47cfb97b4dc790b94424 |
| interface | admin |
| region | RegionOne |
| region_id | RegionOne |
| service_id | 060d59eac51b4594815603d75a00aba2 |
| service_name | nova |
| service_type | compute |
| url | http://controller:8774/v2.1 |
+--------------+-------------------------------------------+
Note: Default configuration files vary by distribution. You might need to add these sections and options
rather than modifying existing sections and options. Also, an ellipsis (...) in the configuration snippets
indicates potential default configuration options that you should retain.
Note: As of the Newton release, SUSE OpenStack packages are shipped with the upstream default con-
figuration files. For example, /etc/nova/nova.conf has customizations in /etc/nova/nova.
conf.d/010-nova.conf. While the following instructions modify the default configuration file,
adding a new file in /etc/nova/nova.conf.d achieves the same result.
3.2. Installation 89
Nova Documentation, Release 23.1.1.dev14
# zypper install \
openstack-nova-api \
openstack-nova-scheduler \
openstack-nova-conductor \
openstack-nova-novncproxy \
iptables
[DEFAULT]
# ...
enabled_apis = osapi_compute,metadata
[api_database]
# ...
connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova_api
[database]
# ...
connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova
Replace NOVA_DBPASS with the password you chose for the Compute databases.
• In the [DEFAULT] section, configure RabbitMQ message queue access:
[DEFAULT]
# ...
transport_url = rabbit://openstack:RABBIT_PASS@controller:5672/
Replace RABBIT_PASS with the password you chose for the openstack account in
RabbitMQ.
• In the [api] and [keystone_authtoken] sections, configure Identity service access:
[api]
# ...
auth_strategy = keystone
[keystone_authtoken]
# ...
www_authenticate_uri = http://controller:5000/
auth_url = http://controller:5000/
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = nova
password = NOVA_PASS
Replace NOVA_PASS with the password you chose for the nova user in the Identity service.
Note: Comment out or remove any other options in the [keystone_authtoken] sec-
3.2. Installation 90
Nova Documentation, Release 23.1.1.dev14
tion.
• In the [DEFAULT] section, configure the my_ip option to use the management interface
IP address of the controller node:
[DEFAULT]
# ...
my_ip = 10.0.0.11
[vnc]
enabled = true
# ...
server_listen = $my_ip
server_proxyclient_address = $my_ip
• In the [glance] section, configure the location of the Image service API:
[glance]
# ...
api_servers = http://controller:9292
[oslo_concurrency]
# ...
lock_path = /var/run/nova
[placement]
# ...
region_name = RegionOne
project_domain_name = Default
project_name = service
auth_type = password
user_domain_name = Default
auth_url = http://controller:5000/v3
username = placement
password = PLACEMENT_PASS
Replace PLACEMENT_PASS with the password you choose for the placement service
user created when installing Placement. Comment out or remove any other options in the
[placement] section.
3. Populate the nova-api database:
3.2. Installation 91
Nova Documentation, Release 23.1.1.dev14
Finalize installation
• Start the Compute services and configure them to start when the system boots:
# systemctl enable \
openstack-nova-api.service \
openstack-nova-scheduler.service \
openstack-nova-conductor.service \
openstack-nova-novncproxy.service
# systemctl start \
openstack-nova-api.service \
openstack-nova-scheduler.service \
openstack-nova-conductor.service \
openstack-nova-novncproxy.service
3.2. Installation 92
Nova Documentation, Release 23.1.1.dev14
Install and configure controller node for Red Hat Enterprise Linux and CentOS
This section describes how to install and configure the Compute service, code-named nova, on the
controller node.
Prerequisites
Before you install and configure the Compute service, you must create databases, service credentials,
and API endpoints.
1. To create the databases, complete these steps:
• Use the database access client to connect to the database server as the root user:
$ mysql -u root -p
IDENTIFIED BY 'NOVA_DBPASS';
$ . admin-openrc
3.2. Installation 93
Nova Documentation, Release 23.1.1.dev14
User Password:
Repeat User Password:
+---------------------+----------------------------------+
| Field | Value |
+---------------------+----------------------------------+
| domain_id | default |
| enabled | True |
| id | 8a7dbf5279404537b1c7b86c033620fe |
| name | nova |
| options | {} |
| password_expires_at | None |
+---------------------+----------------------------------+
+-------------+----------------------------------+
| Field | Value |
+-------------+----------------------------------+
| description | OpenStack Compute |
| enabled | True |
| id | 060d59eac51b4594815603d75a00aba2 |
| name | nova |
| type | compute |
+-------------+----------------------------------+
+--------------+-------------------------------------------+
| Field | Value |
+--------------+-------------------------------------------+
| enabled | True |
| id | 3c1caa473bfe4390a11e7177894bcc7b |
| interface | public |
| region | RegionOne |
| region_id | RegionOne |
| service_id | 060d59eac51b4594815603d75a00aba2 |
| service_name | nova |
| service_type | compute |
| url | http://controller:8774/v2.1 |
+--------------+-------------------------------------------+
(continues on next page)
3.2. Installation 94
Nova Documentation, Release 23.1.1.dev14
+--------------+-------------------------------------------+
| Field | Value |
+--------------+-------------------------------------------+
| enabled | True |
| id | e3c918de680746a586eac1f2d9bc10ab |
| interface | internal |
| region | RegionOne |
| region_id | RegionOne |
| service_id | 060d59eac51b4594815603d75a00aba2 |
| service_name | nova |
| service_type | compute |
| url | http://controller:8774/v2.1 |
+--------------+-------------------------------------------+
+--------------+-------------------------------------------+
| Field | Value |
+--------------+-------------------------------------------+
| enabled | True |
| id | 38f7af91666a47cfb97b4dc790b94424 |
| interface | admin |
| region | RegionOne |
| region_id | RegionOne |
| service_id | 060d59eac51b4594815603d75a00aba2 |
| service_name | nova |
| service_type | compute |
| url | http://controller:8774/v2.1 |
+--------------+-------------------------------------------+
Note: Default configuration files vary by distribution. You might need to add these sections and options
rather than modifying existing sections and options. Also, an ellipsis (...) in the configuration snippets
indicates potential default configuration options that you should retain.
3.2. Installation 95
Nova Documentation, Release 23.1.1.dev14
[DEFAULT]
# ...
enabled_apis = osapi_compute,metadata
[api_database]
# ...
connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova_api
[database]
# ...
connection = mysql+pymysql://nova:NOVA_DBPASS@controller/nova
Replace NOVA_DBPASS with the password you chose for the Compute databases.
• In the [DEFAULT] section, configure RabbitMQ message queue access:
[DEFAULT]
# ...
transport_url = rabbit://openstack:RABBIT_PASS@controller:5672/
Replace RABBIT_PASS with the password you chose for the openstack account in
RabbitMQ.
• In the [api] and [keystone_authtoken] sections, configure Identity service access:
[api]
# ...
auth_strategy = keystone
[keystone_authtoken]
# ...
www_authenticate_uri = http://controller:5000/
auth_url = http://controller:5000/
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = nova
password = NOVA_PASS
Replace NOVA_PASS with the password you chose for the nova user in the Identity service.
Note: Comment out or remove any other options in the [keystone_authtoken] sec-
tion.
• In the [DEFAULT] section, configure the my_ip option to use the management interface
IP address of the controller node:
[DEFAULT]
# ...
my_ip = 10.0.0.11
3.2. Installation 96
Nova Documentation, Release 23.1.1.dev14
[vnc]
enabled = true
# ...
server_listen = $my_ip
server_proxyclient_address = $my_ip
• In the [glance] section, configure the location of the Image service API:
[glance]
# ...
api_servers = http://controller:9292
[oslo_concurrency]
# ...
lock_path = /var/lib/nova/tmp
[placement]
# ...
region_name = RegionOne
project_domain_name = Default
project_name = service
auth_type = password
user_domain_name = Default
auth_url = http://controller:5000/v3
username = placement
password = PLACEMENT_PASS
Replace PLACEMENT_PASS with the password you choose for the placement service
user created when installing Placement. Comment out or remove any other options in the
[placement] section.
3. Populate the nova-api database:
3.2. Installation 97
Nova Documentation, Release 23.1.1.dev14
,→-------------------------+----------+
| Name | UUID |
,→Transport URL | Database
,→Connection | Disabled |
+-------+--------------------------------------+----------------------
,→------------------------------+-------------------------------------
,→-------------------------+----------+
| cell0 | 00000000-0000-0000-0000-000000000000 |
,→ none:/ | mysql+pymysql://
,→nova:****@controller/nova_cell0?charset=utf8 | False |
| cell1 | f690f4fd-2bc5-4f15-8145-db561a7b9d3d | rabbit://
,→openstack:****@controller:5672/nova_cell1 | mysql+pymysql://
,→nova:****@controller/nova_cell1?charset=utf8 | False |
+-------+--------------------------------------+----------------------
,→------------------------------+-------------------------------------
,→-------------------------+----------+
Finalize installation
• Start the Compute services and configure them to start when the system boots:
# systemctl enable \
openstack-nova-api.service \
openstack-nova-scheduler.service \
openstack-nova-conductor.service \
openstack-nova-novncproxy.service
# systemctl start \
openstack-nova-api.service \
openstack-nova-scheduler.service \
openstack-nova-conductor.service \
openstack-nova-novncproxy.service
This section describes how to install and configure the Compute service on a compute node for Ubuntu,
openSUSE and SUSE Linux Enterprise, and Red Hat Enterprise Linux and CentOS.
The service supports several hypervisors to deploy instances or virtual machines (VMs). For simplicity,
this configuration uses the Quick EMUlator (QEMU) hypervisor with the kernel-based VM (KVM) ex-
tension on compute nodes that support hardware acceleration for virtual machines. On legacy hardware,
this configuration uses the generic QEMU hypervisor. You can follow these instructions with minor
modifications to horizontally scale your environment with additional compute nodes.
Note: This section assumes that you are following the instructions in this guide step-by-step to configure
3.2. Installation 98
Nova Documentation, Release 23.1.1.dev14
the first compute node. If you want to configure additional compute nodes, prepare them in a similar
fashion to the first compute node in the example architectures section. Each additional compute node
requires a unique IP address.
This section describes how to install and configure the Compute service on a compute node. The ser-
vice supports several hypervisors to deploy instances or virtual machines (VMs). For simplicity, this
configuration uses the Quick EMUlator (QEMU) hypervisor with the kernel-based VM (KVM) exten-
sion on compute nodes that support hardware acceleration for virtual machines. On legacy hardware,
this configuration uses the generic QEMU hypervisor. You can follow these instructions with minor
modifications to horizontally scale your environment with additional compute nodes.
Note: This section assumes that you are following the instructions in this guide step-by-step to configure
the first compute node. If you want to configure additional compute nodes, prepare them in a similar
fashion to the first compute node in the example architectures section. Each additional compute node
requires a unique IP address.
Note: Default configuration files vary by distribution. You might need to add these sections and options
rather than modifying existing sections and options. Also, an ellipsis (...) in the configuration snippets
indicates potential default configuration options that you should retain.
Replace RABBIT_PASS with the password you chose for the openstack account in
RabbitMQ.
• In the [api] and [keystone_authtoken] sections, configure Identity service access:
[api]
# ...
auth_strategy = keystone
[keystone_authtoken]
# ...
www_authenticate_uri = http://controller:5000/
(continues on next page)
3.2. Installation 99
Nova Documentation, Release 23.1.1.dev14
Replace NOVA_PASS with the password you chose for the nova user in the Identity service.
Note: Comment out or remove any other options in the [keystone_authtoken] sec-
tion.
[DEFAULT]
# ...
my_ip = MANAGEMENT_INTERFACE_IP_ADDRESS
[vnc]
# ...
enabled = true
server_listen = 0.0.0.0
server_proxyclient_address = $my_ip
novncproxy_base_url = http://controller:6080/vnc_auto.html
The server component listens on all IP addresses and the proxy component only listens on
the management interface IP address of the compute node. The base URL indicates the
location where you can use a web browser to access remote consoles of instances on this
compute node.
Note: If the web browser to access remote consoles resides on a host that cannot resolve the
controller hostname, you must replace controller with the management interface
IP address of the controller node.
• In the [glance] section, configure the location of the Image service API:
[glance]
# ...
api_servers = http://controller:9292
[oslo_concurrency]
# ...
lock_path = /var/lib/nova/tmp
[placement]
# ...
region_name = RegionOne
project_domain_name = Default
project_name = service
auth_type = password
user_domain_name = Default
auth_url = http://controller:5000/v3
username = placement
password = PLACEMENT_PASS
Replace PLACEMENT_PASS with the password you choose for the placement user in
the Identity service. Comment out any other options in the [placement] section.
Finalize installation
1. Determine whether your compute node supports hardware acceleration for virtual machines:
If this command returns a value of one or greater, your compute node supports hardware
acceleration which typically requires no additional configuration.
If this command returns a value of zero, your compute node does not support hardware acceler-
ation and you must configure libvirt to use QEMU instead of KVM.
• Edit the [libvirt] section in the /etc/nova/nova-compute.conf file as follows:
[libvirt]
# ...
virt_type = qemu
1. Source the admin credentials to enable admin-only CLI commands, then confirm there are com-
pute hosts in the database:
$ . admin-openrc
Note: When you add new compute nodes, you must run nova-manage cell_v2
discover_hosts on the controller node to register those new compute nodes. Alternatively,
you can set an appropriate interval in /etc/nova/nova.conf:
[scheduler]
discover_hosts_in_cells_interval = 300
Install and configure a compute node for Red Hat Enterprise Linux and CentOS
This section describes how to install and configure the Compute service on a compute node. The ser-
vice supports several hypervisors to deploy instances or virtual machines (VMs). For simplicity, this
configuration uses the Quick EMUlator (QEMU) hypervisor with the kernel-based VM (KVM) exten-
sion on compute nodes that support hardware acceleration for virtual machines. On legacy hardware,
this configuration uses the generic QEMU hypervisor. You can follow these instructions with minor
modifications to horizontally scale your environment with additional compute nodes.
Note: This section assumes that you are following the instructions in this guide step-by-step to configure
the first compute node. If you want to configure additional compute nodes, prepare them in a similar
fashion to the first compute node in the example architectures section. Each additional compute node
requires a unique IP address.
Note: Default configuration files vary by distribution. You might need to add these sections and options
rather than modifying existing sections and options. Also, an ellipsis (...) in the configuration snippets
indicates potential default configuration options that you should retain.
[DEFAULT]
# ...
enabled_apis = osapi_compute,metadata
[DEFAULT]
# ...
transport_url = rabbit://openstack:RABBIT_PASS@controller
Replace RABBIT_PASS with the password you chose for the openstack account in
RabbitMQ.
• In the [api] and [keystone_authtoken] sections, configure Identity service access:
[api]
# ...
auth_strategy = keystone
[keystone_authtoken]
# ...
www_authenticate_uri = http://controller:5000/
(continues on next page)
Replace NOVA_PASS with the password you chose for the nova user in the Identity service.
Note: Comment out or remove any other options in the [keystone_authtoken] sec-
tion.
[DEFAULT]
# ...
my_ip = MANAGEMENT_INTERFACE_IP_ADDRESS
[vnc]
# ...
enabled = true
server_listen = 0.0.0.0
server_proxyclient_address = $my_ip
novncproxy_base_url = http://controller:6080/vnc_auto.html
The server component listens on all IP addresses and the proxy component only listens on
the management interface IP address of the compute node. The base URL indicates the
location where you can use a web browser to access remote consoles of instances on this
compute node.
Note: If the web browser to access remote consoles resides on a host that cannot resolve the
controller hostname, you must replace controller with the management interface
IP address of the controller node.
• In the [glance] section, configure the location of the Image service API:
[glance]
# ...
api_servers = http://controller:9292
[oslo_concurrency]
# ...
lock_path = /var/lib/nova/tmp
[placement]
# ...
region_name = RegionOne
project_domain_name = Default
project_name = service
auth_type = password
user_domain_name = Default
auth_url = http://controller:5000/v3
username = placement
password = PLACEMENT_PASS
Replace PLACEMENT_PASS with the password you choose for the placement user in
the Identity service. Comment out any other options in the [placement] section.
Finalize installation
1. Determine whether your compute node supports hardware acceleration for virtual machines:
If this command returns a value of one or greater, your compute node supports hardware
acceleration which typically requires no additional configuration.
If this command returns a value of zero, your compute node does not support hardware acceler-
ation and you must configure libvirt to use QEMU instead of KVM.
• Edit the [libvirt] section in the /etc/nova/nova.conf file as follows:
[libvirt]
# ...
virt_type = qemu
2. Start the Compute service including its dependencies and configure them to start automatically
when the system boots:
1. Source the admin credentials to enable admin-only CLI commands, then confirm there are com-
pute hosts in the database:
$ . admin-openrc
Note: When you add new compute nodes, you must run nova-manage cell_v2
discover_hosts on the controller node to register those new compute nodes. Alternatively,
you can set an appropriate interval in /etc/nova/nova.conf:
[scheduler]
discover_hosts_in_cells_interval = 300
Install and configure a compute node for openSUSE and SUSE Linux Enterprise
This section describes how to install and configure the Compute service on a compute node. The ser-
vice supports several hypervisors to deploy instances or virtual machines (VMs). For simplicity, this
configuration uses the Quick EMUlator (QEMU) hypervisor with the kernel-based VM (KVM) exten-
sion on compute nodes that support hardware acceleration for virtual machines. On legacy hardware,
this configuration uses the generic QEMU hypervisor. You can follow these instructions with minor
modifications to horizontally scale your environment with additional compute nodes.
Note: This section assumes that you are following the instructions in this guide step-by-step to configure
the first compute node. If you want to configure additional compute nodes, prepare them in a similar
fashion to the first compute node in the example architectures section. Each additional compute node
requires a unique IP address.
Note: Default configuration files vary by distribution. You might need to add these sections and options
rather than modifying existing sections and options. Also, an ellipsis (...) in the configuration snippets
indicates potential default configuration options that you should retain.
[DEFAULT]
# ...
enabled_apis = osapi_compute,metadata
[DEFAULT]
# ...
compute_driver = libvirt.LibvirtDriver
[DEFAULT]
# ...
transport_url = rabbit://openstack:RABBIT_PASS@controller
Replace RABBIT_PASS with the password you chose for the openstack account in
RabbitMQ.
• In the [api] and [keystone_authtoken] sections, configure Identity service access:
[api]
# ...
auth_strategy = keystone
[keystone_authtoken]
# ...
www_authenticate_uri = http://controller:5000/
auth_url = http://controller:5000/
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = nova
password = NOVA_PASS
Replace NOVA_PASS with the password you chose for the nova user in the Identity service.
Note: Comment out or remove any other options in the [keystone_authtoken] sec-
tion.
[DEFAULT]
# ...
my_ip = MANAGEMENT_INTERFACE_IP_ADDRESS
[vnc]
# ...
enabled = true
server_listen = 0.0.0.0
server_proxyclient_address = $my_ip
novncproxy_base_url = http://controller:6080/vnc_auto.html
The server component listens on all IP addresses and the proxy component only listens on
the management interface IP address of the compute node. The base URL indicates the
location where you can use a web browser to access remote consoles of instances on this
compute node.
Note: If the web browser to access remote consoles resides on a host that cannot resolve the
controller hostname, you must replace controller with the management interface
IP address of the controller node.
• In the [glance] section, configure the location of the Image service API:
[glance]
# ...
api_servers = http://controller:9292
[oslo_concurrency]
# ...
lock_path = /var/run/nova
[placement]
# ...
region_name = RegionOne
project_domain_name = Default
project_name = service
auth_type = password
user_domain_name = Default
auth_url = http://controller:5000/v3
username = placement
password = PLACEMENT_PASS
Replace PLACEMENT_PASS with the password you choose for the placement user in
the Identity service. Comment out any other options in the [placement] section.
3. Ensure the kernel module nbd is loaded.
# modprobe nbd
4. Ensure the module loads on every boot by adding nbd to the /etc/modules-load.d/nbd.
conf file.
Finalize installation
1. Determine whether your compute node supports hardware acceleration for virtual machines:
If this command returns a value of one or greater, your compute node supports hardware
acceleration which typically requires no additional configuration.
If this command returns a value of zero, your compute node does not support hardware acceler-
ation and you must configure libvirt to use QEMU instead of KVM.
• Edit the [libvirt] section in the /etc/nova/nova.conf file as follows:
[libvirt]
# ...
virt_type = qemu
2. Start the Compute service including its dependencies and configure them to start automatically
when the system boots:
1. Source the admin credentials to enable admin-only CLI commands, then confirm there are com-
pute hosts in the database:
$ . admin-openrc
Note: When you add new compute nodes, you must run nova-manage cell_v2
discover_hosts on the controller node to register those new compute nodes. Alternatively,
you can set an appropriate interval in /etc/nova/nova.conf:
[scheduler]
discover_hosts_in_cells_interval = 300
$ . admin-openrc
2. List service components to verify successful launch and registration of each process:
+----+--------------------+------------+----------+---------+-------+-
,→---------------------------+
| Id | Binary | Host | Zone | Status | State |
,→Updated At |
+----+--------------------+------------+----------+---------+-------+-
,→---------------------------+
| 1 | nova-scheduler | controller | internal | enabled | up |
,→2016-02-09T23:11:15.000000 |
| 2 | nova-conductor | controller | internal | enabled | up |
,→2016-02-09T23:11:16.000000 |
| 3 | nova-compute | compute1 | nova | enabled | up |
,→2016-02-09T23:11:20.000000 |
+----+--------------------+------------+----------+---------+-------+-
,→---------------------------+
Note: This output should indicate two service components enabled on the controller node and
one service component enabled on the compute node.
3. List API endpoints in the Identity service to verify connectivity with the Identity service:
Note: Below endpoints list may differ depending on the installation of OpenStack components.
+-----------+-----------+-----------------------------------------+
| Name | Type | Endpoints |
+-----------+-----------+-----------------------------------------+
| keystone | identity | RegionOne |
| | | public: http://controller:5000/v3/ |
| | | RegionOne |
| | | internal: http://controller:5000/v3/ |
| | | RegionOne |
| | | admin: http://controller:5000/v3/ |
| | | |
(continues on next page)
4. List images in the Image service to verify connectivity with the Image service:
$ openstack image list
+--------------------------------------+-------------+-------------+
| ID | Name | Status |
+--------------------------------------+-------------+-------------+
| 9a76d9f9-9620-4f2e-8c69-6c5691fae163 | cirros | active |
+--------------------------------------+-------------+-------------+
5. Check the cells and placement API are working successfully and that other necessary prerequisites
are in place:
# nova-status upgrade check
+--------------------------------------------------------------------+
| Upgrade Check Results |
+--------------------------------------------------------------------+
| Check: Cells v2 |
| Result: Success |
| Details: None |
+--------------------------------------------------------------------+
| Check: Placement API |
| Result: Success |
| Details: None |
+--------------------------------------------------------------------+
| Check: Cinder API |
| Result: Success |
| Details: None |
(continues on next page)
There is information you might want to consider before doing your deployment, especially if it is going
to be a larger deployment. For smaller deployments the defaults from the install guide will be sufficient.
• Compute Driver Features Supported: While the majority of nova deployments use libvirt/kvm,
you can use nova with other compute drivers. Nova attempts to provide a unified feature set across
these, however, not all features are implemented on all backends, and not all features are equally
well tested.
– Feature Support by Use Case: A view of what features each driver supports based on whats
important to some large use cases (General Purpose Cloud, NFV Cloud, HPC Cloud).
– Feature Support full list: A detailed dive through features in each compute driver backend.
• Cells v2 Planning: For large deployments, Cells v2 allows sharding of your compute environment.
Upfront planning is key to a successful Cells v2 layout.
• Running nova-api on wsgi: Considerations for using a real WSGI container instead of the baked-in
eventlet web server.
This document presents a matrix that describes which features are ready to be used and which features
are works in progress. It includes links to relevant documentation and functional tests.
3.3.1.1 Aims
Users want reliable, long-term solutions for their use cases. The feature classification matrix identifies
which features are complete and ready to use, and which should be used with caution.
The matrix also benefits developers by providing a list of features that require further work to be con-
sidered complete.
Below is a matrix for a selection of important verticals:
• General Purpose Cloud Features
• NFV Cloud Features
• HPC Cloud Features
For more details on the concepts in each matrix, please see Notes on Concepts.
This is a summary of the key features dev/test clouds, and other similar general purpose clouds need,
and it describes their current state.
Below there are sections on NFV and HPC specific features. These look at specific features and scenarios
that are important to those more specific sets of use cases. Summary
Details
• Create Server and Delete Server This includes creating a server, and deleting a server. Specif-
ically this is about booting a server from a glance image using the default disk and network con-
figuration.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/#servers-servers
– Admin Docs: https://docs.openstack.org/nova/latest/user/launch-instances.html
– Tempest tests: 9a438d88-10c6-4bcd-8b5b-5b6e25e1346f, 585e934c-448e-43c4-acbf-
d06a9b899997
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: partial
– libvirt+virtuozzo VM: partial
– VMware CI: complete
– Hyper-V CI: complete
– Ironic CI: unknown
– IBM PowerVM CI: complete
– IBM zVM CI: complete
• Snapshot Server This is creating a glance image from the currently running server.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/?expanded=
#servers-run-an-action-servers-action
– Admin Docs: https://docs.openstack.org/glance/latest/admin/troubleshooting.html
– Tempest tests: aaacd1d0-55a2-4ce8-818a-b5439df8adc9
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: partial
– libvirt+virtuozzo VM: partial
– VMware CI: unknown
– Hyper-V CI: unknown
– Ironic CI: unknown
– IBM PowerVM CI: complete
– IBM zVM CI: complete
• Server power ops This includes reboot, shutdown and start.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/?expanded=
#servers-run-an-action-servers-action
– Admin Docs:
– Tempest tests: 2cb1baf6-ac8d-4429-bf0d-ba8a0ba53e32, af8eafd4-38a7-4a4b-bdbc-
75145a580560
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: partial
– libvirt+virtuozzo VM: partial
– VMware CI: complete
– Hyper-V CI: complete
– Ironic CI: unknown
– IBM PowerVM CI: complete
– IBM zVM CI: complete
• Rebuild Server You can rebuild a server, optionally specifying the glance image to use.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/?expanded=
#servers-run-an-action-servers-action
– Admin Docs:
– Tempest tests: aaa6cdf3-55a7-461a-add9-1c8596b9a07c
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: partial
– libvirt+virtuozzo VM: partial
– VMware CI: complete
– Hyper-V CI: complete
– Ironic CI: unknown
– IBM PowerVM CI: missing
– IBM zVM CI: missing
• Resize Server You resize a server to a new flavor, then confirm or revert that operation.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/?expanded=
#servers-run-an-action-servers-action
– Admin Docs:
– Tempest tests: 1499262a-9328-4eda-9068-db1ac57498d2
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: complete
– libvirt+virtuozzo VM: partial
– VMware CI: complete
– Hyper-V CI: complete
– Ironic CI: unknown
– IBM PowerVM CI: missing
– IBM zVM CI: missing
• Volume Operations This is about attaching volumes, detaching volumes.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/
#servers-with-volume-attachments-servers-os-volume-attachments
– Admin Docs: https://docs.openstack.org/cinder/latest/admin/
blockstorage-manage-volumes.html
– Tempest tests: fff42874-7db5-4487-a8e1-ddda5fb5288d
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: complete
– libvirt+virtuozzo VM: complete
– VMware CI: complete
– Hyper-V CI: complete
– Ironic CI: missing
– IBM PowerVM CI: complete
– IBM zVM CI: missing
• Custom disk configurations on boot This is about supporting all the features of BDMv2. This
includes booting from a volume, in various ways, and specifying a custom set of ephemeral disks.
Note some drivers only supports part of what the API allows.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/?expanded=
create-image-createimage-action-detail#create-server
– Admin Docs: https://docs.openstack.org/nova/latest/user/block-device-mapping.html
– Tempest tests: 557cd2c2-4eb8-4dce-98be-f86765ff311b, 36c34c67-7b54-4b59-b188-
02a2f458a63b
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: missing
– libvirt+virtuozzo VM: complete
– VMware CI: partial
– Hyper-V CI: complete (updated in N release)
– Ironic CI: missing
– IBM PowerVM CI: missing
– IBM zVM CI: missing
• Custom neutron configurations on boot This is about supporting booting from one or more
neutron ports, or all the related short cuts such as booting a specified network. This does not
include SR-IOV or similar, just simple neutron ports.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/?&expanded=create-server-detail
– Admin Docs:
– Tempest tests: 2f3a0127-95c7-4977-92d2-bc5aec602fb4
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: unknown
– libvirt+virtuozzo VM: unknown
– VMware CI: partial
– Hyper-V CI: partial
– Ironic CI: missing
– IBM PowerVM CI: complete
– IBM zVM CI: partial
• Pause a Server This is pause and unpause a server, where the state is held in memory.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/?#pause-server-pause-action
– Admin Docs:
– Tempest tests: bd61a9fd-062f-4670-972b-2d6c3e3b9e73
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: missing
– libvirt+virtuozzo VM: partial
– VMware CI: partial
– Hyper-V CI: complete
– Ironic CI: missing
– IBM PowerVM CI: missing
– IBM zVM CI: complete
• Suspend a Server This suspend and resume a server, where the state is held on disk.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/?expanded=
suspend-server-suspend-action-detail
– Admin Docs:
– Tempest tests: 0d8ee21e-b749-462d-83da-b85b41c86c7f
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: partial
– libvirt+virtuozzo VM: partial
– VMware CI: complete
– Hyper-V CI: complete
– Ironic CI: missing
– IBM PowerVM CI: missing
– IBM zVM CI: missing
• Server console output This gets the current server console output.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/
#show-console-output-os-getconsoleoutput-action
– Admin Docs:
– Tempest tests: 4b8867e6-fffa-4d54-b1d1-6fdda57be2f3
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: unknown
– libvirt+virtuozzo VM: unknown
– VMware CI: partial
– Hyper-V CI: partial
– Ironic CI: missing
– IBM PowerVM CI: complete
– IBM zVM CI: complete
• Server Rescue This boots a server with a new root disk from the specified glance image to allow
a user to fix a boot partition configuration, or similar.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/#rescue-server-rescue-action
– Admin Docs:
– Tempest tests: fd032140-714c-42e4-a8fd-adcd8df06be6, 70cdb8a1-89f8-437d-9448-
8844fd82bf46
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: partial
– libvirt+virtuozzo VM: complete
– VMware CI: complete
– Hyper-V CI: partial
– Ironic CI: missing
– IBM PowerVM CI: missing
– IBM zVM CI: missing
• Server Config Drive This ensures the user data provided by the user when booting a server is
available in one of the expected config drive locations.
info:
– Maturity: complete
– API Docs: https://docs.openstack.org/api-ref/compute/#create-server
– Admin Docs: https://docs.openstack.org/nova/latest/admin/config-drive.html
– Tempest tests: 7fff3fb3-91d8-4fd0-bd7d-0204f1f180ba
drivers:
– libvirt+kvm (x86 & ppc64): complete
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: missing
– libvirt+virtuozzo VM: partial
– VMware CI: complete
– Hyper-V CI: complete
– Ironic CI: partial
– IBM PowerVM CI: complete
– IBM zVM CI: complete
• Server Change Password The ability to reset the password of a user within the server.
info:
– Maturity: experimental
– API Docs: https://docs.openstack.org/api-ref/compute/
#change-administrative-password-changepassword-action
– Admin Docs:
– Tempest tests: 6158df09-4b82-4ab3-af6d-29cf36af858d
drivers:
– libvirt+kvm (x86 & ppc64): partial
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: missing
– libvirt+virtuozzo VM: missing
– VMware CI: missing
– Hyper-V CI: partial
– Ironic CI: missing
– IBM PowerVM CI: missing
– IBM zVM CI: missing
• Server Shelve and Unshelve The ability to keep a server logically alive, but not using any cloud
resources. For local disk based instances, this involves taking a snapshot, called offloading.
info:
– Maturity: complete
Network Function Virtualization (NFV) is about virtualizing network node functions into building
blocks that may connect, or chain together to create a particular service. It is common for this workloads
needing bare metal like performance, i.e. low latency and close to line speed performance.
Important: In deployments older than Train, or in mixed Stein/Train deployments with a rolling
upgrade in progress, unless specifically enabled, live migration is not possible for instances
with a NUMA topology when using the libvirt driver. A NUMA topology may be specified explicitly or
can be added implicitly due to the use of CPU pinning or huge pages. Refer to bug #1289064 for more
information. As of Train, live migration of instances with a NUMA topology when using the libvirt
driver is fully supported.
Summary
Details
• NUMA Placement Configure placement of instance vCPUs and memory across host NUMA
nodes
info:
– Maturity: experimental
– API Docs: https://docs.openstack.org/api-ref/compute/#create-server
– Admin Docs: https://docs.openstack.org/nova/latest/admin/cpu-topologies.html#
customizing-instance-cpu-pinning-policies
– Tempest tests: 9a438d88-10c6-4bcd-8b5b-5b6e25e1346f, 585e934c-448e-43c4-acbf-
d06a9b899997
drivers:
– libvirt+kvm (x86 & ppc64): partial
– libvirt+kvm (s390x): unknown
• CPU Pinning Policy Enable/disable binding of instance vCPUs to host CPUs
info:
– Maturity: experimental
– API Docs: https://docs.openstack.org/api-ref/compute/#create-server
– Admin Docs: https://docs.openstack.org/nova/latest/admin/cpu-topologies.html#
customizing-instance-cpu-pinning-policies
– Tempest tests:
drivers:
– libvirt+kvm (x86 & ppc64): partial
– libvirt+kvm (s390x): unknown
• CPU Pinning Thread Policy Configure usage of host hardware threads when pinning is used
info:
– Maturity: experimental
– API Docs: https://docs.openstack.org/api-ref/compute/#create-server
– Admin Docs: https://docs.openstack.org/nova/latest/admin/cpu-topologies.html#
customizing-instance-cpu-pinning-policies
– Tempest tests:
drivers:
– libvirt+kvm (x86 & ppc64): partial
– libvirt+kvm (s390x): unknown
High Performance Compute (HPC) cloud have some specific needs that are covered in this set of fea-
tures. Summary
Details
• GPU Passthrough The PCI passthrough feature in OpenStack allows full access and direct control
of a physical PCI device in guests. This mechanism is generic for any devices that can be attached
to a PCI bus. Correct driver installation is the only requirement for the guest to properly use the
devices.
info:
– Maturity: experimental
– API Docs: https://docs.openstack.org/api-ref/compute/#create-server
– Admin Docs: https://docs.openstack.org/nova/latest/admin/pci-passthrough.html
– Tempest tests: 9a438d88-10c6-4bcd-8b5b-5b6e25e1346f, 585e934c-448e-43c4-acbf-
d06a9b899997
drivers:
– libvirt+kvm (x86 & ppc64): complete (updated in L release)
– libvirt+kvm (s390x): unknown
– libvirt+virtuozzo CT: partial
– libvirt+virtuozzo VM: partial
– VMware CI: missing
– Hyper-V CI: missing
– Ironic: unknown
– PowerVM CI: missing
• Virtual GPUs Attach a virtual GPU to an instance at server creation time
info:
– Maturity: experimental
Users
Note: This is not an exhaustive list of personas, but rather an indicative set of users.
Feature Group
To reduce the size of the matrix, we organize the features into groups. Each group maps to a set of user
stories that can be validated by a set of scenarios and tests. Typically, this means a set of tempest tests.
This list focuses on API concepts like attach and detach volumes, rather than deployment specific con-
cepts like attach an iSCSI volume to a KVM based VM.
Deployment
A deployment maps to a specific test environment. We provide a full description of the environment, so
it is possible to reproduce the reported test results for each of the Feature Groups.
This description includes all aspects of the deployment, for example the hypervisor, number of nova-
compute services, storage, network driver, and types of images being tested.
The Feature Group Maturity rating is specific to the API concepts, rather than specific to a particular
deployment. That detail is covered in the deployment rating for each feature group.
Note: Although having some similarities, this list is not directly related to the DefCore effort.
The deployment rating refers to the state of the tests for each Feature Group on a particular deployment.
Deployment ratings:
Unknown No data is available.
Not Implemented No tests exist.
Implemented Self declared that the tempest tests pass.
Regularly Tested Tested by third party CI.
Checked Tested as part of the check or gate queue.
The eventual goal is to automate this list from a third party CI reporting system, but currently we docu-
ment manual inspections in an ini file. Ideally, we will review the list at every milestone.
When considering which capabilities should be marked as mandatory the following general guiding
principles were applied
• Inclusivity - people have shown ability to make effective use of a wide range of virtualization
technologies with broadly varying feature sets. Aiming to keep the requirements as inclusive as
possible, avoids second-guessing what a user may wish to use the cloud compute service for.
• Bootstrapping - a practical use case test is to consider that starting point for the compute de-
ploy is an empty data center with new machines and network connectivity. The look at what are
the minimum features required of a compute service, in order to get user instances running and
processing work over the network.
• Competition - an early leader in the cloud compute service space was Amazon EC2. A sanity
check for whether a feature should be mandatory is to consider whether it was available in the first
public release of EC2. This had quite a narrow feature set, but none the less found very high usage
in many use cases. So it serves to illustrate that many features need not be considered mandatory
in order to get useful work done.
• Reality - there are many virt drivers currently shipped with Nova, each with their own supported
feature set. Any feature which is missing in at least one virt driver that is already in-tree, must
by inference be considered optional until all in-tree drivers support it. This does not rule out
the possibility of a currently optional feature becoming mandatory at a later date, based on other
principles above.
Summary
Details
• Attach block volume to instance Status: optional.
CLI commands:
– nova volume-attach <server> <volume>
Notes: The attach volume operation provides a means to hotplug additional block storage to a
running instance. This allows storage capabilities to be expanded without interruption of service.
In a cloud model it would be more typical to just spin up a new instance with large storage, so the
ability to hotplug extra storage is for those cases where the instance is considered to be more of a
pet than cattle. Therefore this operation is not considered to be mandatory to support.
Driver Support:
– Hyper-V: complete
– Ironic: missing
– Libvirt KVM (aarch64): complete
– Libvirt KVM (ppc64): complete
– Libvirt KVM (s390x): complete
– Libvirt KVM (x86): complete
– Libvirt LXC: missing
– Libvirt QEMU (x86): complete
– Libvirt Virtuozzo CT: missing
– Libvirt Virtuozzo VM: complete
– PowerVM: complete Notes: This is not tested for every CI run. Add a powervm:volume-
check comment to trigger a CI job running volume tests.
– VMware vCenter: complete
– zVM: missing
• Attach tagged block device to instance Status: optional.
CLI commands:
– nova volume-attach <server> <volume> [--tag <tag>]
Notes: Attach a block device with a tag to an existing server instance. See Device tags for more
information.
Driver Support:
– Hyper-V: missing
– Ironic: missing
– Libvirt KVM (aarch64): complete
– Libvirt KVM (ppc64): complete
– Libvirt KVM (s390x): complete
– Libvirt KVM (x86): complete
– Libvirt LXC: missing
– Libvirt QEMU (x86): complete
– Libvirt Virtuozzo CT: missing
– Libvirt Virtuozzo VM: complete
– PowerVM: missing
– VMware vCenter: missing
– zVM: missing
• Detach block volume from instance Status: optional.
CLI commands:
– PowerVM: complete Notes: This is not tested for every CI run. Add a powervm:volume-
check comment to trigger a CI job running volume tests.
– VMware vCenter: missing
– zVM: missing
• Attach virtual network interface to instance Status: optional.
CLI commands:
– nova interface-attach <server>
Notes: The attach interface operation provides a means to hotplug additional interfaces to a run-
ning instance. Hotplug support varies between guest OSes and some guests require a reboot for
new interfaces to be detected. This operation allows interface capabilities to be expanded without
interruption of service. In a cloud model it would be more typical to just spin up a new instance
with more interfaces.
Driver Support:
– Hyper-V: partial Notes: Works without issue if instance is off. When hotplugging, only
works if using Windows/Hyper-V Server 2016 and the instance is a Generation 2 VM.
– Ironic: complete
– Libvirt KVM (aarch64): complete
– Libvirt KVM (ppc64): complete
– Libvirt KVM (s390x): complete
– Libvirt KVM (x86): complete
– Libvirt LXC: missing
– Libvirt QEMU (x86): complete
– Libvirt Virtuozzo CT: complete
– Libvirt Virtuozzo VM: complete
– PowerVM: complete
– VMware vCenter: complete
– zVM: missing
• Attach tagged virtual network interface to instance Status: optional.
CLI commands:
– nova interface-attach <server> [--tag <tag>]
Notes: Attach a virtual network interface with a tag to an existing server instance. See Device
tags for more information.
Driver Support:
– Hyper-V: complete
– Ironic: missing
– Libvirt KVM (aarch64): unknown
– Libvirt KVM (ppc64): complete
– Hyper-V: missing
– Ironic: missing
– Libvirt KVM (aarch64): missing
– Libvirt KVM (ppc64): missing
– Libvirt KVM (s390x): missing
– Libvirt KVM (x86): missing
– Libvirt LXC: missing
– Libvirt QEMU (x86): missing
– Libvirt Virtuozzo CT: missing
– Libvirt Virtuozzo VM: missing
– PowerVM: missing
– VMware vCenter: missing
– zVM: missing
• Evacuate instances from a host Status: optional.
CLI commands:
– nova evacuate <server>
– nova host-evacuate <host>
Notes: A possible failure scenario in a cloud environment is the outage of one of the compute
nodes. In such a case the instances of the down host can be evacuated to another host. It is assumed
that the old host is unlikely ever to be powered back on, otherwise the evacuation attempt will be
rejected. When the instances get moved to the new host, their volumes get re-attached and the
locally stored data is dropped. That happens in the same way as a rebuild. This is not considered
to be a mandatory operation to support.
Driver Support:
– Hyper-V: unknown
– Ironic: unknown
– Libvirt KVM (aarch64): complete
– Libvirt KVM (ppc64): unknown
– Libvirt KVM (s390x): complete
– Libvirt KVM (x86): complete
– Libvirt LXC: unknown
– Libvirt QEMU (x86): unknown
– Libvirt Virtuozzo CT: missing
– Libvirt Virtuozzo VM: missing
– PowerVM: missing
– VMware vCenter: unknown
– zVM: unknown
• Rebuild instance Status: optional.
CLI commands:
– nova rebuild <server> <image>
Notes: A possible use case is additional attributes need to be set to the instance, nova will purge
all existing data from the system and remakes the VM with given information such as metadata
and personalities. Though this is not considered to be a mandatory operation to support.
Driver Support:
– Hyper-V: complete
– Ironic: complete
– Libvirt KVM (aarch64): complete
– Libvirt KVM (ppc64): complete
– Libvirt KVM (s390x): complete
– Libvirt KVM (x86): complete
– Libvirt LXC: complete
– Libvirt QEMU (x86): complete
– Libvirt Virtuozzo CT: complete
– Libvirt Virtuozzo VM: complete
– PowerVM: missing
– VMware vCenter: complete
– zVM: unknown
• Guest instance status Status: mandatory.
Notes: Provides realtime information about the power state of the guest instance. Since the power
state is used by the compute manager for tracking changes in guests, this operation is considered
mandatory to support.
Driver Support:
– Hyper-V: complete
– Ironic: complete
– Libvirt KVM (aarch64): complete
– Libvirt KVM (ppc64): complete
– Libvirt KVM (s390x): complete
– Libvirt KVM (x86): complete
– Libvirt LXC: complete
– Libvirt QEMU (x86): complete
– Libvirt Virtuozzo CT: complete
– Libvirt Virtuozzo VM: complete
– PowerVM: complete
– VMware vCenter: complete
– zVM: complete
• Guest host uptime Status: optional.
Notes: Returns the result of host uptime since power on, its used to report hypervisor status.
Driver Support:
– Hyper-V: complete
– Ironic: missing
– Libvirt KVM (aarch64): complete
– Libvirt KVM (ppc64): complete
– Libvirt KVM (s390x): complete
– Libvirt KVM (x86): complete
– Libvirt LXC: complete
– Libvirt QEMU (x86): complete
– Libvirt Virtuozzo CT: complete
– Libvirt Virtuozzo VM: complete
– PowerVM: complete
– VMware vCenter: missing
– zVM: complete
• Guest host ip Status: optional.
Notes: Returns the ip of this host, its used when doing resize and migration.
Driver Support:
– Hyper-V: complete
– Ironic: missing
– Libvirt KVM (aarch64): complete
– Libvirt KVM (ppc64): complete
– Libvirt KVM (s390x): complete
– Libvirt KVM (x86): complete
– Libvirt LXC: complete
– Libvirt QEMU (x86): complete
– Libvirt Virtuozzo CT: complete
– Libvirt Virtuozzo VM: complete
– PowerVM: complete
– VMware vCenter: complete
– zVM: complete
• Live migrate instance across hosts Status: optional.
CLI commands:
– nova live-migration <server>
– nova host-evacuate-live <host>
Notes: Live migration provides a way to move an instance off one compute host, to another
compute host. Administrators may use this to evacuate instances from a host that needs to undergo
maintenance tasks, though of course this may not help if the host is already suffering a failure. In
general instances are considered cattle rather than pets, so it is expected that an instance is liable
to be killed if host maintenance is required. It is technically challenging for some hypervisors to
provide support for the live migration operation, particularly those built on the container based
virtualization. Therefore this operation is not considered mandatory to support.
Driver Support:
– Hyper-V: complete
– Ironic: missing
– Libvirt KVM (aarch64): missing
– Libvirt KVM (ppc64): complete
– Libvirt KVM (s390x): complete
– Libvirt KVM (x86): complete
– Libvirt LXC: missing
– Libvirt QEMU (x86): complete
– Libvirt Virtuozzo CT: complete
– Libvirt Virtuozzo VM: complete
– PowerVM: missing
– VMware vCenter: complete
– zVM: missing
• Force live migration to complete Status: optional.
CLI commands:
– nova live-migration-force-complete <server> <migration>
Notes: Live migration provides a way to move a running instance to another compute host. But
it can sometimes fail to complete if an instance has a high rate of memory or disk page access.
This operation provides the user with an option to assist the progress of the live migration. The
mechanism used to complete the live migration depends on the underlying virtualization subsys-
tem capabilities. If libvirt/qemu is used and the post-copy feature is available and enabled then
the force complete operation will cause a switch to post-copy mode. Otherwise the instance will
be suspended until the migration is completed or aborted.
Driver Support:
– Hyper-V: missing
– Ironic: missing
– Libvirt KVM (aarch64): missing
– Libvirt KVM (ppc64): complete Notes: Requires libvirt>=1.3.3, qemu>=2.5.0
– Libvirt KVM (s390x): complete Notes: Requires libvirt>=1.3.3, qemu>=2.5.0
– Libvirt KVM (x86): complete Notes: Requires libvirt>=1.3.3, qemu>=2.5.0
– Libvirt LXC: missing
– Libvirt QEMU (x86): complete Notes: Requires libvirt>=1.3.3, qemu>=2.5.0
– Libvirt Virtuozzo CT: missing
– Libvirt Virtuozzo VM: missing
– PowerVM: missing
– VMware vCenter: missing
– zVM: missing
• Abort an in-progress or queued live migration Status: optional.
CLI commands:
– nova live-migration-abort <server> <migration>
Notes: Live migration provides a way to move a running instance to another compute host. But it
can sometimes need a large amount of time to complete if an instance has a high rate of memory or
disk page access or is stuck in queued status if there are too many in-progress live migration jobs
in the queue. This operation provides the user with an option to abort in-progress live migrations.
When the live migration job is still in queued or preparing status, it can be aborted regardless of
the type of underneath hypervisor, but once the job status changes to running, only some of the
hypervisors support this feature.
Driver Support:
– Hyper-V: missing
– Ironic: missing
– Libvirt KVM (aarch64): missing
– Libvirt KVM (ppc64): complete
– Libvirt KVM (s390x): complete
– Libvirt KVM (x86): complete
– Libvirt LXC: missing
– Libvirt QEMU (x86): complete
– Libvirt Virtuozzo CT: unknown
– Libvirt Virtuozzo VM: unknown
– PowerVM: missing
– VMware vCenter: missing
– zVM: missing
with a clock that is no longer telling correct time. For container based virtualization solutions, this
operation is particularly technically challenging to implement and is an area of active research.
This operation tends to make more sense when thinking of instances as pets, rather than cattle,
since with cattle it would be simpler to just terminate the instance instead of suspending. Therefore
this operation is considered optional to support.
Driver Support:
– Hyper-V: complete
– Ironic: missing
– Libvirt KVM (aarch64): complete
– Libvirt KVM (ppc64): complete
– Libvirt KVM (s390x): complete
– Libvirt KVM (x86): complete
– Libvirt LXC: missing
– Libvirt QEMU (x86): complete
– Libvirt Virtuozzo CT: complete
– Libvirt Virtuozzo VM: complete
– PowerVM: missing
– VMware vCenter: complete
– zVM: missing
• Swap block volumes Status: optional.
CLI commands:
– nova volume-update <server> <attachment> <volume>
Notes: The swap volume operation is a mechanism for changing a running instance so that its
attached volume(s) are backed by different storage in the host. An alternative to this would be to
simply terminate the existing instance and spawn a new instance with the new storage. In other
words this operation is primarily targeted towards the pet use case rather than cattle, however, it
is required for volume migration to work in the volume service. This is considered optional to
support.
Driver Support:
– Hyper-V: missing
– Ironic: missing
– Libvirt KVM (aarch64): unknown
– Libvirt KVM (ppc64): complete
– Libvirt KVM (s390x): complete
– Libvirt KVM (x86): complete
– Libvirt LXC: missing
– Libvirt QEMU (x86): complete
– Ironic: missing
– Libvirt KVM (aarch64): missing
– Libvirt KVM (ppc64): missing
– Libvirt KVM (s390x): missing
– Libvirt KVM (x86): complete
– Libvirt LXC: missing
– Libvirt QEMU (x86): complete
– Libvirt Virtuozzo CT: missing
– Libvirt Virtuozzo VM: missing
– PowerVM: missing
– VMware vCenter: missing
– zVM: missing
• Boot instance with secure encrypted memory Status: optional.
CLI commands:
– openstack server create <usual server create parameters>
Notes: The feature allows VMs to be booted with their memory hardware-encrypted with a key
specific to the VM, to help protect the data residing in the VM against access from anyone other
than the user of the VM. The Configuration and Security Guides specify usage of this feature.
Driver Support:
– Hyper-V: missing
– Ironic: missing
– Libvirt KVM (aarch64): missing
– Libvirt KVM (ppc64): missing
– Libvirt KVM (s390x): missing
– Libvirt KVM (x86): partial Notes: This feature is currently only available with hosts
which support the SEV (Secure Encrypted Virtualization) technology from AMD.
– Libvirt LXC: missing
– Libvirt QEMU (x86): missing
– Libvirt Virtuozzo CT: missing
– Libvirt Virtuozzo VM: missing
– PowerVM: missing
– VMware vCenter: missing
– zVM: missing
• Cache base images for faster instance boot Status: optional.
CLI commands:
This document describes the layout of a deployment with Cells version 2, including deployment con-
siderations for security and scale. It is focused on code present in Pike and later, and while it is geared
towards people who want to have multiple cells for whatever reason, the nature of the cellsv2 support in
Nova means that it applies in some way to all deployments.
3.3.3.1 Concepts
API-level services need to be able to contact other services in all of the cells. Since they only have one
configured transport_url and [database]/connection they look up the information for the
other cells in the API database, with records called cell mappings.
Note: The API database must have cell mapping records that match the transport_url
and [database]/connection configuration elements of the lower-level services. See the
nova-manage Nova Cells v2 commands for more information about how to create and examine these
records.
The services generally have a well-defined communication pattern that dictates their layout in a de-
ployment. In a small/simple scenario, the rules do not have much of an impact as all the services can
communicate with each other on a single message bus and in a single cell database. However, as the
deployment grows, scaling and security concerns may drive separation and isolation of the services.
Simple
This is a diagram of the basic services that a simple (single-cell) deployment would have, as well as the
relationships (i.e. communication paths) between them:
nova-conductor MQ
All of the services are configured to talk to each other over the same message bus, and there is only one
cell database where live instance data resides. The cell0 database is present (and required) but as no
compute nodes are connected to it, this is still a single cell deployment.
Multiple Cells
In order to shard the services into multiple cells, a number of things must happen. First, the message bus
must be split into pieces along the same lines as the cell database. Second, a dedicated conductor must
be run for the API-level services, with access to the API database and a dedicated message queue. We
call this super conductor to distinguish its place and purpose from the per-cell conductor nodes.
nova-api
Cell 0
Cell 1 Cell 2
It is important to note that services in the lower cell boxes only have the ability to call back to the
placement API but cannot access any other API-layer services via RPC, nor do they have access to the
API database for global visibility of resources across the cloud. This is intentional and provides security
and failure domain isolation benefits, but also has impacts on some things that would otherwise require
this any-to-any communication style. Check the release notes for the version of Nova you are using for
the most up-to-date information about any caveats that may be present due to this limitation.
Note: This information is correct as of the Pike release. Where improvements have been made or issues
fixed, they are noted per item.
Currently it is not possible to migrate an instance from a host in one cell to a host in another cell. This
may be possible in the future, but it is currently unsupported. This impacts cold migration, resizes, live
migrations, evacuate, and unshelve operations.
Quota-related quirks
Quotas are now calculated live at the point at which an operation would consume more resource, instead
of being kept statically in the database. This means that a multi-cell environment may incorrectly calcu-
late the usage of a tenant if one of the cells is unreachable, as those resources cannot be counted. In this
case, the tenant may be able to consume more resource from one of the available cells, putting them far
over quota when the unreachable cell returns.
Note: Starting in the Train (20.0.0) release, it is possible to configure counting of quota usage from
the placement service and API database to make quota usage calculations resilient to down or poor-
performing cells in a multi-cell environment. See the quotas documentation for more details.
With multiple cells, the instance list operation may not sort and paginate results properly when crossing
multiple cell boundaries. Further, the performance of a sorted list operation will be considerably slower
than with a single cell.
Notifications
With a multi-cell environment with multiple message queues, it is likely that operators will want to
configure a separate connection to a unified queue for notifications. This can be done in the configuration
file of all nodes. Refer to the oslo.messaging configuration documentation for more details.
Starting from the Stein release, the nova metadata API service can be run either globally or per cell
using the api.local_metadata_per_cell configuration option.
Global
If you have networks that span cells, you might need to run Nova metadata API globally. When run-
ning globally, it should be configured as an API-level service with access to the api_database.
connection information. The nova metadata API service must not be run as a standalone service,
using the nova-api-metadata service, in this case.
Local per cell
Running Nova metadata API per cell can have better performance and data isolation in a multi-cell
deployment. If your networks are segmented along cell boundaries, then you can run Nova metadata
API service per cell. If you choose to run it per cell, you should also configure each neutron-metadata-
agent service to point to the corresponding nova-api-metadata. The nova metadata API service
must be run as a standalone service, using the nova-api-metadata service, in this case.
1
https://blueprints.launchpad.net/nova/+spec/efficient-multi-cell-instance-list-and-sort
Console proxies
Starting from the Rocky release, console proxies must be run per cell because console token authoriza-
tions are stored in cell databases. This means that each console proxy server must have access to the
database.connection information for the cell database containing the instances for which it is
proxying console access.
If you deploy multiple cells with a superconductor as described above, computes and cell-based conduc-
tors will not have the ability to speak to the scheduler as they are not connected to the same MQ. This is
by design for isolation, but currently the processes are not in place to implement some features without
such connectivity. Thus, anything that requires a so-called upcall will not function. This impacts the
following:
1. Instance reschedules during boot and resize (part 1)
The first is simple: if you boot an instance, it gets scheduled to a compute node, fails, it would normally
be re-scheduled to another node. That requires scheduler intervention and thus it will not work in Pike
with a multi-cell layout. If you do not rely on reschedules for covering up transient compute-node
failures, then this will not affect you. To ensure you do not make futile attempts at rescheduling, you
should set [scheduler]/max_attempts=1 in nova.conf.
The second two are related. The summary is that some of the facilities that Nova
has for ensuring that affinity/anti-affinity is preserved between instances does not function
in Pike with a multi-cell layout. If you dont use affinity operations, then this will
not affect you. To make sure you dont make futile attempts at the affinity check,
you should set [workarounds]/disable_group_policy_check_upcall=True and
[filter_scheduler]/track_instance_changes=False in nova.conf.
2
https://specs.openstack.org/openstack/nova-specs/specs/queens/approved/return-alternate-hosts.html
3
https://blueprints.launchpad.net/nova/+spec/live-migration-in-xapi-pool
4
https://review.opendev.org/686047/
5
https://review.opendev.org/686050/
The fourth was previously only a problem when performing live migrations using the since-removed
XenAPI driver and not specifying --block-migrate. The driver would attempt to figure out if
block migration should be performed based on source and destination hosts being in the same aggregate.
Since aggregates data had migrated to the API database, the cell conductor would not be able to access
the aggregate information and would fail.
The fifth is a problem because when a volume is attached to an instance in the nova-compute service,
and [cinder]/cross_az_attach=False in nova.conf, we attempt to look up the availability
zone that the instance is in which includes getting any host aggregates that the instance.host is in.
Since the aggregates are in the API database and the cell conductor cannot access that information, so
this will fail. In the future this check could be moved to the nova-api service such that the availability
zone between the instance and the volume is checked before we reach the cell, except in the case of boot
from volume where the nova-compute service itself creates the volume and must tell Cinder in which
availability zone to create the volume. Long-term, volume creation during boot from volume should be
moved to the top-level superconductor which would eliminate this AZ up-call check problem.
The sixth is detailed in bug 1781286 and similar to the first issue. The issue is that servers created without
a specific availability zone will have their AZ calculated during a reschedule based on the alternate host
selected. Determining the AZ for the alternate host requires an up call to the API DB.
Though the compute and metadata APIs can be run using independent scripts that provide eventlet-based
HTTP servers, it is generally considered more performant and flexible to run them using a generic HTTP
server that supports WSGI (such as Apache or nginx).
The nova project provides two automatically generated entry points that support this:
nova-api-wsgi and nova-metadata-wsgi. These read nova.conf and api-paste.
ini and generate the required module-level application that most WSGI servers require. If nova
is installed using pip, these two scripts will be installed into whatever the expected bin directory is for
the environment.
The new scripts replace older experimental scripts that could be found in the nova/wsgi directory of
the code repository. The new scripts are not experimental.
When running the compute and metadata services with WSGI, sharing the compute and metadata service
in the same process is not supported (as it is in the eventlet-based scripts).
In devstack as of May 2017, the compute and metadata APIs are hosted by a Apache communicating with
uwsgi via mod_proxy_uwsgi. Inspecting the configuration created there can provide some guidance on
one option for managing the WSGI scripts. It is important to remember, however, that one of the major
features of using WSGI is that there are many different ways to host a WSGI application. Different
servers make different choices about performance and configurability.
3.4 Maintenance
Once you are running nova, the following information is extremely useful.
• Admin Guide: A collection of guides for administrating nova.
• Flavors: What flavors are and why they are used.
• Upgrades: How nova is designed to be upgraded for minimal service impact, and the order you
should do them in.
• Quotas: Managing project quotas in nova.
• Aggregates: Aggregates are a useful way of grouping hosts together for scheduling purposes.
• Filter Scheduler: How the filter scheduler is configured, and how that will impact where compute
instances land in your environment. If you are seeing unexpected distribution of compute instances
in your hosts, youll want to dive into this configuration.
• Exposing custom metadata to compute instances: How and when you might want to extend the
basic metadata exposed to compute instances (either via metadata server or config drive) for your
specific purposes.
3.4.1 Compute
The OpenStack Compute service allows you to control an Infrastructure-as-a-Service (IaaS) cloud com-
puting platform. It gives you control over instances and networks, and allows you to manage access to
the cloud through users and projects.
Compute does not include virtualization software. Instead, it defines drivers that interact with underlying
virtualization mechanisms that run on your host operating system, and exposes functionality over a web-
based API.
3.4.1.1 Overview
To effectively administer compute, you must understand how the different installed nodes interact with
each other. Compute can be installed in many different ways using multiple servers, but generally
multiple compute nodes control the virtual servers and a cloud controller node contains the remaining
Compute services.
The Compute cloud works using a series of daemon processes named nova-* that exist persistently on
the host machine. These binaries can all run on the same machine or be spread out on multiple boxes in
a large deployment. The responsibilities of services and drivers are:
Services
nova-api Receives XML requests and sends them to the rest of the system. A WSGI app routes and
authenticates requests. Supports the OpenStack Compute APIs. A nova.conf configuration file
is created when Compute is installed.
nova-console, nova-dhcpbridge and nova-xvpvncproxy are all deprecated for removal so they can be
ignored.
nova-compute Manages virtual machines. Loads a Service object, and exposes the public methods
on ComputeManager through a Remote Procedure Call (RPC).
nova-conductor Provides database-access support for compute nodes (thereby reducing security
risks).
nova-scheduler Dispatches requests for new virtual machines to the correct node.
nova-novncproxy Provides a VNC proxy for browsers, allowing VNC consoles to access virtual
machines.
Note: Some services have drivers that change how the service implements its core functionality. For
example, the nova-compute service supports drivers that let you choose which hypervisor type it can
use.
Manage volumes
Depending on the setup of your cloud provider, they may give you an endpoint to use to manage volumes.
You can use the openstack CLI to manage volumes.
For the purposes of the compute service, attaching, detaching and creating a server from a volume are
of primary interest.
Refer to the CLI documentation for more information.
Volume multi-attach
Nova added support for multiattach volumes in the 17.0.0 Queens release.
This document covers the nova-specific aspects of this feature. Refer to the block storage admin guide
for more details about creating multiattach-capable volumes.
Boot from volume and attaching a volume to a server that is not SHELVED_OFFLOADED is supported.
Ultimately the ability to perform these actions depends on the compute host and hypervisor driver that
is being used.
There is also a recorded overview and demo for volume multi-attach.
Requirements
• The minimum required compute API microversion for attaching a multiattach-capable volume to
more than one server is 2.60.
• Cinder 12.0.0 (Queens) or newer is required.
• The nova-compute service must be running at least Queens release level code (17.0.0) and the
hypervisor driver must support attaching block storage devices to more than one guest. Refer to
Feature Support Matrix for details on which compute drivers support volume multiattach.
• When using the libvirt compute driver, the following native package versions determine multiat-
tach support:
– libvirt must be greater than or equal to 3.10, or
– qemu must be less than 2.10
• Swapping an in-use multiattach volume is not supported (this is actually controlled via the block
storage volume retype API).
Known issues
• Creating multiple servers in a single request with a multiattach-capable volume as the root disk is
not yet supported: https://bugs.launchpad.net/nova/+bug/1747985
• Subsequent attachments to the same volume are all attached in read/write mode by default in the
block storage service. A future change either in nova or cinder may address this so that subsequent
attachments are made in read-only mode, or such that the mode can be specified by the user when
attaching the volume to the server.
Testing
Continuous integration testing of the volume multiattach feature is done via the tempest-full and
tempest-slow jobs, which, along with the tests themselves, are defined in the tempest repository.
Manage Flavors
Admin users can use the openstack flavor command to customize and manage flavors. To see
information for this command, run:
Note: Flavor customization can be limited by the hypervisor in use. For example the libvirt driver
enables quotas on CPUs available to a VM, disk tuning, bandwidth I/O, watchdog behavior, random
number generator device control, and instance VIF traffic control.
For information on the flavors and flavor extra specs, refer to Flavors.
Create a flavor
1. List flavors to show the ID and name, the amount of memory, the amount of disk space for the
root partition and for the ephemeral partition, the swap, and the number of virtual CPUs for each
flavor:
2. To create a flavor, specify a name, ID, RAM size, disk size, and the number of vCPUs for the
flavor, as follows:
Note: Unique ID (integer or UUID) for the new flavor. If specifying auto, a UUID will be
automatically generated.
Here is an example that creates a public m1.extra_tiny flavor that automatically gets an ID
assigned, with 256 MB memory, no disk space, and one VCPU.
3. If an individual user or group of users needs a custom flavor that you do not want other projects
to have access to, you can create a private flavor.
After you create a flavor, assign it to a project by specifying the flavor name or ID and the project
ID:
4. In addition, you can set or unset properties, commonly referred to as extra specs, for the existing
flavor. The extra_specs metadata keys can influence the instance directly when it is launched.
If a flavor sets the quota:vif_outbound_peak=65536 extra spec, the instances outbound
peak bandwidth I/O should be less than or equal to 512 Mbps. There are several aspects that can
work for an instance including CPU limits, Disk tuning, Bandwidth I/O, Watchdog behavior, and
Random-number generator. For information about available metadata keys, see Flavors.
For a list of optional parameters, run this command:
Modify a flavor
Only the description of flavors can be modified (starting from microversion 2.55). To modify the de-
scription of a flavor, specify the flavor name or ID and a new description as follows:
Note: The only field that can be updated is the description field. Nova has historically intentionally not
included an API to update a flavor because that would be confusing for instances already created with
that flavor. Needing to change any other aspect of a flavor requires deleting and/or creating a new flavor.
Nova stores a serialized version of the flavor associated with an instance record in the
instance_extra table. While nova supports updating flavor extra_specs it does not update the
embedded flavor in existing instances. Nova does not update the embedded flavor as the extra_specs
change may invalidate the current placement of the instance or alter the compute context that has been
created for the instance by the virt driver. For this reason admins should avoid updating extra_specs for
flavors used by existing instances. A resize can be used to update existing instances if required but as a
resize performs a cold migration it is not transparent to a tenant.
Delete a flavor
Default Flavors
Previous versions of nova typically deployed with default flavors. This was removed from Newton. The
following table lists the default flavors for Mitaka and earlier.
Console connections for virtual machines, whether direct or through a proxy, are received on ports 5900
to 5999. The firewall on each Compute service node must allow network traffic on these ports.
This procedure modifies the iptables firewall to allow incoming connections to the Compute services.
Configuring the service-node firewall
1. Log in to the server that hosts the Compute service, as root.
2. Edit the /etc/sysconfig/iptables file, to add an INPUT rule that allows TCP traffic on
ports from 5900 to 5999. Make sure the new rule appears before any INPUT rules that REJECT
traffic:
3. Save the changes to the /etc/sysconfig/iptables file, and restart the iptables service
to pick up the changes:
Compute can generate a random administrator (root) password and inject that password into an instance.
If this feature is enabled, users can run ssh to an instance without an ssh keypair. The random password
appears in the output of the openstack server create command. You can also view and set the
admin password from the dashboard.
For password injection display in the dashboard, please refer to the setting of can_set_password in
Horizon doc
For hypervisors that use the libvirt back end (such as KVM, QEMU, and LXC), admin password injec-
tion is disabled by default. To enable it, set this option in /etc/nova/nova.conf:
[libvirt]
inject_password=true
When enabled, Compute will modify the password of the admin account by editing the /etc/shadow
file inside the virtual machine instance.
Note: Users can only use ssh to access the instance by using the admin password if the virtual machine
image is a Linux distribution, and it has been configured to allow users to use ssh as the root user with
password authorization. This is not the case for Ubuntu cloud images which, by default, does not allow
users to use ssh to access the root account, or CentOS cloud images which, by default, does not allow
ssh access to the instance with password.
For Windows virtual machines, configure the Windows image to retrieve the admin password on boot
by installing an agent such as cloudbase-init.
You can show basic statistics on resource usage for hosts and instances.
Note: For more sophisticated monitoring, see the Ceilometer project. You can also use tools, such as
Ganglia or Graphite, to gather more detailed data.
The following examples show the host usage statistics for a host called devstack.
• List the hosts and the nova-related services that run on them:
• Get a summary of resource usage of all of the instances running on the host:
The CPU column shows the sum of the virtual CPUs for instances running on the host.
The MEMORY MB column shows the sum of the memory (in MB) allocated to the instances that
run on the host.
The DISK GB column shows the sum of the root and ephemeral disk sizes (in GB) of the instances
that run on the host.
The row that has the value used_now in the PROJECT column shows the sum of the resources
allocated to the instances that run on the host, plus the resources allocated to the host itself.
The row that has the value used_max in the PROJECT column shows the sum of the resources
allocated to the instances that run on the host.
Note: These values are computed by using information about the flavors of the instances that run
on the hosts. This command does not query the CPU usage, memory usage, or hard disk usage of
the physical host.
Note: As of microversion v2.48, diagnostics information for all virt drivers will have a
standard format as below. Before microversion 2.48, each hypervisor had its own format.
For more details on diagnostics response message see server diagnostics api documentation.
| config_drive | False
,→ |
| cpu_details | []
,→ |
| disk_details | [{"read_requests": 887, "errors_count": -1,
,→"read_bytes": 20273152, |
| | "write_requests": 89, "write_bytes": 303104}]
,→ |
| driver | libvirt
,→ |
| hypervisor | qemu
,→ |
| hypervisor_os | linux
,→ |
| memory_details | {"used": 0, "maximum": 0}
,→ |
| nic_details | [{"rx_packets": 9, "rx_drop": 0, "tx_octets":
,→1464, "tx_errors": 0, |
| | "mac_address": "fa:16:3e:fa:db:d3", "rx_octets
,→": 958, "rx_rate": null, |
| | "rx_errors": 0, "tx_drop": 0, "tx_packets": 9,
,→"tx_rate": null}] |
| num_cpus | 0
,→ |
| num_disks | 1
,→ |
| num_nics | 1
,→ |
| state | running
,→ |
| uptime | 5528
,→ |
+----------------+------------------------------------------------
,→------------------------+
Running openstack help returns a list of openstack commands and parameters. To get
help for a subcommand, run:
For a complete list of openstack commands and parameters, refer to the OpenStack Command-
Line Reference.
3. Set the required parameters as environment variables to make running commands easier. For
example, you can add --os-username as an openstack option, or set it as an environment
variable. To set the user name, password, and project as environment variables, use:
$ export OS_USERNAME=joecool
$ export OS_PASSWORD=coolword
$ export OS_TENANT_NAME=coolu
4. The Identity service gives you an authentication endpoint, which Compute recognizes as
OS_AUTH_URL:
$ export OS_AUTH_URL=http://hostname:5000/v2.0
Logging
Logging module
Logging behavior can be changed by creating a configuration file. To specify the configuration file, add
this line to the /etc/nova/nova.conf file:
log_config_append=/etc/nova/logging.conf
To change the logging level, add DEBUG, INFO, WARNING, or ERROR as a parameter.
The logging configuration file is an INI-style configuration file, which must contain a section called
logger_nova. This controls the behavior of the logging facility in the nova-* services. For exam-
ple:
[logger_nova]
level = INFO
handlers = stderr
qualname = nova
This example sets the debugging level to INFO (which is less verbose than the default DEBUG setting).
For more about the logging configuration syntax, including the handlers and qualname variables,
see the Python documentation on logging configuration files.
For an example of the logging.conf file with various defined handlers, see the Example Configura-
tion File for nova.
Syslog
OpenStack Compute services can send logging information to syslog. This is useful if you want to
use rsyslog to forward logs to a remote machine. Separately configure the Compute service (nova), the
Identity service (keystone), the Image service (glance), and, if you are using it, the Block Storage service
(cinder) to send log messages to syslog. Open these configuration files:
• /etc/nova/nova.conf
• /etc/keystone/keystone.conf
• /etc/glance/glance-api.conf
• /etc/glance/glance-registry.conf
• /etc/cinder/cinder.conf
In each configuration file, add these lines:
debug = False
use_syslog = True
syslog_log_facility = LOG_LOCAL0
In addition to enabling syslog, these settings also turn off debugging output from the log.
Note: Although this example uses the same local facility for each service (LOG_LOCAL0, which
corresponds to syslog facility LOCAL0), we recommend that you configure a separate local facility
for each service, as this provides better isolation and more flexibility. For example, you can capture
logging information at different severity levels for different services. syslog allows you to define up
to eight local facilities, LOCAL0, LOCAL1, ..., LOCAL7. For more information, see the syslog
documentation.
Rsyslog
rsyslog is useful for setting up a centralized log server across multiple machines. This section briefly
describe the configuration to set up an rsyslog server. A full treatment of rsyslog is beyond the scope
of this book. This section assumes rsyslog has already been installed on your hosts (it is installed by
default on most Linux distributions).
This example provides a minimal configuration for /etc/rsyslog.conf on the log server host,
which receives the log files
Add a filter rule to /etc/rsyslog.conf which looks for a host name. This example uses COM-
PUTE_01 as the compute host name:
On each compute host, create a file named /etc/rsyslog.d/60-nova.conf, with the following
content:
Once you have created the file, restart the rsyslog service. Error-level log messages on the compute
hosts should now be sent to the log server.
Serial console
The serial console provides a way to examine kernel output and other system messages during trou-
bleshooting if the instance lacks network connectivity.
Read-only access from server serial console is possible using the os-GetSerialOutput server ac-
tion. Most cloud images enable this feature by default. For more information, see Common errors and
fixes for Compute.
OpenStack Juno and later supports read-write access using the serial console using the
os-GetSerialConsole server action. This feature also requires a websocket client to access the
serial console.
[serial_console]
# ...
enabled = true
2. In the [serial_console] section, configure the serial console proxy similar to graphical
console proxies:
[serial_console]
# ...
base_url = ws://controller:6083/
listen = 0.0.0.0
proxyclient_address = MANAGEMENT_INTERFACE_IP_ADDRESS
The base_url option specifies the base URL that clients receive from the API upon requesting
a serial console. Typically, this refers to the host name of the controller node.
The listen option specifies the network interface nova-compute should listen on for virtual
console connections. Typically, 0.0.0.0 will enable listening on all interfaces.
The proxyclient_address option specifies which network interface the proxy should con-
nect to. Typically, this refers to the IP address of the management interface.
When you enable read-write serial console access, Compute will add serial console information
to the Libvirt XML file for the instance. For example:
<console type='tcp'>
<source mode='bind' host='127.0.0.1' service='10000'/>
<protocol type='raw'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
1. Use the nova get-serial-proxy command to retrieve the websocket URL for the serial
console on the instance:
Type Url
serial ws://127.0.0.1:6083/?token=18510769-71ad-4e5a-8348-4218b5613b3d
$ curl -i 'http://<controller>:8774/v2.1/<tenant_uuid>/servers/
,→<instance_uuid>/action' \
-X POST \
-H "Accept: application/json" \
-H "Content-Type: application/json" \
-H "X-Auth-Project-Id: <project_id>" \
-H "X-Auth-Token: <auth_token>" \
-d '{"os-getSerialConsole": {"type": "serial"}}'
2. Use Python websocket with the URL to generate .send, .recv, and .fileno methods for
serial console access. For example:
import websocket
ws = websocket.create_connection(
'ws://127.0.0.1:6083/?token=18510769-71ad-4e5a-8348-4218b5613b3d',
subprotocols=['binary', 'base64'])
Note: When you enable the serial console, typical instance logging using the nova console-log
command is disabled. Kernel output and other system messages will not be visible unless you are
actively viewing the serial console.
Rootwrap allows unprivileged users to safely run Compute actions as the root user. Compute previously
used sudo for this purpose, but this was difficult to maintain, and did not allow advanced filters. The
rootwrap command replaces sudo for Compute.
To use rootwrap, prefix the Compute command with nova-rootwrap. For example:
A generic sudoers entry lets the Compute user run nova-rootwrap as root. The
nova-rootwrap code looks for filter definition directories in its configuration file, and loads com-
mand filters from them. It then checks if the command requested by Compute matches one of those
filters and, if so, executes the command (as root). If no filter matches, it denies the request.
Note: Be aware of issues with using NFS and root-owned files. The NFS share must be configured
with the no_root_squash option enabled, in order for rootwrap to work correctly.
Rootwrap is fully controlled by the root user. The root user owns the sudoers entry which allows Com-
pute to run a specific rootwrap executable as root, and only with a specific configuration file (which
should also be owned by root). The nova-rootwrap command imports the Python modules it needs
from a cleaned, system-default PYTHONPATH. The root-owned configuration file points to root-owned
filter definition directories, which contain root-owned filters definition files. This chain ensures that the
Compute user itself is not in control of the configuration or modules used by the nova-rootwrap
executable.
Configure rootwrap
Configure rootwrap in the rootwrap.conf file. Because it is in the trusted security path, it must be
owned and writable by only the root user. The rootwrap_config=entry parameter specifies the
files location in the sudoers entry and in the nova.conf configuration file.
The rootwrap.conf file uses an INI file format with these sections and parameters:
If the root wrapper is not performing correctly, you can add a workaround option into the nova.conf
configuration file. This workaround re-configures the root wrapper configuration to fall back to running
commands as sudo, and is a Kilo release feature.
Including this workaround in your configuration file safeguards your environment from issues that can
impair root wrapper performance. Tool changes that have impacted Python Build Reasonableness (PBR)
for example, are a known issue that affects root wrapper performance.
To set up this workaround, configure the disable_rootwrap option in the [workaround] section
of the nova.conf configuration file.
The filters definition files contain lists of filters that rootwrap will use to allow or deny a specific com-
mand. They are generally suffixed by .filters . Since they are in the trusted security path, they need
to be owned and writable only by the root user. Their location is specified in the rootwrap.conf file.
Filter definition files use an INI file format with a [Filters] section and several lines, each with a
unique parameter name, which should be different for each filter you define:
Administrators can use rootwrap daemon support instead of running rootwrap with sudo. The rootwrap
daemon reduces the overhead and performance loss that results from running oslo.rootwrap with
sudo. Each call that needs rootwrap privileges requires a new instance of rootwrap. The daemon pre-
vents overhead from the repeated calls. The daemon does not support long running processes, however.
To enable the rootwrap daemon, set use_rootwrap_daemon to True in the Compute service con-
figuration file.
Migration enables an administrator to move a virtual machine instance from one compute host to an-
other. A typical scenario is planned maintenance on the source host, but migration can also be useful to
redistribute the load when many VM instances are running on a specific physical machine.
This document covers live migrations using the Libvirt and VMWare hypervisors
Note: Not all Compute service hypervisor drivers support live-migration, or support all live-migration
features. Similarly not all compute service features are supported.
Consult Feature Support Matrix to determine which hypervisors support live-migration.
See the Configuration Guide for details on hypervisor configuration settings.
The instance is shut down, then moved to another hypervisor and restarted. The instance recog-
nizes that it was rebooted, and the application running on the instance is disrupted.
This section does not cover cold migration.
• Live migration
The instance keeps running throughout the migration. This is useful when it is not possible or
desirable to stop the application running on the instance.
Live migrations can be classified further by the way they treat instance storage:
– Shared storage-based live migration. The instance has ephemeral disks that are located on
storage shared between the source and destination hosts.
– Block live migration, or simply block migration. The instance has ephemeral disks that are
not shared between the source and destination hosts. Block migration is incompatible with
read-only devices such as CD-ROMs and Configuration Drive (config_drive).
– Volume-backed live migration. Instances use volumes rather than ephemeral disks.
Block live migration requires copying disks from the source to the destination host. It takes more
time and puts more load on the network. Shared-storage and volume-backed live migration does
not copy disks.
Note: In a multi-cell cloud, instances can be live migrated to a different host in the same cell, but not
across cells.
The following sections describe how to configure your hosts for live migrations using the libvirt virt
driver and KVM hypervisor.
Libvirt
General configuration
To enable any type of live migration, configure the compute hosts according to the instructions below:
1. Set the following parameters in nova.conf on all compute hosts:
• server_listen=0.0.0.0
You must not make the VNC server listen to the IP address of its compute host, since that
addresses changes when the instance is migrated.
Important: Since this setting allows VNC clients from any IP address to connect to instance
consoles, you must take additional measures like secure networks or firewalls to prevent
potential attackers from gaining access to instances.
• instances_path must have the same value for all compute hosts. In this guide, the value
/var/lib/nova/instances is assumed.
2. Ensure that name resolution on all compute hosts is identical, so that they can connect each other
through their hostnames.
If you use /etc/hosts for name resolution and enable SELinux, ensure that /etc/hosts
has the correct SELinux context:
# restorecon /etc/hosts
3. Enable password-less SSH so that root on one compute host can log on to any other compute
host without providing a password. The libvirtd daemon, which runs as root, uses the SSH
protocol to copy the instance to the destination and cant know the passwords of all compute hosts.
You may, for example, compile roots public SSH keys on all compute hosts into an
authorized_keys file and deploy that file to the compute hosts.
4. Configure the firewalls to allow libvirt to communicate between compute hosts.
By default, libvirt uses the TCP port range from 49152 to 49261 for copying memory and disk
contents. Compute hosts must accept connections in this range.
For information about ports used by libvirt, see the libvirt documentation.
If your compute nodes have at least libvirt 4.4.0 and QEMU 2.11.0, it is strongly recommended to secure
all your live migration streams by taking advantage of the QEMU-native TLS feature. This requires a
pre-existing PKI (Public Key Infrastructure) setup. For further details on how to set this all up, refer to
the Secure live migration with QEMU-native TLS document.
If your environment satisfies the requirements for QEMU-native TLS, then block migration requires
some setup; refer to the above section, Securing live migration streams, for details. Otherwise, no
additional configuration is required for block migration and volume-backed live migration.
Be aware that block migration adds load to the network and storage subsystems.
Shared storage
Compute hosts have many options for sharing storage, for example NFS, shared disk array LUNs, Ceph
or GlusterFS.
The next steps show how a regular Linux system might be configured as an NFS v4 server for live
migration. For detailed information and alternative ways to configure NFS on Linux, see instructions
for Ubuntu, RHEL and derivatives or SLES and OpenSUSE.
1. Ensure that UID and GID of the nova user are identical on the compute hosts and the NFS server.
2. Create a directory with enough disk space for all instances in the cloud, owned by user nova. In
this guide, we assume /var/lib/nova/instances.
3. Set the execute/search bit on the instances directory:
/var/lib/nova/instances *(rw,sync,fsid=0,no_root_squash)
The asterisk permits access to any NFS client. The option fsid=0 exports the instances directory
as the NFS root.
After setting up the NFS server, mount the remote filesystem on all compute hosts.
1. Assuming the NFS servers hostname is nfs-server, add this line to /etc/fstab to mount
the NFS root:
2. Test NFS by mounting the instances directory and check access permissions for the nova user:
$ sudo mount -a -v
$ ls -ld /var/lib/nova/instances/
drwxr-xr-x. 2 nova nova 6 Mar 14 21:30 /var/lib/nova/instances/
Live migration copies the instances memory from the source to the destination compute host. After
a memory page has been copied, the instance may write to it again, so that it has to be copied again.
Instances that frequently write to different memory pages can overwhelm the memory copy process and
prevent the live migration from completing.
This section covers configuration settings that can help live migration of memory-intensive instances
succeed.
1. Live migration completion timeout
The Compute service will either abort or force complete a migration when it
has been running too long. This behavior is configurable using the libvirt.
live_migration_timeout_action config option. The timeout is calculated based
on the instance size, which is the instances memory size in GiB. In the case of block migration,
the size of ephemeral storage in GiB is added.
The timeout in seconds is the instance size multiplied by the configurable parameter libvirt.
live_migration_completion_timeout, whose default is 800. For example, shared-
storage live migration of an instance with 8GiB memory will time out after 6400 seconds.
2. Instance downtime
Near the end of the memory copy, the instance is paused for a short time so that the remaining
few pages can be copied without interference from instance memory writes. The Compute service
initializes this time to a small value that depends on the instance size, typically around 50 mil-
liseconds. When it notices that the memory copy does not make sufficient progress, it increases
the time gradually.
You can influence the instance downtime algorithm with the help of three configuration variables
on the compute hosts:
live_migration_downtime = 500
live_migration_downtime_steps = 10
live_migration_downtime_delay = 75
Caution: Before enabling auto-convergence, make sure that the instances application toler-
ates a slow-down.
Be aware that auto-convergence does not guarantee live migration success.
4. Post-copy
Live migration of a memory-intensive instance is certain to succeed when you enable post-copy.
This feature, implemented by libvirt and QEMU, activates the virtual machine on the destination
host before all of its memory has been copied. When the virtual machine accesses a page that is
missing on the destination host, the resulting page fault is resolved by copying the page from the
source host.
Post-copy is disabled by default. You can enable it by setting
live_migration_permit_post_copy=true.
When you enable both auto-convergence and post-copy, auto-convergence remains disabled.
Caution: The page faults introduced by post-copy can slow the instance down.
When the network connection between source and destination host is interrupted, page faults
cannot be resolved anymore and the instance is rebooted.
The full list of live migration configuration parameters is documented in the Nova Configuration Options
VMware
vSphere configuration
Enable vMotion on all ESX hosts which are managed by Nova by following the instructions in this KB
article.
Live-migrate instances
Live-migrating an instance means moving its virtual machine to a different OpenStack Compute server
while the instance continues running. Before starting a live-migration, review the chapter Configure live
migrations. It covers the configuration settings required to enable live-migration, but also reasons for
migrations and non-live-migration options.
The instructions below cover shared-storage and volume-backed migration. To block-migrate instances,
add the command-line option -block-migrate to the nova live-migration command, and
--block-migration to the openstack server migrate command.
+--------------------------------------+------+--------+--------------
,→---+------------+
| ID | Name | Status | Networks
,→ | Image Name |
+--------------------------------------+------+--------+--------------
,→---+------------+
| d1df1b5a-70c4-4fed-98b7-423362f2c47c | vm1 | ACTIVE | private=a.b.
,→c.d | ... |
| d693db9e-a7cf-45ef-a7c9-b3ecb5f22645 | vm2 | ACTIVE | private=e.f.
,→g.h | ... |
+--------------------------------------+------+--------+--------------
,→---+------------+
2. Determine on which host the instance is currently running. In this example, vm1 is running on
HostB:
+----------------------+--------------------------------------+
| Field | Value |
+----------------------+--------------------------------------+
| ... | ... |
| OS-EXT-SRV-ATTR:host | HostB |
| ... | ... |
| addresses | a.b.c.d |
| flavor | m1.tiny |
| id | d1df1b5a-70c4-4fed-98b7-423362f2c47c |
| name | vm1 |
(continues on next page)
3. Select the compute node the instance will be migrated to. In this example, we will migrate the
instance to HostC, because nova-compute is running on it:
+----+------------------+-------+----------+---------+-------+--------
,→--------------------+
| ID | Binary | Host | Zone | Status | State |
,→Updated At |
+----+------------------+-------+----------+---------+-------+--------
,→--------------------+
+-------+------------+-----+-----------+---------+
| Host | Project | CPU | Memory MB | Disk GB |
+-------+------------+-----+-----------+---------+
| HostC | (total) | 16 | 32232 | 878 |
| HostC | (used_now) | 22 | 21284 | 422 |
| HostC | (used_max) | 22 | 21284 | 422 |
| HostC | p1 | 22 | 21284 | 422 |
| HostC | p2 | 22 | 21284 | 422 |
+-------+------------+-----+-----------+---------+
+----------------------+--------------------------------------+
| Field | Value |
+----------------------+--------------------------------------+
| ... | ... |
| OS-EXT-SRV-ATTR:host | HostC |
| ... | ... |
+----------------------+--------------------------------------+
If the instance is still running on HostB, the migration failed. The nova-scheduler and
nova-conductor log files on the controller and the nova-compute log file on the source
compute host can help pin-point the problem.
To leave the selection of the destination host to the Compute service, use the nova command-line client.
1. Obtain the instance ID as shown in step 1 of the section Manual selection of the destination host.
2. Leave out the host selection steps 2, 3, and 4.
3. Migrate the instance:
+----------------------+--------------------------------------+
| Field | Value |
+----------------------+--------------------------------------+
| ... | ... |
| status | MIGRATING |
| ... | ... |
+----------------------+--------------------------------------+
2. Check progress
Use the nova command-line client for novas migration monitoring feature. First, obtain the mi-
gration ID:
For readability, most output columns were removed. Only the first column, Id, is relevant. In this
example, the migration ID is 2. Use this to get the migration status.
The output shows that the migration is running. Progress is measured by the number of memory
bytes that remain to be copied. If this number is not decreasing over time, the migration may be
unable to complete, and it may be aborted by the Compute service.
Note: The command reports that no disk bytes are processed, even in the event of block migra-
tion.
During the migration process, the instance may write to a memory page after that page has been copied
to the destination. When that happens, the same page has to be copied again. The instance may write
to memory pages faster than they can be copied, so that the migration cannot complete. There are two
optional actions, controlled by libvirt.live_migration_timeout_action, which can be
taken against a VM after libvirt.live_migration_completion_timeout is reached:
1. abort (default): The live migration operation will be cancelled after the completion timeout is
reached. This is similar to using API DELETE /servers/{server_id}/migrations/
{migration_id}.
2. force_complete: The compute service will either pause the VM or trig-
ger post-copy depending on if post copy is enabled and available (libvirt.
live_migration_permit_post_copy is set to True). This is similar to using
API POST /servers/{server_id}/migrations/{migration_id}/action
(force_complete).
You can also read the libvirt.live_migration_timeout_action configuration option help
for more details.
The following remarks assume the KVM/Libvirt hypervisor.
To determine that the migration timed out, inspect the nova-compute log file on the source host. The
following log entry shows that the migration timed out:
To stop the migration from putting load on infrastructure resources like network and disks, you may opt
to cancel it manually.
Caution: Since the pause impacts time keeping on the instance and not all applications
tolerate incorrect time settings, use this approach with caution.
• Enable auto-convergence
Auto-convergence is a Libvirt feature. Libvirt detects that the migration is unlikely to complete
and slows down its CPU until the memory copy process is faster than the instances memory writes.
To enable auto-convergence, set live_migration_permit_auto_converge=true in
nova.conf and restart nova-compute. Do this on all compute hosts.
Caution: One possible downside of auto-convergence is the slowing down of the instance.
• Enable post-copy
This is a Libvirt feature. Libvirt detects that the migration does not progress and responds by
activating the virtual machine on the destination host before all its memory has been copied.
Access to missing memory pages result in page faults that are satisfied from the source host.
To enable post-copy, set live_migration_permit_post_copy=true in nova.conf
and restart nova-compute. Do this on all compute hosts.
When post-copy is enabled, manual force-completion does not pause the instance but switches to
the post-copy process.
If live migrations routinely timeout or fail during cleanup operations due to the user token timing out,
consider configuring nova to use service user tokens.
OpenStack provides a number of different methods to interact with your guests: VNC, SPICE, Serial,
RDP or MKS. If configured, these can be accessed by users through the OpenStack dashboard or the
command line. This document outlines how these different technologies can be configured.
Overview
It is considered best practice to deploy only one of the consoles types and not all console types are
supported by all compute drivers. Regardless of what option is chosen, a console proxy service is
required. These proxy services are responsible for the following:
• Provide a bridge between the public network where the clients live and the private network where
the servers with consoles live.
• Mediate token authentication.
• Transparently handle hypervisor-specific connection details to provide a uniform client experi-
ence.
For some combinations of compute driver and console driver, these proxy services are provided by the
hypervisor or another service. For all others, nova provides services to handle this proxying. Consider a
noVNC-based VNC console connection for example:
1. A user connects to the API and gets an access_url such as, http://ip:port/?
path=%3Ftoken%3Dxyz.
2. The user pastes the URL in a browser or uses it as a client parameter.
3. The browser or client connects to the proxy.
4. The proxy authorizes the token for the user, and maps the token to the private host and port of the
VNC server for an instance.
The compute host specifies the address that the proxy should use to connect through the vnc.
server_proxyclient_address option. In this way, the VNC proxy works as a bridge
between the public network and private host network.
5. The proxy initiates the connection to VNC server and continues to proxy until the session ends.
This means a typical deployment with noVNC-based VNC consoles will have the following components:
• One or more nova-novncproxy service. Supports browser-based noVNC clients. For simple
deployments, this service typically runs on the same machine as nova-api because it operates
as a proxy between the public network and the private compute host network.
• One or more nova-compute services. Hosts the instances for which consoles are provided.
VNC is a graphical console with wide support among many hypervisors and clients. noVNC provides
VNC support through a web browser.
Note: It has been reported that versions of noVNC older than 0.6 do not work with the
nova-novncproxy service.
If using non-US key mappings, you need at least noVNC 1.0.0 for a fix.
If using VMware ESX/ESXi hypervisors, you need at least noVNC 1.1.0 for a fix.
Configuration
To enable the noVNC VNC console service, you must configure both the nova-novncproxy service
and the nova-compute service. Most options are defined in the vnc group.
The nova-novncproxy service accepts the following options:
• daemon
• ssl_only
• source_is_ipv6
• cert
• key
• web
• console.ssl_ciphers
• console.ssl_minimum_version
• vnc.novncproxy_host
• vnc.novncproxy_port
If using the libvirt compute driver and enabling VNC proxy security, the following additional options are
supported:
• vnc.auth_schemes
• vnc.vencrypt_client_key
• vnc.vencrypt_client_cert
• vnc.vencrypt_ca_certs
For example, to configure this via a nova-novncproxy.conf file:
[vnc]
novncproxy_host = 0.0.0.0
novncproxy_port = 6082
Note: This doesnt show configuration with security. For information on how to configure this, refer to
VNC proxy security below.
The nova-compute service requires the following options to configure noVNC-based VNC console
support:
• vnc.enabled
• vnc.novncproxy_base_url
• vnc.server_listen
• vnc.server_proxyclient_address
If using the VMware compute driver, the following additional options are supported:
• vmware.vnc_port
• vmware.vnc_port_total
For example, to configure this via a nova.conf file:
[vnc]
enabled = True
novncproxy_base_url = http://IP_ADDRESS:6082/vnc_auto.html
server_listen = 127.0.0.1
server_proxyclient_address = 127.0.0.1
Replace IP_ADDRESS with the IP address from which the proxy is accessible by the outside world.
For example, this may be the management interface IP address of the controller or the VIP.
Deploy the public-facing interface of the VNC proxy with HTTPS to prevent attacks from malicious
parties on the network between the tenant user and proxy server. When using HTTPS, the TLS encryp-
tion only applies to data between the tenant user and proxy server. The data between the proxy server
and Compute node instance will still be unencrypted. To provide protection for the latter, it is necessary
to enable the VeNCrypt authentication scheme for VNC in both the Compute nodes and noVNC proxy
server hosts.
Ensure each Compute node running QEMU/KVM with libvirt has a set of certificates issued to it. The
following is a list of the required certificates:
• /etc/pki/libvirt-vnc/server-cert.pem
An x509 certificate to be presented by the VNC server. The CommonName should match the
primary hostname of the compute node. Use of subjectAltName is also permitted if there
is a need to use multiple hostnames or IP addresses to access the same Compute node.
• /etc/pki/libvirt-vnc/server-key.pem
The private key used to generate the server-cert.pem file.
• /etc/pki/libvirt-vnc/ca-cert.pem
The authority certificate used to sign server-cert.pem and sign the VNC proxy server cer-
tificates.
The certificates must have v3 basic constraints2 present to indicate the permitted key use and purpose
data.
We recommend using a dedicated certificate authority solely for the VNC service. This authority may
be a child of the master certificate authority used for the OpenStack deployment. This is because libvirt
does not currently have a mechanism to restrict what certificates can be presented by the proxy server.
For further details on certificate creation, consult the QEMU manual page documentation on VNC server
certificate setup1 .
Configure libvirt to enable the VeNCrypt authentication scheme for the VNC server. In /etc/
libvirt/qemu.conf, uncomment the following settings:
• vnc_tls=1
This instructs libvirt to enable the VeNCrypt authentication scheme when launching QEMU, pass-
ing it the certificates shown above.
• vnc_tls_x509_verify=1
This instructs QEMU to require that all VNC clients present a valid x509 certificate. Assuming a
dedicated certificate authority is used for the VNC service, this ensures that only approved VNC
proxy servers can connect to the Compute nodes.
After editing qemu.conf, the libvirtd service must be restarted:
2
https://tools.ietf.org/html/rfc3280#section-4.2.1.10
1
https://qemu.weilnetz.de/doc/qemu-doc.html#vnc_005fsec_005fcertificate_005fverify
Changes will not apply to any existing running guests on the Compute node, so this configuration should
be done before launching any instances.
The noVNC proxy server initially only supports the none authentication scheme, which does no check-
ing. Therefore, it is necessary to enable the vencrypt authentication scheme by editing the nova.
conf file to set.
[vnc]
auth_schemes=vencrypt,none
[vnc]
vencrypt_client_key=/etc/pki/nova-novncproxy/client-key.pem
vencrypt_client_cert=/etc/pki/nova-novncproxy/client-cert.pem
vencrypt_ca_certs=/etc/pki/nova-novncproxy/ca-cert.pem
SPICE console
The VNC protocol is fairly limited, lacking support for multiple monitors, bi-directional audio, reliable
cut-and-paste, video streaming and more. SPICE is a new protocol that aims to address the limitations
in VNC and provide good remote desktop support.
SPICE support in OpenStack Compute shares a similar architecture to the VNC implementa-
tion. The OpenStack dashboard uses a SPICE-HTML5 widget in its console tab that com-
municates with the nova-spicehtml5proxy service by using SPICE-over-websockets. The
nova-spicehtml5proxy service communicates directly with the hypervisor process by using
SPICE.
Configuration
Important: VNC must be explicitly disabled to get access to the SPICE console. Set the vnc.
enabled option to False to disable the VNC console.
To enable the SPICE console service, you must configure both the nova-spicehtml5proxy service
and the nova-compute service. Most options are defined in the spice group.
The nova-spicehtml5proxy service accepts the following options.
• daemon
• ssl_only
• source_is_ipv6
• cert
• key
• web
• console.ssl_ciphers
• console.ssl_minimum_version
• spice.html5proxy_host
• spice.html5proxy_port
For example, to configure this via a nova-spicehtml5proxy.conf file:
[spice]
html5proxy_host = 0.0.0.0
html5proxy_port = 6082
The nova-compute service requires the following options to configure SPICE console support.
• spice.enabled
• spice.agent_enabled
• spice.html5proxy_base_url
• spice.server_listen
• spice.server_proxyclient_address
For example, to configure this via a nova.conf file:
[spice]
agent_enabled = False
enabled = True
html5proxy_base_url = http://IP_ADDRESS:6082/spice_auto.html
server_listen = 127.0.0.1
server_proxyclient_address = 127.0.0.1
Replace IP_ADDRESS with the IP address from which the proxy is accessible by the outside world.
For example, this may be the management interface IP address of the controller or the VIP.
Serial
Serial consoles provide an alternative to graphical consoles like VNC or SPICE. They work a little
differently to graphical consoles so an example is beneficial. The example below uses these nodes:
• controller node with IP 192.168.50.100
• compute node 1 with IP 192.168.50.104
• compute node 2 with IP 192.168.50.105
Heres the general flow of actions:
Browser/CLI/Client
1. 3.
nova-compute
10000
# nova.conf
[DEFAULT]
... my_ip=192.168.50.104
2. 4. [serial_console]
nova-api nova-serialproxy enabled=true
port_range=10000:20000
20000 base_url=ws://192.168.50.100:6083
# nova.conf proxyclient_address=192.168.50.104
[DEFAULT]
my_ip=192.168.50.100
[serial_console]
enabled=true nova-compute
serialproxy_host=192.168.50.100 10000
serialproxy_port=6083 # nova.conf
[DEFAULT]
... my_ip=192.168.50.105
[serial_console]
enabled=true
port_range=10000:20000
20000 base_url=ws://192.168.50.100:6083
proxyclient_address=192.168.50.105
1. The user requests a serial console connection string for an instance from the REST API.
2. The nova-api service asks the nova-compute service, which manages that instance, to fulfill
that request.
3. That connection string gets used by the user to connect to the nova-serialproxy service.
4. The nova-serialproxy service then proxies the console interaction to the port of the com-
pute node where the instance is running. That port gets forwarded by the hypervisor (or ironic
conductor, for ironic) to the guest.
Configuration
To enable the serial console service, you must configure both the nova-serialproxy service and
the nova-compute service. Most options are defined in the serial_console group.
The nova-serialproxy service accepts the following options.
• daemon
• ssl_only
• source_is_ipv6
• cert
• key
• web
• console.ssl_ciphers
• console.ssl_minimum_version
• serial_console.serialproxy_host
• serial_console.serialproxy_port
For example, to configure this via a nova-serialproxy.conf file:
[serial_console]
serialproxy_host = 0.0.0.0
serialproxy_port = 6083
The nova-compute service requires the following options to configure serial console support.
• serial_console.enabled
• serial_console.base_url
• serial_console.proxyclient_address
• serial_console.port_range
For example, to configure this via a nova.conf file:
[serial_console]
enabled = True
base_url = ws://IP_ADDRESS:6083/
proxyclient_address = 127.0.0.1
port_range = 10000:20000
Replace IP_ADDRESS with the IP address from which the proxy is accessible by the outside world.
For example, this may be the management interface IP address of the controller or the VIP.
There are some things to keep in mind when configuring these options:
RDP
RDP is a graphical console primarily used with Hyper-V. Nova does not provide a console proxy service
for RDP - instead, an external proxy service, such as the wsgate application provided by FreeRDP-
WebConnect, should be used.
Configuration
To enable the RDP console service, you must configure both a console proxy service like wsgate and
the nova-compute service. All options for the latter service are defined in the rdp group.
Information on configuring an RDP console proxy service, such as wsgate, is not provided here.
However, more information can be found at cloudbase.it.
The nova-compute service requires the following options to configure RDP console support.
• rdp.enabled
• rdp.html5_proxy_base_url
For example, to configure this via a nova.conf file:
[rdp]
enabled = True
html5_proxy_base_url = https://IP_ADDRESS:6083/
Replace IP_ADDRESS with the IP address from which the proxy is accessible by the outside world.
For example, this may be the management interface IP address of the controller or the VIP.
MKS
MKS is the protocol used for accessing the console of a virtual machine running on VMware vSphere.
It is very similar to VNC. Due to the architecture of the VMware vSphere hypervisor, it is not necessary
to run a console proxy service.
Configuration
To enable the MKS console service, only the nova-compute service must be configured. All options
are defined in the mks group.
The nova-compute service requires the following options to configure MKS console support.
• mks.enabled
• mks.mksproxy_base_url
For example, to configure this via a nova.conf file:
[mks]
enabled = True
mksproxy_base_url = https://127.0.0.1:6090/
About nova-consoleauth
The now-removed nova-consoleauth service was previously used to provide a shared service to
manage token authentication that the client proxies outlined below could leverage. Token authentication
was moved to the database in 18.0.0 (Rocky) and the service was removed in 20.0.0 (Train).
# This is the address where the underlying vncserver (not the proxy)
# will listen for connections.
server_listen=192.168.1.2
Note: novncproxy_base_url uses a public IP; this is the URL that is ultimately returned to
clients, which generally do not have access to your private network. Your PROXYSERVER must
be able to reach server_proxyclient_address, because that is the address over which
the VNC connection is proxied.
• Q: My noVNC does not work with recent versions of web browsers. Why?
A: Make sure you have installed python-numpy, which is required to support a newer version
of the WebSocket protocol (HyBi-07+).
• Q: How do I adjust the dimensions of the VNC window image in the OpenStack dashboard?
A: These values are hard-coded in a Django HTML template. To alter them, edit the
_detail_vnc.html template file. The location of this file varies based on Linux distribution.
On Ubuntu 14.04, the file is at /usr/share/pyshared/horizon/dashboards/nova/
instances/templates/instances/_detail_vnc.html.
Modify the width and height options, as follows:
• Q: My noVNC connections failed with ValidationError: Origin header protocol does not
match. Why?
A: Make sure the base_url match your TLS setting. If you are using https console con-
nections, make sure that the value of novncproxy_base_url is set explicitly where the
nova-novncproxy service is running.
References
The Compute service must know the status of each compute node to effectively manage and use them.
This can include events like a user launching a new VM, the scheduler sending a request to a live node,
or a query to the ServiceGroup API to determine if a node is live.
When a compute worker running the nova-compute daemon starts, it calls the join API to join the
compute group. Any service (such as the scheduler) can query the groups membership and the status of
its nodes. Internally, the ServiceGroup client driver automatically updates the compute worker status.
By default, Compute uses the database driver to track if a node is live. In a compute worker, this driver
periodically sends a db update command to the database, saying Im OK with a timestamp. Compute
uses a pre-defined timeout (service_down_time) to determine if a node is dead.
The driver has limitations, which can be problematic depending on your environment. If a lot of compute
worker nodes need to be checked, the database can be put under heavy load, which can cause the timeout
to trigger, and a live node could incorrectly be considered dead. By default, the timeout is 60 seconds.
Reducing the timeout value can help in this situation, but you must also make the database update more
frequently, which again increases the database workload.
The database contains data that is both transient (such as whether the node is alive) and persistent (such
as entries for VM owners). With the ServiceGroup abstraction, Compute can treat each type separately.
The memcache ServiceGroup driver uses memcached, a distributed memory object caching system that
is used to increase site performance. For more details, see memcached.org.
To use the memcache driver, you must install memcached. You might already have it installed, as the
same driver is also used for the OpenStack Object Storage and OpenStack dashboard. To install mem-
cached, see the Environment -> Memcached section in the Installation Tutorials and Guides depending
on your distribution.
These values in the /etc/nova/nova.conf file are required on every node for the memcache driver:
# Timeout; maximum time since last check-in for up service (integer value).
# Helps to define whether a node is dead
service_down_time = 60
If you deploy Compute with a shared file system, you can use several methods to quickly recover from
a node failure. This section discusses manual recovery.
Evacuate instances
If a hardware malfunction or other error causes the cloud compute node to fail, you can use the nova
evacuate command to evacuate instances. See evacuate instances for more information on using the
command.
Manual recovery
3. Decide to which compute host to move the affected VM. Run this database command to move the
VM to that host:
4. If you use a hypervisor that relies on libvirt, such as KVM, update the libvirt.xml file in
/var/lib/nova/instances/[instance ID] with these changes:
• Change the DHCPSERVER value to the host IP address of the new compute host.
• Update the VNC IP to 0.0.0.0.
5. Reboot the VM:
Typically, the database update and openstack server reboot command recover a VM from a
failed host. However, if problems persist, try one of these actions:
• Use virsh to recreate the network filter configuration.
• Restart Compute services.
• Update the vm_state and power_state fields in the Compute database.
Sometimes when you run Compute with a shared file system or an automated configuration tool, files
on your compute node might use the wrong UID or GID. This UID or GID mismatch can prevent you
from running live migrations or starting virtual machines.
This procedure runs on nova-compute hosts, based on the KVM hypervisor:
1. Set the nova UID to the same number in /etc/passwd on all hosts. For example, set the UID
to 112.
Note: Choose UIDs or GIDs that are not in use for other users or groups.
2. Set the libvirt-qemu UID to the same number in the /etc/passwd file on all hosts. For
example, set the UID to 119.
3. Set the nova group to the same number in the /etc/group file on all hosts. For example, set
the group to 120.
4. Set the libvirtd group to the same number in the /etc/group file on all hosts. For example,
set the group to 119.
5. Stop the services on the compute node.
6. Change all files that the nova user or group owns. For example:
This section describes how to manage your cloud after a disaster and back up persistent storage volumes.
Backups are mandatory, even outside of disaster scenarios.
For a definition of a disaster recovery plan (DRP), see https://en.wikipedia.org/wiki/Disaster_Recovery_Plan.
A disk crash, network loss, or power failure can affect several components in your cloud architecture.
The worst disaster for a cloud is a power loss. A power loss affects these components:
• A cloud controller (nova-api, nova-conductor, nova-scheduler)
• A compute node (nova-compute)
• A storage area network (SAN) used by OpenStack Block Storage (cinder-volumes)
Before a power loss:
• Create an active iSCSI session from the SAN to the cloud controller (used for the
cinder-volumes LVMs VG).
• Create an active iSCSI session from the cloud controller to the compute node (managed by
cinder-volume).
• Create an iSCSI session for every volume (so 14 EBS volumes requires 14 iSCSI sessions).
• Create iptables or ebtables rules from the cloud controller to the compute node. This
allows access from the cloud controller to the running instance.
• Save the current state of the database, the current state of the running instances, and the attached
volumes (mount point, volume ID, volume status, etc), at least from the cloud controller to the
compute node.
Begin recovery
Warning: Do not add any steps or change the order of steps in this procedure.
1. Check the current relationship between the volume and its instance, so that you can recreate the
attachment.
Use the openstack volume list command to get this information. Note that the
openstack client can get volume information from OpenStack Block Storage.
2. Update the database to clean the stalled state. Do this for every volume by using these queries:
Important: Some instances completely reboot and become reachable, while some might stop at
the plymouth stage. This is expected behavior. DO NOT reboot a second time.
Instance state at this stage depends on whether you added an /etc/fstab entry for that volume.
Images built with the cloud-init package remain in a pending state, while others skip the missing
volume and start. You perform this step to ask Compute to reboot every instance so that the
stored state is preserved. It does not matter if not all instances come up successfully. For more
information about cloud-init, see help.ubuntu.com/community/CloudInit/.
4. If required, run the openstack server add volume command to reattach the volumes to
their respective instances. This example uses a file of listed volumes to reattach them:
#!/bin/bash
Instances that were stopped at the plymouth stage now automatically continue booting and start
normally. Instances that previously started successfully can now see the volume.
5. Log in to the instances with SSH and reboot them.
If some services depend on the volume or if a volume has an entry in fstab, you can now restart
the instance. Restart directly from the instance itself and not through nova:
# shutdown -r now
When you plan for and complete a disaster recovery, follow these tips:
• Use the errors=remount option in the fstab file to prevent data corruption.
In the event of an I/O error, this option prevents writes to the disk. Add this configu-
ration option into the cinder-volume server that performs the iSCSI connection to the
SAN and into the instances fstab files.
• Do not add the entry for the SANs disks to the cinder-volumes fstab file.
Some systems hang on that step, which means you could lose access to your cloud-
controller. To re-run the session manually, run this command before performing the
mount:
• On your instances, if you have the whole /home/ directory on the disk, leave a users directory
with the users bash files and the authorized_keys file instead of emptying the /home/
directory and mapping the disk on it.
This action enables you to connect to the instance without the volume attached, if you allow only
connections through public keys.
To reproduce the power loss, connect to the compute node that runs that instance and close the iSCSI
session. Do not detach the volume by using the openstack server remove volume command.
You must manually close the iSCSI session. This example closes an iSCSI session with the number 15:
# iscsiadm -m session -u -r 15
Warning: There is potential for data loss while running instances during this procedure. If you are
using Liberty or earlier, ensure you have the correct patch and set the options appropriately.
OpenStack clouds run on platforms that differ greatly in the capabilities that they provide. By default,
the Compute service seeks to abstract the underlying hardware that it runs on, rather than exposing
specifics about the underlying host platforms. This abstraction manifests itself in many ways. For ex-
ample, rather than exposing the types and topologies of CPUs running on hosts, the service exposes a
number of generic CPUs (virtual CPUs, or vCPUs) and allows for overcommitting of these. In a sim-
ilar manner, rather than exposing the individual types of network devices available on hosts, generic
software-powered network ports are provided. These features are designed to allow high resource uti-
lization and allows the service to provide a generic cost-effective and highly scalable cloud upon which
to build applications.
This abstraction is beneficial for most workloads. However, there are some workloads where determin-
ism and per-instance performance are important, if not vital. In these cases, instances can be expected to
deliver near-native performance. The Compute service provides features to improve individual instance
for these kind of workloads.
Important: In deployments older than Train, or in mixed Stein/Train deployments with a rolling
upgrade in progress, unless specifically enabled, live migration is not possible for instances
with a NUMA topology when using the libvirt driver. A NUMA topology may be specified explicitly or
can be added implicitly due to the use of CPU pinning or huge pages. Refer to bug #1289064 for more
information. As of Train, live migration of instances with a NUMA topology when using the libvirt
driver is fully supported.
The PCI passthrough feature in OpenStack allows full access and direct control of a physical PCI device
in guests. This mechanism is generic for any kind of PCI device, and runs with a Network Interface
Card (NIC), Graphics Processing Unit (GPU), or any other devices that can be attached to a PCI bus.
Correct driver installation is the only requirement for the guest to properly use the devices.
Some PCI devices provide Single Root I/O Virtualization and Sharing (SR-IOV) capabilities. When SR-
IOV is used, a physical device is virtualized and appears as multiple PCI devices. Virtual PCI devices
are assigned to the same or different guests. In the case of PCI passthrough, the full physical device is
assigned to only one guest and cannot be shared.
PCI devices are requested through flavor extra specs, specifically via the pci_passthrough:alias
flavor extra spec. This guide demonstrates how to enable PCI passthrough for a type of PCI device with
a vendor ID of 8086 and a product ID of 154d - an Intel X520 Network Adapter - by mapping them to
the alias a1. You should adjust the instructions for other devices with potentially different capabilities.
Note: For information on creating servers with SR-IOV network interfaces, refer to the Networking
Guide.
Limitations
• Attaching SR-IOV ports to existing servers was not supported until the 22.0.0 Victoria release.
Due to various bugs in libvirt and qemu we recommend to use at least libvirt version 6.0.0 and at
least qemu version 4.2.
• Cold migration (resize) of servers with SR-IOV devices attached was not supported until the 14.0.0
Newton release, see bug 1512800 for details.
Note: Nova only supports PCI addresses where the fields are restricted to the following maximum
value:
• domain - 0xFFFF
• bus - 0xFF
• slot - 0x1F
• function - 0x7
Nova will ignore PCI devices reported by the hypervisor if the address is outside of these ranges.
To enable PCI passthrough on an x86, Linux-based compute node, the following are required:
• VT-d enabled in the BIOS
• IOMMU enabled on the host OS, e.g. by adding the intel_iommu=on or amd_iommu=on
parameter to the kernel parameters
• Assignable PCIe devices
To enable PCI passthrough on a Hyper-V compute node, the following are required:
• Windows 10 or Windows / Hyper-V Server 2016 or newer
• VT-d enabled on the host
• Assignable PCI devices
In order to check the requirements above and if there are any assignable PCI devices, run the following
Powershell commands:
Start-BitsTransfer https://raw.githubusercontent.com/Microsoft/
,→Virtualization-Documentation/master/hyperv-samples/benarm-powershell/DDA/
,→survey-dda.ps1
.\survey-dda.ps1
If the compute node passes all the requirements, the desired assignable PCI devices to be disabled and
unmounted from the host, in order to be assignable by Hyper-V. The following can be read for more
details: Hyper-V PCI passthrough.
Configure nova-compute
Once PCI passthrough has been configured for the host, nova-compute must be configured to allow
the PCI device to pass through to VMs. This is done using the pci.passthrough_whitelist
option. For example, assuming our sample PCI device has a PCI address of 41:00.0 on each host:
[pci]
passthrough_whitelist = { "address": "0000:41:00.0" }
[pci]
passthrough_whitelist = { "vendor_id": "8086", "product_id": "154d" }
If using vendor and product IDs, all PCI devices matching the vendor_id and product_id are
added to the pool of PCI devices available for passthrough to VMs.
In addition, it is necessary to configure the pci.alias option, which is a JSON-style configuration
option that allows you to map a given device type, identified by the standard PCI vendor_id and
(optional) product_id fields, to an arbitrary name or alias. This alias can then be used to request a
PCI device using the pci_passthrough:alias flavor extra spec, as discussed previously. For our
sample device with a vendor ID of 0x8086 and a product ID of 0x154d, this would be:
[pci]
alias = { "vendor_id":"8086", "product_id":"154d", "device_type":"type-PF",
,→ "name":"a1" }
Its important to note the addition of the device_type field. This is necessary because this PCI device
supports SR-IOV. The nova-compute service categorizes devices into one of three types, depending
on the capabilities the devices report:
type-PF The device supports SR-IOV and is the parent or root device.
type-VF The device is a child device of a device that supports SR-IOV.
type-PCI The device does not support SR-IOV.
By default, it is only possible to attach type-PCI devices using PCI passthrough. If you wish to attach
type-PF or type-VF devices, you must specify the device_type field in the config option. If the
device was a device that did not support SR-IOV, the device_type field could be omitted.
Refer to pci.alias for syntax information.
Important: This option must also be configured on controller nodes. This is discussed later in this
document.
Configure nova-scheduler
[filter_scheduler]
enabled_filters = ...,PciPassthroughFilter
available_filters = nova.scheduler.filters.all_filters
Configure nova-api
It is necessary to also configure the pci.alias config option on the controller. This configuration
should match the configuration found on the compute nodes. For example:
[pci]
alias = { "vendor_id":"8086", "product_id":"154d", "device_type":"type-PF",
,→ "name":"a1", "numa_policy":"preferred" }
Refer to pci.alias for syntax information. Refer to Affinity for numa_policy information.
Once configured, restart the nova-api service.
Once the alias has been configured, it can be used for an flavor extra spec. For example, to request two
of the PCI devices referenced by alias a1, run:
For more information about the syntax for pci_passthrough:alias, refer to the documentation.
By default, the libvirt driver enforces strict NUMA affinity for PCI devices, be they PCI passthrough
devices or neutron SR-IOV interfaces. This means that by default a PCI device must be allocated from
the same host NUMA node as at least one of the instances CPUs. This isnt always necessary, however,
and you can configure this policy using the hw:pci_numa_affinity_policy flavor extra spec or
equivalent image metadata property. There are three possible values allowed:
required This policy means that nova will boot instances with PCI devices only if at least one of the
NUMA nodes of the instance is associated with these PCI devices. It means that if NUMA node
info for some PCI devices could not be determined, those PCI devices wouldnt be consumable by
the instance. This provides maximum performance.
socket This policy means that the PCI device must be affined to the same host socket as at least one of
the guest NUMA nodes. For example, consider a system with two sockets, each with two NUMA
nodes, numbered node 0 and node 1 on socket 0, and node 2 and node 3 on socket 1. There is
a PCI device affined to node 0. An PCI instance with two guest NUMA nodes and the socket
policy can be affined to either:
• node 0 and node 1
• node 0 and node 2
• node 0 and node 3
• node 1 and node 2
• node 1 and node 3
The instance cannot be affined to node 2 and node 3, as neither of those are on the same socket
as the PCI device. If the other nodes are consumed by other instances and only nodes 2 and 3 are
available, the instance will not boot.
preferred This policy means that nova-scheduler will choose a compute host with minimal con-
sideration for the NUMA affinity of PCI devices. nova-compute will attempt a best ef-
fort selection of PCI devices based on NUMA affinity, however, if this is not possible then
nova-compute will fall back to scheduling on a NUMA node that is not associated with the
PCI device.
legacy This is the default policy and it describes the current nova behavior. Usually we have information
about association of PCI devices with NUMA nodes. However, some PCI devices do not provide
such information. The legacy value will mean that nova will boot instances with PCI device if
either:
• The PCI device is associated with at least one NUMA nodes on which the instance will be
booted
• There is no information about PCI-NUMA affinity available
For example, to configure a flavor to use the preferred PCI NUMA affinity policy for any neutron
SR-IOV interfaces attached by the user:
You can also configure this for PCI passthrough devices by specifying the policy in the alias configura-
tion via pci.alias. For more information, refer to the documentation.
CPU topologies
The NUMA topology and CPU pinning features in OpenStack provide high-level control over how
instances run on hypervisor CPUs and the topology of virtual CPUs available to instances. These features
help minimize latency and maximize performance.
Important: In deployments older than Train, or in mixed Stein/Train deployments with a rolling
upgrade in progress, unless specifically enabled, live migration is not possible for instances
with a NUMA topology when using the libvirt driver. A NUMA topology may be specified explicitly or
can be added implicitly due to the use of CPU pinning or huge pages. Refer to bug #1289064 for more
information. As of Train, live migration of instances with a NUMA topology when using the libvirt
driver is fully supported.
Symmetric multiprocessing (SMP) SMP is a design found in many modern multi-core systems. In an
SMP system, there are two or more CPUs and these CPUs are connected by some interconnect.
This provides CPUs with equal access to system resources like memory and input/output ports.
Non-uniform memory access (NUMA) NUMA is a derivative of the SMP design that is found in many
multi-socket systems. In a NUMA system, system memory is divided into cells or nodes that are
associated with particular CPUs. Requests for memory on other nodes are possible through an
interconnect bus. However, bandwidth across this shared bus is limited. As a result, competition
for this resource can incur performance penalties.
Simultaneous Multi-Threading (SMT) SMT is a design complementary to SMP. Whereas CPUs in
SMP systems share a bus and some memory, CPUs in SMT systems share many more components.
CPUs that share components are known as thread siblings. All CPUs appear as usable CPUs on
the system and can execute workloads in parallel. However, as with NUMA, threads compete for
shared resources.
Non-Uniform I/O Access (NUMA I/O) In a NUMA system, I/O to a device mapped to a local memory
region is more efficient than I/O to a remote device. A device connected to the same socket provid-
ing the CPU and memory offers lower latencies for I/O operations due to its physical proximity.
This generally manifests itself in devices connected to the PCIe bus, such as NICs or vGPUs, but
applies to any device support memory-mapped I/O.
In OpenStack, SMP CPUs are known as cores, NUMA cells or nodes are known as sockets, and SMT
CPUs are known as threads. For example, a quad-socket, eight core system with Hyper-Threading
would have four sockets, eight cores per socket and two threads per core, for a total of 64 CPUs.
PCPU Resource class representing an amount of dedicated CPUs for a single guest.
VCPU Resource class representing a unit of CPU resources for a single guest approximating the pro-
cessing power of a single physical processor.
Important: The functionality described below is currently only supported by the libvirt/KVM and
Hyper-V driver. The Hyper-V driver may require some host configuration for this to work.
When running workloads on NUMA hosts, it is important that the vCPUs executing processes are on the
same NUMA node as the memory used by these processes. This ensures all memory accesses are local to
the node and thus do not consume the limited cross-node memory bandwidth, adding latency to memory
accesses. Similarly, large pages are assigned from memory and benefit from the same performance
improvements as memory allocated using standard pages. Thus, they also should be local. Finally, PCI
devices are directly associated with specific NUMA nodes for the purposes of DMA. Instances that use
PCI or SR-IOV devices should be placed on the NUMA node associated with these devices.
NUMA topology can exist on both the physical hardware of the host and the virtual hardware of the
instance. In OpenStack, when booting a process, the hypervisor driver looks at the NUMA topology
field of both the instance and the host it is being booted on, and uses that information to generate an
appropriate configuration.
By default, an instance floats across all NUMA nodes on a host. NUMA awareness can be enabled
implicitly through the use of huge pages or pinned CPUs or explicitly through the use of flavor extra
specs or image metadata. If the instance has requested a specific NUMA topology, compute will try to
pin the vCPUs of different NUMA cells on the instance to the corresponding NUMA cells on the host.
It will also expose the NUMA topology of the instance to the guest OS.
In all cases where NUMA awareness is used, the NUMATopologyFilter filter must be enabled.
Details on this filter are provided in Compute schedulers.
Caution: The NUMA node(s) used are normally chosen at random. However, if a PCI passthrough
or SR-IOV device is attached to the instance, then the NUMA node that the device is associated
with will be used. This can provide important performance improvements. However, booting a large
number of similar instances can result in unbalanced NUMA node usage. Care should be taken to
mitigate this issue. See this discussion for more details.
Caution: Inadequate per-node resources will result in scheduling failures. Resources that are
specific to a node include not only CPUs and memory, but also PCI and SR-IOV resources. It is not
possible to use multiple resources from different nodes without requesting a multi-node layout. As
such, it may be necessary to ensure PCI or SR-IOV resources are associated with the same NUMA
node or force a multi-node layout.
When used, NUMA awareness allows the operating system of the instance to intelligently schedule the
workloads that it runs and minimize cross-node memory bandwidth. To configure guest NUMA nodes,
you can use the hw:numa_nodes flavor extra spec. For example, to restrict an instances vCPUs to a
single host NUMA node, run:
$ openstack flavor set $FLAVOR --property hw:numa_nodes=1
Some workloads have very demanding requirements for memory access latency or bandwidth that ex-
ceed the memory bandwidth available from a single NUMA node. For such workloads, it is beneficial
to spread the instance across multiple host NUMA nodes, even if the instances RAM/vCPUs could the-
oretically fit on a single NUMA node. To force an instances vCPUs to spread across two host NUMA
nodes, run:
$ openstack flavor set $FLAVOR --property hw:numa_nodes=2
The allocation of instance vCPUs and memory from different host NUMA nodes can be configured.
This allows for asymmetric allocation of vCPUs and memory, which can be important for some work-
loads. You can configure the allocation of instance vCPUs and memory across each guest NUMA node
using the hw:numa_cpus.{id} and hw:numa_mem.{id} extra specs respectively. For example,
to spread the 6 vCPUs and 6 GB of memory of an instance across two NUMA nodes and create an
asymmetric 1:2 vCPU and memory mapping between the two nodes, run:
$ openstack flavor set $FLAVOR --property hw:numa_nodes=2
# configure guest node 0
$ openstack flavor set $FLAVOR \
--property hw:numa_cpus.0=0,1 \
(continues on next page)
Note: The {num} parameter is an index of guest NUMA nodes and may not correspond to host
NUMA nodes. For example, on a platform with two NUMA nodes, the scheduler may opt to place guest
NUMA node 0, as referenced in hw:numa_mem.0 on host NUMA node 1 and vice versa. Similarly,
the CPUs bitmask specified in the value for hw:numa_cpus.{num} refer to guest vCPUs and may
not correspond to host CPUs. As such, this feature cannot be used to constrain instances to specific host
CPUs or NUMA nodes.
Note: Hyper-V does not support asymmetric NUMA topologies, and the Hyper-V driver will not spawn
instances with such topologies.
For more information about the syntax for hw:numa_nodes, hw:numa_cpus.N and
hw:num_mem.N, refer to Extra Specs.
Important: The functionality described below is currently only supported by the libvirt/KVM driver
and requires some host configuration for this to work. Hyper-V does not support CPU pinning.
Note: There is no correlation required between the NUMA topology exposed in the instance and how
the instance is actually pinned on the host. This is by design. See this invalid bug for more information.
By default, instance vCPU processes are not assigned to any particular host CPU, instead, they float
across host CPUs like any other process. This allows for features like overcommitting of CPUs. In
heavily contended systems, this provides optimal system performance at the expense of performance
and latency for individual instances.
Some workloads require real-time or near real-time behavior, which is not possible with the latency
introduced by the default CPU policy. For such workloads, it is beneficial to control which host CPUs
are bound to an instances vCPUs. This process is known as pinning. No instance with pinned CPUs can
use the CPUs of another pinned instance, thus preventing resource contention between instances.
CPU pinning policies can be used to determine whether an instance should be pinned or not. They can
be configured using the hw:cpu_policy extra spec and equivalent image metadata property. There
are three policies: dedicated, mixed and shared (the default). The dedicated CPU policy is
used to specify that all CPUs of an instance should use pinned CPUs. To configure a flavor to use the
dedicated CPU policy, run:
This works by ensuring PCPU allocations are used instead of VCPU allocations. As such, it is also
possible to request this resource type explicitly. To configure this, run:
Note: It is not currently possible to request PCPU and VCPU resources in the same instance.
The shared CPU policy is used to specify that an instance should not use pinned CPUs. To configure
a flavor to use the shared CPU policy, run:
The mixed CPU policy is used to specify that an instance use pinned CPUs along with unpinned CPUs.
The instance pinned CPU could be specified in the hw:cpu_dedicated_mask or, if real-time is
enabled, in the hw:cpu_realtime_mask extra spec. For example, to configure a flavor to use the
mixed CPU policy with 4 vCPUs in total and the first 2 vCPUs as pinned CPUs, run:
To configure a flavor to use the mixed CPU policy with 4 vCPUs in total and the first 2 vCPUs as
pinned real-time CPUs, run:
Note: For more information about the syntax for hw:cpu_policy, hw:cpu_dedicated_mask,
hw:realtime_cpu and hw:cpu_realtime_mask, refer to Extra Specs
Note: For more information about real-time functionality, refer to the documentation.
It is also possible to configure the CPU policy via image metadata. This can be useful when packaging
applications that require real-time or near real-time behavior by ensuring instances created with a given
image are always pinned regardless of flavor. To configure an image to use the dedicated CPU policy,
run:
Note: For more information about image metadata, refer to the Image metadata guide.
Important: Flavor-based policies take precedence over image-based policies. For example, if a flavor
specifies a CPU policy of dedicated then that policy will be used. If the flavor specifies a CPU
policy of shared and the image specifies no policy or a policy of shared then the shared policy
will be used. However, the flavor specifies a CPU policy of shared and the image specifies a policy
of dedicated, or vice versa, an exception will be raised. This is by design. Image metadata is often
configurable by non-admin users, while flavors are only configurable by admins. By setting a shared
policy through flavor extra-specs, administrators can prevent users configuring CPU policies in images
and impacting resource utilization.
Important: The functionality described below requires the use of pinned instances and is therefore
currently only supported by the libvirt/KVM driver and requires some host configuration for this to
work. Hyper-V does not support CPU pinning.
When running pinned instances on SMT hosts, it may also be necessary to consider the impact that
thread siblings can have on the instance workload. The presence of an SMT implementation like Intel
Hyper-Threading can boost performance by up to 30% for some workloads. However, thread siblings
share a number of components and contention on these components can diminish performance for other
workloads. For this reason, it is also possible to explicitly request hosts with or without SMT.
To configure whether an instance should be placed on a host with SMT or not, a CPU thread policy may
be specified. For workloads where sharing benefits performance, you can request hosts with SMT. To
configure this, run:
This will ensure the instance gets scheduled to a host with SMT by requesting hosts that report the
HW_CPU_HYPERTHREADING trait. It is also possible to request this trait explicitly. To configure this,
run:
For other workloads where performance is impacted by contention for resources, you can request hosts
without SMT. To configure this, run:
This will ensure the instance gets scheduled to a host with SMT by requesting hosts that do not report
the HW_CPU_HYPERTHREADING trait. It is also possible to request this trait explicitly. To configure
this, run:
Finally, for workloads where performance is minimally impacted, you may use thread siblings if avail-
able and fallback to not using them if necessary. This is the default, but it can be set explicitly:
This does not utilize traits and, as such, there is no trait-based equivalent.
Note: For more information about the syntax for hw:cpu_thread_policy, refer to Extra Specs.
As with CPU policies, it also possible to configure the CPU thread policy via image metadata. This
can be useful when packaging applications that require real-time or near real-time behavior by ensuring
instances created with a given image are always pinned regardless of flavor. To configure an image to
use the require CPU policy, run:
Likewise, to configure an image to use the isolate CPU thread policy, run:
Finally, to configure an image to use the prefer CPU thread policy, run:
If the flavor does not specify a CPU thread policy then the CPU thread policy specified by the image (if
any) will be used. If both the flavor and image specify a CPU thread policy then they must specify the
same policy, otherwise an exception will be raised.
Note: For more information about image metadata, refer to the Image metadata guide.
Important: The functionality described below requires the use of pinned instances and is therefore
currently only supported by the libvirt/KVM driver and requires some host configuration for this to
work. Hyper-V does not support CPU pinning.
In addition to the work of the guest OS and applications running in an instance, there is a small amount
of overhead associated with the underlying hypervisor. By default, these overhead tasks - known collec-
tively as emulator threads - run on the same host CPUs as the instance itself and will result in a minor
performance penalty for the instance. This is not usually an issue, however, for things like real-time
instances, it may not be acceptable for emulator thread to steal time from instance CPUs.
Emulator thread policies can be used to ensure emulator threads are run on cores separate from those
used by the instance. There are two policies: isolate and share. The default is to run the emulator
threads on the same core. The isolate emulator thread policy is used to specify that emulator threads
for a given instance should be run on their own unique core, chosen from one of the host cores listed
in compute.cpu_dedicated_set. To configure a flavor to use the isolate emulator thread
policy, run:
The share policy is used to specify that emulator threads from a given instance should be run on the
pool of host cores listed in compute.cpu_shared_set if configured, else across all pCPUs of the
instance. To configure a flavor to use the share emulator thread policy, run:
compute. compute.
cpu_shared_set set cpu_shared_set unset
Pinned to all of the instances pC-
hw:emulator_treads_policy Pinned to all of the instances pC-
unset (default) PUs PUs
Pinned
hw:emulator_threads_policy to compute. Pinned to all of the instances pC-
= share cpu_shared_set PUs
Pinned to a single pCPU distinct
hw:emulator_threads_policy Pinned to a single pCPU distinct
= isolate from the instances pCPUs from the instances pCPUs
Note: For more information about the syntax for hw:emulator_threads_policy, refer to
hw:emulator_threads_policy.
Important: The functionality described below is currently only supported by the libvirt/KVM driver.
Note: Currently it also works with libvirt/QEMU driver but we dont recommend it in production use
cases. This is because vCPUs are actually running in one thread on host in qemu TCG (Tiny Code
Generator), which is the backend for libvirt/QEMU driver. Work to enable full multi-threading support
for TCG (a.k.a. MTTCG) is on going in QEMU community. Please see this MTTCG project page for
detail.
In addition to configuring how an instance is scheduled on host CPUs, it is possible to configure how
CPUs are represented in the instance itself. By default, when instance NUMA placement is not specified,
a topology of N sockets, each with one core and one thread, is used for an instance, where N corresponds
to the number of instance vCPUs requested. When instance NUMA placement is specified, the number
of sockets is fixed to the number of host NUMA nodes to use and the total number of instance CPUs is
split over these sockets.
Some workloads benefit from a custom topology. For example, in some operating systems, a different
license may be needed depending on the number of CPU sockets. To configure a flavor to use two
sockets, run:
Similarly, to configure a flavor to use one core and one thread, run:
Caution: If specifying all values, the product of sockets multiplied by cores multiplied by threads
must equal the number of instance vCPUs. If specifying any one of these values or the multiple of
two values, the values must be a factor of the number of instance vCPUs to prevent an exception. For
example, specifying hw:cpu_sockets=2 on a host with an odd number of cores fails. Similarly,
specifying hw:cpu_cores=2 and hw:cpu_threads=4 on a host with ten cores fails.
For more information about the syntax for hw:cpu_sockets, hw:cpu_cores and
hw:cpu_threads, refer to Extra Specs.
It is also possible to set upper limits on the number of sockets, cores, and threads used. Unlike the hard
values above, it is not necessary for this exact number to used because it only provides a limit. This
can be used to provide some flexibility in scheduling, while ensuring certain limits are not exceeded.
For example, to ensure no more than two sockets, eight cores and one thread are defined in the instance
topology, run:
For more information about the syntax for hw:cpu_max_sockets, hw:cpu_max_cores, and
hw:cpu_max_threads, refer to Extra Specs.
Applications are frequently packaged as images. For applications that prefer certain CPU topologies,
configure image metadata to hint that created instances should have a given topology regardless of flavor.
To configure an image to request a two-socket, four-core per socket topology, run:
To constrain instances to a given limit of sockets, cores or threads, use the max_ variants. To configure
an image to have a maximum of two sockets and a maximum of one thread, run:
The value specified in the flavor is treated as the absolute limit. The image limits are not permitted to
exceed the flavor limits, they can only be equal to or lower than what the flavor defines. By setting a
max value for sockets, cores, or threads, administrators can prevent users configuring topologies that
might, for example, incur an additional licensing fees.
For more information about image metadata, refer to the Image metadata guide.
Changed in version 20.0.0: Prior to 20.0.0 (Train), it was not necessary to explicitly configure hosts for
pinned instances. However, it was not possible to place pinned instances on the same host as unpinned
CPUs, which typically meant hosts had to be grouped into host aggregates. If this was not done, un-
pinned instances would continue floating across all enabled host CPUs, even those that some instance
CPUs were pinned to. Starting in 20.0.0, it is necessary to explicitly identify the host cores that should
be used for pinned instances.
Nova treats host CPUs used for unpinned instances differently from those used by pinned instances. The
former are tracked in placement using the VCPU resource type and can be overallocated, while the latter
are tracked using the PCPU resource type. By default, nova will report all host CPUs as VCPU inventory,
however, this can be configured using the compute.cpu_shared_set config option, to specify
which host CPUs should be used for VCPU inventory, and the compute.cpu_dedicated_set
config option, to specify which host CPUs should be used for PCPU inventory.
Consider a compute node with a total of 24 host physical CPU cores with hyperthreading enabled.
The operator wishes to reserve 1 physical CPU core and its thread sibling for host processing (not
for guest instance use). Furthermore, the operator wishes to use 8 host physical CPU cores and their
thread siblings for dedicated guest CPU resources. The remaining 15 host physical CPU cores and their
thread siblings will be used for shared guest vCPU usage, with an 8:1 allocation ratio for those physical
processors used for shared guest CPU resources.
The operator could configure nova.conf like so:
[DEFAULT]
cpu_allocation_ratio=8.0
[compute]
(continues on next page)
The virt driver will construct a provider tree containing a single resource provider representing the
compute node and report inventory of PCPU and VCPU for this single provider accordingly:
Instances using the dedicated CPU policy or an explicit PCPU resource request, PCPU inventory will
be consumed. Instances using the shared CPU policy, meanwhile, will consume VCPU inventory.
Note: PCPU and VCPU allocations are currently combined to calculate the value for the cores quota
class.
Hyper-V is configured by default to allow instances to span multiple NUMA nodes, regardless if the
instances have been configured to only span N NUMA nodes. This behaviour allows Hyper-V instances
to have up to 64 vCPUs and 1 TB of memory.
Checking NUMA spanning can easily be done by running this following PowerShell command:
(Get-VMHost).NumaSpanningEnabled
In order to disable this behaviour, the host will have to be configured to disable NUMA spanning. This
can be done by executing these following PowerShell commands:
The Virtual Machine Management Service (vmms) is responsible for managing the Hyper-V VMs. The
VMs will still run while the service is down or restarting, but they will not be manageable by the
nova-compute service. In order for the effects of the host NUMA spanning configuration to take
effect, the VMs will have to be restarted.
Hyper-V does not allow instances with a NUMA topology to have dynamic memory allocation turned
on. The Hyper-V driver will ignore the configured dynamic_memory_ratio from the given nova.
conf file when spawning instances with a NUMA topology.
Real Time
Enabling Real-Time
Currently the creation of real-time instances is only supported when using the libvirt compute driver
with a libvirt.virt_type of kvm or qemu. It requires extensive configuration of the host and this
document provides but a rough overview of the changes required. Configuration will vary depending on
your hardware, BIOS configuration, host and guest OS and application.
BIOS configuration
Configure your host BIOS as recommended in the rt-wiki page. The most important steps are:
• Disable power management, including CPU sleep states
• Disable SMT (hyper-threading) or any option related to logical processors
These are standard steps used in benchmarking as both sets of features can result in non-deterministic
behavior.
OS configuration
This is inherently specific to the distro used, however, there are some common steps:
• Install the real-time (preemptible) kernel (PREEMPT_RT_FULL) and real-time KVM modules
• Configure hugepages
• Isolate host cores to be used for instances from the kernel
• Disable features like CPU frequency scaling (e.g. P-States on Intel processors)
RHEL and RHEL-derived distros like CentOS provide packages in their repositories to accomplish. The
kernel-rt and kernel-rt-kvm packages will provide the real-time kernel and real-time KVM
module, respectively, while the tuned-profiles-realtime package will provide tuned profiles
to configure the host for real-time workloads. You should refer to your distro documentation for more
information.
Validation
Once your BIOS and the host OS have been configured, you can validate real-time readiness using the
hwlatdetect and rteval utilities. On RHEL and RHEL-derived hosts, you can install these using
the rt-tests package. More information about the rteval tool can be found here.
In this configuration, any non-real-time cores configured will have an implicit dedicated CPU pin-
ning policy applied. It is possible to apply a shared policy for these non-real-time cores by specifying
the mixed CPU pinning policy via the hw:cpu_policy extra spec. This can be useful to increase
resource utilization of the host. For example:
Finally, you can explicitly offload guest overhead processes to another host core using the
hw:emulator_threads_policy extra spec. For example:
Note: Emulator thread pinning requires additional host configuration. Refer to the documentation for
more information.
In addition to configuring the instance CPUs, it is also likely that you will need to configure guest huge
pages. For information on how to configure these, refer to the documentation
References
Huge pages
The huge page feature in OpenStack provides important performance improvements for applications that
are highly memory IO-bound.
Note: Huge pages may also be referred to hugepages or large pages, depending on the source. These
terms are synonyms.
Pages Physical memory is segmented into a series of contiguous regions called pages. Each page
contains a number of bytes, referred to as the page size. The system retrieves memory by accessing
entire pages, rather than byte by byte.
Translation Lookaside Buffer (TLB) A TLB is used to map the virtual addresses of pages to the phys-
ical addresses in actual memory. The TLB is a cache and is not limitless, storing only the most
recent or frequently accessed pages. During normal operation, processes will sometimes attempt
to retrieve pages that are not stored in the cache. This is known as a TLB miss and results in a
delay as the processor iterates through the pages themselves to find the missing address mapping.
Huge Pages The standard page size in x86 systems is 4 kB. This is optimal for general purpose comput-
ing but larger page sizes - 2 MB and 1 GB - are also available. These larger page sizes are known
as huge pages. Huge pages result in less efficient memory usage as a process will not generally
use all memory available in each page. However, use of huge pages will result in fewer overall
pages and a reduced risk of TLB misses. For processes that have significant memory requirements
or are memory intensive, the benefits of huge pages frequently outweigh the drawbacks.
Persistent Huge Pages On Linux hosts, persistent huge pages are huge pages that are reserved upfront.
The HugeTLB provides for the mechanism for this upfront configuration of huge pages. The
HugeTLB allows for the allocation of varying quantities of different huge page sizes. Allocation
can be made at boot time or run time. Refer to the Linux hugetlbfs guide for more information.
Transparent Huge Pages (THP) On Linux hosts, transparent huge pages are huge pages that are au-
tomatically provisioned based on process requests. Transparent huge pages are provisioned on a
best effort basis, attempting to provision 2 MB huge pages if available but falling back to 4 kB
small pages if not. However, no upfront configuration is necessary. Refer to the Linux THP guide
for more information.
Important: Huge pages may not be used on a host configured for file-backed memory. See File-backed
memory for details
Persistent huge pages are required owing to their guaranteed availability. However, persistent huge
pages are not enabled by default in most environments. The steps for enabling huge pages differ from
platform to platform and only the steps for Linux hosts are described here. On Linux hosts, the number
of persistent huge pages on the host can be queried by checking /proc/meminfo:
$ grep Huge /proc/meminfo
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
In this instance, there are 0 persistent huge pages (HugePages_Total) and 0 transparent huge pages
(AnonHugePages) allocated. Huge pages can be allocated at boot time or run time. Huge pages
require a contiguous area of memory - memory that gets increasingly fragmented the long a host is
running. Identifying contiguous areas of memory is an issue for all huge page sizes, but it is particularly
problematic for larger huge page sizes such as 1 GB huge pages. Allocating huge pages at boot time
will ensure the correct number of huge pages is always available, while allocating them at run time can
fail if memory has become too fragmented.
To allocate huge pages at run time, the kernel boot parameters must be extended to include some
huge page-specific parameters. This can be achieved by modifying /etc/default/grub and ap-
pending the hugepagesz, hugepages, and transparent_hugepages=never arguments to
GRUB_CMDLINE_LINUX. To allocate, for example, 2048 persistent 2 MB huge pages at boot time,
run:
# echo 'GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX hugepagesz=2M
,→hugepages=2048 transparent_hugepage=never"' > /etc/default/grub
$ grep GRUB_CMDLINE_LINUX /etc/default/grub
GRUB_CMDLINE_LINUX="..."
GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX hugepagesz=2M hugepages=2048
,→transparent_hugepage=never"
Important: Persistent huge pages are not usable by standard host OS processes. Ensure enough free,
non-huge page memory is reserved for these processes.
Reboot the host, then validate that huge pages are now available:
$ grep "Huge" /proc/meminfo
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
HugePages_Total: 2048
HugePages_Free: 2048
HugePages_Rsvd: 0
(continues on next page)
There are now 2048 2 MB huge pages totalling 4 GB of huge pages. These huge pages must be mounted.
On most platforms, this happens automatically. To verify that the huge pages are mounted, run:
In this instance, the huge pages are mounted at /dev/hugepages. This mount point varies from
platform to platform. If the above command did not return anything, the hugepages must be mounted
manually. To mount the huge pages at /dev/hugepages, run:
# mkdir -p /dev/hugepages
# mount -t hugetlbfs hugetlbfs /dev/hugepages
There are many more ways to configure huge pages, including allocating huge pages at run time, speci-
fying varying allocations for different huge page sizes, or allocating huge pages from memory affinitized
to different NUMA nodes. For more information on configuring huge pages on Linux hosts, refer to the
Linux hugetlbfs guide.
Important: The functionality described below is currently only supported by the libvirt/KVM driver.
Important: For performance reasons, configuring huge pages for an instance will implicitly result in
a NUMA topology being configured for the instance. Configuring a NUMA topology for an instance
requires enablement of NUMATopologyFilter. Refer to CPU topologies for more information.
By default, an instance does not use huge pages for its underlying memory. However, huge pages
can bring important or required performance improvements for some workloads. Huge pages must be
requested explicitly through the use of flavor extra specs or image metadata. To request an instance use
huge pages, you can use the hw:mem_page_size flavor extra spec:
Different platforms offer different huge page sizes. For example: x86-based platforms offer 2 MB and
1 GB huge page sizes. Specific huge page sizes can be also be requested, with or without a unit suffix.
The unit suffix must be one of: Kb(it), Kib(it), Mb(it), Mib(it), Gb(it), Gib(it), Tb(it), Tib(it), KB, KiB,
MB, MiB, GB, GiB, TB, TiB. Where a unit suffix is not provided, Kilobytes are assumed. To request an
instance to use 2 MB huge pages, run one of:
Enabling huge pages for an instance can have negative consequences for other instances by consuming
limited huge pages resources. To explicitly request an instance use small pages, run:
Note: Explicitly requesting any page size will still result in a NUMA topology being applied to the
instance, as described earlier in this document.
Finally, to leave the decision of huge or small pages to the compute driver, run:
For more information about the syntax for hw:mem_page_size, refer to hw:mem_page_size.
Applications are frequently packaged as images. For applications that require the IO performance im-
provements that huge pages provides, configure image metadata to ensure instances always request the
specific page size regardless of flavor. To configure an image to use 1 GB huge pages, run:
If the flavor specifies a numerical page size or a page size of small the image is not allowed to specify a
page size and if it does an exception will be raised. If the flavor specifies a page size of any or large
then any page size specified in the image will be used. By setting a small page size in the flavor,
administrators can prevent users requesting huge pages in flavors and impacting resource utilization. To
configure this page size, run:
Note: Explicitly requesting any page size will still result in a NUMA topology being applied to the
instance, as described earlier in this document.
For more information about image metadata, refer to the Image metadata guide.
Important: The functionality described below is only supported by the libvirt/KVM driver.
The virtual GPU feature in Nova allows a deployment to provide specific GPU types for instances using
physical GPUs that can provide virtual devices.
For example, a single Intel GVT-g or a NVIDIA GRID vGPU physical Graphics Processing Unit (pGPU)
can be virtualized as multiple virtual Graphics Processing Units (vGPUs) if the hypervisor supports the
hardware driver and has the capability to create guests using those virtual devices.
This feature is highly dependent on the version of libvirt and the physical devices present on the host.
In addition, the vendors vGPU driver software must be installed and configured on the host at the same
time.
Caveats are mentioned in the Caveats section.
To enable virtual GPUs, follow the steps below:
[devices]
enabled_vgpu_types = nvidia-35
If you want to support more than a single GPU type, you need to provide a separate configuration
section for each device. For example:
[devices]
enabled_vgpu_types = nvidia-35, nvidia-36
[vgpu_nvidia-35]
device_addresses = 0000:84:00.0,0000:85:00.0
[vgpu_nvidia-36]
device_addresses = 0000:86:00.0
where you have to define which physical GPUs are supported per GPU type.
If the same PCI address is provided for two different types, nova-compute will refuse to start and
issue a specific error in the logs.
To know which specific type(s) to mention, please refer to How to discover a GPU type.
Changed in version 21.0.0: Supporting multiple GPU types is only supported by the Ussuri release
and later versions.
2. Restart the nova-compute service.
Warning: Changing the type is possible but since existing physical GPUs cant address mul-
tiple guests having different types, that will make Nova return you a NoValidHost if existing
instances with the original type still exist. Accordingly, its highly recommended to instead
deploy the new type to new compute nodes that dont already have workloads and rebuild
instances on the nodes that need to change types.
Note: As of the Queens release, all hypervisors that support virtual GPUs only accept a single virtual
GPU per instance.
The enabled vGPU types on the compute hosts are not exposed to API users. Flavors configured for
vGPU support can be tied to host aggregates as a means to properly schedule those flavors onto the
compute hosts that support them. See Host aggregates for more information.
The nova-scheduler selects a destination host that has vGPU devices available by calling the Place-
ment API for a specific VGPU resource class provided by compute nodes.
Note: As of the Queens release, only the FilterScheduler scheduler driver uses the Placement API.
Virtual GPUs are seen as mediated devices. Physical PCI devices (the graphic card here) supporting
virtual GPUs propose mediated device (mdev) types. Since mediated devices are supported by the
Linux kernel through sysfs files after installing the vendors virtual GPUs driver software, you can see
the required properties as follows:
$ ls /sys/class/mdev_bus/*/mdev_supported_types
/sys/class/mdev_bus/0000:84:00.0/mdev_supported_types:
nvidia-35 nvidia-36 nvidia-37 nvidia-38 nvidia-39 nvidia-40 nvidia-
,→41 nvidia-42 nvidia-43 nvidia-44 nvidia-45
/sys/class/mdev_bus/0000:85:00.0/mdev_supported_types:
nvidia-35 nvidia-36 nvidia-37 nvidia-38 nvidia-39 nvidia-40 nvidia-
,→41 nvidia-42 nvidia-43 nvidia-44 nvidia-45
/sys/class/mdev_bus/0000:86:00.0/mdev_supported_types:
nvidia-35 nvidia-36 nvidia-37 nvidia-38 nvidia-39 nvidia-40 nvidia-
,→41 nvidia-42 nvidia-43 nvidia-44 nvidia-45
/sys/class/mdev_bus/0000:87:00.0/mdev_supported_types:
nvidia-35 nvidia-36 nvidia-37 nvidia-38 nvidia-39 nvidia-40 nvidia-
,→41 nvidia-42 nvidia-43 nvidia-44 nvidia-45
Note: The information below is only valid from the 19.0.0 Stein release. Before this release, inventories
and allocations related to a VGPU resource class are still on the root resource provider related to the
compute node. If upgrading from Rocky and using the libvirt driver, VGPU inventory and allocations are
moved to child resource providers that represent actual physical GPUs.
The examples you will see are using the osc-placement plugin for OpenStackClient. For details on
specific commands, see its documentation.
Here you can see a VGPU inventory on the child resource provider while other resource class
,→---+--------+
| 5294f726-33d5-472a-bef1-9e19bb41626d | vgpu2 | ACTIVE | private=10.
,→0.0.14, fd45:cdad:c431:0:f816:3eff:fe78:a748 | cirros-0.4.0-x86_64-
,→disk | vgpu |
| a6811fc2-cec8-4f1d-baea-e2c6339a9697 | vgpu1 | ACTIVE | private=10.
,→0.0.34, fd45:cdad:c431:0:f816:3eff:fe54:cc8f | cirros-0.4.0-x86_64-
,→disk | vgpu |
+--------------------------------------+-------+--------+-------------
,→--------------------------------------------+-----------------------
,→---+--------+
In this example, two servers were created using a flavor asking for 1 VGPU, so when
looking at the allocations for each consumer UUID (which is the server UUID), you
can see that VGPU allocation is against the child resource provider while other alloca-
tions are for the root resource provider. Here, that means that the virtual GPU used by
Since operators want to support different GPU types per compute, it would be nice to have flavors asking
for a specific GPU type. This is now possible using custom traits by decorating child Resource Providers
that correspond to physical GPUs.
Note: Possible improvements in a future release could consist of providing automatic tagging of Re-
source Providers with standard traits corresponding to versioned mapping of public GPU types. For the
moment, this has to be done manually.
In this case, the trait CUSTOM_NVIDIA_11 will be added to the Resource Provider with the
UUID e2f8607b-0683-4141-a8af-f5e20682e28c that corresponds to the PCI address
0000:85:00:0 as shown above.
4. Amend the flavor to add a requested trait
In this example, we add the CUSTOM_NVIDIA_11 trait as a required information for the
vgpu_1 flavor we created earlier.
This will allow the Placement service to only return the Resource Providers matching this trait so
only the GPUs that were decorated with will be checked for this flavor.
Caveats
Note: This information is correct as of the 17.0.0 Queens release. Where improvements have been
made or issues fixed, they are noted per item.
• Suspending a guest that has vGPUs doesnt yet work because of a libvirt limitation (it cant hot-
unplug mediated devices from a guest). Workarounds using other instance actions (like snapshot-
ting the instance or shelving it) are recommended until libvirt gains mdev hot-unplug support. If a
user attempts to suspend the instance, the libvirt driver will raise an exception that will cause the
instance to be set back to ACTIVE. The suspend action in the os-instance-actions API
will have an Error state.
• Resizing an instance with a new flavor that has vGPU resources doesnt allocate those vGPUs to
the instance (the instance is created without vGPU resources). The proposed workaround is to
rebuild the instance after resizing it. The rebuild operation allocates vGPUS to the instance.
Changed in version 21.0.0: This has been resolved in the Ussuri release. See bug 1778563.
• Cold migrating an instance to another host will have the same problem as resize. If you want to
migrate an instance, make sure to rebuild it after the migration.
Changed in version 21.0.0: This has been resolved in the Ussuri release. See bug 1778563.
• Rescue images do not use vGPUs. An instance being rescued does not keep its vGPUs during
rescue. During that time, another instance can receive those vGPUs. This is a known issue. The
recommended workaround is to rebuild an instance immediately after rescue. However, rebuilding
the rescued instance only helps if there are other free vGPUs on the host.
Changed in version 18.0.0: This has been resolved in the Rocky release. See bug 1762688.
For nested vGPUs:
Note: This information is correct as of the 21.0.0 Ussuri release. Where improvements have been made
or issues fixed, they are noted per item.
• If creating servers with a flavor asking for vGPUs and the user wants multi-create (i.e. say max
2) then the scheduler could be returning a NoValidHosts exception even if each physical GPU can
support at least one specific instance, if the total wanted capacity is not supported by only one
physical GPU. (See bug 1874664.)
For example, creating servers with a flavor asking for vGPUs, if two children RPs have 4 vGPU
inventories each:
– You can ask for a flavor with 2 vGPU with max 2.
– But you cant ask for a flavor with 4 vGPU and max 2.
File-backed memory
Important: As of the 18.0.0 Rocky release, the functionality described below is only supported by the
libvirt/KVM driver.
The file-backed memory feature in Openstack allows a Nova node to serve guest memory from a file
backing store. This mechanism uses the libvirt file memory source, causing guest instance memory to
be allocated as files within the libvirt memory backing directory.
Since instance performance will be related to the speed of the backing store, this feature works best
when used with very fast block devices or virtual file systems - such as flash or RAM devices.
When configured, nova-compute will report the capacity configured for file-backed memory to place-
ment in place of the total system memory capacity. This allows the node to run more instances than
would normally fit within system memory.
When available in libvirt and qemu, instance memory will be discarded by qemu at shutdown by calling
madvise(MADV_REMOVE), to avoid flushing any dirty memory to the backing store on exit.
To enable file-backed memory, follow the steps below:
1. Configure the backing store
2. Configure Nova Compute for file-backed memory
Important: It is not possible to live migrate from a node running a version of OpenStack that does
not support file-backed memory to a node with file backed memory enabled. It is recommended that all
Nova compute nodes are upgraded to Rocky before enabling file-backed memory.
Libvirt File-backed memory requires libvirt version 4.0.0 or newer. Discard capability requires libvirt
version 4.4.0 or newer.
Qemu File-backed memory requires qemu version 2.6.0 or newer. Discard capability requires qemu
version 2.10.0 or newer.
Memory overcommit File-backed memory is not compatible with memory overcommit.
ram_allocation_ratio must be set to 1.0 in nova.conf, and the host must not
be added to a host aggregate with ram_allocation_ratio set to anything but 1.0.
Reserved memory When configured, file-backed memory is reported as total system memory to place-
ment, with RAM used as cache. Reserved memory corresponds to disk space not set aside for
file-backed memory. reserved_host_memory_mb should be set to 0 in nova.conf.
Huge pages File-backed memory is not compatible with huge pages. Instances with huge pages con-
figured will not start on a host with file-backed memory enabled. It is recommended to use host
aggregates to ensure instances configured for huge pages are not placed on hosts with file-backed
memory configured.
Handling these limitations could be optimized with a scheduler filter in the future.
Note: /dev/sdb and the ext4 filesystem are used here as an example. This will differ between
environments.
Note: /var/lib/libvirt/qemu/ram is the default location. The value can be set via
memory_backing_dir in /etc/libvirt/qemu.conf, and the mountpoint must match the
value configured there.
# mkfs.ext4 /dev/sdb
[libvirt]
file_backed_memory=1048576
Starting from microversion 2.72 nova supports creating servers with neutron ports having resource re-
quest visible as a admin-only port attribute resource_request. For example a neutron port has
resource request if it has a QoS minimum bandwidth rule attached.
The Quality of Service (QoS): Guaranteed Bandwidth document describes how to configure neutron to
use this feature.
Resource allocation
Nova collects and combines the resource request from each port in a boot request and sends one allo-
cation candidate request to placement during scheduling so placement will make sure that the resource
request of the ports are fulfilled. At the end of the scheduling nova allocates one candidate in placement.
Therefore the requested resources for each port from a single boot request will be allocated under the
servers allocation in placement.
Nova represents the resource request of each neutron port as a separate Granular Resource Request
group when querying placement for allocation candidates. When a server create request includes more
than one port with resource requests then more than one group will be used in the allocation candidate
query. In this case placement requires to define the group_policy. Today it is only possible via the
group_policy key of the flavor extra_spec. The possible values are isolate and none.
When the policy is set to isolate then each request group and therefore the resource request of
each neutron port will be fulfilled from separate resource providers. In case of neutron ports with
vnic_type=direct or vnic_type=macvtap this means that each port will use a virtual function
from different physical functions.
When the policy is set to none then the resource request of the neutron ports can be fulfilled
from overlapping resource providers. In case of neutron ports with vnic_type=direct or
vnic_type=macvtap this means the ports may use virtual functions from the same physical func-
tion.
For neutron ports with vnic_type=normal the group policy defines the collocation policy on OVS
bridge level so group_policy=none is a reasonable default value in this case.
If the group_policy is missing from the flavor then the server create request will fail with No valid
host was found and a warning describing the missing policy will be logged.
Dependencies
Note: NVDIMM support is present in the Linux Kernel v4.0 or newer. It is recommended to use Kernel
version 4.2 or later since NVDIMM support is enabled by default. We met some bugs in older versions,
and we have done all verification works with OpenStack on 4.18 version, so 4.18 version and newer will
probably guarantee its functionality.
$ ndctl list -X
[
{
...
"size":6440353792,
...
"name":"ns0",
...
},
{
...
"size":6440353792,
...
"name":"ns1",
...
},
{
...
"size":6440353792,
...
"name":"ns2",
...
},
{
...
"size":32210157568,
...
"name":"ns3",
...
}
]
[libvirt]
# pmem_namespaces=$LABEL:$NSNAME[|$NSNAME][,$LABEL:$NSNAME[|$NSNAME]]
pmem_namespaces = 6GB:ns0|ns1|ns2,LARGE:ns3
Configured PMEM namespaces must have already been created on the host as described above.
The conf syntax allows the admin to associate one or more namespace $NSNAMEs with an arbi-
trary $LABEL that can subsequently be used in a flavor to request one of those namespaces. It is
recommended, but not required, for namespaces under a single $LABEL to be the same size.
3. Restart the nova-compute service.
Nova will invoke ndctl to identify the configured PMEM namespaces, and report vPMEM re-
sources to placement.
Configure a flavor
Note: If a NUMA topology is specified, all vPMEM devices will be put on guest NUMA node 0;
otherwise nova will generate one NUMA node automatically for the guest.
Based on the above examples, an openstack server create request with my_flavor_large
will spawn an instance with two vPMEMs. One, corresponding to the LARGE label, will be ns3; the
other, corresponding to the 6G label, will be arbitrarily chosen from ns0, ns1, or ns2.
Note: When resizing an instance with vPMEMs, the vPMEM data wont be migrated.
Note: Inventories and allocations related to vPMEM resource classes are on the root resource provider
related to the compute node.
Here you can see the vPMEM resource classes prefixed with CUSTOM_PMEM_NAMESPACE_.
The LARGE label was configured with one namespace (ns3), so it has an inventory of
1. Since the 6GB label was configured with three namespaces (ns0, ns1, and ns2), the
CUSTOM_PMEM_NAMESPACE_6GB inventory has a total and max_unit of 3.
3. Check allocations for each server that is using vPMEMs
| ID | Name |
,→Status | Networks | Image | Flavor |
+--------------------------------------+----------------------+-------
,→-+-------------------+---------------+-----------------+
| 41d3e139-de5c-40fd-9d82-016b72f2ba1d | server-with-2-vpmems |
,→ACTIVE | private=10.0.0.24 | ubuntu-bionic | my_flavor_large |
| a616a7f6-b285-4adf-a885-dd8426dd9e6a | server-with-1-vpmem |
,→ACTIVE | private=10.0.0.13 | ubuntu-bionic | my_flavor |
+--------------------------------------+----------------------+-------
,→-+-------------------+---------------+-----------------+
,→ |
(continues on next page)
,→ u'CUSTOM_PMEM_NAMESPACE_LARGE': 1} |
+--------------------------------------+------------+-----------------
,→--------------------------------------------------------------------
,→-----------------------------------+
Enabling vTPM
The following are required on each compute host wishing to support the vTPM feature:
• Currently vTPM is only supported when using the libvirt compute driver with a libvirt.
virt_type of kvm or qemu.
• A key manager service, such as barbican, must be configured to store secrets used to encrypt the
virtual device files at rest.
• The swtpm binary and associated libraries.
• Set the libvirt.swtpm_enabled config option to True. This will enable support for both
TPM version 1.2 and 2.0.
With the above requirements satisfied, verify vTPM support by inspecting the traits on the compute
nodes resource provider:
A vTPM can be requested on a server via flavor extra specs or image metadata properties. There are two
versions supported - 1.2 and 2.0 - and two models - TPM Interface Specification (TIS) and Command-
Response Buffer (CRB). The CRB model is only supported with version 2.0.
For example, to configure a flavor to use the TPM 2.0 with the CRB model:
Scheduling will fail if flavor and image supply conflicting values, or if model tpm-crb is requested
with version 1.2.
Upon successful boot, the server should see a TPM device such as /dev/tpm0 which can be used in
the same manner as a hardware TPM.
Limitations
• Only server operations performed by the server owner are supported, as the users credentials are
required to unlock the virtual device files on the host. Thus the admin may need to decide whether
to grant the user additional policy roles; if not, those operations are effectively disabled.
• Live migration, evacuation, shelving and rescuing of servers with vTPMs is not currently sup-
ported.
Security
With a hardware TPM, the root of trust is a secret known only to the TPM user. In contrast, an emulated
TPM comprises a file on disk which the libvirt daemon must be able to present to the guest. At rest,
this file is encrypted using a passphrase stored in a key manager service. The passphrase in the key
manager is associated with the credentials of the owner of the server (the user who initially created it).
The passphrase is retrieved and used by libvirt to unlock the emulated TPM data any time the server is
booted.
Although the above mechanism uses a libvirt secret that is both private (cant be displayed via the
libvirt API or virsh) and ephemeral (exists only in memory, never on disk), it is theoretically
possible for a sufficiently privileged user to retrieve the secret and/or vTPM data from memory.
A full analysis and discussion of security issues related to emulated TPM is beyond the scope of this
document.
References
UEFI
Enabling UEFI
Currently the configuration of UEFI guest bootloaders is only supported when using the libvirt compute
driver with a libvirt.virt_type of kvm or qemu or when using the Hyper-V compute driver
with certain machine types. When using the libvirt compute driver with AArch64-based guests, UEFI is
automatically enabled as AArch64 does not support BIOS.
Todo: Update this once compute drivers start reporting a trait indicating UEFI bootloader support.
Libvirt
UEFI support is enabled by default on AArch64-based guests. For other guest architectures, you can
request UEFI support with libvirt by setting the hw_firmware_type image property to uefi. For
example:
Hyper-V
It is not possible to explicitly request UEFI support with Hyper-V. Rather, it is enabled implicitly when
using Generation 2 guests. You can request a Generation 2 guest by setting the hw_machine_type
image metadata property to hyperv-gen2. For example:
References
Secure Boot
Currently the configuration of UEFI guest bootloaders is only supported when using the libvirt compute
driver with a libvirt.virt_type of kvm or qemu or when using the Hyper-V compute driver with
certain machine types. In both cases, it requires the guests also be configured with a UEFI bootloader.
With these requirements satisfied, you can verify UEFI Secure Boot support by inspecting the traits on
the compute nodes resource provider:
| COMPUTE_SECURITY_UEFI_SECURE_BOOT |
Configuring UEFI Secure Boot for guests varies depending on the compute driver in use. In all cases,
a UEFI guest bootloader must be configured for the guest but there are also additional requirements
depending on the compute driver in use.
Libvirt
As the name would suggest, UEFI Secure Boot requires that a UEFI bootloader be configured for guests.
When this is done, UEFI Secure Boot support can be configured using the os:secure_boot extra
spec or equivalent image metadata property. For example, to configure an image that meets both of these
requirements:
Note: On x86_64 hosts, enabling secure boot also requires configuring use of the Q35 machine type.
This can be configured on a per-guest basis using the hw_machine_type image metadata property or
automatically for all guests created on a host using the libvirt.hw_machine_type config option.
It is also possible to explicitly request that secure boot be disabled. This is the default behavior, so this
request is typically useful when an admin wishes to explicitly prevent a user requesting secure boot by
uploading their own image with relevant image properties. For example, to disable secure boot via the
flavor:
Finally, it is possible to request that secure boot be enabled if the host supports it. This is only possible
via the image metadata property. When this is requested, secure boot will only be enabled if the host
supports this feature and the other constraints, namely that a UEFI guest bootloader is configured, are
met. For example:
Note: If both the image metadata property and flavor extra spec are provided, they must match. If they
do not, an error will be raised.
Hyper-V
Like libvirt, configuring a guest for UEFI Secure Boot support also requires that it be configured with
a UEFI bootloader: As noted in UEFI, it is not possible to do this explicitly in Hyper-V. Rather, you
should configure the guest to use the Generation 2 machine type. In addition to this, the Hyper-V
compute driver also requires that the OS type be configured.
When both of these constraints are met, you can configure UEFI Secure Boot support using the
os:secure_boot extra spec or equivalent image metadata property. For example, to configure an
image that meets all the above requirements:
As with the libvirt driver, it is also possible to request that secure boot be disabled. This is the default
behavior, so this is typically useful when an admin wishes to explicitly prevent a user requesting secure
boot. For example, to disable secure boot via the flavor:
However, unlike the libvirt driver, the Hyper-V driver does not respect the optional value for the
image metadata property. If this is configured, it will be silently ignored.
References
• Allow Secure Boot (SB) for QEMU- and KVM-based guests (spec)
• Securing Secure Boot with System Management Mode
• Generation 2 virtual machine security settings for Hyper-V
Enabling SEV
First the operator will need to ensure the following prerequisites are met:
• Currently SEV is only supported when using the libvirt compute driver with a libvirt.
virt_type of kvm or qemu.
• At least one of the Nova compute hosts must be AMD hardware capable of supporting SEV. It
is entirely possible for the compute plane to be a mix of hardware which can and cannot support
SEV, although as per the section on Permanent limitations below, the maximum number of simul-
taneously running guests with SEV will be limited by the quantity and quality of SEV-capable
hardware available.
In order for users to be able to use SEV, the operator will need to perform the following steps:
• Ensure that sufficient memory is reserved on the SEV compute hosts for host-level services to
function correctly at all times. This is particularly important when hosting SEV-enabled guests,
since they pin pages in RAM, preventing any memory overcommit which may be in normal oper-
ation on other compute hosts.
It is recommended to achieve this by configuring an rlimit at the /machine.slice top-level
cgroup on the host, with all VMs placed inside that. (For extreme detail, see this discussion on
the spec.)
An alternative approach is to configure the reserved_host_memory_mb option in the
[DEFAULT] section of nova.conf, based on the expected maximum number of SEV guests
simultaneously running on the host, and the details provided in an earlier version of the AMD
SEV spec regarding memory region sizes, which cover how to calculate it correctly.
See the Memory Locking and Accounting section of the AMD SEV spec and previous discussion
for further details.
• A cloud administrator will need to define one or more SEV-enabled flavors as described below,
unless it is sufficient for users to define SEV-enabled images.
Additionally the cloud operator should consider the following optional steps:
• Configure the libvirt.num_memory_encrypted_guests option in nova.conf to rep-
resent the number of guests an SEV compute node can host concurrently with memory encrypted
at the hardware level. For example:
[libvirt]
num_memory_encrypted_guests = 15
This option exists because on AMD SEV-capable hardware, the memory controller has a fixed
number of slots for holding encryption keys, one per guest. For example, at the time of writing,
earlier generations of hardware only have 15 slots, thereby limiting the number of SEV guests
which can be run concurrently to 15. Nova needs to track how many slots are available and used
in order to avoid attempting to exceed that limit in the hardware.
At the time of writing (September 2019), work is in progress to allow QEMU and libvirt to expose
the number of slots available on SEV hardware; however until this is finished and released, it will
not be possible for Nova to programmatically detect the correct value.
So this configuration option serves as a stop-gap, allowing the cloud operator the option of pro-
viding this value manually. It may later be demoted to a fallback value for cases where the limit
cannot be detected programmatically, or even removed altogether when Novas minimum QEMU
version guarantees that it can always be detected.
Note: When deciding whether to use the default of None or manually impose a limit, operators
should carefully weigh the benefits vs. the risk. The benefits of using the default are a) immediate
convenience since nothing needs to be done now, and b) convenience later when upgrading com-
pute hosts to future versions of Nova, since again nothing will need to be done for the correct limit
to be automatically imposed. However the risk is that until auto-detection is implemented, users
may be able to attempt to launch guests with encrypted memory on hosts which have already
reached the maximum number of guests simultaneously running with encrypted memory. This
risk may be mitigated by other limitations which operators can impose, for example if the small-
est RAM footprint of any flavor imposes a maximum number of simultaneously running guests
which is less than or equal to the SEV limit.
Caution: Consider carefully whether to set this option. It is particularly important since
a limitation of the implementation prevents the user from receiving an error message with a
helpful explanation if they try to boot an SEV guest when neither this configuration option nor
the image property are set to select a q35 machine type.
On the other hand, setting it to q35 may have other undesirable side-effects on other images
which were expecting to be booted with pc, so it is suggested to set it on a single compute
node or aggregate, and perform careful testing of typical images before rolling out the setting
to all SEV-capable compute hosts.
Once an operator has covered the above steps, users can launch SEV instances either by requesting a
flavor for which the operator set the hw:mem_encryption extra spec to True, or by using an image
with the hw_mem_encryption property set to True. For example, to enable SEV for a flavor:
These do not inherently cause a preference for SEV-capable hardware, but for now SEV is the only way
of fulfilling the requirement for memory encryption. However in the future, support for other hardware-
level guest memory encryption technology such as Intel MKTME may be added. If a guest specifically
needs to be booted using SEV rather than any other memory encryption technology, it is possible to
ensure this by setting the trait{group}:HW_CPU_X86_AMD_SEV extra spec or equivalent image
metadata property to required.
In all cases, SEV instances can only be booted from images which have the hw_firmware_type
property set to uefi, and only when the machine type is set to q35. This can be set per image by setting
the image property hw_machine_type=q35, or per compute node by the operator via libvirt.
hw_machine_type as explained above.
Limitations
Impermanent limitations
The following limitations may be removed in the future as the hardware, firmware, and various layers of
software receive new features:
• SEV-encrypted VMs cannot yet be live-migrated or suspended, therefore they will need to be fully
shut down before migrating off an SEV host, e.g. if maintenance is required on the host.
• SEV-encrypted VMs cannot contain directly accessible host devices (PCI passthrough). So for
example mdev vGPU support will not currently work. However technologies based on vhost-user
should work fine.
• The boot disk of SEV-encrypted VMs can only be virtio. (virtio-blk is typically the
default for libvirt disks on x86, but can also be explicitly set e.g. via the image property
hw_disk_bus=virtio). Valid alternatives for the disk include using hw_disk_bus=scsi
with hw_scsi_model=virtio-scsi , or hw_disk_bus=sata.
• QEMU and libvirt cannot yet expose the number of slots available for encrypted guests in the
memory controller on SEV hardware. Until this is implemented, it is not possible for Nova to
programmatically detect the correct value. As a short-term workaround, operators can optionally
manually specify the upper limit of SEV guests for each compute host, via the new libvirt.
num_memory_encrypted_guests configuration option described above.
Permanent limitations
Non-limitations
For the sake of eliminating any doubt, the following actions are not expected to be limited when SEV
encryption is used:
• Cold migration or shelve, since they power off the VM before the operation at which point there
is no encrypted memory (although this could change since there is work underway to add support
for PMEM)
• Snapshot, since it only snapshots the disk
• nova evacuate (despite the name, more akin to resurrection than evacuation), since this is
only initiated when the VM is no longer running
• Attaching any volumes, as long as they do not require attaching via an IDE bus
• Use of spice / VNC / serial / RDP consoles
• VM guest virtual NUMA (a.k.a. vNUMA)
References
In order to facilitate management of resource provider information in the Placement API, Nova provides
a method for admins to add custom inventory and traits to resource providers using YAML files.
Note: Only CUSTOM_* resource classes and traits may be managed this way.
Placing Files
Nova-compute will search for *.yaml files in the path specified in compute.
provider_config_location. These files will be loaded and validated for errors on nova-compute
startup. If there are any errors in the files, nova-compute will fail to start up.
Administrators should ensure that provider config files have appropriate permissions and ownership. See
the specification and admin guide for more details.
Note: The files are loaded once at nova-compute startup and any changes or new files will not be
recognized until the next nova-compute startup.
Examples
Resource providers to target can be identified by either UUID or name. In addition, the value
$COMPUTE_NODE can be used in the UUID field to identify all nodes managed by the service.
If an entry does not include any additional inventory or traits, it will be logged at load time but otherwise
ignored. In the case of a resource provider being identified by both $COMPUTE_NODE and individual
UUID/name, the values in the $COMPUTE_NODE entry will be ignored for that provider only if the
explicit entry includes inventory or traits.
Note: In the case that a resource provider is identified more than once by explicit UUID/name, the nova-
compute service will fail to start. This is a global requirement across all supplied provider.yaml
files.
meta:
schema_version: '1.0'
providers:
- identification:
name: 'EXAMPLE_RESOURCE_PROVIDER'
# Additional valid identification examples:
# uuid: '$COMPUTE_NODE'
# uuid: '5213b75d-9260-42a6-b236-f39b0fd10561'
inventories:
additional:
- CUSTOM_EXAMPLE_RESOURCE_CLASS:
total: 100
reserved: 0
min_unit: 1
max_unit: 10
step_size: 1
allocation_ratio: 1.0
traits:
additional:
- 'CUSTOM_EXAMPLE_TRAIT'
Schema Example
type: object
properties:
# This property is used to track where the provider.yaml file originated.
# It is reserved for internal use and should never be set in a provider.
,→yaml
# file supplied by an end user.
__source_file:
not: {}
meta:
type: object
properties:
# Version ($Major, $minor) of the schema must successfully parse
# documents conforming to ($Major, 0..N). Any breaking schema change
# (e.g. removing fields, adding new required fields, imposing a
,→stricter
# pattern on a value, etc.) must bump $Major.
schema_version:
type: string
pattern: '^1\.([0-9]|[1-9][0-9]+)$'
required:
- schema_version
additionalProperties: true
providers:
type: array
items:
type: object
properties:
identification:
$ref: '#/provider_definitions/provider_identification'
inventories:
$ref: '#/provider_definitions/provider_inventories'
(continues on next page)
provider_definitions:
provider_identification:
# Identify a single provider to configure. Exactly one identification
# method should be used. Currently `uuid` or `name` are supported, but
# future versions may support others.
# The uuid can be set to the sentinel value `$COMPUTE_NODE` which will
# cause the consuming compute service to apply the configuration to
# to all compute node root providers it manages that are not otherwise
# specified using a uuid or name.
type: object
properties:
uuid:
oneOf:
# TODO(sean-k-mooney): replace this with type uuid when we can
,→depend
# on a version of the jsonschema lib that implements draft 8
,→or later
# of the jsonschema spec.
- type: string
pattern: '^[0-9A-Fa-f]{8}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-
,→Fa-f]{4}-[0-9A-Fa-f]{12}$'
- type: string
const: '$COMPUTE_NODE'
name:
type: string
minLength: 1
# This introduces the possibility of an unsupported key name being
,→used to
# get by schema validation, but is necessary to support forward
# compatibility with new identification methods. This should be checked
# after schema validation.
minProperties: 1
maxProperties: 1
additionalProperties: false
provider_inventories:
# Allows the admin to specify various adjectives to create and manage
# providers' inventories. This list of adjectives can be extended in
,→the
# future as the schema evolves to meet new use cases. As of v1.0, only
,→one
# adjective, `additional`, is supported.
type: object
properties:
additional:
type: array
items:
patternProperties:
(continues on next page)
additionalProperties: false
# This ensures only keys matching the pattern above are allowed
additionalProperties: false
additionalProperties: true
provider_traits:
# Allows the admin to specify various adjectives to create and manage
# providers' traits. This list of adjectives can be extended in the
# future as the schema evolves to meet new use cases. As of v1.0, only
,→one
# adjective, `additional`, is supported.
type: object
properties:
additional:
type: array
items:
# Allows any value matching the trait pattern here, additional
# validation will be done after schema validation.
type: string
pattern: '^[A-Z0-9_]{1,255}$'
additionalProperties: true
Note: When creating a provider.yaml config file it is recommended to use the schema provided
by nova to validate the config using a simple jsonschema validator rather than starting the nova compute
agent to enable faster iteration.
Resource Limits
Nova supports configuring limits on individual resources including CPU, memory, disk and network.
These limits can be used to enforce basic Quality-of-Service (QoS) policies on such resources.
Note: Hypervisor-enforced resource limits are distinct from API-enforced user and project quotas. For
information on the latter, refer to Manage quotas.
Warning: This feature is poorly tested and poorly maintained. It may no longer work as expected.
Where possible, consider using the QoS policies provided by other services, such as Cinder and
Neutron.
Resource quota enforcement support is specific to the virt driver in use on compute hosts.
libvirt
The libvirt driver supports CPU, disk and VIF limits. Unfortunately all of these work quite differently,
as discussed below.
CPU limits
Libvirt enforces CPU limits in terms of shares and quotas, configured via quota:cpu_shares and
quota:cpu_period / quota:cpu_quota, respectively. Both are implemented using the cgroups
v1 cpu controller.
CPU shares are a proportional weighted share of total CPU resources relative to other instances. It does
not limit CPU usage if CPUs are not busy. There is no unit and the value is purely relative to other
instances, so an instance configured with value of 2048 will get twice as much CPU time as a VM
configured with the value 1024. For example, to configure a CPU share of 1024 for a flavor:
The CPU quotas require both a period and quota. The CPU period specifies the enforcement interval in
microseconds, while the CPU quota specifies the maximum allowed bandwidth in microseconds that the
each vCPU of the instance can consume. The CPU period must be in the range 1000 (1mS) to 1,000,000
(1s) or 0 (disabled). The CPU quota must be in the range 1000 (1mS) to 2^64 or 0 (disabled). Where
the CPU quota exceeds the CPU period, this means the guest vCPU process is able to consume multiple
pCPUs worth of bandwidth. For example, to limit each guest vCPU to 1 pCPU worth of runtime per
period:
Finally, to limit each guest vCPU to 0.5 pCPUs worth of runtime per period:
Note: Smaller periods will ensure a consistent latency response at the expense of burst capacity.
CPU shares and CPU quotas can work hand-in-hand. For example, if two instances were configured with
quota:cpu_shares=1024 and quota:cpu_period=100000 (100mS) for both, then configuring
both with a quota:cpu_quota=75000 (75mS) will result in them sharing a host CPU equally, with
both getting exactly 50mS of CPU time. If instead only one instance gets quota:cpu_quota=75000
(75mS) while the other gets quota:cpu_quota=25000 (25mS), then the first will get 3/4 of the time
per period.
Memory Limits
Libvirt enforces disk limits through maximum disk read, write and total bytes per sec-
ond, using the quota:disk_read_bytes_sec, quota:disk_write_bytes_sec
and quota:disk_total_bytes_sec extra specs, respectively. It can also en-
force disk limits through maximum disk read, write and total I/O operations per sec-
ond, using the quota:disk_read_iops_sec, quota:disk_write_iops_sec and
quota:disk_total_iops_sec extra specs, respectively. For example, to set a maximum
disk write of 10 MB/sec for a flavor:
Warning: These limits are enforced via libvirt and will only work where the network is connect
to the instance using a tap interface. It will not work for things like SR-IOV VFs. Neutrons QoS
policies should be preferred wherever possible.
Libvirt enforces network bandwidth limits through inbound and outbound average, using the
quota:vif_inbound_average and quota:vif_outbound_average extra specs, re-
spectively. In addition, optional peak values, which specifies the maximum rate at which
a bridge can send data (kB/s), and burst values, which specifies the amount of bytes
that can be burst at peak speed (kilobytes), can be specified for both inbound and out-
bound traffic, using the quota:vif_inbound_peak / quota:vif_outbound_peak and
quota:vif_inbound_burst / quota:vif_outbound_burst extra specs, respectively.
For example, to configure outbound traffic to an average of 262 Mbit/s (32768 kB/s), a peak of 524
Mbit/s, and burst of 65536 kilobytes:
Note: The speed limit values in above example are specified in kilobytes/second, whle the burst value
is in kilobytes.
VMWare
In contrast to libvirt, the VMWare virt driver enforces resource limits using consistent terminol-
ogy, specifically through relative allocation levels, hard upper limits and minimum reservations
configured via, for example, the quota:cpu_shares_level / quota:cpu_shares_share,
quota:cpu_limit, and quota:cpu_reservation extra specs, respectively.
Allocation levels can be specified using one of high, normal, low, or custom. When custom is
specified, the number of shares must be specified using e.g. quota:cpu_shares_share. There
is no unit and the values are relative to other instances on the host. The upper limits and reservations,
by comparison, are measure in resource-specific units, such as MHz for CPUs and will ensure that the
instance never used more than or gets less than the specified amount of the resource.
CPU limits
To configure a minimum CPU allocation of 1024 MHz and a maximum of 2048 MHz:
Memory limits
To configure a minimum disk I/O allocation of 1024 MB and a maximum of 2048 MB:
To configure a minimum bandwidth allocation of 1024 Mbits/sec and a maximum of 2048 Mbits/sec:
Hyper-V
CPU limits
Memory limits
Hyper-V enforces disk limits through maximum total bytes and total I/O operations per second, us-
ing the quota:disk_total_bytes_sec and quota:disk_total_iops_sec extra specs,
respectively. For example, to set a maximum disk read/write of 10 MB/sec for a flavor:
Host aggregates
Host aggregates are a mechanism for partitioning hosts in an OpenStack cloud, or a region of an Open-
Stack cloud, based on arbitrary characteristics. Examples where an administrator may want to do this
include where a group of hosts have additional hardware or performance characteristics.
Host aggregates started out as a way to use Xen hypervisor resource pools, but have been generalized to
provide a mechanism to allow administrators to assign key-value pairs to groups of machines. Each node
can have multiple aggregates, each aggregate can have multiple key-value pairs, and the same key-value
pair can be assigned to multiple aggregates. This information can be used in the scheduler to enable
advanced scheduling, to set up Xen hypervisor resource pools or to define logical groups for migration.
Host aggregates are not explicitly exposed to users. Instead administrators map flavors to host ag-
gregates. Administrators do this by setting metadata on a host aggregate, and matching flavor extra
specifications. The scheduler then endeavors to match user requests for instances of the given flavor
to a host aggregate with the same key-value pair in its metadata. Compute nodes can be in more than
one host aggregate. Weight multipliers can be controlled on a per-aggregate basis by setting the desired
xxx_weight_multiplier aggregate metadata.
Administrators are able to optionally expose a host aggregate as an Availability Zone. Availability zones
are different from host aggregates in that they are explicitly exposed to the user, and hosts can only be
in a single availability zone. Administrators can configure a default availability zone where instances
will be scheduled when the user fails to specify one. For more information on how to do this, refer to
Availability Zones.
One common use case for host aggregates is when you want to support scheduling instances to a subset
of compute hosts because they have a specific capability. For example, you may want to allow users to
request compute hosts that have SSD drives if they need access to faster disk I/O, or access to compute
hosts that have GPU cards to take advantage of GPU-accelerated code.
To configure the scheduler to support host aggregates, the
filter_scheduler.enabled_filters configuration option must contain the
AggregateInstanceExtraSpecsFilter in addition to the other filters used by the scheduler.
Add the following line to nova.conf on the host that runs the nova-scheduler service to enable
host aggregates filtering, as well as the other filters that are typically enabled:
[filter_scheduler]
enabled_filters=...,AggregateInstanceExtraSpecsFilter
This example configures the Compute service to enable users to request nodes that have solid-state
drives (SSDs). You create a fast-io host aggregate in the nova availability zone and you add the
ssd=true key-value pair to the aggregate. Then, you add the node1, and node2 compute nodes to
it.
Use the openstack flavor create command to create the ssd.large flavor called with an ID
of 6, 8 GB of RAM, 80 GB root disk, and 4 vCPUs.
Once the flavor is created, specify one or more key-value pairs that match the key-value pairs on
the host aggregates with scope aggregate_instance_extra_specs. In this case, that is the
aggregate_instance_extra_specs:ssd=true key-value pair. Setting a key-value pair on a
flavor is done using the openstack flavor set command.
Once it is set, you should see the extra_specs property of the ssd.large flavor populated with a
key of ssd and a corresponding value of true.
Now, when a user requests an instance with the ssd.large flavor, the scheduler only considers hosts
with the ssd=true key-value pair. In this example, these are node1 and node2.
Aggregates in Placement
Aggregates also exist in placement and are not the same thing as host aggregates in nova. These aggre-
gates are defined (purely) as groupings of related resource providers. Since compute nodes in nova are
represented in placement as resource providers, they can be added to a placement aggregate as well. For
example, get the UUID of the compute node using openstack hypervisor list and add it to
an aggregate in placement using openstack resource provider aggregate set.
$ openstack --os-compute-api-version=2.53 hypervisor list
+--------------------------------------+---------------------+-------------
,→----+-----------------+-------+
| ID | Hypervisor Hostname | Hypervisor
,→Type | Host IP | State |
+--------------------------------------+---------------------+-------------
,→----+-----------------+-------+
| 815a5634-86fb-4e1e-8824-8a631fee3e06 | node1 | QEMU
,→ | 192.168.1.123 | up |
+--------------------------------------+---------------------+-------------
,→----+-----------------+-------+
Some scheduling filter operations can be performed by placement for increased speed and efficiency.
Note: The nova-api service attempts (as of nova 18.0.0) to automatically mirror the association of
a compute host with an aggregate when an administrator adds or removes a host to/from a nova host
aggregate. This should alleviate the need to manually create those association records in the placement
API using the openstack resource provider aggregate set CLI invocation.
In order to use placement to isolate tenants, there must be placement aggregates that match the mem-
bership and UUID of nova host aggregates that you want to use for isolation. The same key pattern
in aggregate metadata used by the AggregateMultiTenancyIsolation filter controls this function, and is
enabled by setting scheduler.limit_tenants_to_placement_aggregate to True.
$ openstack --os-compute-api-version=2.53 aggregate create myagg
+-------------------+--------------------------------------+
| Field | Value |
+-------------------+--------------------------------------+
| availability_zone | None |
| created_at | 2018-03-29T16:22:23.175884 |
| deleted | False |
| deleted_at | None |
| id | 4 |
(continues on next page)
Note that the filter_tenant_id metadata key can be optionally suffixed with any string for multi-
ple tenants, such as filter_tenant_id3=$tenantid.
Usage
Much of the configuration of host aggregates is driven from the API or command-line clients. For
example, to create a new aggregate and add hosts to it using the openstack client, run:
To list all aggregates and show information about a specific aggregate, run:
Configuration
In addition to CRUD operations enabled by the API and clients, the following configuration options can
be used to configure how host aggregates and the related availability zones feature operate under the
hood:
• default_schedule_zone
• scheduler.limit_tenants_to_placement_aggregate
• cinder.cross_az_attach
Finally, as discussed previously, there are a number of host aggregate-specific scheduler filters. These
are:
• AggregateImagePropertiesIsolation
• AggregateInstanceExtraSpecsFilter
• AggregateIoOpsFilter
• AggregateMultiTenancyIsolation
• AggregateNumInstancesFilter
• AggregateTypeAffinityFilter
The following configuration options are applicable to the scheduler configuration:
• cpu_allocation_ratio
• ram_allocation_ratio
• filter_scheduler.max_instances_per_host
• filter_scheduler.aggregate_image_properties_isolation_separator
• filter_scheduler.aggregate_image_properties_isolation_namespace
Image Caching
Aggregates can be used as a way to target multiple compute nodes for the purpose of requesting that
images be pre-cached for performance reasons.
Note: Some of the virt drivers provide image caching support, which improves performance of second-
and-later boots of the same image by keeping the base image in an on-disk cache. This avoids the need to
re-download the image from Glance, which reduces network utilization and time-to-boot latency. Image
pre-caching is the act of priming that cache with images ahead of time to improve performance of the
first boot.
Assuming an aggregate called my-aggregate where two images should be pre-cached, running the
following command will initiate the request:
Note that image pre-caching happens asynchronously in a best-effort manner. The images and aggre-
gate provided are checked by the server when the command is run, but the compute nodes are not
checked to see if they support image caching until the process runs. Progress and results are logged
by each compute, and the process sends aggregate.cache_images.start, aggregate.
cache_images.progress, and aggregate.cache_images.end notifications, which may be
useful for monitoring the operation externally.
References
System architecture