host arch in libvirt driver support
This is split 2 of 3 for the architecture emulation feature.
This implements emulated multi-architecture support through qemu
within OpenStack Nova.
Additional config variable check to pull host architecture into
hw_architecture field for emulation checks to be made.
Adds a custom function that simply performs a check for
hw_emulation_architecture field being set, allowing for core code to
function as normal while enabling a simple check to enable emulated
architectures to follow the same path as all multi-arch support
already established for physical nodes but instead levaraging qemu
which allows for the overall emulation.
Added check for domain xml unit test to strip arch from the os tag,
as it is not required uefi checks, and only leveraged for emulation
checks.
Added additional test cases test_driver validating emulation
functionality with checking hw_emulation_architecture against the
os_arch/hw_architecture field. Added required os-traits and settings
for scheduler request_filter.
Added RISCV64 to architecture enum for better support in driver.
Implements: blueprint pick-guest-arch-based-on-host-arch-in-libvirt-driver
Closes-Bug: 1863728
Change-Id: Ia070a29186c6123cf51e1b17373c2dc69676ae7c
Signed-off-by: Jonathan Race <jrace@augusta.edu>
This adds an image property show and image property set command to
nova-manage to allow users to update image properties stored for an
instance in system metadata without having to rebuild the instance.
This is intended to ease migration to new machine types, as updating
the machine type could potentially invalidate the existing image
properties of an instance.
Co-Authored-By: melanie witt <melwittt@gmail.com>
Blueprint: libvirt-device-bus-model-update
Change-Id: Ic8783053778cf4614742186e94059d5675121db1
Before, the definition of live_migration_downtime didn't explain
if any exception/timeout occurs if the migration exceeds the value.
This is just used as a reference for nova and if any problem happens
when the VM gets paused, there will be no abort or force-complete.
Closes-Bug: #1960345
Signed-off-by: Pedro Almeida <pedro.monteiroazevedodemouraalmeida@windriver.com>
Change-Id: I336481d1801a367b5628fedcd2aa5f5cf763355a
While most of the SR-IOV related documentation resides in the Neutron
repository which is going to have a separate section on the topic of
supporting remote-managed ports and off-path networking backends, there
are still some things specific to Nova which are worth documenting in
Nova docs.
https://docs.openstack.org/neutron/latest/admin/config-sriov.html
Implements: blueprint integration-with-off-path-network-backends
Change-Id: I3c5fe8ec0539e10d07b1b4888e9833bc7ede1d04
This was eventually added in Yoga, not Xena.
Change-Id: I8afe755732c95d023b7c4bd99964507f54d324f1
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
This was actually three documents in one:
- An admin doc detailing how to configure and use notifications
- A contributor doc describing how to extend the versioned notifications
- A reference doc listing available versioned notifications
Split the doc up to reflect this
Change-Id: I880f1c77387efcc3c1e147323b224e10156e0a52
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Mostly copy-paste from the spec, but at least this is in-tree and
updatable.
Change-Id: I4cad2111065fbc1840d44fc9f4bf6ac585e18db6
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Cells mean NUMA cells below in text.
By default, first instance's cell are placed to the host's cell with
id 0, so it will be exhausted first. Than host's cell with id 1 will
be used and exhausted. It will lead to error placing instance with
number of cells in NUMA topology equal to host's cells number if
some instances with one cell topology are placed on cell with id 0
before. Fix will perform several sorts to put less used cells at
the beginning of host_cells list based on PCI devices, memory and
cpu usage when packing_host_numa_cells_allocation_strategy is set
to False (so called 'spread strategy'), or will try to place all
VM's cell to the same host's cell untill it will be completely
exhausted and only after will start to use next available host's
cell (so called 'pack strategy'), when the configuration option
packing_host_numa_cells_allocation_strategy is set to True.
Partial-Bug: #1940668
Change-Id: I03c4db3c36a780aac19841b750ff59acd3572ec6
Based on review feedback on [1] and [2].
[1] If39db50fd8b109a5a13dec70f8030f3663555065
[2] I518bb5d586b159b4796fb6139351ba423bc19639
Change-Id: I44920f20213462a3abe743ccd38b356d6490a7b4
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
When suspending a VM in OpenStack, Nova detaches all the mediated
devices from the guest machine, but does not reattach them on the resume
operation. This patch makes Nova reattach the mdevs that were detached
when the guest was suspended.
This behavior is due to libvirt not supporting the hot-unplug of
mediated devices at the time the feature was being developed. The
limitation has been lifted since then, and now we have to amend the
resume function so it will reattach the mediated devices that were
detached on suspension.
Closes-bug: #1948705
Signed-off-by: Gustavo Santos <gustavofaganello.santos@windriver.com>
Change-Id: I083929f36d9e78bf7713a87cae6d581e0d946867
As with the cells v2 docs before this, we have a number of architecture
focused documents in tree. The 'user/architecture' guide is relatively
up-to-date but is quite shallow, while the 'admin/arch' guide is
in-depth but almost a decade out-of-date, with references to things
like nova's in-built block storage service. Replace most of the latter
with more up-to-date information and the merge the former into it,
before renaming the file to 'admin/architecture'.
Change-Id: I518bb5d586b159b4796fb6139351ba423bc19639
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
We currently have three cells v2 documents in-tree:
- A 'user/cellsv2-layout' document that details the structure or
architecture of a cells v2 deployment (which is to say, any modern
nova deployment)
- A 'user/cells' document, which is written from a pre-cells v2
viewpoint and details the changes that cells v2 *will* require and the
benefits it *would* bring. It also includes steps for upgrading from
pre-cells v2 (that is, pre-Pike) deployment or a deployment with cells
v1 (which we removed in Train and probably broke long before)
- An 'admin/cells' document, which doesn't contain much other than some
advice for handling down cells
Clearly there's a lot of cruft to be cleared out as well as some
centralization of information that's possible. As such, we combine all
of these documents into one document, 'admin/cells'. This is chosen over
'users/cells' since cells are not an end-user-facing feature. References
to cells v1 and details on upgrading from pre-cells v2 deployments are
mostly dropped, as are some duplicated installation/configuration steps.
Formatting is fixed and Sphinx-isms used to cross reference config
option where possible. Finally, redirects are added so that people can
continue to find the relevant resources. The result is (hopefully) a
one stop shop for all things cells v2-related that operators can use to
configure and understand their deployments.
Change-Id: If39db50fd8b109a5a13dec70f8030f3663555065
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
A recent customer call highlighted some misunderstandings about the two
weighers in the nova tree. Firstly, the basis for the metrics used by
the 'IoOpsWeigher' was not well explained and required some spelunking
through the code to understand. Secondly, the 'BuildFailureWeigher'
multiplier, configured by '[scheduler] build_failure_weight_multiplier',
defaults to a very large value for reasons that are not apparent unless
you read the commit logs for that weigher (hint: it's because we wanted
to preserve the behavior of the older filter-based approach to handling
nodes with build failures). Expand the documentation to fill both gaps.
In the process, we also correct some small nits with this doc, mostly
centered around whitespace.
Change-Id: If2d329b86808bdc70619fbe057dd25a938eb79da
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
The 'nova-manage placement audit' tool has functionality that can
delete orphaned allocations in placement. Add a section for it in the
doc for troubleshooting orphaned allocations.
Change-Id: I697de57cf7eb43c0993af2b1f5b3f5c4395ef097
This adds some basic documentation for the above command and also
includes some very generic osc commands to use when checking volume
attachments.
Blueprint: nova-manage-refresh-connection-info
Change-Id: Ib3d680654fe0809c9e8341dffd3a63ab02945a38
This patches adjusts the nova documentation about the extended port
resource request support in nova as the neutron API extension did not
land in Xena.
Change-Id: I3b961426745084bdb4a6d04468f5a3c762be4cfa
blueprint: qos-minimum-guaranteed-packet-rate
The interface attach and detach logic is now fully adapted to the new
extended resource request format, and supports more than one request
group in a single port.
blueprint: qos-minimum-guaranteed-packet-rate
Change-Id: I73e6acf5adfffa9203efa3374671ec18f4ea79eb
Nova re-generates the resource request of an instance for each server
move operation (migrate, resize, evacuate, live-migrate, unshelve) to
find (or validate) a target host for the instance move. This patch
extends the this logic to support the extended resource request from
neutron.
As the changes in the neutron interface code is called from nova-compute
service during the port binding the compute service version is bumped.
And a check is added to the compute-api to reject the move operations
with ports having extended resource request if there are old computes
in the cluster.
blueprint: qos-minimum-guaranteed-packet-rate
Change-Id: Ibcf703e254e720b9a6de17527325758676628d48
This adds the final missing pieces to support creating servers with
ports having extended resource request. As the changes in the neutron
interface code is called from nova-compute service during the port
binding the compute service version is bumped. And a check is added to
the compute-api to reject such server create requests if there are old
computes in the cluster.
Note that some of the negative and SRIOV related interface attach
tests are also started to pass as they are not dependent on any of the
interface attach specific implementation. Still interface attach is
broken here as the failing of the positive tests show.
blueprint: qos-minimum-guaranteed-packet-rate
Change-Id: I9060cc9cb9e0d5de641ade78c5fd7e1cc77ade46
Take the opportunity to clean up the docs quite a bit, ultimately
combining two disparate guides on the scheduler into one.
Change-Id: Ia72d39b4774d93793b381359b554c717dc9a6994
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
To prepare for the unlikely event that Neutron merges and an operator
enables the port-resource-request-groups neutron API extension before
nova adds support for it, this patch rejects server creation if such
extension is enabled in Neutron. Enabling that extension has zero
benefits without nova support hence the harsh but simple rejection.
A subsequent patch will reject server lifecycle operations in a more
sophisticated way and as soon as we support some operations, like
boot, the deployer might rightfully choose to enable the Neutron
extension.
Change-Id: I2c55d9da13a570efbc1c862116cea31aaa6aa02e
blueprint: qos-minimum-guaranteed-packet-rate
Alembic does lots of new things. Provide docs for how to use this. We
also improve upgrade docs slightly, removing references to ancient
reviews that are no longer really helpful as well as calling out our N
-> N+1 constraint.
Change-Id: I3760b82ce3bd71aa0a760d7137d69dfa3f29dc1d
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Nested allocations are only partially supported in nova-manage placement
heal_allocations CLI. This patch documents the missing support and
blocks healing instances with VGPU or Cyborg device profile request in
the embedded flavor. Blocking is needed as if --forced is used with such
instances then the tool could recreate an allocation ignoring some of
these resources.
Change-Id: I89ac90d2ea8bc268940869dbbc90352bfad5c0de
Related-Bug: bug/1939020
As a prerequisite for blueprint generic-mdevs we need to rename the
existing enabled_vgpu_types options and dynamically generated groups
into enabled_mdev_types.
There is no upgrade impact for existing users, as the original
options are still accepted.
NOTE(sbauza): As we have a lot of methods and objects named gpu-ish
let's just change what we need here and provide followups for
fixing internal tech debt later.
Change-Id: Idba094f6366a24965804b88da0bc1b9754549c99
Partially-Implements: blueprint generic-mdevs
Correct a variety of gaps and other issues seen while improving the
flavor docs.
Change-Id: I8d68016cecb0269a5f9af88b0a08578f85403e23
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
The link of `TLS everywhere` should be 'https://docs.openstack.org/
project-deploy-guide/tripleo-docs/latest/features/tls-everywhere.html'.
Closes-Bug: #1933062
Change-Id: I468b82edeb899b0a780f8b545ad23ee0428a93ea
This change deprecates the AZ filters which is no longer required.
This also enable the use of placement for AZ enforcement by default and
deprecates the config option for removal.
Change-Id: I92b0386432444fc8bdf852de4bdb6cebb370a8ca