Nova's use of libvirt's compareCPU() API served its purpose
over the years, but its design limitations break live migration in
subtle ways. For example, the compareCPU() API compares against the
host physical CPUID. Some of the features from this CPUID aren not
exposed by KVM, and then there are some features that KVM emulates that
are not in the host CPUID. The latter can cause bogus live migration
failures.
With QEMU >=2.9 and libvirt >= 4.4.0, libvirt will do the right thing in
terms of CPU compatibility checks on the destination host during live
migration. Nova satisfies these minimum version requirements by a good
margin. So, provide a workaround to skip the CPU comparison check on
the destination host before migrating a guest, and let libvirt handle it
correctly. This workaround will be removed once Nova replaces the older
libvirt APIs with their newer and improved counterparts[1][2].
- - -
Note that Nova's libvirt driver calls compareCPU() in another method,
_check_cpu_compatibility(); I did not remove its usage yet. As it needs
more careful combing of the code, and then:
- where possible, remove the usage of compareCPU() altogether, and
rely on libvirt doing the right thing under the hood; or
- where Nova _must_ do the CPU comparison checks, switch to the better
libvirt CPU APIs -- baselineHypervisorCPU() and
compareHypervisorCPU() -- that are described here[1]. This is work
in progress[2].
[1] https://opendev.org/openstack/nova-specs/commit/70811da221035044e27
[2] https://review.opendev.org/q/topic:bp%252Fcpu-selection-with-hypervisor-consideration
Change-Id: I444991584118a969e9ea04d352821b07ec0ba88d
Closes-Bug: #1913716
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Balazs Gibizer <bgibizer@redhat.com>
Spotted this in a review recently. We don't want people using six
anymore.
Change-Id: Ie107a95bc06390ab519d3b3af9b07103a9a14316
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
The webob.Request class does not have the remote_address attribute but
the remote_addr attribute. This change fixes usage of the non-existing
attribute accordingly.
Closes-Bug: #1967683
Change-Id: I874e97ac6ad84daa20997345082cb4d1135699c4
During the pre live migration process, Nova performs most of the
tasks related to the creation and operation of the VM in the destination
host. That is done without interrupting any of the hardware in the source
host. If the pre_live_migration fails, those same operations should be
rolled back.
Currently nova is sharing the _rollback_live_migration for both
live and pre_live migration rollbacks, and that is causing the source
host to try to re-attach network interfaces on the source host where
they weren't actually de-attached.
This patch fixes that by adding a conditional to allow nova to do
different paths for migration and pre_live_migration rollbacks.
Closes-bug: #1944619
Change-Id: I784190ac356695dd508e0ad8ec31d8eaa3ebee56
There is a NOTE in the CellDatabases code about an unlikely but
possible race that can occur between taking the writer lock to set
the last DB context manager and taking the reader lock to call
target_cell(). When the race is detected, a RuntimeError is raised.
We can handle the race by retrying setting the last DB context manager
when the race is detected, as described in the NOTE.
Closes-Bug: #1959677
Change-Id: I5c0607ce5910dce581ab9360cc7fc69ba9673f35
Addresses a long-standing TODO. We remove the
'InstancePCIRequests.from_request_spec_instance_props' helper since it's
entirely unnecessary: the built-in 'obj_from_primitive' wrapper will do
what we want here (creating an o.vo from a serialized representation of
the object).
Change-Id: I5208b7dff996828137dfddfdd2db8737126884e3
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
There's nothing of use in here. A section on creating extensions for the
API is removed since this is no longer a thing.
Change-Id: I18a6f642c046051cd6084ab920d78f27887ca13d
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
This patch fix test_default_logging test.
The test validates that we have two logging handlers:
1 x to display default messages (info, error, warnings...)
1 x to redirect debug messages to null and so don't display them.
However, if OS_DEBUG=True is set in a shell session, then the test is
run and fails. Because, in debug mode, we should have only one handler
to display all messages. (look at comments for more details and
test_debug_logging test).
To fix the test, we explicitly set OS_DEBUG=0 when running
test_default_logging, so it will ensure we have two handlers whatever
OS_DEBUG value.
Co-authored-by: Rene Ribaud <rribaud@redhat.com>
Closes-Bug: #1964497
Change-Id: I7c0151d988c538dd2d083aab4b3e18ddb8151045
This job is pretty heavy and has been triggering OOMs that take out
mysqld lately. This disables swift (and c-bak as a result) to try to
reduce the runtime footprint. Losing coverage of these services
should not be a problem for the goal of this job.
Change-Id: Icc18ddd847465069aea34b226851afaeb94594fc
Add file to the reno documentation build to show release notes for
stable/yoga.
Use pbr instruction to increment the minor version number
automatically so that master versions are higher than the versions on
stable/yoga.
Sem-Ver: feature
Change-Id: I596e4e49e4982b6c47457d565f389f749783b23f
Due to initial release of emulated architecture feature the CI
has been changed to run less frequent to save CI hours. This
will be revisited in later releases as feature gains greater
support and capabilities.
Implements: blueprint pick-guest-arch-based-on-host-arch-in-libvirt-driver
Signed-off-by: Jonathan Race <jrace@augusta.edu>
Change-Id: I7b085c2086a720a049c9b04a6ff10a0e5cc9d650
This patch solves bug #1949808 and bug #1960412 by tuning
live_migration_abort() function and adding calls to:
- remove placement allocations for live migration;
- remove INACTIVE port bindings against destination compute node;
- restore instance's state.
Related unit test was adjusted and related functional tests were
fixed.
Closes-bug: #1949808
Closes-bug: #1960412
Change-Id: Ic97eff86f580bff67b1f02c8eeb60c4cf4181e6a
Instance would be affected by problems described in bug #1949808
and bug #1960412 when queued live migration is aborted.
This change adds functional test to reproduce problems with
placement allocations (record for aborted live migration is not
removed when queued live migration is aborted) and with Neutron port
bindings (INACTIVE port binding records for destination host are not
removed when queued live migration is aborted).
It looks like there are no other modifications introduced by Nova
control plane which should be reverted when queued live migration is
aborted.
This patch also changes libvirt and neutron fixtures:
- libvirt fixture was changed to support live migrations of
instances with regular ports: without this change
_update_vif_xml() complains about lack of address element in VIF's
XML.
- neutron fixture was changed to improve active port binding's
tracking during live migration: without this change port's
binding:host_id is not updated when activate_port_binding() is
called. As a result, list_ports() function returns empty list
when constants.BINDING_HOST_ID is used in search_opts, which is
the case for setup_networks_on_host() called with teardown=True.
Related-bug: #1960412
Related-bug: #1949808
Change-Id: I152581deb6e659c551f78eed66e4b0b958b20c53
Back in the days of centos 6 and python 2.6 eventlet
greendns monkeypatching broke ipv6. As a result nova
has run without greendns monkey patching ever since.
This removes that old workaround allowing modern
eventlet to use greendns for non blocking dns lookups.
Closes-Bug: #1964149
Change-Id: Ia511879d2f5f50a3f63d180258abccf046a7264e
`binding:profile` updates are handled differently for migration from
instance creation which was not taken into account previously. Relevant
fields (card_serial_number, pf_mac_address, vf_num) are now added to the
`binding:profile` after a new remote-managed PCI device is determined at
the destination node.
Likewise, there is special handling for the unshelve operation which is
fixed too.
Func testing:
* Allow the generated device XML to contain the PCI VPD capability;
* Add test cases for basic operations on instances with remote-managed
ports (tunnel or physical);
* Add a live migration test case similar to how it is done for
non-remote-managed SR-IOV ports but taking remote-managed port related
specifics into account;
* Add evacuate, shelve/unshelve, cold migration test cases.
Change-Id: I9a1532e9a98f89db69b9ae3b41b06318a43519b3
This addresses remaining comments from the unified limits series to add
type hints to new code and add a docstring to the is_qfd_populated()
method in nova/quota.py.
Related to blueprint unified-limits-nova
Change-Id: I948647b04b260e888a4c71c1fa3c2a7be5d140c5