When reverting a resize, we need to wait for the migration status to
change to 'reverted', but we also need to wait for the relevant
versioned notification to be emitted. The reason for this is noted in a
couple of places, including the '_revert_resize' helper in the
'nova.tests.functional.integrated_helpers.InstanceHelperMixin' module:
[T]he migration status is changed to "reverted" in the dest host
revert_resize method but the allocations are cleaned up in the source
host finish_revert_resize method so we need to wait for the
finish_revert_resize method to complete.
Two tests in the 'test_cross_cell_migrate' test module were not doing
this wait, resulting in intermittent failures in CI due to the races.
Resolve this now.
Change-Id: I3ec6cae19b362ac9cc311a979f680cf64db4f458
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Closes-Bug: #1904051
The service level check introduced in
Ie15ec8299ae52ae8f5334d591ed3944e9585cf71 should handle the case when a
compute service is wrongly configured with DB credentials. The previous
patch did not handle this and it caused a misleading error at compute
service startup. This patch makes sure that a user friendly warning is
logged in this case then the service level check is done ignoring the DB
configuration and only checking the local cell.
A subsequent patch will add a separate check that fails the compute
service startup in such invalid configuration.
Change-Id: I89cdf3852266ed93a2ac7cd6261fe269932026ac
Related-Bug: #1871482
Rework Ie8bb5e5622bd37dfe8073cca12f77174e8e7d98c so we only log failures
to import the rbd or rados modules when the RbdDriver is used.
This should reduce noise in the logs at runtime as well as during unit
and functional tests runs where these modules are not present.
Change-Id: I150e70629f6ae579ccfe0bf585c8a27df14fb51d
Closes-Bug: #1903316
This was incorrectly removed by
Ib342e2d3c395830b4667a60de7e492d3b9de2f0a while still being used by the
nova-grenade-multinode job. This was missed as the check queue appears
to default to silently skipping jobs where it can't find the parent
instead of failing.
Change-Id: I3ece71ab75a28a0ba662c56fb140525e8ce4aa6c
This change removes the original nova-live-migration job and replaces it
directly with the new Focal based zuulv3 native job.
The nova-dsvm-multinode-base base job is no longer used and so also
removed as part of this change.
Note that this new nova-live-migration job does not yet contain any
ceph coverage like the original, this is still pending and will be
completed early in the W cycle.
This change is being merged ahead of this to resolve bug #1901739, a
known QEMU -drive issue caused by the previous jobs use of libvirt 5.4.0
as provided by Ubuntu Bionic. The fix here being the migration to Ubuntu
Focal based jobs and libvirt 6.0.0 that now defaults to using QEMU
-blockdev.
Closes-Bug: #1901739
Change-Id: Ib342e2d3c395830b4667a60de7e492d3b9de2f0a
Both jobs deploy a multinode env before running their tests so just
append the new nova-evacuate tests to the end of the job.
Change-Id: If64cdf1002eec1504fa76eb4df39b6b2e4ff3728
In I147bf4d95e6d86ff1f967a8ce37260730f21d236 we added a new argument for
the rebuild_instance() RPC method. Unfortunately, we missed to that it
needs to be optional for older versions.
Adding a default none value for it so rolling upgrades would work.
Change-Id: I59c5e56b00114fea5ec19fa63ec73f032dc3bd5c
Closes-Bug: #1902925
In I147bf4d95e6d86ff1f967a8ce37260730f21d236 we wrote a breaking RPC change
for the 5.12 version as the accel_uuids parameter is not optional.
Adding a regression test to check the issue.
Change-Id: I1f3914e16294c99a625b3984ca0098d835cd9b92
Related-Bug: #1902925
Report a warning during upgrade checks if there are computes older than
the previous major nova release in the system.
So if code is upgraded to Wallaby and the upgrade check was run before
the restart of the services with W code then the check warns for Ussuri
computes in the system.
Change-Id: I873b0c1e6e695ae88241bbf75ac9f80ecc6f5664
Nova services only support computes that are not older than
the previous major release. This patch introduces a check in the
service startup that prevents staring the service if too old computes
are detected.
Change-Id: Ie15ec8299ae52ae8f5334d591ed3944e9585cf71
nova-ceph-multistore setup needs non-admin users to copy the image.
To allow that glance's policy was overriden to allow public
images to copy. This restriction again can cause issue if there
is any new copy image tempest test try to copy private image with
admin users.
- https://review.opendev.org/#/c/742546/
Let's allow everyone to copy every image to make it work
for all type of test credentials.
Change-Id: Ia65afdfb8989909441dba55faeed2d78cc7f1ee7
I668643c836d46a25df46d4c99a973af5e50a39db attempted to fix service wide
pauses by providing a more complete list of classes to tpool.Proxy.
While this excluded libvirtError it can include internal libvirt-python
classes pointed to by private globals that have been introduced with the
use of type checking within the module.
Any attempt to wrap these internal classes will result in the failure
seen in bug #1901383. As a result this change simply ignores any class
found during inspection that doesn't start with the `vir` string, used
by libvirt to denote public methods and classes.
Closes-Bug: #1901383
Co-Authored-By: Daniel Berrange <berrange@redhat.com>
Change-Id: I568b0c4fd6069b9118ff116532f14abb46cc42ab
Pygments 2.7.x is stricter in how it validates JSON escapes, aligning it
closer with the spec [1]. Turns out we have some invalid JSON in our
docs, meaning builds are now failing with the following error:
doc/source/user/metadata.rst:262: WARNING: Could not lex literal_block
as "json". Highlighting skipped.
Resolve this.
[1] https://github.com/pygments/pygments/commit/9514e794e0c2a5c7c048df97fcfef4a099e05ac3
Change-Id: Ic50e29e9c7817744ad0b4f9de309aa3e96a09505
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
This should have been removed in change
I18a01032a89bff84d71e879c5207157393849b7e. Remove it now.
Change-Id: I29d1cd2f043bd2244c6fb0410c94f6612de14dbc
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Currently, when we "get" a single instance from the database and we
load metadata and system_metadata, we do so using a joinedload() which
does JOINs with the respective tables. Because of the one-to-many
relationship between an instance and (system_)metadata records, doing
the database query this way can result in a large number of additional
rows being returned unnecessarily and cause a large data transfer.
This is similar to the problem addressed by change
I0610fb16ccce2ee95c318589c8abcc30613a3fe9 which added separate queries
for (system_)metadata when we "get" multiple instances. We don't,
however, reuse the same code for this change because
_instances_fill_metadata converts the instance database object to a
dict, and some callers of _instance_get_by_uuid need to be able to
access an instance database object attached to the session (example:
instance_update_and_get_original).
By using subqueryload() [1], we can perform the additional queries for
(system_)metadata to solve the problem with a similar approach.
Closes-Bug: #1799298
[1] https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html#subquery-eager-loading
Change-Id: I5c071f70f669966e9807b38e99077c1cae5b4606