Commit Graph

55753 Commits

Author SHA1 Message Date
Zuul f88abe3fed Merge "Switch to devstack-plugin-ceph-tempest-py3 for ceph" 2019-11-01 20:56:49 +00:00
Matt Riedemann e619354f7c Document CD mentality policy for nova contributors
The mentality of being able to continuously deliver nova
has been around since the beginning with Rackspace public
cloud trying to CD openstack as close to master as possible.
This has implications for how code series are structured,
reviewed and merged. For the most part this seems to be tribal
knowledge and we don't have anything very obvious in the nova
docs about it, and not all projects in openstack necessarily
subscribe to this mentality anymore, or do so grudgingly, but
it's worth documenting it in nova while still applied here.

Change-Id: Ieff87dbd748318f1b7f879a136ff25081dac321e
2019-11-01 11:52:07 -04:00
Matt Riedemann c5557f03da doc: link to nova code review guide from dev policies
The development policies section on code review was linking to the
generic openstack review guidelines but we have nova-specific
guidelines as well so this changes the policies page to link to the
nova code review guidelines, links the general guidelines into the
nova page, and also fixes a formatting issue in the nova code review
guidelines page.

Change-Id: I725570d0d737f18fe8b105dc8382c4abcfdef295
2019-11-01 11:34:13 -04:00
Zuul 46a02d5eb5 Merge "Default AZ for instance if cross_az_attach=False and checking from API" 2019-11-01 13:04:58 +00:00
Zuul 14e974a5f7 Merge "Refactor rebuild_instance" 2019-11-01 07:16:00 +00:00
Zuul 38c5f2cc96 Merge "Add functional test for two-cell scheduler behaviors" 2019-11-01 01:51:08 +00:00
Matt Riedemann 07a24dcef7 Default AZ for instance if cross_az_attach=False and checking from API
If we're booting from an existing volume but the instance is not being
created in a requested availability zone, and cross_az_attach=False,
we'll fail with a 400 since by default the volume is in the 'nova'
AZ and the instance does not have an AZ set - because one wasn't requested
and because it's not in a host aggregate yet.

This refactors that AZ validation during server create in the API to
do it before calling _validate_bdm so we get the pre-existing volumes
early and if cross_az_attach=False, we validate the volume zone(s) against
the instance AZ. If the [DEFAULT]/default_schedule_zone (for instances) is
not set and the volume AZ does not match the
[DEFAULT]/default_availability_zone then we put the volume AZ in the request
spec as if the user requested that AZ when creating the server.

Since this is a change in how cross_az_attach is used and how the instance
default AZ works when using BDMs for pre-existing volumes, the docs are
updated and a release note is added.

Note that not all of the API code paths are unit tested because the
functional test coverage does most of the heavy lifting for coverage.
Given the amount of unit tests that are impacted by this change, it is
pretty obvious that (1) many unit tests are mocking at too low a level and
(2) functional tests are better for validating these flows.

Closes-Bug: #1694844

Change-Id: Ib31ba2cbff0ebb22503172d8801b6e0c3d2aa68a
2019-10-31 10:08:46 -04:00
Dan Smith 888dd7d475 Add functional test for two-cell scheduler behaviors
This adds a functional test to validate migrate-related behaviors
with multiple cells. We can't migrate across cells yet and this
validates that such an operation is properly blocked.

Co-Authored-By: Matt Riedemann <mriedem.os@gmail.com>

Related to blueprint cells-aware-api
Change-Id: I79736624dbebd7085fd8a5fe810a80312ebb367f
2019-10-31 13:45:47 +00:00
Zuul 7fb4e3001a Merge "Add missing parameter" 2019-10-31 11:09:12 +00:00
Zuul a9ff7e9982 Merge "Only allow one scheduler service in tests" 2019-10-31 05:08:10 +00:00
Zuul 22ea7bde2a Merge "Update help for image_cache_manager_interval option" 2019-10-31 05:08:05 +00:00
Zuul b0e97fd703 Merge "nova-net: Use deepcopy on value returned by NeutronFixture" 2019-10-31 01:25:55 +00:00
Zuul 81929c3c52 Merge "nova-net: Migrate 'test_floating_ips' functional tests" 2019-10-30 23:47:55 +00:00
Zuul 8075bd7bc8 Merge "fixtures: Add support for security groups" 2019-10-30 23:43:43 +00:00
Zuul 50223cae9e Merge "Remove 'test_cold_migrate_with_physnet_fails' test" 2019-10-30 21:53:23 +00:00
Zuul 631d685d67 Merge "Move Destination object tests to their own test class" 2019-10-30 21:53:16 +00:00
Zuul 5f1b235bc8 Merge "Mark "block_migration" arg deprecation on pre_live_migration method" 2019-10-30 21:53:10 +00:00
Zuul ac0acfb809 Merge "Move pre-3.44 Cinder post live migration test to test_compute_mgr" 2019-10-30 19:55:03 +00:00
Zuul 7a402a8126 Merge "Refactor volume connection cleanup out of _post_live_migration" 2019-10-30 19:54:58 +00:00
Eric Fried fe05d004b5 Only allow one scheduler service in tests
There have been two recent issues [1][2] caused by starting multiple
instances of the same service in tests. This can cause races when the
services' (shared) state conflicts.

With this patch, the nexus of nova service starting,
nova.test.TestCase.start_service, is instrumented to keep track of how
many of each service we are running. If we try to run the scheduler
service more than once, we fail.

We could probably do the same thing for conductor, though that's less
important (for now) because conductor is stateless (for now).

[1] https://bugs.launchpad.net/nova/+bug/1844174
[2] https://review.opendev.org/#/c/681059/ (not a nova service, but same
class of problem)

Change-Id: I56d3cb17260dad8b88f03c0a7b9688efb3258d6f
2019-10-30 19:44:17 +00:00
Zuul f2a99e480d Merge "api-ref: remove mention about os-migrations no longer being extended" 2019-10-30 19:28:25 +00:00
Zuul f33f8c2ba2 Merge "Log some stats for image pre-cache" 2019-10-30 14:26:29 +00:00
Zuul 6f121c1653 Merge "Make nova-next multinode and drop tempest-slow-py3" 2019-10-30 03:14:24 +00:00
Zuul 534f570c8c Merge "Fix race in test_vcpu_to_pcpu_reshape" 2019-10-29 23:18:52 +00:00
Zuul 61873f34b7 Merge "Switch to opensuse-15 nodeset" 2019-10-29 22:06:50 +00:00
Matt Riedemann 56a391aafc Fix race in test_vcpu_to_pcpu_reshape
This test uses the ServersTestBase._wait_for_state_change method
which waits for the status to change *from* what is provided, so
when creating a server and waiting for the status to change from
ACTIVE makes _wait_for_state_change return immediately since the
status starts as BUILD. This can lead to a failure when the test
tries to migrate a server that is in BUILD status rather than
ACTIVE status.

This fixes the test by using this version of  _wait_for_state_change
correctly, not to be confused with the same method in
InstanceHelperMixin which is more accurate (it waits for the
terminal status of the server operation).

Change-Id: I56ff050194d0eb465b8c41795fdea2a8b0d764d6
Closes-Bug: #1850514
2019-10-29 14:29:52 -04:00
Dan Smith 7ecd502f6d Log some stats for image pre-cache
This attempts to log some statistics about a precache operation so that,
barring something more complicated on the operator side, there is some
useful information in the logs about what is going on.

Related to blueprint image-precache-support

Change-Id: I550afc344eca30c366ba0e5342966bcbaac96bfe
2019-10-29 07:05:51 -07:00
melanie witt 23871ad4ad Switch to devstack-plugin-ceph-tempest-py3 for ceph
We dropped the tempest-full python2 job from our zuul config awhile
back with change I93e938277454a1fc203b3d930ec1bc1eceac0a1e. Since we
track ceph job health by how its failure rate compares with our basic
tempest full job, this updates our config to run the python3 version
of the ceph job by default and drops the python2 version of the job.

Change-Id: I92e01b896751f7f29a0b2b826c33cb2c74b8ced4
2019-10-28 22:31:41 +00:00
Zuul 9742a64403 Merge "Add notification sample test for aggregate.cache_images.start|end" 2019-10-28 20:07:37 +00:00
Zuul 853eaa7f38 Merge "Added openssh-client into bindep" 2019-10-28 15:38:34 +00:00
Zuul 888c52a928 Merge "Fix policy doc for host_status and extended servers attribute" 2019-10-26 05:35:08 +00:00
Zuul c537704cd7 Merge "Remove redundant call to get/create default security group" 2019-10-26 01:57:52 +00:00
Ghanshyam Mann 4722fe5ba5 Fix policy doc for host_status and extended servers attribute
In microversion 2.75, host_status and extended-server-attributes
were added in PUT /servers/{server-id} and POST /servers/action {rebuild }
API response with respective policy enforcement[1].

But PUT and rebuild APIs were missed to mentioned in policy doc
for 'os_compute_api:servers:show:host_status'
'os_compute_api:os-extended-server-attributes'

- https://docs.openstack.org/nova/latest/configuration/policy.html

Closes-Bug: #1849164

[1]
https://github.com/openstack/nova/blob/964d7dc87989b5765fcc60d34f734963ab8e03e7/nova/api/openstack/compute/servers.py#L854

https://github.com/openstack/nova/blob/964d7dc87989b5765fcc60d34f734963ab8e03e7/nova/api/openstack/compute/servers.py#L1161

Change-Id: Ifac1e60f5c8d9c5e3a0a9dacc398c339c2216689
2019-10-26 00:50:24 +00:00
Matt Riedemann d50efc337c Add notification sample test for aggregate.cache_images.start|end
This adds the functional notification sample test for the
aggregate.cache_images.start and aggregate.cache_images.end
versioned notifications.

I also added a comment to the docs builder code since it took
me a bit to figure out how to get the notification sample
linked into the docs, and for whatever reason figured that out
by looking through code rather than our nicely detailed docs
that already explain it.

Part of blueprint image-precache-support

Change-Id: I0869979a1b8a0966f0e7b49e5a5984f76d7d67cd
2019-10-25 10:38:19 -07:00
Matt Riedemann b2122f7702 Stop building docs with (test-)requirements.txt
Change Iba797243d2a137b551223165a1af1a8676bcea02 was a bit
overzealous in using {[testenv]deps} and changed the docs
tox target to also install requirements.txt and
test-requirements.txt which means for docs builds we're
installing things like psycopg2 which we shouldn't be doing
if we don't have the correct native packages installed to
build those types of dependencies.

Change-Id: Ib718911596b93ec6ec7e899210300d2f0d9572ed
Closes-Bug: #1849870
2019-10-25 12:48:31 -04:00
Zuul ba48c2369f Merge "[Trivial] Add missing ws between words" 2019-10-25 13:49:14 +00:00
Zuul c067962ad7 Merge "libvirt: Ignore volume exceptions during post_live_migration" 2019-10-25 10:28:31 +00:00
Zuul ec9125bdc9 Merge "Add regression test for bug 1824435" 2019-10-25 05:36:08 +00:00
Matt Riedemann 6f8c2f0df5 Make nova-next multinode and drop tempest-slow-py3
The tempest-slow-py3 job is, well, very slow. It takes
over 2.5 hours sometimes to complete. There are a few
reasons beyond it just running slow tests but it runs
all slow tests serially and for nova it's testing things
we don't care about like network scenario tests like
test_slaac_from_os. The one benefit we get from running
tempest-slow-py3 is that it's multinode and there are
certain slow test scenarios for multinode that are
important for test coverage.

This change drops the tempest-slow-py3 job from our
job list and changes nova-next to be multinode. The
nova-next job runs a select set of tempest compute API
and scenario tests only and runs them concurrently, which
in the gate is 4 workers at a time. The nova-next job
will take a bit longer since we have to setup the subnode
now but overall it should still be faster than the
tempest-slow-py3 job and we'll save on one more node
required from nodepool to run jobs against nova changes.

The USE_PYTHON3 variable can be dropped from the nova-next
job definition now that it extends tempest-multinode-full-py3.

Depends-On: https://review.opendev.org/690469/

Change-Id: I1b7d71e833bf0743f22d7fa0241c9d1bbcd0faac
2019-10-24 09:53:07 -04:00
Lee Yarwood ac68cffd43 libvirt: Ignore volume exceptions during post_live_migration
Previously errors while disconnecting volumes from the source host
during post_live_migration within LibvirtDriver would result in the
overall failure of the migration. This would also mean that while the
instance would be running on the destination it would still be listed as
running on the source within the db.

This change simply ignores any exceptions raised while attempting to
disconnect volumes on the source. These errors can be safely ignored as
they will have no impact on the running instance on the destination.

In the future Nova could wire up the force and ignore_errors kwargs when
calling down into the associated os-brick connectors to help avoid this.

Closes-Bug: #1843639
Change-Id: Ieff5243854321ec40f642845e87a0faecaca8721
2019-10-24 09:30:12 +01:00
Zuul 1bfa4626d1 Merge "Fix listing deleted servers with a marker" 2019-10-24 07:06:06 +00:00
Zuul ee974a5272 Merge "Add image precaching docs for aggregates" 2019-10-23 21:03:23 +00:00
Zuul 6cb4194902 Merge "Add functional regression test for bug 1849409" 2019-10-23 18:29:50 +00:00
Dan Smith 829ccbe2bb Add image precaching docs for aggregates
Related to blueprint image-precache-support

Partial-Bug: #1847302
Change-Id: I7a57e5e09b2a1760a9c5aeac402911895dfce07d
2019-10-23 11:12:03 -07:00
Zuul fd470598dc Merge "Adds view builders for keypairs controller" 2019-10-23 17:29:03 +00:00
Matt Riedemann df03499843 Fix listing deleted servers with a marker
Change I1aa3ca6cc70cef65d24dec1e7db9491c9b73f7ab in Queens,
which was backported through to Newton, introduced a regression
when listing deleted servers with a marker because it assumes
that if BuildRequestList.get_by_filters does not raise
MarkerNotFound that the marker was found among the build requests
and does not account for that get_by_filters method short-circuiting
if filtering servers with deleted/cleaned/limit=0. The API code
then nulls out the marker which means you'll continue to get the
marker instance back in the results even though you shouldn't,
and that can cause an infinite loop in some client-side tooling like
nova's CLI:

  nova list --deleted --limit -1

This fixes the bug by raising MarkerNotFound from
BuildRequestList.get_by_filters if we have a marker but are
short-circuiting and returning early from the method based on
limit or filters.

Change-Id: Ic2b19c2aa06b3059ab0344b6ac56ffd62b3f755d
Closes-Bug: #1849409
2019-10-23 10:32:28 -04:00
Matt Riedemann 45c2752f2c Add functional regression test for bug 1849409
Change I1aa3ca6cc70cef65d24dec1e7db9491c9b73f7ab in Queens,
which was backported through to Newton, introduced a regression
when listing deleted servers with a marker because it assumes
that if BuildRequestList.get_by_filters does not raise
MarkerNotFound that the marker was found among the build requests
and does not account for that get_by_filters method short-circuiting
if filtering servers with deleted/cleaned/limit=0. The API code
then nulls out the marker which means you'll continue to get the
marker instance back in the results even though you shouldn't,
and that can cause an infinite loop in some client-side tooling like
nova's CLI:

  nova list --deleted --limit -1

This adds a functional recreate test for the regression which will
be updated when the bug is fixed.

Change-Id: I324193129acb6ac739133c7e76920762a8987a84
Related-Bug: #1849409
2019-10-23 09:41:54 -04:00
Zuul 2718de6ed7 Merge "Revert "Log CellTimeout traceback in scatter_gather_cells"" 2019-10-23 08:27:52 +00:00
Zuul 5351e5d1d8 Merge "Revert "vif: Resolve a TODO and update another"" 2019-10-23 08:08:50 +00:00
Zuul 98b521034b Merge "Remove compute compat checks for aborting queued live migrations" 2019-10-23 08:08:36 +00:00