Commit Graph

55864 Commits

Author SHA1 Message Date
Matt Riedemann 2d91a8463f docs: update SUSPENDED server status wrt supported drivers
Two things here:

1. The API guide was missing the hyper-v driver which supports
   the suspend operation. Rather than hard-code a list of supported
   drivers in the doc, this change just links to the entry in the
   feature support matrix.

2. The supported hypervisors mention in the API reference is removed
   because the end user using the API should not need to know or care
   what backend hypervisor on which their server is running. They can
   either suspend or not, but we don't need to mention the supporting
   drivers for that in the API reference.

Change-Id: Ib76779a8e34b2c68b0f4af190f71576180360d0f
Related-Bug: #1815403
2019-11-14 10:25:06 -05:00
Zuul 8f341eb4a4 Merge "doc: add troubleshooting guide for cleaning up orphaned allocations" 2019-11-14 10:28:58 +00:00
Zuul 22d7f95e03 Merge "Remove dead set_admin_password code to generate password" 2019-11-14 10:28:46 +00:00
Zuul 54ac837865 Merge "Make API always RPC cast to conductor for resize/migrate" 2019-11-14 08:49:50 +00:00
Zuul 0634d73195 Merge "Stop converting Migration objects to dicts for migrate_instance_start" 2019-11-14 08:29:58 +00:00
Zuul 59d4cfef7d Merge "Add image caching to the support matrix" 2019-11-14 08:29:51 +00:00
Zuul 0f37acb3a3 Merge "Consolidate [image_cache] conf options" 2019-11-14 08:29:45 +00:00
Zuul 543c874cfe Merge "Require Migration object arg to migrate_instance_finish method" 2019-11-14 08:29:39 +00:00
Zuul bd0eab8ff5 Merge ""SUSPENDED" description changed in server_concepts guide and API REF" 2019-11-14 03:08:11 +00:00
Zuul 9ff7969fbe Merge "Use named kwargs in compute.API.resize" 2019-11-14 03:08:06 +00:00
Zuul ea55a53880 Merge "Fix review link." 2019-11-14 01:46:15 +00:00
Zuul 691db5b99b Merge "Restrict RequestSpec to cell when evacuating" 2019-11-14 01:10:33 +00:00
Zuul 3d5115a761 Merge "Remove unused CannotMigrateWithTargetHost" 2019-11-14 00:59:25 +00:00
Zuul aa21fe9c9c Merge "Delete _normalize_inventory_from_cn_obj" 2019-11-14 00:59:20 +00:00
Zuul 2dbe174278 Merge "Drop compat for non-update_provider_tree code paths" 2019-11-14 00:54:44 +00:00
Zuul 0b5adc8554 Merge "Implement update_provider_tree for mocked driver in test_resource_tracker" 2019-11-13 23:44:01 +00:00
Zuul 9fcb0d5def Merge "Add support matrix for Delete (Abort) on-going live migration" 2019-11-13 23:43:52 +00:00
Zuul 292a6787fe Merge "Remove dead HostAPI.service_delete code" 2019-11-13 23:43:46 +00:00
Zuul 207bb4157c Merge "Remove PlacementAPIConnectFailure handling from AggregateAPI" 2019-11-13 23:27:52 +00:00
Zuul 5ca532ad00 Merge "cond: rename 'recreate' var to 'evacuate'" 2019-11-13 23:22:29 +00:00
Zuul 1c7a3d5908 Merge "Clear instance.launched_on when build fails" 2019-11-13 21:45:04 +00:00
Zuul 839b3322ff Merge "Rename Claims resources to compute_node" 2019-11-13 21:44:57 +00:00
Matt Riedemann dcd3f516d2 doc: add troubleshooting guide for cleaning up orphaned allocations
While we do not have an automated fix for bug 1849479 this provides
a troubleshooting document for working around that issue where
allocations from a server that was evacuated from a down host need
to be cleaned up manually in order to delete the resource provider
and associated compute node/service.

In general this is also a useful guide for linking up the various
resources and terms in nova and how they are reflected in placement
with the relevant commands which is probably something we should
do more of in our docs.

Change-Id: I120e1ddd7946a371888bfc890b5979f2e19288cd
Related-Bug: #1829479
2019-11-13 15:31:32 -05:00
Sharat Sharma 3badb674f6 "SUSPENDED" description changed in server_concepts guide and API REF
The description of "SUSPENDED" server status was misguiding. Rewording
it to make it more accurate.

Change-Id: Ie93b3b38c2000f7e9caa3ca89dea4ec04ed15067
Closes-Bug: #1815403
2019-11-13 17:11:27 +00:00
Eric Fried f7c027db9a Add image caching to the support matrix
Add a section to the support matrix for image caching
(``has_imagecache`` virt driver capability).

Change-Id: I9147c5ea6b276b4fe18a981f4360844009bd3d95
Partial-Bug: #1847302
2019-11-13 11:09:03 -06:00
Eric Fried 828e8047e5 Consolidate [image_cache] conf options
Blueprint image-precache-support added a conf section called
[image_cache], so it makes sense to move all the existing image
cache-related conf options into it.

Old:
[DEFAULT]image_cache_manager_interval
[DEFAULT]image_cache_subdirectory_name
[DEFAULT]remove_unused_base_images
[DEFAULT]remove_unused_original_minimum_age_seconds
[libvirt]remove_unused_resized_minimum_age_seconds

New:
[image_cache]manager_interval
[image_cache]subdirectory_name
[image_cache]remove_unused_base_images
[image_cache]remove_unused_original_minimum_age_seconds
[image_cache]remove_unused_resized_minimum_age_seconds

Change-Id: I3c49825ac0d70152b6c8ee4c8ca01546265f4b80
Partial-Bug: #1847302
2019-11-13 11:09:03 -06:00
Zuul dcfd74fb37 Merge "Fix ItemMatcher to avoid false positives" 2019-11-13 16:15:23 +00:00
Zuul ec69150112 Merge "ItemsMatcher: mock call list arg in any order" 2019-11-13 16:04:06 +00:00
wangfaxin 200a050182 Fix review link.
Change-Id: Ibdc333f155835f27c95d8e50d0a5fab92bbb0780
2019-11-13 15:54:53 +00:00
Matt Riedemann 891b8f9e98 Use named kwargs in compute.API.resize
The only kwarg passed to resize() is the auto_disk_config
kwarg during resize (not cold migrate). This expands it out
to a named kwarg to resolve the TODO. The same is done with
_check_auto_disk_config and as a result there is a hit to
rebuild as well, but the same TODO exists on rebuild() to use
named kwargs but that can be dealt with separately.

Change-Id: Ide8eb9e09d22f20165474d499ef0524aefc67854
2019-11-13 10:30:06 -05:00
Zuul e3b3ebed2b Merge "Provide a better error when _verify_response hits a TypeError" 2019-11-13 15:27:54 +00:00
Zuul 1e348cfcd1 Merge "api-ref: re-work resize action post-conditions" 2019-11-13 15:27:43 +00:00
Zuul fd40b58f85 Merge "Add known limitation about resize not resizing ephemeral disks" 2019-11-13 15:27:33 +00:00
Zuul e7e56a0c9b Merge "Use ListOfUUIDField from oslo.versionedobjects" 2019-11-13 15:27:17 +00:00
Zuul 27951fd13d Merge "Remove now invalid TODO from ComputeManager._confirm_resize" 2019-11-13 15:27:07 +00:00
Matt Riedemann a05ef30fb9 Make API always RPC cast to conductor for resize/migrate
This is a follow up to [1] to make the API behave consistently
by always asynchronously casting to conductor during resize
and cold migration regardless of same-cell or cross-cell
migration.

From the end user point of view, not much changes besides
the possibility of some exceptions occurring during scheduling
which would have resulted in a 400 BadRequest error.
The user still gets a 202 response, must poll the server status
until the server goes to VERIFY_RESIZE status or times out, and
can check the instance actions if the resize/migrate fails.

The specific errors that can occur are not really an API contract
and as such end user applications should not be building logic
around, for example, getting a NoValidHost error. It should be
noted, however, that by default non-admin users cannot see
the instance action event traceback that would contain the
error, e.g. NoValidHost.

The only exception types removed from handling in the API are
(1) AllocationMoveFailed which can be raised when the conductor
MigrationTask runs replace_allocation_with_migration and
(2) NoValidHost when the scheduler is called to select destinations.

Because of this, quite a few functional negative tests have to be
adjusted since the API no longer returns a 400 for NoValidHost and
other errors that can happen during scheduling.

Finally, the do_cast kwarg is left on the conductor API method since
the compute service calls it during same-cell reschedule as a
synchronous RPC call and has error handling if rescheduling in
conductor fails.

[1] I098f91d8c498e5a85266e193ad37c08aca4792b2

Change-Id: I711e56bcb4b72605253fa63be230a68e03e45b84
2019-11-13 10:19:53 -05:00
Zuul d215bffb31 Merge "Remove TODOs around claim_resources_on_destination" 2019-11-13 15:15:05 +00:00
Matt Riedemann fb283dab57 Remove unused CannotMigrateWithTargetHost
Before [1] this could be raised from the API resize()
method if getting a RequestSpec failed and a target host
was specified for cold migration. Since that change the
usage of the exception was removed so we can remove it
altogether since only unit test code is using it.

[1] I34ffaf285718059b55f90e812b57f1e11d566c6f

Change-Id: I19db48bd03855d1a1edbeff5adf15a28abcb5d92
2019-11-12 09:47:54 -05:00
Zuul 7aa88029bb Merge "Resolve TODO in _remove_host_allocations" 2019-11-12 11:21:36 +00:00
Zuul bf415a8dc2 Merge "Add func test for 'required' PCI NUMA policy" 2019-11-12 11:17:30 +00:00
Zuul 63f84c17ff Merge "Pass RequestContext to oslo_policy" 2019-11-12 11:17:23 +00:00
Zuul ee16ae1b39 Merge "Plumb allow_cross_cell_resize into compute API resize()" 2019-11-12 02:42:44 +00:00
Matt Riedemann b6133f8183 Remove TODOs around claim_resources_on_destination
The TODOs were added back in the Queens/Pike timeframe [1][2]
but at this point there probably isn't much value in resolving
those TODOs by adding a skip_filters kwarg to the scheduler
especially since [3] changed the method to not support nested
resource provider allocations so the minimal duplication with
what we do in the scheduler in the non-force evacuate/live migrate
cases is sufficient.

[1] Ie63a4798d420c39815e294843e02ab6473cfded2
[2] I6590f0eda4ec4996543ad40d8c2640b83fc3dd9d
[3] I7cbd5d9fb875ebf72995362e0b6693492ce32051

Change-Id: I3e599147f95337477c9573b517feee67e0ae37e4
2019-11-10 19:29:23 +00:00
Matt Riedemann 29f22b3b51 Resolve TODO in _remove_host_allocations
The Selection object returned by select_destinations has a
compute node UUID in it so we don't have to look up the
compute node object by host/nodename when reverting allocations
during live migration.

Change-Id: I0156cda8543f847d50e16683e1eb29fbdd556d27
2019-11-10 09:42:50 -05:00
Zuul 17b5a1ab85 Merge "Use admin neutron client to see if instance has qos ports" 2019-11-08 21:16:12 +00:00
Zuul a5898ff4c0 Merge "Replace time.sleep(10) with service forced_down in tests" 2019-11-08 16:14:20 +00:00
melanie witt 1c93ca82b8 Replace time.sleep(10) with service forced_down in tests
The server group functional tests are doing time.sleep(10) in order
to make sure a stopped compute service is considered "down" by the nova
compute API.

Instead of sleeping, we can set the service as "forced_down" to get the
desired "down" compute service status and avoid unnecessary delays in
these tests.

Unnecessary service start() calls are also removed in this change. They
appear at the end of tests and services are started during each test
setUp() and killed during each test tearDown() via the ServiceFixture.

Closes-Bug: #1783565

Change-Id: I74f64b68e4b33ee0f8c45fdc5f570c7e12e05d3b
2019-11-08 15:09:32 +00:00
Zuul d597ecde10 Merge "Remove the TODO about using OSC for BFV in test_evacuate.sh" 2019-11-08 08:18:07 +00:00
Matt Riedemann 112999e1dd Delete _normalize_inventory_from_cn_obj
With Ib62ac0b692eb92a2ed364ec9f486ded05def39ad and the
get_inventory method gone nothing uses this so we can
remove it now.

Change-Id: I3f55e09641465279b8b92551a2302219fe6fc5ca
2019-11-07 17:25:11 -05:00
Matt Riedemann c80912866f Drop compat for non-update_provider_tree code paths
In Train [1] we deprecated support for compute drivers
that did not implement the update_provider_tree method.
That compat code is now removed along with the get_inventory
method definition and (most) references to it.

As a result there are more things we can remove but those
will come in separate changes.

[1] I1eae47bce08f6292d38e893a2122289bcd6f4b58

Change-Id: Ib62ac0b692eb92a2ed364ec9f486ded05def39ad
2019-11-07 17:20:18 -05:00