There is inconsistency on return code nova API return
for "Feature not supported/implemented'. Current return
code are 400, 409, and 403.
- 400 case: Example: Multiattach Swap Volume Not Supported
- 403 case: Cyborg integration
- 409 case: Example: Operation Not Supported For SEV ,
Operation Not Supported For VTPM
In xena PTG, we agreed to fix this by returning 400 in all cases
- L446: https://etherpad.opendev.org/p/nova-xena-ptg
This commit convert all the features not supported error to
HTTPBadRequest(400).
To avoid converting every NotSupported inherited exception
in API controller to HTTPBadRequest generic conversion is
added in expected_errors() decorator.
Closes-Bug: #1938093
Change-Id: I410924668a73785f1bfe5c79827915d72e1d9e03
As a precaution reject all the server lifecycle operations that currently
do not support port-resource-request-groups API extension. These
are:
* resize
* migrate
* live migrate
* evacuate
* unshelve after shelve offload
* interface attach
This rejection will be removed in the patch that adds support for the
given operation.
blueprint: qos-minimum-guaranteed-packet-rate
Change-Id: I12c25550b08be6854b71ed3ad4c411a244a6c813
There are a number of operations that are known not to work with vDPA
interfaces and another few that may work but haven't been tested. Start
blocking these. In all cases where an operation is blocked a HTTP 409
(Conflict) is returned. This will allow lifecycle operations to be
enabled as they are tested or bugs are addressed.
Change-Id: I7f3cbc57a374b2f271018a2f6ef33ef579798db8
Blueprint: libvirt-vdpa-support
Replace six.text_type with str.
This patch completes six removal.
Change-Id: I779bd1446dc1f070fa5100ccccda7881fa508d79
Implements: blueprint six-removal
Signed-off-by: Takashi Natsume <takanattie@gmail.com>
To support move operations with qos ports both the source and the
destination compute hosts need to be on Ussuri level. We have service
level checks implemented in Ussuri. In Victoria we could remove those
checks as nova only supports compatibility between N and N-1 computes.
But we kept them there just for extra safety. In the meanwhile we
codified [1] the rule that nova does not support N-2 computes any
more. So in Wallaby we can assume that the oldest compute is already
on Victoria (Ussuri would be enough too).
So this patch removes the unnecessary service level checks and related
test cases.
[1] Ie15ec8299ae52ae8f5334d591ed3944e9585cf71
Change-Id: I14177e35b9d6d27d49e092604bf0f288cd05f57e
When using emulated TPM, libvirt will store the persistent TPM data
under '/var/lib/libvirt/swtpm/<instance_uuid>' which is owned by the
"tss" or "root" user depending how libvirt is configured (the parent
directory, '/var/lib/libvirt/swtpm' is always owned by root). When doing
a resize or a cold migration between nodes, this data needs to be copied
to the other node to ensure that the TPM data is not lost. Libvirt
won't do this automatically for us since cold migrations, or offline
migrations in libvirt lingo, do not currently support "copying
non-shared storage or other file based storages", which includes the
vTPM device [1].
To complicate things further, even if migration/resize is supported,
only the user that nova-compute runs as is guaranteed to be able to have
SSH keys set up for passwordless access, and it's only guaranteed to be
able to copy files to the instance directory on the dest node.
The solution is to have nova (via privsep) copy the TPM files into the
local instance directory on the source and changes the ownership. This
is handled through an additional call in 'migrate_disk_and_power_off'.
As itself, nova then copies them into the instance directory on the
dest. Nova then (once again, via privsep) changes the ownership back and
moves the files to where libvirt expects to find them. This second step
is handled by 'finish_migration'. Confirming the resize will result in
the original TPM data at '/var/lib/libvirt/swtpm' being deleted by
libvirt and the copied TPM data in the instance data being cleaned up by
nova (via 'confirm_migration'), while reverting it will result on the
same on the host.
Part of blueprint add-emulated-virtual-tpm
[1] https://libvirt.org/migration.html#offline
Change-Id: I9b053919bb499c308912c8c9bff4c1fc396c1193
Signed-off-by: Chris Friesen <chris.friesen@windriver.com>
Co-authored-by: Stephen Finucane <stephenfin@redhat.com>
We're going to gradually introduce support for the various instance
operations when using vTPM due to the complications of having to worry
about the state of the vTPM device on the host. Add in API checks to
reject all manner of requests until we get to include support for each
one. With this change, the upcoming patch to turn everything on will
allow a user to create, delete and reboot an instance with vTPM, while
evacuate, rebuild, cold migration, live migration, resize, rescue and
shelve will not be supported immediately.
While we're here, we rename two unit test files so that their names
match the files they are testing and one doesn't have to spend time
finding them.
Change-Id: I3862a06ca28b383d525bcc9dcbc6fb1d4062f193
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Current tests do not have good test coverage of existing policies.
Either tests for policies do not exist or if they exist then they
do not cover the actual negative and positive testing.
For Example, if any policy with default rule as admin only then
test should verify:
- policy check pass with context having admin or server owner
- policy check fail with context having non-admin and not server owner
As discussed in policy-defaults-refresh, to change the policies
with new default roles and scope_type, we need to have the enough
testing coverage of existing policy behavior.
When we will add the scope_type in policies or new default roles,
then these test coverage will be extended to adopt the new changes
and also make sure we do not break the existing behavior.
This commit covers the testing coverage of existing migrate server policies.
Partial implement blueprint policy-defaults-refresh
Change-Id: I1de770bd17d3b8bd2e4f3381ae73a1f6cdf30c80
Previous patches in the blueprint implemented the support for unshelve
with qos ports and added functional test coverage for the
various scenarios. So this patch changes the API check
that rejected such operation to check for the service version and therefore
conditionally enable the feature.
Change-Id: Iaf70ee41f1bfb1a4964da3f59cd3a0b4b5e20d36
blueprint: support-move-ops-with-qos-ports-ussuri
At some point in the past, there was only nova-network and its code
could be found in 'nova.network'. Neutron was added and eventually found
itself (mostly!) in the 'nova.network.neutronv2' submodule. With
nova-network now gone, we can remove one layer of indirection and move
the code from 'nova.network.neutronv2' back up to 'nova.network',
mirroring what we did with the old nova-volume code way back in 2012
[1]. To ensure people don't get nova-network and 'nova.network'
confused, 'neutron' is retained in filenames.
[1] https://review.opendev.org/#/c/14731/
Change-Id: I329f0fd589a4b2e0426485f09f6782f94275cc07
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
If the source compute service is down when a resize or
cold migrate is initiated the prep_resize cast from the
selected destination compute service to the source will
fail/hang. The API can validate the source compute service
is up or fail the operation with a 409 response if the
source service is down. Note that a host status of
"MAINTENANCE" means the service is up but disabled by
an administrator which is OK for resize/cold migrate.
The solution here works the validation into the
check_instance_host decorator which surprisingly isn't
used in more places where the source host is involved
like reboot, rebuild, snapshot, etc. This change just
handles the resize method but is done in such a way that
the check_instance_host decorator could be applied to
those other methods and perform the is-up check as well.
The decorator is made backward compatible by default.
Note that Instance._save_services is added because during
resize the Instance is updated and the services field
is set but not actually changed, but Instance.save()
handles object fields differently so we need to implement
the no-op _save_services method to avoid a failure.
Change-Id: I85423c7bcacff3bc465c22686d0675529d211b59
Closes-Bug: #1856925
Before [1] this could be raised from the API resize()
method if getting a RequestSpec failed and a target host
was specified for cold migration. Since that change the
usage of the exception was removed so we can remove it
altogether since only unit test code is using it.
[1] I34ffaf285718059b55f90e812b57f1e11d566c6f
Change-Id: I19db48bd03855d1a1edbeff5adf15a28abcb5d92
The nova-api checks at each move* operation if the instance has qos port
attached as not all the move operations are supported for such servers.
Nova uses the request context to initialize the neutron client for the
port query. However neutron does not return the value of the
resource_request of the port if it is queried with a non admin client.
This causes that if the move operation is initiated by a non admin
then nova thinks that the ports do not have resource request.
This patch creates an admin context for this neutron query.
The new functional tests are not added before this patch in a regression
test like way as existing functional tests are reused with different
setup and doing that without the fix causes a lot of different failure
scenarios.
Note that neutron fixture is changed to simulate the different behavior
in case of different request context are used to initialize the client.
*: Note that Id5f2f4f22b856c989e2eef8ed56b9829d1bcefb6 removed the check
for evacuate in Ussuri but exists in Train and Stein.
Change-Id: I3cf6eb4654663865d9258c38f05cd05974ffcf9d
Closes-Bug: #1850280
During cold migrate the RequestSpec goes from the dest compute to the
source compute and then back to the dest. The previous patch [1] added
service level check for the dest compute. However the source compute also
needs to be new enough so the RequestSpec is passed through it.
Please note that the functional coverage for this api change is in a
later patch [2].
[1] https://review.opendev.org/#/c/680394
[2] https://review.opendev.org/#/c/655113
blueprint: support-move-ops-with-qos-ports
Change-Id: I09cac780b9ee5b5726874d4e6f895fd0cd4bff8c
The instance.flavor is lazy-loaded currently in the resize method
in nova/compute/api.py.
Set expected_attrs=['flavor'] at the common.get_instance
in the _migrate method
in nova/api/openstack/compute/migrate_server.py
to avid lazy-loading instance.flavor.
Change-Id: Iba3b7c3e027ec78395a102c1fed46fa7a2ffa7be
Closes-Bug: #1829877
This addresses review comments from the following changes:
I61a3e8902a891bac36911812e4e7c080570e3850
I48e6db9693e470b177bf4c75211d8b883c768433
Ic70d2bb781b6a844849a5cf2fe4d271b5a81093d
I5a956513f3485074023e027430cc52ee7a3f92e4
Ica6152ccb97dce805969d964d6ed032bfe22a33f
Part of blueprint bandwidth-resource-provider
Change-Id: Idffaa6d206cda3f507e6be095356537f22302ad7
Add a new microversion that removes support for the aforementioned
argument, which cannot be adequately guaranteed in the new placement
world.
Change-Id: I2a395aa6eccad75a97fa49e993b0300bdcfc7258
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Implements: blueprint remove-force-flag-from-live-migrate-and-evacuate
APIImpact
Nova does not consider the resource request of a Neutron port as of now.
So this patch makes sure that server migrate and live migrate requests are
rejected if they involve a port that has resource request. When the feature
is ready on the nova side this limitation will be lifted.
blueprint: bandwidth-resource-provider
Change-Id: I48e6db9693e470b177bf4c75211d8b883c768433
Live migration is currently totally broken if a NUMA topology is
present. This affects everything that's been regrettably stuffed in with
NUMA topology including CPU pinning, hugepage support and emulator
thread support. Side effects can range from simple unexpected
performance hits (due to instances running on the same cores) to
complete failures (due to instance cores or huge pages being mapped to
CPUs/NUMA nodes that don't exist on the destination host).
Until such a time as we resolve these issues, we should alert users to
the fact that such issues exist. A workaround option is provided for
operators that _really_ need the broken behavior, but it's defaulted to
False to highlight the brokenness of this feature to unsuspecting
operators.
Change-Id: I217fba9138132b107e9d62895d699d238392e761
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Related-bug: #1289064
This patch sets the stage for modifying the behavior of nova show
which currently gives a 500 when the cell in which the instance
lives is down. The new behavior will return a partial construct
consisting of uuid, project_id, created_at from instance_mappings
table and user_id, flavor, image_ref and availability_zone info
from request_specs table. Note that the rest of the keys will be
missing. This behavior will be enabled by passing a new enough
microversion, handling for which is introduced later in this series.
Related to blueprint handling-down-cell
Change-Id: Iaea1cb4ed93bb98f451de4f993106d7891ca3682
Replace mox with mock or stub_out in the following files
in nova/tests/unit/api/openstack/compute directory.
* admin_only_action_common.py
* test_admin_actions.py
* test_lock_server.py
* test_migrate_server.py
* test_pause_server.py
* test_suspend_server.py
Change-Id: I83b473e9ba557545b5c186f979e068e442de2424
Implements: blueprint mox-removal
Change Ibcb6bf912b3fb69c8631665fef2832906ba338aa dropped
the compute RPC API code for checking old computes from the
API and raising specific errors if the API is trying to perform
some action on an instance running on an old compute that can't
handle that action.
With that change, the API now expects that computes should be
able to handle at least Queens level operations, i.e. the
can_send_version checks for those operations in the compute RPC
API code would not be False.
Since those compute RPC API checks are removed, the handling in
the API is dead code now, so we can cleanup that handling.
Change-Id: Ibd05139c5f6a0548f17e24d3807746b93d76f446
Previously user was getting a 500 error code for ComputeHostNotFound
if they are using latest microversion that does live migration in
async. This patches changes return response to 400 as 500 internal
server error should not be returned to the user for failures due to
user error that can be fixed by changing to request on client side.
Change-Id: I7a9de211ecfaa7f2816fbf8bcd73ebbdd990643c
closes-bug:1643623
This function enables users to specify a target host
when cold migrating a VM instance.
This patch modifies the migration API.
APIImpact
Add an optional parameter 'host' in cold migration action.
Change-Id: Iee356c4dd097c846b6ca8617ead6a061300c83f8
Implements: blueprint cold-migration-with-target-queens
Commit c824982e6a did not update
expected exceptions. Therefore we end up with 500 internal server
error when triggering targeted live migration to non-existing compute
node. This patch adds ComputeHostNotFound to expected exception list
in both, conductor and API.
Change-Id: If515a90217a8e329d932dcacb357b78081c505c1
Related-bug: 1538837
pre live-migration checks now are done in async way. This patch
updates rest api version to keep this tracked.
bp: async-live-migration-rest-check
Change-Id: I9fed3079d0f1b7de3ad1b3ecd309c93785fd11fe
After modifying the evacuate action, we now add a new microversion
change for modifying the live-migrate call so that the scheduler is
called when the admin user provides an hostname unless the force
field is provided.
APIImpact
Implements: blueprint check-destination-on-migrations-newton
Change-Id: I212cbb44f46d7cb36b5d8c74a79065d38fc526d8
When Cinder client exception is thrown in initialize_connection
(which is called during prechecks for Live migration),
instance is moved to ERROR state. It is not sensible
to move the instance to ERROR state when the precheck fails.
Adding new exception changes this behavior. Instance will be reset
to Active state when the precheck fails.
Closes bug: #1544744
Change-Id: I7a5fcc070ff53086f37417f12e2b9f383e220747
There are two implementation code for similar API in Nova repository.
One is newer: v2.1 API, another is legacy: v2 API. v2.1 API has been
used as the default API since Liberty and legacy v2 API has been marked
as deprecated. We have used and tested v2.1 API so well and now is nice
time to remove legacy API code based on the consensus of the design
summit of Austin. This patch removes unit tests of legacy v2 API[f-n].
Partially implements blueprint remove-legacy-v2-api-code
Change-Id: I543bc2a9c068aae2c755f8159c7d2a9fff2c67ee
It has been reported that the exception LiveMigrationWithOldNovaNotSafe
is not useful since the change
I5651fb7ba95f38e2e2f8a48a98ff04072c6bb885.
This patch will cleanup the definition and the occurences of
that exception.
Change-Id: I7a5b677904d83104c4f5367b0245eebd422e2338
Closes-Bug: #1550282
This is os-migrateLive API changes:
* 2.25 - Make block_migration to support 'auto' value, remove
disk_over_commit.
Partially implements: blueprint making-live-migration-api-friendly
APIImpact
DocImpact
Change-Id: Ibb0d50f0f7444028ef9d0c294aea41edf0024b31
When we boot and resize instance, if multiple requested
resource(core, ram and instances) exceeded quota,
only the detail of core resource will been outputed to
user in the exception, the loss of core and instances number
will make end user have no idea which flavor can be
used to boot instance successfully.
Fix this issue and update related test cases.
Change-Id: I969d73e2f222278ea8a2bb4c21474c13ab213d81
Closes-Bug: #1469942
This patch moves the tests in contrib/ and plugins/v3/ into the
base directory.
Note that some of the tests have both v2 and v21 tests, The v2
tests will be deleted when V2 API removed.
Co-Authored-By: Ed Leafe <ed@leafe.com>
Change-Id: I6ff1d6594e7a44f2bcb6bbb04a4277b98d1cac74
Partial-Bug: #1462901