Cinder introduced "shared_targets" and "service_uuid" fields in volumes
to allow volume consumers to protect themselves from unintended leftover
devices when handling iSCSI connections with shared targets.
Nova avoids races caused by automatic rescans on iSCSI volumes when
detaching a volume while Cinder is mapping another volume to the same
host by locking and only allowing one attach or one detach operation for
each server to happen at a given time if "shared_targets" is set to
True.
When using an up to date Open iSCSI initiator we don't need to use
locks, as it is possible to disable automatic LUN scans (which are the
real cause of the leftover devices), and OS-Brick already supports this
feature.
Currently Nova is blindly locking whenever "shared_targets" is set to
True, even when the iSCSI initiator and OS-Brick are already presenting
such races, which introduces unnecessary locking and serialization on
the connection of volumes.
This patch uses the new context manager introduced in OS-Brick to allow
Nova to abstract its code from all this storage internal details and to
only lock when it's really necessary.
Depends-On: I4970363301d5d1f4e7d0f07e09b34d15ee6884c3
Closes-Bug: #1800515
Change-Id: Ie9106d5832d6a728ea97a8dbb5ddb5dcc17a2ec4
The combined fixes for the two related bugs resolve the problem where
SIGHUP breaks the nova-compute service. Bump the minimum requirements
for oslo.privsep and oslo.service to make sure these fixes are in place,
and add a reno to advertise resolution of the issue.
This also bumps oslo.utils to match the lower constraint from
oslo.service.
Change-Id: I39ead744b21a4423352a88573f327273e4d09630
Related-Bug: #1794708
Related-Bug: #1715374
When instance_get_all_uuids_by_hosts was added [1] some follow up
cleanups where suggested. This change provides them:
* removal of redundance in docstring
* moving docstring to the public method, rather than the private
implementation
* more clarity on the type of the default (defaultdict(list)) and
the implications thereof
* Using an sa.bindparam in the 'in_' call. This requires that the
SQLAlchemy requirment be raised to at least 1.2.0 where the feature
was added. 1.2.19, the latest bugfix release, is chosen.
[1] If92fe8b75d20a738f37e2a74c52c59bfc699a74f
Change-Id: Ib538ab070d73b06ddeb9fea3af149304e40952ec
Add a new "hw:mem_encryption" extra spec parameter, and a new
"hw_mem_encryption" image property, which indicate that any guest
booted with that extra spec parameter or image property respectively
needs to be booted with its memory hardware-encrypted.
This is achieved by converting the requirement stated in the extra
spec parameter and/or image property into an additional extra spec
parameter which requests resources for one slot of the inventory of
the new MEM_ENCRYPTION_CONTEXT resource class (introduced in
os-resource-classes 0.4.0). The inventory will be provided by the
follow-up commit I659cb77f12a38a4d2fb118530ebb9de88d2ed30d.
Since future commits adding support for SEV to guest XML config will
also need to know at launch-time whether memory encryption has been
requested, add a reusable mem_encryption_requested() function to the
nova.virt.hardware library for detecting which of the extra spec /
image property (if either) have requested encrypted memory.
If both the extra spec parameter and the image property are explicitly
specified and they contradict each other, or if either request memory
encryption but the image does not have hw_firmware_type set to UEFI,
then log an error and raise a new generic FlavorImageConflict
exception. This exception can also be useful in the future for
handling other similar conflicts. In this particular use case,
FlavorImageConflict is raised by mem_encryption_requested(), and then
if caught during API call validation, it's re-raised as
HTTPBadRequest.
In order to test this code, we need to construct various ImageMeta
objects containing fake data and a ImageMetaProps instance for each.
This is a slightly fiddly task which future patches in the SEV series
will also need to perform, so add a helper to nova.tests.unit.image.fake
for this.
blueprint: amd-sev-libvirt-support
Change-Id: I8c63b5cc5ad97ce831adb2eb96a995ebc798ecb7
In version 0.35.0, openstacksdk added a strict_proxies kwarg to the
Connection constructor [1].
Without it, openstacksdk tries really hard to give us an Adapter, which
in the case of the service being down can mean we default to the catalog
endpoint without doing any discovery. This should usually work; but may
break in cases where the discovery document (at the catalog endpoint)
points to different URLs for versioned endpoints.
This commit adds a check_service bool kwarg to get_sdk_adapter which, if
True, uses strict_proxies to create the Connection, and causing
get_sdk_adapter to raise a ServiceUnavailable exception if the service
is down.
This can be used for services like Ironic, where we're set up to
tolerate connect failures on startup. But it should not be used for
services like Placement, where we expect getting the adapter to succeed,
and are instead tolerant of failures making the actual API calls.
[1] https://review.opendev.org/#/c/676837/
This dependency bumps the openstacksdk u-c in the requirements project.
Depends-On: https://review.opendev.org/678207
Change-Id: I86e038af8a96e113a754b2fdb3698acd3783c1c8
A number of different efforts are going to need to make use of
openstacksdk 0.34.0 and keystoneauth1 3.16.0, so rather than bump the
minimum in all of those, bump it in one place.
Also, this gives us the opportunity to independently validate some of
the fixes (particularly in logging) we were expecting to affect nova.
Change-Id: I87d1dcd299f6547d5f3c3d77e219bf71aba1cff2
'AVX512-VNNI' is the instruction set for vector neural network
instructions supported since CascadeLake CPU. Enabling this
CPU feature in Nova.
Requires 'os-traints' to be greater than '0.16.0'.
Depends-On: Ia421ed500fbc15bf0088a8436ddeb5d8d1196256
Change-Id: I4ee821cba7cd23f0db9dc2c2c83c78ef5e70ad7b
Enables the use of the sdk instead of ksa adapter or python-*client.
It is provided by a get_sdk_adapter method which constructs an
authenticated SDK Connection object using provided service configuration.
This change should be transparent to operators of services which already
use ksa as get_sdk_adapter uses the same conf options from keystoneauth1.
Blueprint: openstacksdk-in-nova
Co-Authored-By: Dustin Cowles <dustin.cowles@intel.com>
Change-Id: I49f364e01e2a18de0c95674654fc72acea019e76
Release 3.15.0 of keystoneauth1 introduced the ability to pass
X-Openstack-Request-Id to request methods (get/put/etc) via a
global_request_id kwarg rather than having to put it in a headers dict.
This commit bumps the minimum ksa level to 3.15.0 and takes advantage of
the new kwarg to replace explicit header construction in
SchedulerReportClient (Placement) and neutronv2/api methods.
Also normalizes the way param lists were being passed from
SchedulerReportClient's REST primitives (get/put/post/delete) into the
Adapter equivalents. There was no reason for them to be different.
Change-Id: I2f6eb50f4cb428179ec788de8b7bd6ef9bbeeaf9
This adds code which hooks in the update_provider_tree flow in the
ResourceTracker, specifically when the RT is generically modifying
the compute-owned traits for the given compute node resource provider.
A future change will add the scheduler request pre-filter and
API code to sync the trait when enabling/disabling a compute service.
This is necessary for two cases specifically:
1. After upgrading an older disabled compute we will sync the trait.
2. If enabling/disabling a compute and the service is down, the API
will not call the compute service to sync the trait. When the
compute service is restarted we will sync the trait on startup
with this code.
The COMPUTE_STATUS_DISABLED trait was added to os-traits in change
Ia8e4487bfb59f764a6817ec8650785ffa902eab5 which is in the 0.15.0
release of os-traits so the requirements are bumped here as well.
Part of blueprint pre-filter-disabled-computes
Change-Id: I3005b46221ac3c0e559e1072131a7e4846c9867c
Version 2.6 of the cryptography library [1] added support for ed25519
ssh keys. This works with OpenSSL >= 1.1.1b.
In nova, we can enable people to use ed25519 ssh keys by using the
necessary cryptography library version. Users must make sure they have
a new enough OpenSSL version, else they won't be able to generate
ed25519 ssh keys using ssh-keygen in the first place. I did a local
test using Ubuntu 18.04 and things "just worked" when I generated a
ed25519 ssh key and imported it into nova. I left a comment on the
launchpad bug accordingly.
This updates our minimum version to the latest available version 2.7.
Closes-Bug: #1555521
[1] https://cryptography.io/en/latest/changelog/#v2-6
Change-Id: Id4a4e1ae4c0acd40c1fc32c3b82a8d8a62d4624d
This release of the Cinder client broke support for the v3
volume-transfer APIs unless microversion 3.55 or higher was requested.
Depends-On https://review.opendev.org/#/c/587877/
Change-Id: Ieb685a476d51d92ad3f153fb3d1fabfb6d5a4376
This is to pick up change If20663ecad19f18f22172ae489206b42489fd9f6
for our lower-constraints CI job to avoid blowing up the subunit
parser with too much log output.
Change-Id: I3c404bd650521b44faddf97ef0b41953f82c4bd2
Related-Bug: #1813147
This makes the base virt driver define capability flags for each of the
glance-defined image types. It also adds a capability-to-trait mapping
for each, causing any driver that supports a given image type to expose
the corresponding trait.
Related to blueprint request-filter-image-types
Change-Id: Id2912a46dddee3d63ce373e4d280fad79d0128a8
We have jsonschema capped at a fairly old version. Other than some
specific releases, it looks like keeping it below 3.0 was added in
I943fd68b9fab3bce1764305a5058df5339470757 without really any explanation
why.
In order to update to a 3.x release we need to:
1. Remove the cap from global-requirements.txt (see Depends-On), leaving
upper-constraints.txt at a 2.x release
2. Remove the cap from all consumers (this change)
3. Release a new version of consumers that are published to pypi
4. Update upper-constraints.txt with those new releases
5. Update jsonschema in upper-constraints.txt to a 3.X release
(See: https://review.openstack.org/649789)
6. Test consumers with the change from 5.
7. [Optional] fix issues in consumers that arise from 6.
8. Merge the change from 5.
Change-Id: I8ba739b97cb9673b34acb041524a2041c1489466
Co-Authored-by: Sean McGinnis <sean.mcginnis@gmail.com>
Depends-On: https://review.openstack.org/649669
In the review of a similar change in placement [1], it was realized that
the nova lower-constraints tox job probably had the same problems.
Testing revealed this to be the case. This change fixes the job and
updates the related requirements problems accordingly.
The are two main factors at play here:
* The default install_command in tox.ini uses the upper_contraints.txt
file. When there is more than one constraints.txt they are merged and
the higher constraints win. Using upper and lower at the same time
violates the point of lower (which is to indicate the bare minimum
we are capable of using).
* When usedevelop is true in tox, the command that is run to install the
current projects code is something like 'python setup.py develop',
which installs a project's requirements _after_ the install_command has
run, clobbering the constrained installs. When using pbr,
'python setup.py install' (used when usedevelop is False) does not do
this.
Fixing those then makes it possible to use the test to fix the
lower-constraints.txt and *requirements.txt files, changes include:
* Defining 'usedevelop = False' in the 'lower-constraints' target and
removing the otherwise superfluous 'skipsdist' global setting to
ensure requirements aren't clobbered.
* Removing packages which show up in lower-constraints.txt but not in
the created virtualenv. Note that the job only runs unit tests, so
this may be incomplete. In the placement version of this both unit and
functional are run. We may want to consider that here.
* Updating cryptography. This version is needed with more recent
pyopenssl.
* Updated keystonemiddleware. This is needed for some tests which
confirm passing configuration to the middleware.
* Update psycopg2 to a version that can talk with postgresql 10.
* Add PyJWT, used by zVMCloudConnector
* Update zVMCloudConnector to a version that works with Python 3.5 and
beyond.
* Update olso.messaging to versions that work with the tests, under
Python 3.
* Adding missing transitive packages.
* Adjusting alpha-ordering to map to how pip freeze does it.
* setuptools is removed from requirements.txt because the created
virtualenv doesn't contain it
NOTE: The lower-constraints.txt file makes no commitment to expressing
minimum requirements for anything other than the current basepython.
So the fact that a different set of lower-constraints would be present
if we were using python2 is not relevant. See discussion at [1].
However, since requirements.txt _is_ used for python2, the
requirements-check gate job requires that enum34 be present in
lower-constraints.txt because it is in requirements.txt.
NOTE: A test is removed because it cannot work in the
lower-constraints context: 'test_policy_generator_from_command_line'
forks a call to 'oslopolicy-policy-generator --namespace nova' which
fails because stevedore fails to pick up nova-based entry points when
in a different process. This is because of the change to usedevelop.
After discussion with the original author of the test removal was
considered an acceptable choice.
[1] http://eavesdrop.openstack.org/irclogs/%23openstack-dev/%23openstack-dev.2019-03-05.log.html#t2019-03-05T13:28:23
Closes-Bug: #1822575
Change-Id: Ic6466b0440a4fe012731a63715cf5d793b6ae4dd
We recently exposed the privsep opts for config generator use, so
projects that depend on oslo.privsep should include them in their
sample configs.
Change-Id: I7fab7002d51b2aaf1b0a6545d07b616120e26461
This builds on the ProviderTree work in the compute driver and
resource tracker to take the supported capabilities from a driver and
turn those into standard traits on the compute node resource provider.
This is a simple way to expose in a REST API (Placement in this case)
what a compute node, via its driver, supports.
This is also something easy that we can do in lieu of a full-blown
compute capabilities REST API in nova, which we've talked about for
years but never actually done anything about.
We can later build on this to add a request filter which will mark
certain types of boot-from-volume requests as requiring specific
capabilities, like for volume multiattach and tagged devices.
Any traits provided by the driver will be automatically added during
startup or a periodic update of a compute node:
https://pasteboard.co/I3iqqNm.jpg
Similarly any traits later retracted by the driver will be
automatically removed.
However any traits associated with capabilities which are
inappropriately added to or removed from the resource provider by the
admin via the Placement API will not be reverted until the compute
service's provider cache is reset.
The new call graph is shown in this sequence diagram:
https://pasteboard.co/I25qICd.png
Co-Authored-By: Adam Spiers <aspiers@suse.com>
Related to blueprint placement-req-filter
Related to blueprint expose-host-capabilities
Change-Id: I15364d37fb7426f4eec00ca4eaf99bec50e964b6
The warning should be gone with change
I192e84ce757d12d33085a209dd58d8ea46fb90fb in
oslo.db 4.44.0 so this changes the warnings
filter from ignore to error and bumps the minimum
required version of oslo.db to include that change.
Change-Id: If7b1a9613b58476fab8409211512613a8863cdde
Related-Bug: #1813147
Related-Bug: #1814199
This resolves the following deprecation warning:
b'/home/zuul/src/git.openstack.org/openstack/nova/.tox/functional-py35/
lib/python3.5/site-packages/oslo_service/threadgroup.py:193:
DeprecationWarning: Calling add_timer() with arguments to the callback
function is deprecated. Use add_timer_args() instead.'
The add_timer_args method was added in 1.34.0:
Ib2791342263e2b88c045bcc92adc8160f57a0ed6
So the required version of oslo.service is also updated.
Change-Id: Id54226dc926839686906d04ecf8d791c0881f82a
Partial-Bug: #1813147
With the extraction of placement we ended up with resource class names
being duplicated between nova and placement. To address that, the
os-resource-classes library [1] was created to provide a single
authority for standard resource classes and the format of custom
classes.
This patch changes nova to use it, removing the use of the rc_fields
module which used to have the information. A method left in it
(normalize_name) has been moved to utils.py, renamed as
normalize_rc_name, and callers and tests updated accordingly.
Because the placement code is being kept in nova for the time being,
that code's use of rc_fields is maintained, and the module too.
A note is added in the module explain that. Backporting the changes
from extracted-placement to placement-in-nova was considered but
because we no longer have placement tests in nova, that didn't seem
like the right thing to do.
requirements and lower-constraints have been updated.
os-resource-classes is already in global requirements.
For reference the related placement change is at [2].
[1] https://docs.openstack.org/os-resource-classes
[2] https://review.openstack.org/#/c/623556/
Change-Id: I8e579920c0eaca81b563a87429c930b21b3d4dc5
Add plumbing for Contrail/Tungsten Fabric datapath offloads
* This change expands the VNIC type support for the vrouter VIF type by
adding 'direct' and 'virtio-forwarder' plugging support.
* After this change, the vrouter VIF type will support the following modes:
* 'normal': the 'classic' tap-style VNIC plugged into the instance,
* 'direct': a PCI Virtual Function is passed through to the instance,
* 'virtio-forwarder': a PCI Virtual Function is proxied to the
instance via a vhost-user virtio forwarder.
* The os-vif conversion function was extended to support the two new
plugging modes.
* Unit tests were added for the os-vif conversion functions.
* OpenContrail / Tungsten Fabric is planning to consume this
functionality for the 5.1 release.
* os-vif 1.14.0 is required to pass the metadata
Change-Id: I327894839a892a976cf314d4292b22ce247b0afa
Depends-On: I401ee6370dad68e62bc2d089e786a840d91d0267
Signed-off-by: Jan Gutter <jan.gutter@netronome.com>
blueprint: vrouter-hw-offloads
The rfc3986.is_valid_uri has been deprecated in 1.1.0.
It generates the following warnings.
DeprecationWarning: Please use rfc3986.validators.Validator instead.
This method will be eventually removed.
So this patch replaces rfc3986.is_valid_uri with
rfc3986.validators.Validator.
Even after applying this patch, the warnings are still output
because it is caused by oslo.config.
The fix for oslo.config will be done in another patch.
Change-Id: I70aebad6c6bd384dbd11ef732226356922bf1913
Closes-Bug: #1809755
The CheatingSerializer fixture used in nova tests keeps
the RequestContext.db_connection set on the context object
which otherwise wouldn't normally happen. The context object
can show up in error notification payloads because of how
nova.exception_wrapper.wrap_exception works. That payload
is eventually serialized for notifications and since the
cheated RequestContext.db_connection is set and cannot be
serialized, it results in a UserWarning from the
jsonutils.to_primitive method (called via JsonPayloadSerializer).
This will eventually result in failures when that UserWarning
is made into an error.
To fix this, we can pass a fallback method to to_primitive()
which will serialize a RequestContext object the same way that
RequestContextSerializer serializes a context - by simply
converting it to dict form.
Since this only affects test runs, because of using the
CheatingSerializer fixture, it should have no impact on
runtime serializations.
Error logging is added to the FakeNotifier since it's hard
to know what is wrong in the payload unless it is logged.
Also, the WarningsFixture is updated to make sure we don't
introduce new UserWarnings for the serialization issue.
The jsonutils.to_primitive() fallback method was added to
oslo.serialization via commit cdb2f60d26e3b65b6370f87b2e9864045651c117
in 2.21.1 so we have to bump our minimum required version
of that library as well.
Change-Id: Id9f960a0c7c8897dbb9edf53b4723154341412d6
Closes-Bug: #1799249
The common upgrade check code has been moved to oslo.upgradecheck.
This change switches the Nova upgrade checks to use the common code
and removes the tests for functionality that is now the responsibility
of the library.
Change-Id: I0dc2044286dbe78314c650a92c4654f7f50642d2
Based on eventlet issue [1] ThreadPoolExecutor doesn't play nice with
eventlet in python 3.7. We saw deadlocks in the functional-py37
execution in live migration tests due to live migration using
ThreadPoolExecutor.
The [1] suggests to replaces ThreadPoolExecutor with
futurist.GreenThreadPoolExecutor to avoid deadlocks. So this patch does
the replacement and adjusts the unit tests accordingly.
As the ThreadPoolExecutor was the last used class from the futures
module we remove that from the requirements and add the futurist module
instead.
[1] https://github.com/eventlet/eventlet/issues/508
Change-Id: Ia56ab43be739e677760bbad5c40caad924425fa5
Recently, the _ThreadingEvent class in oslo.service was removed [1] and
our unit test patching is preventing us from moving to a newer version
of oslo.service [2].
We have patching of the _ThreadingEvent.wait method to bypass the sleep
time in the looping call of RetryDecorator, which adds several seconds
to the run time of unit tests.
This changes things to use the new SleepFixture from oslo.service
instead.
Depends-On: https://review.openstack.org/616371
[1] I62e9f1a7cde8846be368fbec58b8e0825ce02079
[2] https://review.openstack.org/615676
Change-Id: I45dd7602068eb0ce1331cfefd5a0cf6418bc8e88
Later patches will introduce a field in RequestSpec using this type as
the field type to store the resource requests coming from outside of
Nova like the bandwidth request coming from the Neutron ports.
This patch refactors the usage of placement.lib.RequestGroup. Until now
this class was used both by placement and nova services and they used
it only as a util class. However after this series the nova services
would like to use such a class via RPC which requires an OVO. This
patch makes sure that the new OVO is used by nova and the old plain
object is used by placement. This way placement is not forced to use
an OVO where no OVO functionality is required.
The minimum required version of oslo.versionedobjects is updated to
1.33.3 to include the fix for bug 1771804.
Change-Id: I46c97d2641d9685ef59771314665a17a5236097d
blueprint: bandwidth-resource-provider
This reverts commit bd7d991309 and bumps
the minimum version of oslo.db to 4.40.0, as that is the first version
of the library to include the renamed attribute.
Change-Id: Ic9e7864be3af7ef362cad5648dfc7bdecd104465
Related-Bug: #1788833
This changes the max_concurrent_live_migrations handling
to use a ThreadPoolExecutor so that we can control a bounded
pool of Futures in order to cancel queued live migrations
later in this series.
There is a slight functional difference in the unlimited
case since starting in python 3.5, ThreadPoolExecutor will
default to ncpu * 5 concurrently running threads. However,
max_concurrent_live_migrations defaults to 1 and assuming
compute hosts run with 32 physical CPUs on average, you'd
be looking at a maximum of 160 concurrently running live
migrations, which is probably way above what anyone would
consider sane.
Co-Authored-By: Matt Riedemann <mriedem.os@gmail.com>
Part of blueprint abort-live-migration-in-queued-status
Change-Id: Ia9ea1e164fb3b4a386405538eed58d94ad115172
This is the first change that implements basic virt.driver methods
to allow nova-compute process start successfully.
A set of subsequent changes will implement spawn, snapshot, destroy
and instance power actions.
Change-Id: Ica6117c2c64f7518b78b7fb02487622250638e88
blueprint: add-zvm-driver-rocky
Bump the minimum version of oslo.config to 6.1.0, which adds proper
support for parsing Opt.help as rST [1]. This in turn allows us to
revert commit 75fc300901, which is a
temporary fix relying on deprecated features of Sphinx.
[1] https://review.openstack.org/#/c/553860/
Change-Id: I8f56bdce37cfc538348490052a24e463164c86a3
In Ie4d81fa178b3ed6b2a7b450b4978009486f07810 we started using a new WebOb API
for introspecting headers but since this new API isn't supported by older
versions than 1.8, we need to only accept 1.8.1 or 1.8.2 for Nova
(because 1.8.0 was having a bug fixed by 1.8.1 at least).
Change-Id: I345f372815aef5ac0fb6fc607812ce81587734bf
Closes-Bug: #1773225
Due to change [1], the retrying package in requirements.txt
must match the lower bound found in lower-constraints.txt.
[1] https://review.openstack.org/#/c/574367/
Change-Id: I05600e8c606099aea74aa032f92c4f44f947cb4c
With the new image handler, it creates an image proxy which
will use the vdi streaming function from os-xenapi to
remotely export VHD from XenServer(image upload) or import
VHD to Xenerver(image download).
The existing GlanceStore uses custom functionality to directly
manipulate files on-disk, so it has the restriction that SR's
type must be file system based: e.g. ext or nfs. The new
image handler invokes APIs formally supported by XenServer
to export/import VDI remotely, it can support other SR
types also e.g. lvm, iscsi, etc.
Note:
vdi streaming would be supported by XenServer 6.5 or above.
The function of image handler depends on os-xenapi 0.3.3 or
above, so bump os-xenapi's version to 0.3.3 and also declare
depends on the patch which bump version in openstack/requirements.
Blueprint: xenapi-image-handler-option-improvement
Change-Id: I0ad8e34808401ace9b85e1b937a542f4c4e61690
Depends-On: Ib8bc0f837c55839dc85df1d1f0c76b320b9d97b8
This change makes nova configure oslo.messaging's active call monitoring
feature if the operator increases the rpc_response_timeout configuration
option beyond the default of 60 seconds. If this happens, oslo.messaging will
heartbeat actively-running calls to indicate that they are still running,
avoiding a false timeout at the shorter interval, while still detecting
actual dead-service failures before the longer timeout value.
In addition, this adds a long_rpc_timeout configuration option that we
can use for known-to-run-long operations separately from the base
rpc_response_timeout value, and pre_live_migration() is changed to use
this, as it is known to suffer from early false timeouts.
Depends-On: Iecb7bef61b3b8145126ead1f74dbaadd7d97b407
Change-Id: Icb0bdc6d4ce4524341e70e737eafcb25f346d197