Commit Graph

61899 Commits

Author SHA1 Message Date
Jay Faulkner 56cb5f52fb [ironic] Ensure unprovision happens for new states
States were added to the Ironic API to enable the node servicing
feature, which can be performed on nodes provisioned with Nova
instances. Current nova, if asked to delete these instances, will only
remove the instance metadata and not tear them down.

This change has two parts:
- I have added the new, relevant states to _UNPROVISION_STATES in
  driver.py, which now allows Nova to know that SERVIC* states and
  DEPLOYHOLD are safe to unprovision from.
- I have added all existing ironic states to ironic_states.py and the
  PROVISION_STATE_LIST constant and check the state against it -- in a
  case where a completely unknown state is returned, we should attempt
  an unprovision.

This fix needs to be backported as far as possible, as this bug has
existed since Antelope / 2023.1 (DEPLOYHOLD) or Bobcat / 2023.3
(SERVIC*).

Assisted-by: Claude Code
Closes-bug: #2131960
Change-Id: I31c70d35b0e6e9f8d2252bfb2f0bdec477cc6cc7
Signed-off-by: Jay Faulkner <jay@jvf.cc>
2025-11-20 15:23:58 -08:00
René Ribaud f017e23b81 Use *_OR_ADMIN policy defaults for server shares
Update the server shares API policies to use
PROJECT_READER_OR_ADMIN and PROJECT_MEMBER_OR_ADMIN instead of
PROJECT_READER and PROJECT_MEMBER.

This aligns the server shares policies with other compute API
policies and ensures administrators can list, attach, show and
detach shares regardless of project policy overrides.

Signed-off-by: René Ribaud <rene.ribaud@gmail.com>
Change-Id: I2b237d56b08e3080475dc500e204298018af29c7
2025-11-20 15:15:00 +01:00
melanie witt c5c1b93d21 libvirt: add configuration option for volume AIO mode
With the NFS, FC, and iSCSI Cinder volume backends, Nova explicitly
sets AIO mode ``io=native`` in the Libvirt guest XML. Operators may set
this option to True in order to defer AIO mode selection to QEMU if
forcing ``io=native`` is not desired.

Closes-Bug: #2129788

Change-Id: I6e51706b5cb8be5becebbafe9108df1ba9e0f69f
Signed-off-by: melanie witt <melwittt@gmail.com>
2025-11-19 12:04:31 -08:00
Zuul 53aadaf967 Merge "Update comment about migrated mypy conf files" 2025-11-19 17:50:23 +00:00
Zuul 94788200db Merge "TPM: support instances with host secret security" 2025-11-19 17:45:06 +00:00
Zuul 32ad7a036b Merge "TPM: support instances with user secret security" 2025-11-19 17:37:39 +00:00
Zuul d6b0961862 Merge "TPM: add RequestContext checks to functional tests" 2025-11-19 15:51:15 +00:00
Rajesh Tailor cec81f76fb Update comment about migrated mypy conf files
The change Ife39b55eb40c9cb8e61f1b2295b6d42cefe3a680 migrated mypy
configuration files from setup.cfg to pyproject.toml file, but a
comment in .pre-commit-config.yaml says to keep is in sync with
setup.cfg, which is incorrect.

This change updates comment in the .pre-commit-config.yaml file to
reflect the change.

Signed-off-by: Rajesh Tailor <ratailor@redhat.com>
Change-Id: I4d35b989e8c90b629bcb15438ad82f60f7ca8957
2025-11-19 11:47:50 +05:30
Zuul e2eefc277c Merge "api: Add response body schemas for floating IP APIs" 2025-11-18 18:08:36 +00:00
Zuul df6f5c3fdc Merge "api: Add response body schemas for volume attachments APIs" 2025-11-18 17:54:28 +00:00
Zuul c5ebda4d84 Merge "api: Add response body schemas for snapshots APIs" 2025-11-18 17:54:16 +00:00
Zuul 0c33871c36 Merge "Add managed='no' flag to libvirt XML definition for VIF type TAP" 2025-11-18 14:57:17 +00:00
Artom Lifshitz 245a321e43 TPM: support instances with host secret security
Start supporting booting instances with the `host` TPM secret
security. This means setting the `ephemeral` and `private` attributes
on the Libvirt secret correctly, and not undefining the secret once
the instance has spawned. The Libvirt fixture's Secret support is
extended to be able to test all that in a functional test.

For functional testing, we need to:

* Extend our libvirt fixture's Secret object to properly set the usage
  id (which is just the instance UUID) when parsing vTPM secret XML.

Related to blueprint vtpm-live-migration

Change-Id: I5a38a0de76a78b28a205a8d19f2374830054e1ab
Signed-off-by: melanie witt <melwittt@gmail.com>
2025-11-17 17:26:38 -08:00
Artom Lifshitz ad1dd5e594 TPM: support instances with user secret security
The `user` secret security policy is just existing behavior. No
changes are necessary in the mechanics, so this patch just adds a
scheduler prefilter and tests. The functional tests add some
groundwork to make future tests easier as well by making the helper
methods more flexible.

For functional testing, we need to:

* Have our libvirt fixture keep track of undefined secrets. Secrets
  are undefined as soon as the VM that uses them successfully boots
  (as mentioned previously, VM creation follows this pattern), but our
  tests would still like to assert that the secret had been created on
  a host. Just add a _removed_secrets dict that _remove_secret()
  populates.

Related to blueprint vtpm-live-migration

Change-Id: Ib449dc2f1c4a9af9d423252594261947e811452e
Signed-off-by: melanie witt <melwittt@gmail.com>
2025-11-17 17:26:38 -08:00
melanie witt 0f82c2953e TPM: add RequestContext checks to functional tests
Key manager service secret ownership can be a challenge when dealing
vTPM instances. Some instance actions require access to the secret and
will fail if there is a mismatch.

In preparation for vTPM live migration changes which will involve
different users accessing secrets (user|admin|Nova service user), this
removes ADMIN_ONLY from the functional tests class and adds checking of
RequestContext user_id in the FakeKeyManager.

Change-Id: I2790cd274a4776ab306b39df1e591e8304b63f96
Signed-off-by: melanie witt <melwittt@gmail.com>
2025-11-17 17:26:38 -08:00
Zuul cf930034f2 Merge "Reproduce bug/2130881" 2025-11-17 16:44:07 +00:00
Zuul 72dd372fc4 Merge "[hacking] Improve N373 to catch also other primitives" 2025-11-17 16:18:33 +00:00
Sean Mooney 22012360c4 ensure correct cleanup of multi-attach volumes
If a host has multiple instance with the same shared
multi attach volume and you delete them in parallel
nova need to correctly clean up the volume connection on
the host when the last instance is removed.

currently we do not have a volume level lock to guard the
critical section that determins if the current disconnect is
removing the final usage of the volume.

This can lead to leaking the volume or other issues as
noted in bug: #2048837

This change introduces a FairLockGuard to ensure we acquire
and release the locks in a fair and orderd manner.
The FairLockGuard is used to lock the server delete with
one lock per multi attach volume.

This will ensure that disconnects of diffrent volumes can happen
in parallel but if we are disconnecting the same volume in multiple
greenthread concurrently they will be serialised.

Assisted-By: Cursor Auto
Closes-Bug: #2048837
Change-Id: I67e10cace451259127a5d7da8fbdf7739afe3e51
Signed-off-by: Sean Mooney <work@seanmooney.info>
2025-11-17 13:26:08 +00:00
Zuul 8a993d583f Merge "add functional repoducer for bug 2048837" 2025-11-14 22:20:55 +00:00
Dan Smith 326b77d837 Test nova-next with >1 parallel migrations
Change-Id: Ic69872e6667664d1b3bd7a88d7ef018b67352f44
Signed-off-by: Dan Smith <dansmith@redhat.com>
2025-11-13 06:32:54 -08:00
Sean Mooney fac1a4d9de add functional repoducer for bug 2048837
Change-Id: I8ce3044cff198209416d2a458317f01d1177e9da
Signed-off-by: Sean Mooney <work@seanmooney.info>
2025-11-12 10:32:06 +00:00
Zuul b7d50570c7 Merge "api: Add response body schemas for volumes APIs" 2025-11-11 20:10:29 +00:00
Sean Mooney 61242f75da allow funtional test to run with released placment
As part of I0b5e13673cb4cc7c57aeae50914ace443dfc18fa
a new depency was created on a placement config
option and the workarounds config group

This enabled the workaround added in
I13ab83a165c229ae57876df4570e8af25221a45e
which is present on master but not in a release

That works in ci because in ci we use placement
master but locally and in the requirement repo
we do not.

Closes-Bug: #2131032
Change-Id: I744049b5cf0ef69624fc4b6db1e5f415ab89a5af
Signed-off-by: Sean Mooney <work@seanmooney.info>
2025-11-10 20:07:10 +00:00
Zuul 68a0a69c33 Merge "Allow to perform parallel live migrations" 2025-11-07 22:36:34 +00:00
Zuul 6625a7b0c0 Merge "Fix test_simple_tenant_usage test" 2025-11-07 19:27:55 +00:00
Zuul 44c9d18b08 Merge "Migrate setup configuration to pyproject.toml" 2025-11-07 17:18:30 +00:00
Zuul 3466cd0152 Merge "Migrate codespell configuration to pyproject.toml" 2025-11-07 17:18:18 +00:00
Zuul fa3f07994f Merge "Migrate mypy configuration to pyproject.toml" 2025-11-07 17:18:06 +00:00
Dmitriy Rabotyagov 25fbf32f22 Allow to perform parallel live migrations
This patch implements parallel live migrations for libvirt driver.
It is achieved through introduction of new configuration parameter
`live_migration_parallel_connections`.

This allows to eliminate bottleneck on live migration speed by
establishing multiple connections for memory transition, thus
leveraging multi-threaded behavior in QEMU.

Implements-blueprint: libvirt-parallel-migrate
Change-Id: I98ff5f07f94d94f3aa0227591f425d532773adb0
Signed-off-by: Dmitriy Rabotyagov <dmitriy.rabotyagov@cleura.com>
2025-11-07 07:17:54 -08:00
Balazs Gibizer 16abde56ee Init virt driver before use
The virt driver interface assumes that init_host is called before any
other query to the virt driver. The libvirt virt driver cannot fully
function otherwise. If any connection is made to the hypervisor before
driver.init_host then the libvirt lifecycle events will not function and
libvirt returns the warning:

  URI qemu:///system does not support events: internal error: could not
  initialize domain event timer: libvirt.libvirtError: internal error:
  could not initialize domain event timer

During the first startup of the nova-compute service
ComputeManager.init_host checks if the hypervisor has any instances to
detect if this is not really the first start of the compute service on
the host. But that code path happens before ComputeManager.init_host
initialize the virt driver via driver.init_host.

This patch reorders the calls to make sure the driver is initialized
before use.

Closes-Bug: #2130881
Change-Id: I814a2f3982d481a1f926fe13465a19955c4f48f2
Signed-off-by: Balazs Gibizer <gibi@redhat.com>
2025-11-07 15:48:54 +01:00
Balazs Gibizer 38495b8ada Reproduce bug/2130881
Show that ComputeManager.init_host calls the driver before calling
driver.init_host.

Related-Bug: #2130881
Change-Id: I364ecd4277fe8d5e62629355105fa799d7dabf19
Signed-off-by: Balazs Gibizer <gibi@redhat.com>
2025-11-07 15:34:09 +01:00
Zuul 6ce9c77553 Merge "Default native threading for sch, api and metadata" 2025-11-07 12:52:34 +00:00
Zuul a79fea0347 Merge "Move monkey_patch from init to the entrypoints" 2025-11-06 23:48:36 +00:00
Balazs Gibizer 35207ee8b5 Default native threading for sch, api and metadata
This patch switches the default concurrency mode to native threading
for the services that gained native threading support in Flamingo:
nova-scheduler, nova-api, and nova-metadata.

The OS_NOVA_DISABLE_EVENTLET_PATCHING env variable still can be used to
explicitly switch the concurrency mode to eventlet by

  OS_NOVA_DISABLE_EVENTLET_PATCHING=false

We also ensure that the cover, docs, py3xx and functional tox targets
are still running with eventlet while py312-threading kept running
with native threading.

Change-Id: I86c7f31f19ca3345218171f0abfa8ddd4f8fc7ea
Signed-off-by: Balazs Gibizer <gibi@redhat.com>
2025-11-06 19:42:24 +01:00
Zuul 4cfdf18a0b Merge "[CI]nova-alt-configurations tests eventlet" 2025-11-06 17:44:13 +00:00
Nell Jerram 6aba55a23f Add managed='no' flag to libvirt XML definition for VIF type TAP
libvirt 9.5.0 and later by default doesn't allow using a pre-created
TAP device; instead it expects to create and manage the TAP device
itself, which is incompatible with how Nova works.  To restore
compatibility with Nova we need to add the managed="no" flag to the
target device section in the XML domain file.

The libvirt change is here[1].  In particular it breaks Calico for
OpenStack, because the Calico plugin (out of tree[2]) uses VIF type TAP.

1. https://github.com/libvirt/libvirt/commit/a2ae3d299cf
2. https://github.com/projectcalico/calico/blob/master/networking-calico/networking_calico/plugins/ml2/drivers/calico/mech_calico.py#L217

Many thanks to Masahito Muroi <masahito.muroi@linecorp.com> for
proposing an earlier version of this fix.

Closes-Bug: #2033681
Change-Id: I4a7b4ecf69cfe04c5291e5ca2a76db8829d6e592
Signed-off-by: Nell Jerram <nell@tigera.io>
2025-11-06 12:29:11 +00:00
Stephen Finucane 6c1e8cda78 pre-commit: Bump versions
Change-Id: I5c333a486ca11246ce5aebb2fabc3624a7534267
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2025-11-06 11:02:25 +00:00
Ghanshyam Maan c4ddc76bd6 Fix test_simple_tenant_usage test
API policy test_simple_tenant_usage test does
not send the start and end time in request's query
string. In that case, API set the current time to
both start and end times. So there is a chance that
both start and end times can be the same, and Nova
raises an error:
- https://github.com/openstack/nova/blob/9e5ad07aeeb9f14eba37e2cdea9377e7af48ef88/nova/api/openstack/compute/simple_tenant_usage.py#L258

Closes-Bug: 2130703

Change-Id: Ib47890087110d460504df64aeed5206ded2e70b0
Signed-off-by: Ghanshyam Maan <gmaan@ghanshyammann.com>
2025-11-05 20:23:48 +00:00
Zuul 7cf0758f38 Merge "Add handling for vTPM secret permission error" 2025-11-05 20:23:18 +00:00
Zuul 66558107c7 Merge "Add hw:tpm_secret_security extra spec validation" 2025-11-05 20:23:05 +00:00
Balazs Gibizer 0afb72e883 Move monkey_patch from init to the entrypoints
This move is needed so that we can define a per service default for
monkey patching.

And yes, the single line with both noqa and autopep8 decorators are
needed to convince autopep8 that this code is OK to be at the start of
the file.

After moving the monkey_patching earlier in the wsgi entrypoint I needed
to move the functional test monkey_patching call earlier too to keep the
early enough for the test where the wsgi entry point is not directly
imported

Change-Id: Idedd2a440adc1cde1e8ffe6636854d5a891e66d2
Signed-off-by: Balazs Gibizer <gibi@redhat.com>
2025-11-05 15:56:59 +01:00
Balazs Gibizer 0bf6780fb8 [CI]nova-alt-configurations tests eventlet
This patch renames the nova-ovs-hybrid-plug Job to
nova-alt-configurations and ensures that all nova services are
running with eventlet even after some of the services switches to
native threading by default. This ensures we keep eventlet test
coverage in place.

Change-Id: Id2b70aa3870f2bf5a28c875a7564f84c012c9456
Signed-off-by: Balazs Gibizer <gibi@redhat.com>
2025-11-05 14:14:04 +01:00
Zuul 9e5ad07aee Merge "setup: Remove pbr's wsgi_scripts" 2025-11-05 11:15:52 +00:00
Stephen Finucane 324af749bb Migrate setup configuration to pyproject.toml
Or as much of it as we can currently.

Change-Id: I25f8e0ae5f7a652576678829e574b5bf2d2441a2
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2025-11-04 16:11:53 +00:00
Stephen Finucane 0b5461e18b Migrate codespell configuration to pyproject.toml
Change-Id: I9554b74bfd732e0e8e792ba543f2c3a6908c4bd9
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2025-11-04 16:11:53 +00:00
Stephen Finucane 4b09ba2a6b Migrate mypy configuration to pyproject.toml
Change-Id: Ife39b55eb40c9cb8e61f1b2295b6d42cefe3a680
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2025-11-04 16:11:53 +00:00
Stephen Finucane 5da2dc2060 setup: Remove pbr's wsgi_scripts
This is technical dead end and not something we're going to be able to
support long-term in pbr. We need to push users away from this. Doing so
highlights quite a few place where our docs need some work, particularly
in light of the recent removal of the eventlet servers.

Change-Id: I2ffaed710fac2612f5337aca5192af15eab46861
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2025-11-04 16:11:50 +00:00
Johannes Kulik 710ffbb0c5 api: Pre-query not deleted members in server groups
When retrieving multiple - or all - server groups, the code tries to
find not deleted members for each server group in every cell
individually. This is highly inefficient, which is especially noticable
when the number of server groups rises.

We change this to query all members of all server-groups we will reply
with (i.e. from the already limited list) in advance and pass this set
of existing uuids into the function formatting the server group. This is
more efficient, because we only do one large query instead of up to 1000
times the number of cells.

Change-Id: I3459ce7a8bec9a9e6f3a3b496a3e441078b86af0
Signed-off-by: Johannes Kulik <johannes.kulik@sap.com>
Partial-Bug: #2122109
2025-11-03 11:46:43 +01:00
Zuul 32f58e8ad6 Merge "[func]Test with optimize_for_wide_provider_trees" 2025-10-31 16:30:09 +00:00
Zuul 74c568b96b Merge "[CI][nova-next]test with placement ac optimizations" 2025-10-31 16:29:56 +00:00