Commit Graph

61366 Commits

Author SHA1 Message Date
Elod Illes e383b46545 [tool] Fix backport validator for non-SLURP
non-SLURP branches are EOL'd in case they reach their end of maintained
phase. This could produce a situation when a patch is merged in a
non-SLURP branch that was deleted in the meantime and it's further
backports fail on gate with backport validator as the hash of the
non-SLURP version of the patch is not on any branch.

This patch fixes the above issue as follows: in case a hash is not
found on any branch, then it checks if it can be found under any *-eol
tag and only fails if there is not found either.

Change-Id: I56705bce8ee4354cd5cb1577a520c2d1c525f57b
2025-05-13 15:06:27 +02:00
Stephen Finucane 023be4f561 wsgi: Don't create, use lock in same line
As noted on the mailing list some time back [1], pylint flags this as a
useless lock [2]. Make it non-useless.

[1] https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/message/CZVC6SEMUSEH7UT5LDHOWL7WBZ2OXUWZ/
[2] https://pylint.readthedocs.io/en/latest/user_guide/messages/warning/useless-with-lock.html

Change-Id: If8243cc62c3dd9cd5f5b0d664981975efc6300cc
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2025-05-08 12:38:28 +01:00
Zuul a106f25e8e Merge "[quota]Refactor group counting to scatter-gather" 2025-05-06 22:40:35 +00:00
Balazs Gibizer 7f4c47c642 [quota]Refactor group counting to scatter-gather
The legacy server group member counting logic fits well to use the
existing scatter-gather logic instead of rolling its own thread
handling.

This replaces a direct eventlet dependency with an indirect, shared one,
in the scatter-gather therefore making the eventlet removal work easier

Change-Id: I6d1b5f9654df2a93bd3722a5813d5ad3a7d1c94a
2025-05-05 16:06:20 +02:00
Balazs Gibizer a5bcaf69b1 Remove python 3.9 support
OpenStack recently dropped python3.9 support for global requirements[1]
as it was removed from the Flamingo supported runtimes[2].

So this patch removes 3.9 support from nova too.

[1]https://review.opendev.org/c/openstack/requirements/+/948285
[2]https://governance.openstack.org/tc/reference/runtimes/2025.2.html#python

Change-Id: I8aea971d7972959c32d5175926cbaddb21839f8e
2025-04-29 09:32:24 +02:00
Zuul 3e7017eb29 Merge "Replace eventlet sleep with time.sleep" 2025-04-28 10:17:55 +00:00
Zuul 2b06c9c6c2 Merge "Remove workaround for ovn live migration" 2025-04-25 19:41:40 +00:00
Zuul 00b2a1c861 Merge "Remove superfluous monkey patching form func test" 2025-04-25 19:21:57 +00:00
Zuul 4085069c23 Merge "split monkey_patching form import" 2025-04-25 19:21:45 +00:00
Zuul f3e206c08a Merge "Remove nova debugger functionality" 2025-04-25 19:21:34 +00:00
Zuul dd784f7327 Merge "FUP improve and add integration tests for PCI SR-IOV servers" 2025-04-25 15:10:27 +00:00
Balazs Gibizer ac765008c9 Remove superfluous monkey patching form func test
We have 4 ways to enter nova code and got monkey patched:

* nova.cmd - used by all of our CLI commands and non WSGI services
* nova.api.openstack - used by our WSGI services
* nova.test - used by our unit test environment to run nova services in GreenThreads.
* nova.tests.functional - used by our functional test environment to run nova services in GreenThreads.

The latter is unnecessary as all our functional test uses the nova.test
module so it automatically got monkey patches by that. So this patch
removing the monkey patching from nova.tests.functional. I don't see any
test runtime increase locally after the change so I don't think the
tests start to run more serially due to some missed monkey patch.

Change-Id: I4731dab89e2c1f1707d322c575ab0780bff80535
2025-04-25 14:37:46 +02:00
Sean Mooney 659710a626 split monkey_patching form import
This change separates the evetlet monkey patching
from importing the module and add a module level
constant to track if we have already monkey patched.

Change-Id: Ic4ab0ba7a8320a008d6e246641446446dcc9ccc0
2025-04-25 14:37:46 +02:00
Sean Mooney 02d72b9d56 Remove nova debugger functionality
The nova debuger functionality was intended
to help debugging running process however it has
never been reliable due to our use of eventlet and is generally
not required when not using eventlet. I.e. you can just
run the nova console-scripts form a debugger or add pdb
statements as required.

As part of the eventlet removal the debugger functionality is
removed given its untested and undocumented.

Change-Id: I7bf88f06f3d1dbd2c7e342b27a21440a123c631d
2025-04-25 14:37:44 +02:00
Zuul 12b32198ca Merge "Remove WSGIServer related config options" 2025-04-25 11:14:19 +00:00
Zuul f1e843af0d Merge "[doc]Describe file based GMR triggering" 2025-04-25 10:57:52 +00:00
Zuul 2fa61a0ad2 Merge "[doc]Remove eventlet based API endpoints" 2025-04-25 10:57:41 +00:00
Zuul c5eac85b05 Merge "Support glance's new location API" 2025-04-24 19:56:33 +00:00
Kamil Sambor 3946a94538 Replace eventlet sleep with time.sleep
As part of the Eventlet removal, this patch replaces eventlet.sleep
with the equivalent time.sleep, which should work the same
with Python threads

Change-Id: I31b1aa854d8c95e47ba476051a650937b739a52b
2025-04-24 14:09:19 +02:00
Zuul 2762a73c5b Merge "Use dict object for request_specs_dict in the _list_view" 2025-04-23 23:16:46 +00:00
Zuul 6dcc4cc279 Merge "Functional tests for one-time-use devices" 2025-04-23 16:01:07 +00:00
Zuul 09e8f6e47a Merge "Fix description of [pci] alias" 2025-04-22 22:28:43 +00:00
Zuul 6ee2ce48b8 Merge "Remove WSGIService and WSGIServer classes" 2025-04-22 17:38:04 +00:00
Zuul cd3b5371a6 Merge "Remove eventlet based WSGI server entry points" 2025-04-22 17:22:17 +00:00
Dan Smith eab0de2900 Support glance's new location API
This makes us use the new method if available, and if not, fall back
to the old method.

Change-Id: If52ac05a02b69476bd2cfa74a7ee800c3f6eeb20
2025-04-21 07:09:58 -07:00
Balazs Gibizer c12eebd4c6 Remove WSGIServer related config options
As [1] removed the possibility to use the Eventlet based API servers
this patch can clean up the configuration options from the [wsgi]
section that are only used by that code path.

The remaining two options [wsgi]api_paste_config and
[wsgi]secure_proxy_ssl_header are still in use by the WSGI application
code path.

[1]I79b725f3b3569e9c1460a93ac40ca92269e7d003

Change-Id: Ia113daabab399e8db8edb1a2402ccae6fca351d5
2025-04-17 16:45:06 +02:00
Balazs Gibizer 05bab98aba [doc]Describe file based GMR triggering
We learned during recent installer development that triggering
GMR with apache/mod_wsgi API services via signals is hard due
to multiple reasons. We ended up using file based triggers instead of
signals. This patch document this approach.

Change-Id: I1fdbe6314ce4a1b173d01d3ebd9db07a0beb25a2
2025-04-17 16:27:46 +02:00
Balazs Gibizer e25418c857 [doc]Remove eventlet based API endpoints
The previous patch[1] removed the entry points. As there is sizable
amount of doc change needed to remove all the references from the doc
to the removed entry points a separate patch, this, is created to do so.

[1] Ie758550c0b8fb02aeb398396961467d9f845fcc9

Change-Id: Ibe8e45e86912e747f07e5fabd5b1204341c1e606
2025-04-17 16:24:16 +02:00
Balazs Gibizer 51eb60063f Remove WSGIService and WSGIServer classes
The previous patch[1] removed the Eventlet based WSGI entry points, and
that code was the only real user of the in tree WSGIService and
WSGIServer classes, we can remove those too. This removes a good chunk
of eventlet dependency from our tree.

There is a catch though. The functional test env used these to start the
nova-metadata-api service. We re-implemented the fixture to use
load the wsgi app and use the wsgi intercept instead. This also showed
that while the Eventlet based API service could be reset via the
oslo.service interface the wsgi APP based API service cannot. So the
related cell caches reset testing is removed.

[1] Ie758550c0b8fb02aeb398396961467d9f845fcc9

Change-Id: I79b725f3b3569e9c1460a93ac40ca92269e7d003
2025-04-17 15:29:59 +02:00
Zuul 33f859cab7 Merge "doc: Remove non-existent [service_user] auth_strategy" 2025-04-17 11:49:22 +00:00
Zuul 16a5923a55 Merge "doc: Drop deprecated [api] auth_strategy" 2025-04-17 11:49:14 +00:00
Balazs Gibizer 05b219746f Remove eventlet based WSGI server entry points
Nova deprecated[1] running the API services under Eventlet in the Rocky
release 6 years ago. Now that we are trying to transition away from
Eventlet it is time to rip out these entry points fully.

[1] b53d81b03c

Change-Id: Ie758550c0b8fb02aeb398396961467d9f845fcc9
2025-04-15 15:03:43 +02:00
Sean Mooney 691d47e936 Remove workaround for ovn live migration
This change removes the concept of plug time vs
bind time live migrations events.

In past releases Id2d8d72d30075200d2b07b847c4e5568599b0d3b
and I51673e58fc8d5f051df911630f6d7a928d123a5b
added workarounds to nova to enable live migration with
the ovn backend. Over the past 5 years a lot of work has
been done in ovn and neutron to support multiple port
bindings and propagage that information to the ovn
db. As a result the workaround in nova are nolonger
required.

Related-Bug: #2073254
Change-Id: Ic3e9c93681d11d5ab988d6990e9b8d480da887d4
2025-04-14 15:09:47 +01:00
Zuul 1ad11b1388 Merge "Remove tags from README" 2025-04-10 13:44:07 +00:00
Zuul 6e37eeaeb3 Merge "Add one-time-use devices docs and reno" 2025-04-08 01:50:31 +00:00
Zuul 69b148e0c5 Merge "Support "one-time-use" PCI devices" 2025-04-08 00:21:25 +00:00
Zuul ba44202dcb Merge "Invalidate PCI-in-placement cached RPs during claim" 2025-04-07 19:58:10 +00:00
Zuul 46736446ab Merge "Extend invalidate_rp to only invalidate cache" 2025-04-07 17:22:59 +00:00
Masahito Muroi 509820f156 Use dict object for request_specs_dict in the _list_view
The request_specs_dict in the _list_view is initialized as a
defaultdict object in order to return empty string as default.
But the request_spec_dict is replaced with a normal dict object in
the v2.96 microversion, then if server list and RequestSpec missmatch
happens by any reason, the List Server API and the List
Server Detail API hit 500 Internal server error because of key error.

This commit updates the req_spec_dict to use normal dict object, then
it returns sentinel object if there is no appropriate
request_spec object.

Closes-Bug: #2095364
Change-Id: If282b8709954f276cb5d48114437809d771a9958
2025-04-04 17:06:25 +09:00
Dan Smith ee67362728 Functional tests for one-time-use devices
Related to blueprint one-time-use-devices

Change-Id: I6f764666a44c74c5ac97dced568fc685dee013b6
2025-04-02 11:53:54 -07:00
Dan Smith 3dc42b8422 Add one-time-use devices docs and reno
This adds documentation to the PCI-passthrough doc in the admin guide,
explaining how to use one-time-use devices.

Keeping this separate so we can iterate on it separate from the code.

Related to blueprint one-time-use-devices

Change-Id: Iff91c0726bbb37c7a3ef885a73e3c3586feb6004
2025-04-02 11:53:54 -07:00
Dan Smith 28a266461a Support "one-time-use" PCI devices
This adds support for devices that will be allocated to an instance
once and left in a reserved=total state. An external workflow can
put them back into allocatable state by dropping reserved back to
zero. Note this requires PCI-in-placement tracking for the affected
devices and it is only valid for type-PCI and type-PF devices.

Related to blueprint one-time-use-devices

Depends-On: https://review.opendev.org/c/openstack/requirements/+/946181
Co-Authored-By: Balazs Gibizer <gibi@redhat.com>
Change-Id: Idfe8a746a97d68cd4eae30afb7d22f4e3af80327
2025-04-02 11:53:36 -07:00
Zuul 9d910ec4bf Merge "Imported Translations from Zanata" 2025-04-02 12:57:38 +00:00
Zuul adfd486810 Merge "ironic: fix logging of validation errors" 2025-04-02 01:11:09 +00:00
Dan Smith c5efabbd07 Invalidate PCI-in-placement cached RPs during claim
This makes us invalidate our cache of the PCI-in-placement resource
providers when we go to do instance_claim(). This is not technically
required right now, but is setup for the next patch where we will
update that inventory during claim and we need to make sure we are
working with the latest version. Without this, we may consider a
cached version of the inventory to be the same as the proposed one,
and thus not actually update placement when we need to. Since PCI-in-
placement was designed to tolerate external changes to the inventory
(especially/explicitly changing the reserved count), we need to be
careful not to allow our cache to prevent us from taking the action
we intend.

Related to blueprint one-time-use-devices

Change-Id: I89039328af7a2d2e6a4128dd08dbe8e97ecb16cd
2025-04-01 07:42:33 -07:00
Dan Smith ba00d60b95 Extend invalidate_rp to only invalidate cache
This makes invalidate_resource_provider() have a cacheonly flag that
only invalidates our cache, but does not remove the provider from the
tree for efficiency.

Related to blueprint one-time-use-devices

Change-Id: I04dd5e984c5671d866804c258422e4230fce37b7
2025-03-27 08:54:11 -07:00
René Ribaud 9947dac7ae FUP improve and add integration tests for PCI SR-IOV servers
This patch improves the test definitions and configurations. It also
adds mix-mode, flavor-based, and port-based integration tests."

The target goal of these series of patch is to enable VFIO devices
migration with kernel variant drivers.

Implements: blueprint migrate-vfio-devices-using-kernel-variant-drivers
Change-Id: I39f5f55bed5ddd940947b9a1e67086e85a9fe074
2025-03-27 15:23:55 +01:00
René Ribaud 98226b60f3 FUP: Improve libvirt fixture for hostdevs
This patch enhances the libvirt fixture to better align with the real
libvirt output when handling hostdevs.

It adds the alias tag, which libvirt provides to specify the hostdev
name, and the address tag, which indicates the address seen by
the guest.

These two fields will be used in a subsequent patch to improve the
comparison between source and destination XMLs during migration.

Example:

<hostdev mode='subsystem' type='pci' managed='yes'>
  <source>
    <address domain='0x0000' bus='0x82' slot='0x00' function='0x1'/>
  </source>
  <alias name='hostdev0'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
  function='0x0'/>
</hostdev>

The target goal of these series of patch is to enable VFIO devices
migration with kernel variant drivers.

Implements: blueprint migrate-vfio-devices-using-kernel-variant-drivers
Change-Id: I3ee3923f990dd6522a11849551a9d49c9fad426c
2025-03-26 10:04:37 +01:00
René Ribaud c6a96a17db FUP Update pci-passthrough and virtual-gpu documentation
This patch adds the necessary documentation identified in:

- pci-passthrough: Explaining live migration and known issues.
- virtual-gpu: Updating the caveats section to clarify what to do
  when VF devices are available instead of `mdev`.

The target goal of these series of patch is to enable VFIO devices
migration with kernel variant drivers.

Implements: blueprint migrate-vfio-devices-using-kernel-variant-drivers
Change-Id: I41271a8af5687fb1d18f9d0852492756e096720d
2025-03-26 10:02:41 +01:00
René Ribaud 28f82ba912 FUP Add a warning to make non-explicit live migration request debugging easier
Today, when a user does not request live-migratable devices, the
migration should fail.
However, this failure is hard to detect because the end result is a
NoValidHost error when Nova exhausts its reschedule attempts. As a
result, it is difficult to determine why scheduling failed.

This patch adds a warning to aid in debugging and identifying the
root cause more easily.

The target goal of these series of patch is to enable VFIO devices
migration with kernel variant drivers.

Implements: blueprint migrate-vfio-devices-using-kernel-variant-drivers
Change-Id: I64448f30e5d692396c129d9239679e74051cde7f
2025-03-26 10:02:41 +01:00