Commit Graph

61992 Commits

Author SHA1 Message Date
Nicolai Ruckel 35b1945522 Preserve UEFI NVRAM variable store
Preserve NVRAM variable store during stop/start, hard reboot, live
migration, and volume retype.

This does not affect cold migration or shelve.

For UEFI guests (hw_firmware_type=uefi), every time the instance is
started, the UEFI variable storage for that instance
(/var/lib/libvirt/qemu/nvram/instance-xxxxxxxx_VARS.fd) is deleted
and reinitialized from the default template.

The changes are based on this patch by Jonas Schäfer to preserve the
vTPM state:
https://review.opendev.org/c/openstack/nova/+/955657

Closes-Bug: #1633447
Closes-Bug: #2131730
Change-Id: I444a9285c07a04bf08a73772235f8dd73d75e513
Signed-off-by: Nicolai Ruckel <nicolai.ruckel@cloudandheat.com>
2026-02-13 23:55:41 +01:00
lajoskatona 873aee5e95 Fix for bug 2140537
If a guest has pinned CPUs the domain XML's
<iothreadpin> should have iothread attribute also.

Closes-Bug: #2140537
Change-Id: I5c2df747a3fdfbd2ee31d50a3d716a0ccc787e15
Signed-off-by: lajoskatona <lajos.katona@est.tech>
2026-02-13 17:17:17 +00:00
Stephen Finucane 6bc431bc52 tests: Invert validation check
Now that all of our controllers have full schema coverage, we can now
assume that all controllers are validated and raise if that's not the
case.

Change-Id: I3a58be8551e7cf13835ad565aae4fc9dc4214bbd
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2026-02-13 16:51:40 +00:00
Stephen Finucane dab02447e6 api: Add response body schemas for server shares APIs
We had missed one.

Change-Id: Icc63959d73b1881b7db19b93cf8fb80dcb77cad8
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2026-02-13 16:51:40 +00:00
Stephen Finucane f80e4935e8 api: Add response body schemas for servers APIs (6/6)
The last one: delete. Very simple, as always.

Change-Id: I08a2dbcd86cf652e9cda193f64edfa655f986506
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2026-02-13 16:51:40 +00:00
Stephen Finucane 9fd431315c api: Add response body schemas for servers APIs (5/6)
The penultimate API: the update view. This is very similar to the
rebuild API so we are able to reuse much of that schema here.

We also move some code outside an try-except as the code in question
can't raise an InstanceNotFound exception.

Change-Id: I0e42de5074dcf699886b20dfd43306683e381ee2
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2026-02-13 16:51:34 +00:00
Stephen Finucane c9be8b9aba tests: Fix bound
Ensure we do not negative values except for -1 (unlimited).

Change-Id: I9a0184ed54054c6466833df24dfbe9ca7d1b454b
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2026-02-13 16:17:31 +00:00
Eigil Obrestad bfaec08220 Make nova recognize amx-capabilities
Expands the CPU_TRAITS_MAPPING table to let nova report if a compute-node
supports AMX. This enables nova to pick the correct cpu_model when a
SapphireRapids (or newer) cpu is wanted by the flavor.

Implements: blueprint add-amx-traits
Change-Id: Ieaa2e1be9d3d3ae945ce28d778edc9729d2db9ba
Signed-off-by: Eigil Obrestad <eigil-git@obrestad.org>
Depends-On: https://review.opendev.org/c/openstack/requirements/+/976640
2026-02-12 20:51:02 +01:00
Zuul 4fec7fe09d Merge "Revert "Set openstacksdk-functional-devstack non voting"" 2026-02-11 20:09:06 +00:00
Zuul 420b02d6be Merge "Add regression test to repoduce bug 2140537" 2026-02-11 11:35:14 +00:00
lajoskatona 76d796193c Add regression test to repoduce bug 2140537
Related-Bug: #2140537
Change-Id: I8c7cf544d599d5a11a2ae898822c2bde36f1d52a
Signed-off-by: lajoskatona <lajos.katona@est.tech>
2026-02-09 13:11:53 +01:00
Balazs Gibizer 4227c9b14a Revert "Set openstacksdk-functional-devstack non voting"
This reverts commit 2f9f780a77.

Signed-off-by: Balazs Gibizer <gibi@redhat.com>
Change-Id: Ia3d01ba6da0ade10ad70de951cbcb72204fbce12
2026-02-09 10:19:56 +01:00
Balazs Gibizer 2f9f780a77 Set openstacksdk-functional-devstack non voting
There is neutron issue in the job but it's fix is being blocked by
multiple other issues in the sdk's gate. Let's keep our gate operational
until they fix the sdk gate.

[1] https://review.opendev.org/c/openstack/openstacksdk/+/976008
[2] https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/message/5HHEYPZA6VIORX2XLBZGNMM2EVX2LR65/

Signed-off-by: Balazs Gibizer <gibi@redhat.com>
Change-Id: Ie2fe2ec18a0fe7dbbfe4fbb9094d9542c729122a
2026-02-09 10:16:10 +01:00
Sean Mooney 264e868d49 Support os-vif TAP pre-creation for OVS/OVN ports
Add support for os-vif TAP device pre-creation when Neutron sets
the 'ovs_create_tap' flag in vif_details. This reduces live
migration downtime by ensuring the network is fully wired before
the VM starts.

Changes:
- Add VIF_DETAILS_OVS_CREATE_TAP constant to model.py
- Propagate create_tap from binding details to os-vif port profile
  in os_vif_util.py
- Set managed='no' in libvirt XML when create_tap is enabled so
  libvirt uses the pre-created TAP device
- Set multiqueue on port profile in _plug_os_vif based on instance
  flavor/image hw:vif_multiqueue_enabled property

When checking oslo.versionedobjects fields for backward compat:
- Use 'field in obj.fields' to check if field exists in schema
- Use 'field in obj' to check if field value is set

Depends-On: https://review.opendev.org/c/openstack/os-vif/+/971231
Generated-By: Cursor claude-opus-4.5
Closes-Bug: #2069718
Change-Id:  I32343658b53e317696d1bd8b984793bfeeccd409
Signed-off-by: Sean Mooney <work@seanmooney.info>
2026-02-05 18:55:06 +00:00
Zuul a17b44f3eb Merge "Use an executor to delay STOPPED events" 2026-02-05 17:38:28 +00:00
Zuul 75aed9a19d Merge "Live migration with iothreads" 2026-02-05 10:56:23 +00:00
Zuul c94d2eaedb Merge "Enable mypy on nova/utils.py" 2026-02-05 03:43:47 +00:00
Zuul 6b0bb735a6 Merge "SubclassSignatureTestCase to use NoDBTestCase as base" 2026-02-05 03:14:21 +00:00
Zuul 6a6e05d4d3 Merge "Libvirt event handling without eventlet" 2026-02-05 03:14:05 +00:00
Artom Lifshitz 3eae9477d2 TPM: support live migration of host secret security
This enables live migration for TPM instances with the ``host`` secret
security mode. The ``host`` security mode uses key manager service
secrets owned by the instance owner. The secret is persisted in
Libvirt and is sent over RPC to the destination during a live
migration.

The service version will be bumped in a separate patch.

Related to blueprint vtpm-live-migration

Change-Id: I97e9dd454c793abcb1a20579b1ceaec627be4813
Signed-off-by: melanie witt <melwittt@gmail.com>
2026-02-04 16:52:06 -08:00
melanie witt 2bdf12535c TPM: prepare to bump service version for live migration
This prepares for a service version bump and adds a minimum service
version check in the API to reject live migration requests for vTPM
instances until the entire cloud is upgraded to the new version.

The actual service version bump will be included in a later patch that
implements vTPM live migration.

Related to blueprint vtpm-live-migration

Change-Id: I7daef8037385a4077dc0a78f03ae4b34a57560b7
Signed-off-by: melanie witt <melwittt@gmail.com>
2026-02-04 15:49:06 -08:00
Balazs Gibizer 8b14a16c57 Fix full executor warning on noname executor
The warning log assumed all executors has a name. Our centrally managed
executors has but not the adhoc ones causing a stack trace in the
compute manager power_sync periodics.

Change-Id: I04620364439a6c377f5b8f8f68cbdd3c62c44562
Signed-off-by: Balazs Gibizer <gibi@redhat.com>
2026-02-04 12:15:19 +01:00
Balazs Gibizer 8017b721fd Cleanup libvirt driver at service stop
As libvirt driver's Host object has a new headless thread we need to
make sure that thread is exiting cleanly when nova-compute is being
stopped.

Also at the same time we make sure our unit tests are not leaking such
thread across test cases with a new fixture and fixes in the test code.

Change-Id: Ide274d6caa3314f9d25d51d1f72850cf77c9dee4
Signed-off-by: Balazs Gibizer <gibi@redhat.com>
2026-02-04 12:15:19 +01:00
Balazs Gibizer 3216573655 Remove spawn_after
It was a naive implementation it is replaced with
StaticallyDelayingCancellableTaskExecutorWrapper.

Signed-off-by: Balazs Gibizer <gibi@redhat.com>
Change-Id: I5e8d496473d4ec167d1655368a00cbfa78d2c074
2026-02-04 12:15:19 +01:00
Balazs Gibizer f16170695c Use an executor to delay STOPPED events
During the VM hard reboot there is 3 events coming from libvirt
* STOPPED
* RESUMED
* STARTED

The libvirt driver implements automatic power sync of the VM based on
the STOPPED event. But it should not do a stop() compute api call if the
STOPPED event is followed right after by a STARTED event during hard
reboot. So the libvirt driver delays processing the STOPPED event by 15
seconds and cancels the event if another lifecycle event is received for
the same domain during that delay. In eventlet mode this is implemented
by sheduling a greenlet and cancelling it. With native threading we
cannot cancel a running task / thread so we need a bit smarter solution
than just adding a sleep to the event handler and putting it in a
threadpool.

So this patch introduces an Executor wrapper that allows delaying the
submission of a task into a real Executor by a predefine delay and checks
for cancellation before during the real submission.

The wrapper uses a single thread and a queue of tasks. As the delay is
the same for every tasks the ordering of the execution of the tasks are
the same as the order they was submitted to the wrapper. So the thread
can process the queue of tasks one by one, check for the remaining
time until the deadline of the oldest task then submit it to the real
executor, then take the next task from the queue.

Cancellation of a task is checked before any wait for a deadline and
before the submission to the real executor. So a task is never executed
if cancelled during its delay period.

Change-Id: I8fb3bb1e5506f2792522bf822939e7e8ab68763d
Signed-off-by: Balazs Gibizer <gibi@redhat.com>
2026-02-04 12:15:12 +01:00
Sean Mooney c8d34ed3dc Fix blockio generation for LUN volumes
QEMU's scsi-block device driver does not support physical_block_size
and logical_block_size properties. When Cinder reports disk geometry
for LUN volumes, Nova was incorrectly including a <blockio> element
in the libvirt XML, causing QEMU to fail with:

    Property 'scsi-block.physical_block_size' not found

This fix adds a check to skip blockio generation when source_device
is 'lun', following the existing pattern used for serial at line 1356.

Generated-By: claude-code (Claude Opus 4.5)
Closes-Bug: #2127196
Change-Id: Idf87e936edd97aac719222942c9842a9aca4c270
Signed-off-by: Sean Mooney <work@seanmooney.info>
2026-02-03 22:15:19 +00:00
Ghanshyam Maan 82fd8ffdce Add 2nd RPC server for compute service
For the compute service graceful shutdown, we need two RPC servers.
1st RPC server will used for the new requests and 2nd for completing
the in-progress tasks. The 2nd RPC server will use the same transport
bus and same endpoint (compute manager instance) but listen to the
different topic then 1st RPC server. By having two different topics,
other service (API, conductor, or compute) can make difference on
which topic they want to send the RPC request to the compute service.
That will be done via RPC client sending the request to specific
topic.

This change stop both RPC servers but later in this series we will
keep the 2nd RPC server active so that compute service can listen to
the in-progress tasks required communication coming from other
services.

The next change in this series will use this 2nd RPC server. The tasks
(compute RPC client methods) who needs to be using this 2nd RPC server
will be modified in the next change.

Partial implement blueprint nova-services-graceful-shutdown-part1

Change-Id: I26656869f00efe6d89d993000dcf2e91683a217e
Signed-off-by: Ghanshyam Maan <gmaan@ghanshyammann.com>
2026-01-30 11:43:26 -08:00
huanhongda 53a613d994 Live migration with iothreads
In commit 76d64b9cb4 we enable
one io-thread per qemu instance. Live migration should update this.

Related-Bug: #2139351
Change-Id: I1476de288490c88a60db697fbb45b4f783821c14
Signed-off-by: hongda.xun <hongda.xun@easystack.cn>
2026-01-30 17:38:00 +08:00
Sean Mooney ba24639b8d Add regression test to repoduce bug 2139351
This tests repoduces the current bug where the iothread pinning
is not updated for numa instnace on live migration and
enhance the libvirt fixture to make this possible

we also provide a sanity check for non numa instnace to show the
vcpu cpuset is correctly.

Related-Bug: #2139351
Assisted-By: claude-code opus 4.5
Change-Id: Ib2c0d1f826ad4f31e3e9b3f61f2c9b2111bf7edd
Signed-off-by: Sean Mooney <work@seanmooney.info>
2026-01-29 15:19:24 +00:00
Balazs Gibizer 9f74d1c5f2 Enable mypy on nova/utils.py
As a follow up for a review comment in [1] this patch enables mypy for
nova/utils, fixes the existing mypy findings, and adds some trivial type
annotations where make sense.

[1]https://review.opendev.org/c/openstack/nova/+/956089/comment/caec94ed_4fdb16bf/

Change-Id: I29ca69bd1e583adc1b1f408bd45de183649986d2
Signed-off-by: Balazs Gibizer <gibi@redhat.com>
2026-01-29 11:57:04 +01:00
Balazs Gibizer c89e54cedc SubclassSignatureTestCase to use NoDBTestCase as base
We have a list of fixtures included in the test.TestCase base class
that prevents global data and tread leaking across test cases within
the same process. The SubclassSignatureTestCase did not use our base
class but it initializes a partial libvirt driver class that will soon
use a ThreadPoolExecutor in native threading mode. So we need the leak
protection here as well. So this patch moves SubclassSignatureTestCase
to use the NoDBTestCase base class.

Change-Id: I05e818e8e83757185e5af78a5a4771c90d9fa217
Signed-off-by: Balazs Gibizer <gibi@redhat.com>
2026-01-29 11:54:25 +01:00
Balazs Gibizer a89c1b44c5 Libvirt event handling without eventlet
Our libvirt interface is not eventlet aware and not pure python. So
eventlet monkey patching is not enough. So the libvirt driver
implemented a native polling thread for libvirt and the queue + pipe
mechanism to push event from the native polling thread to the main
thread with the eventlet event loop.

We don't need all of these complications in native thread mode. There we
only need a single thread that poll libvirt for the events. The received
events can be executed directly on the polling thread as that is no
different from any other threads in the system now.

To make the change more understandable the event handling logic is moved
behind an abstraction that is implemented twice, once for eventlet with
the existing implementation just moved around, and once for native
threading with the simplified handling.

Change-Id: If479574cd91975810098afa8e3c220c7316a9431
Signed-off-by: Balazs Gibizer <gibi@redhat.com>
2026-01-29 11:54:25 +01:00
melanie witt 8b3701490e Add vtpm_secret_(uuid|value) to LibvirtLiveMigrateData
This is needed in order to pass TPM secret information to the
destination over RPC to support the 'host' secret security mode.

The fields are nullable so that secret security modes 'user' and
'deployment' may set them to None.

A setting of None lets the other security modes convey that they are
actively choosing not to pass any data in the vTPM fields. This is
important for interacting with older compute hosts in the middle of a
rolling upgrade. We do not want to backlevel new LibvirtLiveMigrateData
objects involving vTPM because older compute hosts cannot support vTPM
live migration in any capacity.

Related to blueprint vtpm-live-migration

Change-Id: If2ff2a7bb41dea6e0959c965477b79f3f7d633e7
Signed-off-by: melanie witt <melwittt@gmail.com>
2026-01-28 12:41:54 -08:00
Zuul 59a7093915 Merge "Use the correct name for the ironic check job" 2026-01-28 08:18:07 +00:00
Zuul 4112a4491c Merge "Preserve vTPM state between power off and power on" 2026-01-28 01:43:47 +00:00
Zuul ce286865f9 Merge "[hacking]Do not mock threading.Event" 2026-01-27 20:42:15 +00:00
Zuul 134d3ac476 Merge "api: Simplify servers views (3/3)" 2026-01-27 14:17:53 +00:00
Zuul d3143aeec7 Merge "api: Simplify servers views (2/3)" 2026-01-27 14:13:32 +00:00
Zuul 2032cb2828 Merge "api: Simplify servers views (1/3)" 2026-01-27 13:53:24 +00:00
Steve Baker 1637397253 Use the correct name for the ironic check job
The job name has been an alias for 6 years [1] and the accurate
preferred name ironic-tempest-bios-ipmi-direct has been in place for 8
months [2].

The intent of job names is to accurately describe the configuration of
the job, and the name
ironic-tempest-ipa-wholedisk-bios-agent_ipmitool-tinyipa is now
inaccurate - specifically the job no longer uses tinyipa.

[1] https://opendev.org/openstack/ironic/commit/53f751dcfd86594160dc9be92b616ef5d0d70623
[2] https://opendev.org/openstack/ironic/blame/branch/master/zuul.d/ironic-jobs.yaml#L1210-L1236

Change-Id: I768a6d3c7f9f550a692dd1f6e0435228076f118f
Signed-off-by: Steve Baker <sbaker@redhat.com>
2026-01-27 11:15:02 +13:00
Steve Baker 791310ae1e Add VNC console support for the Ironic driver
Ironic is adding support for VNC consoles tracked under the following
spec[1]. This change provides support for the Nova Ironic driver to
access the consoles created by this feature effort.

This supersedes an existing Nova spec[2] to add VNC console support to
the Ironic driver, so this change can be considered to implement this
spec also. This change can be merged independently of the Ironic work,
as the Ironic driver handles the VNC console not being available.

The pre-requesites for a graphical console being available for an Ironic
driver node is:

- Ironic is configured to enable graphical consoles
- The node ``console_interface`` is a graphical driver such as
  ``redfish-graphical`` or ``fake-graphical``
- ``nova-novncproxy`` can make network connections to the VNC servers
  which run adjacent to ``ironic-conductor``

The associated depends on adds the novnc validation check to the
baremetal basic ops, which is run in job
ironic-tempest-ipa-wholedisk-bios-agent_ipmitool-tinyipa.

In the support matrix console.vnc support is set to partial for ironic
due to the current lack of vencrypt support on the ironic side.

[1] https://specs.openstack.org/openstack/ironic-specs/specs/approved/graphical-console.html
[2] https://specs.openstack.org/openstack/nova-specs/specs/2023.1/approved/ironic-vnc-console.html

Related-Bug: 2086715
Implements: blueprint ironic-vnc-console
Change-Id: Iec26c67e29f91954eafc6a5a81086e36798d3f26
Signed-off-by: Steve Baker <sbaker@redhat.com>
2026-01-27 10:06:12 +13:00
Balazs Gibizer 19203d684d [hacking]Do not mock threading.Event
Such mock is too wide and will cause issues with our basic libraries and
test infrastructure leading to race conditions and threads leaked across
tests.

We needed to remove a bunch of such mocks found by the new rule. In some
cases we needed to make the mocking more specific for a given Event
instance, in other case the mock was not needed at all and the test case
was still not taking excessive time.

Related-Bug: #2136815
Change-Id: I3ae3740eb07bade4e0883db3e02c0a81e92b9a36
Signed-off-by: Balazs Gibizer <gibi@redhat.com>
2026-01-26 20:26:56 +01:00
Zuul d840c63a18 Merge "api: Add response body schemas for server metadata APIs" 2026-01-26 14:48:14 +00:00
Zuul eabb1d1260 Merge "api: Remove networks key from quota schemas" 2026-01-26 14:48:01 +00:00
Zuul e67372b33e Merge "api: Add response body schemas for server tags API" 2026-01-25 03:50:50 +00:00
Zuul d6d8f28640 Merge "api: Add response body schemas for server migrations API" 2026-01-25 03:50:32 +00:00
Zuul 92898e8f77 Merge "api: Add response body schemas for migrations API" 2026-01-24 08:29:20 +00:00
Zuul f33f8c6e25 Merge "api: Add response body schemas for quota sets API" 2026-01-24 08:29:06 +00:00
Zuul 99a2835bd2 Merge "api: Add response body schemas for quota class sets API" 2026-01-24 07:28:38 +00:00
Zuul 63c68c9542 Merge "TPM: support instances with deployment secret security" 2026-01-23 22:30:44 +00:00