Use the firmware auto-selection feature in libvirt to find the best
UEFI firmware file according to the requested feature.
Firmware files may be reselected when a libvirt domain is created from
scratch, while these are kept during hard-reboot (or live migration
which preserves the loader/nvram elements filled by libvirt).
Closes-Bug: #2122296
Related-Bug: #2122288
Implements: blueprint libvirt-firmware-auto-selection
Change-Id: Ie48b020597a1a2fb3280815eec5ba3565e396f9b
Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>
As per the part1 of the graceful shutdown timeouts[1], this commit
add/modifies the below timeout/wait needed for graceful shutdown:
- Override the default of the graceful_shutdown_timeout to 180.
- Add a new config option for manager shutdown timeout.
It also adds a graceful_shutdown() method on the manager side, which
will be called by the nova/service.py->stop() method before it stops
the 2nd RPC server. In part1, this will wait for the configurable wait
time, but part2 will implement a better solution to track the
in-progress tasks. The idea is to have this single interface from the
service manager (graceful_shutdown()) that will be called during
graceful shutdown and is responsible for finishing the required tasks
and cleanup.
Partial implement blueprint nova-services-graceful-shutdown-part1
[1] https://specs.openstack.org/openstack/nova-specs/specs/2026.1/approved/nova-services-graceful-shutdown-part1.html#graceful-shutdown-timeouts
Change-Id: I7c1934d3ec7854feac3fc8432627c25eba963ddf
Signed-off-by: Ghanshyam Maan <gmaan.os14@gmail.com>
This test failied a few times and most recent faaailure is
- https://review.opendev.org/c/openstack/nova/+/975586 (PS8 run)
Traceback (most recent call last):
File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/unit/test_utils.py", line 2185, in test_submit_second_while_delaying_first
self.assertGreater(task2_runtime, 2.0)
File "/usr/lib/python3.12/unittest/case.py", line 1269, in assertGreater
self.fail(self._formatMessage(msg, standardMsg))
File "/usr/lib/python3.12/unittest/case.py", line 715, in fail
raise self.failureException(msg)
AssertionError: 1.997275639999998 not greater than 2.0
From error, it seems we are capturing the start time after we
submit the task to executor who will count the task submit time
little ahead of test captured the task start time.
let's capture the task start time before task is submitted so that
we can caompare the time in more correct way.
Change-Id: I5a9845813b614c58e0f5a66e07f8a8c732f38eb3
Signed-off-by: Ghanshyam Maan <gmaan.os14@gmail.com>
Some firmwares require smm feature. While the feature doesn't have to
be explicitly enabled when auto-selection is enabled, it should be
enabled explicitly when firmware files are pre-defined.
Partially-Implements: blueprint libvirt-firmware-auto-selection
Change-Id: Ia194dcfacd2b743761e720d947a6807689a96da3
Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>
... so that we can load these from existing guest XML.
This is a preparation work to use firmware auto-selection by libvirt,
and is required to avoid re-selection during hard-reboot.
Partially-Implements: blueprint libvirt-firmware-auto-selection
Change-Id: I899cb7d6ee364def8d1298b77c24cc5156c71126
Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>
Extend the existing (but unused) guest xml generation logic for
firmware detection, by adding the firmware features flags to require
secure boot support.
Partially-Implements: blueprint libvirt-firmware-auto-selection
Change-Id: I907c9c88f370a52b54b98e1e1cbda6c21d2bff62
Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>
Adds non-secure boot scenario and stateless firmware scenario to
demonstrate how guest xml contents look like when firmware files are
selected by libvirt.
Partially-Implements: blueprint libvirt-firmware-auto-selection
Change-Id: I88f0b81c8455630145efca8c6349fc00a0c29835
Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>
Using a string value was deprecated in oslo.middleware 3.15.0[1] which
was released 9 years age. The value of this option has been treated as
a list value since then.
[1] 7e519d008f7743d75ec299095060a70d5fd00f99
The latest oslo.middelware release removed the deprecated handling.
Change-Id: Ib88c046af14f5d5de0d410a35a702b7a2322c832
Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>
This change improves FairLockGuard to properly support two previously
unsupported (or broken) usage patterns:
1. Cross-thread sharing: When threads share the same FairLockGuard
instance, they now correctly wait for each other instead of raising
TypeError. Fixed by:
- Adding _active_thread tracking to identify the owning thread
- Restructuring lock acquisition order: named locks are now acquired
OUTSIDE of locks_lock to prevent deadlock when Thread-B waits on
locks held by Thread-A
- Only same-thread re-entry triggers the nesting logic, not
cross-thread access
2. Same-thread nesting: The same FairLockGuard instance can now be
nested within itself. Fixed by:
- Adding _nesting_depth counter initialized to 0
- Nested entries increment depth and return early (locks held)
- Exits decrement depth; locks only released when depth reaches 0
- This prevents lock leaks that would occur if inner exit cleared
self.locks before outer exit could release them
Additional improvements:
- Exception handling during partial lock acquisition now properly
releases any locks acquired before the failure
- Lock release moved outside locks_lock in __exit__ for consistency
The docstring has been updated to reflect that both patterns now work,
while continuing to discourage them in favor of creating separate
FairLockGuard instances for clarity.
New tests added:
- test_deep_nesting: Verifies 3+ levels of nesting
- test_nested_exception_outer_still_holds_locks: Verifies outer context
retains locks when inner context raises an exception
- test_empty_lock_list: Verifies empty lock list edge case
Related-Bug: #2048837
Generated-By: claude-code opus 4.5
Change-Id: Ia937b0e2d76c814360f168d5f33b821bfc61aade
Signed-off-by: Sean Mooney <work@seanmooney.info>
On Debian 13 (Trixie), libvirt packaging is modularized and
the libvirt-daemon-lock package (providing virtlockd) is
optional. The evacuate hook previously assumed all libvirt
services were installed and failed when stopping/starting
missing units.
Extract a reusable manage_libvirt_service.yaml task file that
checks if a service exists via systemctl list-unit-files
before managing its units. This prevents failures when
optional libvirt packages are not installed and future-proofs
against further packaging changes.
Generated-By: claude-code
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Change-Id: Ie84e2e8ab2d3065b1562ee5e256fa163541955f7
Signed-off-by: Sean Mooney <work@seanmooney.info>
This fixes an instance of us passing a disk image to qemu-img for
resize where we don't constrain the format. As has previously been
identified, it is never safe to do that when the image itself is not
trusted. In this case, an instance with a previously-raw disk image
being used by imagebackend.Flat is susceptible to the user writing a
qcow2 (or other) header to their disk causing the unconstrained
qemu-img resize operation to interpret it as a qcow2 file.
Since Flat maintains the intended disk format in the disk.info file,
and since we would have safety-checked images we got from glance,
we should be able to trust the image.format specifier, which comes
from driver_format in imagebackend, which is read from disk.info.
Since only raw or qcow2 files should be resized anyway, we can further
constrain it to those.
Notes:
1. qemu-img refuses to resize some types of VMDK files, but it may
be able to resize others (there are many subformats). Technically,
Flat will allow running an instance directly from a VMDK file,
and so this change _could_ be limiting existing "unintentionally
works" behavior.
2. This assumes that disk.info is correct, present, etc. The code to
handle disk.info will regenerate the file if it's missing or
unreadable by probing the image without a safety check, which
would be unsafe. However, that is a much more sophisticated attack,
requiring either access to the system to delete the file or an
errant operator action in the first place.
Change-Id: I07cbe90b7a7a0a416ef13fbc3a1b7e2272c90951
Closes-Bug: #2137507
Signed-off-by: Dan Smith <dansmith@redhat.com>
Preserve NVRAM variable store during stop/start, hard reboot, live
migration, and volume retype.
This does not affect cold migration or shelve.
For UEFI guests (hw_firmware_type=uefi), every time the instance is
started, the UEFI variable storage for that instance
(/var/lib/libvirt/qemu/nvram/instance-xxxxxxxx_VARS.fd) is deleted
and reinitialized from the default template.
The changes are based on this patch by Jonas Schäfer to preserve the
vTPM state:
https://review.opendev.org/c/openstack/nova/+/955657
Closes-Bug: #1633447
Closes-Bug: #2131730
Change-Id: I444a9285c07a04bf08a73772235f8dd73d75e513
Signed-off-by: Nicolai Ruckel <nicolai.ruckel@cloudandheat.com>
If a guest has pinned CPUs the domain XML's
<iothreadpin> should have iothread attribute also.
Closes-Bug: #2140537
Change-Id: I5c2df747a3fdfbd2ee31d50a3d716a0ccc787e15
Signed-off-by: lajoskatona <lajos.katona@est.tech>
Now that all of our controllers have full schema coverage, we can now
assume that all controllers are validated and raise if that's not the
case.
Change-Id: I3a58be8551e7cf13835ad565aae4fc9dc4214bbd
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>