womax/nova - nova - Gitea: Git with a cup of tea

womax/nova

Author	SHA1	Message	Date
Sean Mooney	c8d34ed3dc	Fix blockio generation for LUN volumes QEMU's scsi-block device driver does not support physical_block_size and logical_block_size properties. When Cinder reports disk geometry for LUN volumes, Nova was incorrectly including a <blockio> element in the libvirt XML, causing QEMU to fail with: Property 'scsi-block.physical_block_size' not found This fix adds a check to skip blockio generation when source_device is 'lun', following the existing pattern used for serial at line 1356. Generated-By: claude-code (Claude Opus 4.5) Closes-Bug: #2127196 Change-Id: Idf87e936edd97aac719222942c9842a9aca4c270 Signed-off-by: Sean Mooney <work@seanmooney.info>	2026-02-03 22:15:19 +00:00
Zuul	7579dbdf0e	Merge "Use *_OR_ADMIN policy defaults for server shares"	2026-01-23 05:00:53 +00:00
Zuul	8fe5d3ce75	Merge "Faults from cell DB missing in GET /servers/detail"	2026-01-23 05:00:40 +00:00
elajkat	76d64b9cb4	blueprint: iothreads-for-instances Enable one io-thread per qemu instance. Related-Bug: iothreads-for-instances Change-Id: I8b22e5bca560d111934fbdf67494a4e288b9e50a Signed-off-by: lajoskatona <lajos.katona@est.tech>	2026-01-19 16:17:47 +01:00
Zuul	68cec593a7	Merge "Compute manager to use thread pools selectively"	2026-01-16 21:03:28 +00:00
Balazs Gibizer	3c23390cc8	Compute manager to use thread pools selectively This changes the thread pool usage of the ComputeManager to go through the concurrency mode aware util functions. The concurrent live migration pool had a seemingly unlimited option when configured with value 0, but in reality GreenThreadPool has a default worker size of 1000. In reality it is almost never right to have more than one live migration running concurrently. Also with native threading having 1000 worker is just too costly. So we decided to deprecate the value 0 and changed the implementation of unlimited to mean 5 threads in native threading mode. We kept the 1000 greenthread in eventlet mode for backward compatibility. The _sync_power_states periodic task also spawn tasks for each instance to be synced. As it uses a shared data structure across these tasks and the caller a lock is needed to avoid race conditions. Also the default pool size is 1000 for these tasks in our configuration. That would use a lot of memory on a busy host in native threading mode. So we changed the default value from 1000 to 5. Change-Id: I9567d5fabdf086b5d0493103d9f6bde4f66af387 Signed-off-by: Balazs Gibizer <gibi@redhat.com>	2026-01-16 09:47:42 +01:00
Zuul	80753c5745	Merge "Upgrade note for concurrency mode default change"	2026-01-14 21:23:21 +00:00
Balazs Gibizer	f73a23b4d4	Upgrade note for concurrency mode default change This is a follow up for the release notes added in the commit `35207ee8b5` that changed the default mode for the scheduler and the API services. At that time we missed to note the upgrade impact of such change. So this patch extends the reno with an upgrade note. Change-Id: I280e7eb9c1da6eeaf50e96e8b19e296961f2651a Signed-off-by: Balazs Gibizer <gibi@redhat.com>	2026-01-14 13:29:07 +01:00
Zuul	88c538a897	Merge "libvirt: Skip unsupported firmware types"	2026-01-06 12:01:12 +00:00
Ivaylo Mitev	fb661ec597	Faults from cell DB missing in GET /servers/detail Field is empty in the response of API GET /servers/detail if the instance (hence instace_faults DB entry) is in nova cell DB. Unlike that, for API /servers/:id fault is retrieved correctly no matter in which nova cell the instance belongs. Closes-Bug: #1856329 Change-Id: I1726f53cfeac0a67a5dacdddda2af2cc1db0af0f Signed-off-by: Marius Leustean <marius.leustean@sap.com>	2025-12-17 11:51:38 +02:00
Zuul	f268b385dd	Merge "Use consistent program name for wsgi scripts and entry points"	2025-12-08 22:18:34 +00:00
Zuul	1712ae48e3	Merge "libvirt: add configuration option for volume AIO mode"	2025-12-05 05:20:30 +00:00
Zuul	5d3d0c870a	Merge "ensure correct cleanup of multi-attach volumes"	2025-12-04 07:00:30 +00:00
Takashi Kajinami	253aaec4bb	Use consistent program name for wsgi scripts and entry points Make sure that the consistent program name is always set,so that the same config sub-directory ( /etc/{project}/{prog}.conf.d ) is used regardless of the way api service is run. Closes-Bug: #2098514 Change-Id: Ib5c6d431176b83eefafddc1b35589015db6dfd04 Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>	2025-12-02 02:57:25 +09:00
Takashi Kajinami	d2188b9e6b	libvirt: Skip unsupported firmware types Ignore (1) stateless mode firmware and (2) memory device firmware which do not include a few core keys such as nvram-template. This is a temporal (and backportable) workaround until firmware detection using libvirt's internal feature is implemented by [1] [1] https://blueprints.launchpad.net/nova/+spec/libvirt-firmware-auto-selection Closes-Bug: #2122288 Change-Id: I99bc36fdd5df816c9ae374db71e4734fb7fc467b Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>	2025-11-30 02:24:32 +09:00
Jay Faulkner	56cb5f52fb	[ironic] Ensure unprovision happens for new states States were added to the Ironic API to enable the node servicing feature, which can be performed on nodes provisioned with Nova instances. Current nova, if asked to delete these instances, will only remove the instance metadata and not tear them down. This change has two parts: - I have added the new, relevant states to _UNPROVISION_STATES in driver.py, which now allows Nova to know that SERVIC* states and DEPLOYHOLD are safe to unprovision from. - I have added all existing ironic states to ironic_states.py and the PROVISION_STATE_LIST constant and check the state against it -- in a case where a completely unknown state is returned, we should attempt an unprovision. This fix needs to be backported as far as possible, as this bug has existed since Antelope / 2023.1 (DEPLOYHOLD) or Bobcat / 2023.3 (SERVIC*). Assisted-by: Claude Code Closes-bug: #2131960 Change-Id: I31c70d35b0e6e9f8d2252bfb2f0bdec477cc6cc7 Signed-off-by: Jay Faulkner <jay@jvf.cc>	2025-11-20 15:23:58 -08:00
René Ribaud	f017e23b81	Use *_OR_ADMIN policy defaults for server shares Update the server shares API policies to use PROJECT_READER_OR_ADMIN and PROJECT_MEMBER_OR_ADMIN instead of PROJECT_READER and PROJECT_MEMBER. This aligns the server shares policies with other compute API policies and ensures administrators can list, attach, show and detach shares regardless of project policy overrides. Signed-off-by: René Ribaud <rene.ribaud@gmail.com> Change-Id: I2b237d56b08e3080475dc500e204298018af29c7	2025-11-20 15:15:00 +01:00
melanie witt	c5c1b93d21	libvirt: add configuration option for volume AIO mode With the NFS, FC, and iSCSI Cinder volume backends, Nova explicitly sets AIO mode ``io=native`` in the Libvirt guest XML. Operators may set this option to True in order to defer AIO mode selection to QEMU if forcing ``io=native`` is not desired. Closes-Bug: #2129788 Change-Id: I6e51706b5cb8be5becebbafe9108df1ba9e0f69f Signed-off-by: melanie witt <melwittt@gmail.com>	2025-11-19 12:04:31 -08:00
Zuul	0c33871c36	Merge "Add managed='no' flag to libvirt XML definition for VIF type TAP"	2025-11-18 14:57:17 +00:00
Sean Mooney	22012360c4	ensure correct cleanup of multi-attach volumes If a host has multiple instance with the same shared multi attach volume and you delete them in parallel nova need to correctly clean up the volume connection on the host when the last instance is removed. currently we do not have a volume level lock to guard the critical section that determins if the current disconnect is removing the final usage of the volume. This can lead to leaking the volume or other issues as noted in bug: #2048837 This change introduces a FairLockGuard to ensure we acquire and release the locks in a fair and orderd manner. The FairLockGuard is used to lock the server delete with one lock per multi attach volume. This will ensure that disconnects of diffrent volumes can happen in parallel but if we are disconnecting the same volume in multiple greenthread concurrently they will be serialised. Assisted-By: Cursor Auto Closes-Bug: #2048837 Change-Id: I67e10cace451259127a5d7da8fbdf7739afe3e51 Signed-off-by: Sean Mooney <work@seanmooney.info>	2025-11-17 13:26:08 +00:00
Zuul	68a0a69c33	Merge "Allow to perform parallel live migrations"	2025-11-07 22:36:34 +00:00
Dmitriy Rabotyagov	25fbf32f22	Allow to perform parallel live migrations This patch implements parallel live migrations for libvirt driver. It is achieved through introduction of new configuration parameter `live_migration_parallel_connections`. This allows to eliminate bottleneck on live migration speed by establishing multiple connections for memory transition, thus leveraging multi-threaded behavior in QEMU. Implements-blueprint: libvirt-parallel-migrate Change-Id: I98ff5f07f94d94f3aa0227591f425d532773adb0 Signed-off-by: Dmitriy Rabotyagov <dmitriy.rabotyagov@cleura.com>	2025-11-07 07:17:54 -08:00
Balazs Gibizer	35207ee8b5	Default native threading for sch, api and metadata This patch switches the default concurrency mode to native threading for the services that gained native threading support in Flamingo: nova-scheduler, nova-api, and nova-metadata. The OS_NOVA_DISABLE_EVENTLET_PATCHING env variable still can be used to explicitly switch the concurrency mode to eventlet by OS_NOVA_DISABLE_EVENTLET_PATCHING=false We also ensure that the cover, docs, py3xx and functional tox targets are still running with eventlet while py312-threading kept running with native threading. Change-Id: I86c7f31f19ca3345218171f0abfa8ddd4f8fc7ea Signed-off-by: Balazs Gibizer <gibi@redhat.com>	2025-11-06 19:42:24 +01:00
Nell Jerram	6aba55a23f	Add managed='no' flag to libvirt XML definition for VIF type TAP libvirt 9.5.0 and later by default doesn't allow using a pre-created TAP device; instead it expects to create and manage the TAP device itself, which is incompatible with how Nova works. To restore compatibility with Nova we need to add the managed="no" flag to the target device section in the XML domain file. The libvirt change is here[1]. In particular it breaks Calico for OpenStack, because the Calico plugin (out of tree[2]) uses VIF type TAP. 1. https://github.com/libvirt/libvirt/commit/a2ae3d299cf 2. https://github.com/projectcalico/calico/blob/master/networking-calico/networking_calico/plugins/ml2/drivers/calico/mech_calico.py#L217 Many thanks to Masahito Muroi <masahito.muroi@linecorp.com> for proposing an earlier version of this fix. Closes-Bug: #2033681 Change-Id: I4a7b4ecf69cfe04c5291e5ca2a76db8829d6e592 Signed-off-by: Nell Jerram <nell@tigera.io>	2025-11-06 12:29:11 +00:00
Zuul	9e5ad07aee	Merge "setup: Remove pbr's wsgi_scripts"	2025-11-05 11:15:52 +00:00
Stephen Finucane	5da2dc2060	setup: Remove pbr's wsgi_scripts This is technical dead end and not something we're going to be able to support long-term in pbr. We need to push users away from this. Doing so highlights quite a few place where our docs need some work, particularly in light of the recent removal of the eventlet servers. Change-Id: I2ffaed710fac2612f5337aca5192af15eab46861 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2025-11-04 16:11:50 +00:00
Johannes Kulik	710ffbb0c5	api: Pre-query not deleted members in server groups When retrieving multiple - or all - server groups, the code tries to find not deleted members for each server group in every cell individually. This is highly inefficient, which is especially noticable when the number of server groups rises. We change this to query all members of all server-groups we will reply with (i.e. from the already limited list) in advance and pass this set of existing uuids into the function formatting the server group. This is more efficient, because we only do one large query instead of up to 1000 times the number of cells. Change-Id: I3459ce7a8bec9a9e6f3a3b496a3e441078b86af0 Signed-off-by: Johannes Kulik <johannes.kulik@sap.com> Partial-Bug: #2122109	2025-11-03 11:46:43 +01:00
Zuul	6d5cf6845e	Merge "Fix fill_metadata usage for the ImagePropertiesWeigher"	2025-10-16 23:56:01 +00:00
Sylvain Bauza	98885344bd	Fix fill_metadata usage for the ImagePropertiesWeigher When using the weigher, we need to target the right cell context for the existing instances in the host. fill_metadata was also having an issue as we need to pass the dict value from the updated dict by keying the instance uuid, not the whole dict of updated instances. Change-Id: I18260095ed263da4204f21de27f866568843804e Closes-Bug: #2125935 Signed-off-by: Sylvain Bauza <sbauza@redhat.com>	2025-10-16 11:09:45 +02:00
Zuul	cc742602bc	Merge "Run nova-conductor in native threading mode"	2025-10-02 15:55:16 +00:00
Balazs Gibizer	ec426532c3	Run nova-conductor in native threading mode Previous patches removed direct eventlet usage from nova-conductor so now we can run it with native threading as well. This patch documents the possibility and switches both nova-conductor process to native threading mode in the nova-next job. Change-Id: If26c0c7199cbda157f24b99a419697ecb6618fa6 Signed-off-by: Balazs Gibizer <gibi@redhat.com>	2025-09-22 10:17:39 +00:00
Julien Le Jeune	dc51a4271b	nova-conductor puts instance in error state Nova-conductor puts instance in error if an unknown exception is raised in the _build_live_migrate_task during the live-migration. [1] The exception comes from _call_livem_checks_on_host and we can see raise exception.MigrationPreCheckError if we face to messaging.MessagingTimeout exception for example. [2] The function check_can_live_migrate_destination does a check also on source host with check_can_live_migrate_source [3] and this check can also return exceptions like MessagingTimeout and this one is not caught properly because it's a remote "Remote error: MessagingTimeout" due to dest host try to contact source host and this source host not reply. [1] https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L523 [2] https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L381 [3] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L9090 Closes-Bug: #2044235 Change-Id: Ie1f96fee743c235ab35113a9ad1549a67b975839 Signed-off-by: Julien Le Jeune <julien.le-jeune@ovhcloud.com>	2025-09-15 16:41:01 +02:00
Zuul	87bf7700b8	Merge "reno: Update master for unmaintained/2023.1"	2025-09-12 10:55:00 +00:00
OpenStack Release Bot	71607ef8a5	Update master for stable/2025.2 Add file to the reno documentation build to show release notes for stable/2025.2. Use pbr instruction to increment the minor version number automatically so that master versions are higher than the versions on stable/2025.2. Sem-Ver: feature Change-Id: I7d967c1d5b1ac7fa2e601acfa25c3b5c3880056e Signed-off-by: OpenStack Release Bot <infra-root@openstack.org> Generated-By: openstack/project-config:roles/copy-release-tools-scripts/files/release-tools/add_release_note_page.sh	2025-09-12 08:54:07 +00:00
Zuul	759e03c35d	Merge "Add Flamingo prelude section"	2025-09-11 09:03:15 +00:00
Zuul	36c63f1664	Merge "hypervisors: Optimize uptime retrieval for better performance"	2025-09-10 11:36:59 +00:00
René Ribaud	45ddbc2569	Add Flamingo prelude section Shamelessly copied from the cycle highlights Signed-off-by: René Ribaud <rribaud@redhat.com> Change-Id: Ib9de63fe4ccce24921326ef3bcfc690fd4481687	2025-09-10 10:39:44 +02:00
Sean Mooney	567dbe1867	hypervisors: Optimize uptime retrieval for better performance The /os-hypervisors/detail API endpoint was experiencing significant performance issues in environments with many compute nodes when using microversion 2.88 or higher, as it made sequential RPC calls to gather uptime information from each compute node. This change optimizes uptime retrieval by: * Adding uptime to periodic resource updates sent by nova-compute to the database, eliminating synchronous RPC calls during API requests * Restricting RPC-based uptime retrieval to hypervisor types that support it (libvirt and z/VM), avoiding unnecessary calls that would always fail * Preferring cached database uptime data over RPC calls when available Closes-Bug: #2122036 Assisted-By: Claude <noreply@anthropic.com> Change-Id: I5723320f578192f7e0beead7d5df5d7e47d54d2b Co-Authored-By: Sylvain Bauza <sbauza@redhat.com> Signed-off-by: Sean Mooney <work@seanmooney.info>	2025-09-05 19:03:38 +01:00
Zuul	0dd7cb1fb0	Merge "libvirt: Disable VMCoreInfo device for SEV-encrypted instances"	2025-09-05 16:32:24 +00:00
Takashi Kajinami	79846eb0d0	libvirt: Disable VMCoreInfo device for SEV-encrypted instances When VMCoreInfo device is enabled, the QEMU fw_cfg device in guest OS requires DMA between host OS and guest OS through the device. However DMA is prohibited when guest memory is encrypted using SEV, and the attempt results in kernel crash. Do not add VMCoreInfo when memory encryption is enabled. Closes-Bug: #2117170 Change-Id: I05c7b1ae46ccd8d9aa42456b493ac6ee7ddd8bae Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>	2025-08-29 21:19:10 +09:00
Zuul	dcf90dbb25	Merge "Ask for pre-prod testing for native threading"	2025-08-29 04:35:24 +00:00
Zuul	32d76d08cb	Merge "libvirt: Launch instances with SEV-ES memory encryption"	2025-08-28 23:24:30 +00:00
Zuul	d5134798de	Merge "Detect AMD SEV-ES support"	2025-08-28 20:36:36 +00:00
Takashi Kajinami	4f5a3f3c00	libvirt: Launch instances with SEV-ES memory encryption This is the last piece to allow users to request AMD SEV-ES for memory encryption instead of AMD SEV. The CPU feature for memory encryption can now be requested via the hw:mem_encryption_model flavor extra spec or via the hw_mem_encryption_model image property. Implements: blueprint amd-sev-es-libvirt-support Change-Id: Ifc9b86ad7db887cc22b2cd252fe8adc81fdc29c6 Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>	2025-08-28 08:47:49 +09:00
Takashi Kajinami	6c0a689d80	Detect AMD SEV-ES support Detect AMD SEV-ES support by kernel/qemu/libvirt and generate a nested RP for ASID slots for SEV-ES under the compute node RP. Deprecate the [libvirt] num_memory_encryption_guests option because the option is effective only for SEV, and now the maximum numbers for SEV/SEV-ES guests can be detected by domain capabilities presented by libvirt. Note that creating an instance with memory encryption enabled now requires AMD SEV trait, because these instances can't run with SEV-ES slots, which are added by this change. Partially-Implements: blueprint amd-sev-es-libvirt-support Change-Id: I5968e75325b989225ed1fc6921257751ae227a0b Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>	2025-08-28 08:47:45 +09:00
Ghanshyam Maan	f914cb185c	Add service role in Nova policy RBAC community wide goal phase-2[1] is to add 'service' role for the service APIs policy rule. This commit defaults the service APIs to 'service' role. This way service APIs will be allowed for service user only. Tempest tests also modified to simulate the service-to-service communication. Tempest tests send the user with service role to nova API. - https://review.opendev.org/c/openstack/tempest/+/892639> Partial implement blueprint policy-service-role-default [1] https://governance.openstack.org/tc/goals/selected/consistent-and-secure-rbac.html#phase-2 Change-Id: I1565ea163fa2c8212f71c9ba375654d2aab28330 Signed-off-by: Ghanshyam Maan <gmaan@ghanshyammann.com>	2025-08-27 19:34:04 +00:00
Balazs Gibizer	2a9cbdabce	Ask for pre-prod testing for native threading This patch refines our logging, doc, and release notes about the native threading mode of scheduler, api, and metadata services to ask for pre-prod testing before enabled in production. Change-Id: I04bbb3d7e4664a0cab8b30f4c34ee71774536353 Signed-off-by: Balazs Gibizer <gibi@redhat.com>	2025-08-27 18:46:31 +02:00
Zuul	b0900e9185	Merge "restrict swap volume to cinder"	2025-08-19 01:00:41 +00:00
Zuul	64d73d5e79	Merge "api: Deprecate v2 API"	2025-08-18 22:18:54 +00:00
Sean Mooney	93c0f9bc74	restrict swap volume to cinder This change tightens the validation around the attachment update API to ensure that it can only be called if the source volume has a non empty migration status. That means it will only accept a request to swap the volume if it is the result of a cinder volume migration. This change is being made to prevent the instance domain XML from getting out of sync with the nova BDM records and cinder connection info. In the future support for direct swap volume actions can be re-added if and only if the nova libvirt driver is updated to correctly modify the domain. The libvirt driver is the only driver that supported this API outside of a cinder orchestrated swap volume. By allowing the domain XML and BDMs to get out of sync if an admin later live-migrates the VM the host path will not be modified for the destination host. Normally this results in a live migration failure which often prompts the admin to cold migrate instead. however if the source device path exists on the destination the migration will proceed. This can lead to 2 VMs using the same host block device. At best this will cause a crash or data corruption. At worst it will allow one guest to access the data of another. Prior to this change there was an explicit warning in nova API ref stating that humans should never call this API because it can lead to this situation. Now it considered a hard error due to the security implications. Closes-Bug: #2112187 Depends-on: https://review.opendev.org/c/openstack/tempest/+/957753 Change-Id: I439338bd2f27ccd65a436d18c8cbc9c3127ee612 Signed-off-by: Sean Mooney <work@seanmooney.info>	2025-08-18 16:11:41 +00:00

1 2 3 4 5 ...

2473 Commits