womax/nova - nova - Gitea: Git with a cup of tea

womax/nova

Author	SHA1	Message	Date
melanie witt	5a55a78d51	live migration: Avoid volume rollback mismatches The tl;dr is to 1) avoid trying to disconnect volumes on the destination if they were never connected in the first place and 2) avoid trying to disconnect volumes on the destination using block device info for the source. Details: * Only remotely disconnect volumes on the destination if the failure was not during pre_live_migration(). When pre_live_migration() fails, its exception handling deletes the Cinder attachment that was created before re-raising and returning from the RPC call. And the BDM connection_info in the database is not guaranteed to reference the destination because a failure could have happened after the Cinder attachment was created but before the new connection_info was saved back to the database. In this scenario, there is no way to reliably disconnect volumes in the destination remotely from the source because the destination connection_info needed to do it might not be available. * Due to the first point, this adds exception handling to disconnect the volumes while still on the destination, while the destination connection_info is still available instead of trying to do it remotely from the source afterward. * Do not pass Cinder volume block_device_info when calling rollback_live_migration_on_destination() because volume BDM records have already been rolled back to contain info for the source by that point. Not passing volume block_device_info will prevent driver.destroy() and subsequently driver.cleanup() from attempting to disconnect volumes on the destination using connection_info for the source. Closes-Bug: #1899835 Change-Id: Ia62b99a16bfc802b8ba895c31780e9956aa74c2d	2025-04-28 18:11:25 -07:00
melanie witt	8cafefb2bd	Amend functional reproducer for bug 1899835 This adds mocking of ComputeManager._live_migration_cleanup_flags() to simulate no shared storage. Otherwise the test detects shared storage and skips a second call to _disconnect_volume() that occurs in the bug scenario when storage is local. Related-Bug: #1899835 Change-Id: I06b19044876aab9b4585384352f8dccc39984526	2025-04-07 18:29:08 -07:00
Zuul	46736446ab	Merge "Extend invalidate_rp to only invalidate cache"	2025-04-07 17:22:59 +00:00
Zuul	9d910ec4bf	Merge "Imported Translations from Zanata"	2025-04-02 12:57:38 +00:00
Zuul	adfd486810	Merge "ironic: fix logging of validation errors"	2025-04-02 01:11:09 +00:00
Dan Smith	ba00d60b95	Extend invalidate_rp to only invalidate cache This makes invalidate_resource_provider() have a cacheonly flag that only invalidates our cache, but does not remove the provider from the tree for efficiency. Related to blueprint one-time-use-devices Change-Id: I04dd5e984c5671d866804c258422e4230fce37b7	2025-03-27 08:54:11 -07:00
René Ribaud	98226b60f3	FUP: Improve libvirt fixture for hostdevs This patch enhances the libvirt fixture to better align with the real libvirt output when handling hostdevs. It adds the alias tag, which libvirt provides to specify the hostdev name, and the address tag, which indicates the address seen by the guest. These two fields will be used in a subsequent patch to improve the comparison between source and destination XMLs during migration. Example: <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x82' slot='0x00' function='0x1'/> </source> <alias name='hostdev0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </hostdev> The target goal of these series of patch is to enable VFIO devices migration with kernel variant drivers. Implements: blueprint migrate-vfio-devices-using-kernel-variant-drivers Change-Id: I3ee3923f990dd6522a11849551a9d49c9fad426c	2025-03-26 10:04:37 +01:00
René Ribaud	c6a96a17db	FUP Update pci-passthrough and virtual-gpu documentation This patch adds the necessary documentation identified in: - pci-passthrough: Explaining live migration and known issues. - virtual-gpu: Updating the caveats section to clarify what to do when VF devices are available instead of `mdev`. The target goal of these series of patch is to enable VFIO devices migration with kernel variant drivers. Implements: blueprint migrate-vfio-devices-using-kernel-variant-drivers Change-Id: I41271a8af5687fb1d18f9d0852492756e096720d	2025-03-26 10:02:41 +01:00
René Ribaud	28f82ba912	FUP Add a warning to make non-explicit live migration request debugging easier Today, when a user does not request live-migratable devices, the migration should fail. However, this failure is hard to detect because the end result is a NoValidHost error when Nova exhausts its reschedule attempts. As a result, it is difficult to determine why scheduling failed. This patch adds a warning to aid in debugging and identifying the root cause more easily. The target goal of these series of patch is to enable VFIO devices migration with kernel variant drivers. Implements: blueprint migrate-vfio-devices-using-kernel-variant-drivers Change-Id: I64448f30e5d692396c129d9239679e74051cde7f	2025-03-26 10:02:41 +01:00
René Ribaud	5ac94abfdb	FUP improve comment accuracy and variable naming for tag removal This patch updates an incorrect comment to reflect the correct behavior. It also improves variable naming for tags that need to be removed from the device specifications. The target goal of these series of patch is to enable VFIO devices migration with kernel variant drivers. Implements: blueprint migrate-vfio-devices-using-kernel-variant-drivers Change-Id: I0ae1da59014725aa0065a7f4cfa629367fa5eaeb	2025-03-26 10:02:41 +01:00
René Ribaud	4e4262cd3d	FUP Remove unnecessary PCI check This patch removes the _test_pci() method, which is no longer necessary since flavor-based requests can now be live migrated. The related tests have also been removed. This fixes a bug where a user requests a live migration with a flavor-based request and NUMA constraints (e.g., CPU affinity). In this case, the code encounters the _test_pci() method and fails because the check was originally designed to enforce port-based requests only, causing an unnecessary failure. Notes: This issue was discovered through functional tests that involve a mix of port-based and flavor-based requests. The failure in this scenario highlighted the unnecessary constraint. A functional test reproducing this issue in a mixed-mode scenario (port request + flavor-based request) will be provided in a subsequent FUP patch. The _test_pci() check was redundant, as a similar verification is already performed earlier in the migration process. Closes-Bug: 2103636 Implements: blueprint migrate-vfio-devices-using-kernel-variant-drivers Change-Id: Icbeaadd94658ed44917d724446d484f6497f29e5	2025-03-26 10:02:41 +01:00
Zuul	725a307693	Merge "Update master for stable/2025.1"	2025-03-25 15:57:34 +00:00
Zuul	778be04d4e	Merge "Fix case-sensitivity for metadata keys"	2025-03-25 15:57:27 +00:00
Zuul	76e3b573ba	Merge "Fix case sensitive comparison"	2025-03-25 15:56:01 +00:00
Zuul	caa379116e	Merge "wrap wsgi_app.init_application with latch_error_on_raise"	2025-03-25 04:35:20 +00:00
Zuul	9d05000bb8	Merge "unified limits: discover service ID and region ID"	2025-03-25 01:33:36 +00:00
Sean Mooney	8dcbbe43e7	wrap wsgi_app.init_application with latch_error_on_raise This change adds a latch_error_on_raise decorator which is applied to the init_applciation function in our common wsgi_app module. This decorator will catch all non retryable exceptions and cause future invocations of the function to always return that same exception forever. a reset function is also added to the decorated function which should be called in our bases test class to prevent cross test interactons. Closes-Bug: #2103811 Related-Bug: #1882094 Change-Id: I44b1f7e2acc36a5b557d6d8788f6099f52bbdfb8	2025-03-24 23:37:12 +00:00
Zuul	76c3c4c1bd	Merge "Ignore metadata tags in pci/stats _find_pool logic"	2025-03-19 22:04:07 +00:00
Zuul	c66d5735b0	Merge "Reproduce bug/2098496"	2025-03-19 19:26:09 +00:00
Balazs Gibizer	229fb3513a	Ignore metadata tags in pci/stats _find_pool logic The stats module uses the _find_pool() call to find a matching pool for a new device or a device that is being deallocated. If no existing pool matches with the dev then then a new pool is created for it. The pool matching logic was faulty as it did not remove all the metadata keys from the pool like rp_uuid. So if the dev did not have that key but the pool did then the dev did not match. On the other hand the PCI allocation logic (when PCI in Placement is enabled) assumed that devices from a single rp_uuid are always in a single pool. As this assumption was broken by the above bug the PCI allocation blindly tried to allocate resources for an rp_uuid from each matching pool causing overallocation. The main fix in this patch is to ignore the metadata tags in _find_pool(). But also two safety net are added to the allocation logic. The logic now asserts that the assumption is correct and if not (i.e. it found multiple pools with the same rp_uuid) then it bails out. It also does not ever blindly allocate the same rp_uuid request from multiple pools. Closes-Bug: #2098496 Change-Id: I9678230397fa1a3c735ee01ed756d5af3b4e1191	2025-03-19 18:25:59 +01:00
Pierre Riteau	9b7809b289	Fix missing backtick in configuration option help Change-Id: I00207d1837ba419f0dd5325ee5cbaeb678ad541b	2025-03-19 14:42:51 +01:00
OpenStack Proposal Bot	5ef6eae174	Imported Translations from Zanata For more information about this automatic import see: https://docs.openstack.org/i18n/latest/reviewing-translation-import.html Change-Id: Ifb48bcf17cda8936e4ec3b20269ca9580335ece3	2025-03-19 04:01:16 +00:00
OpenStack Release Bot	932d2334c2	Update master for stable/2025.1 Add file to the reno documentation build to show release notes for stable/2025.1. Use pbr instruction to increment the minor version number automatically so that master versions are higher than the versions on stable/2025.1. Sem-Ver: feature Change-Id: Iba42aa129140dc494d99dede17f5ea7b44062d62	2025-03-18 16:27:31 +00:00
Zuul	6042300453	Merge "Bump MIN_{LIBVIRT,QEMU} for "Epoxy""	2025-03-18 12:43:44 +00:00
Zuul	fc339b2559	Merge "Add Epoxy prelude section"	2025-03-18 10:06:15 +00:00
Sylvain Bauza	8197f7d5a6	Add Epoxy prelude section Shamelessly copied from the cycle highlights Change-Id: I9c949db80ad795d67e75c464eec6cc683e80f4af	2025-03-18 09:19:00 +01:00
Doug Goldstein	37888e875f	ironic: fix logging of validation errors When validation of the node fails, since switching to the SDK the address of the ValidationResult object is displayed instead of the actual message. This has been broken since patch Ibb5b168ee0944463b996e96f033bd3dfb498e304. Closes-Bug: 2100009 Change-Id: I8fbdaadd125ece6a3050b2fbb772a7bd5d7e5304 Signed-off-by: Doug Goldstein <cardoe@cardoe.com>	2025-03-17 17:04:01 -05:00
Balazs Gibizer	32afd0c644	Reproduce bug/2098496 Related-Bug: #2098496 Change-Id: I9a3091662fb5d1d0a41dbeff56d9680a748fe312	2025-03-17 17:50:48 +01:00
Zuul	6b45672b23	Merge "Update compute rpc alias for epoxy"	2025-03-17 13:56:55 +00:00
Zuul	1e1b74467d	Merge "doc: mark the maximum microversion for 2025.1 Epoxy"	2025-03-13 12:32:33 +00:00
Zuul	f71a0a6204	Merge "Fix serial console for ironic"	2025-03-12 12:26:06 +00:00
Sylvain Bauza	0d484ce37d	Add service version for Epoxy We agreed by I2dd906f34118da02783bb7755e0d6c2a2b88eb5d on the support envelope. Pre-RC1, we need to add a service version in the object. Post-RC1, depending on whether it's SLURP or not SLURP, we need to bump the minimum version or not. This patch only focuses on pre-RC1 stage. Given Flamingo will be skippable, we will need a post-RC1 patch for updating the min that will bump to Epoxy. HTH. Change-Id: Id74ebfeaaac7bd116b11ff7bdd86674feb825f0f	2025-03-11 11:38:40 +01:00
Zuul	a329c103cb	Merge "Update driver to map the targeted address for SR-IOV PCI devices"	2025-03-10 20:20:19 +00:00
Zuul	a0a83640b9	Merge "Update libvirt fixtures to support hostdevs"	2025-03-10 20:12:34 +00:00
Sylvain Bauza	a1a118c9f0	Update compute rpc alias for epoxy This adds an alias for Epoxy Change-Id: I2d2b5f80c13524e7aa8278029d0343d12f6d61fd	2025-03-10 16:08:22 +01:00
Sylvain Bauza	4a5e67cff7	doc: mark the maximum microversion for 2025.1 Epoxy We need it for this release. Change-Id: Ibc70045dbdd1b28bf94fd1bec1fac033fae84e26	2025-03-10 16:05:28 +01:00
Zuul	fd1ad4d582	Merge "Update conductor and filters allowing migration with SR-IOV devices"	2025-03-10 14:36:36 +00:00
Zuul	6e51c83d28	Merge "Fix parameter order in add_instance_info_to_node"	2025-03-10 14:09:22 +00:00
Zuul	d1c94e25b6	Merge "api: Address TODO in microversion v2.99"	2025-03-10 13:46:44 +00:00
Zuul	5f3133efc0	Merge "api: project/tenant and user IDs are not UUIDs"	2025-03-10 13:46:38 +00:00
Zuul	2cf4667780	Merge "libvirt: fix maxphysaddr passthrough dom parsing"	2025-03-10 12:13:36 +00:00
melanie witt	eb3a803cd7	unified limits: discover service ID and region ID In oslo.limit 2.6.0 service endpoint discovery was added, provided by three new config options: [oslo_limit] endpoint_service_type = ... endpoint_service_name = ... endpoint_region_name = ... We can use the same config options if they are present to lookup the service ID and region ID we need when calling the GET /registered_limits API as part of the resource limit enforcement strategy. This way, the user will not have to configure endpoint_id. This will look for [oslo.limit]endpoint_id first and if it is not set, it will do the discovery. Closes-Bug: #1931875 Change-Id: Ida14303115e00a1460e6bef4b6d25fc68f343a4e	2025-03-07 17:18:30 -08:00
Zuul	0bbb1d15f4	Merge "Update manager to allow vfio pci device live migration"	2025-03-07 20:17:49 +00:00
Zuul	276685b3db	Merge "api: Add response body schemas for for console auth token APIs (v2.99)"	2025-03-06 20:37:31 +00:00
Michael Still	0954ec9e5c	Don't calculate the minimum compute version repeatedly. I have chosen to do a bit of a cleanup of the lookup of minimum compute manager versions, I didn't like how we looked up the minimum version several times for a single parent call for both create and resize. Change-Id: Ifc52d73b1328d3785e72be2c5cf741962c2b95da	2025-03-06 18:26:02 +11:00
Vasyl Saienko	bf8883ca3b	Fix serial console for ironic Allign code after we switched to openstacksdk in ironic virt driver related to serial console. Closes-Bug: #2099872 Depends-On: https://review.opendev.org/c/openstack/requirements/+/942889 Change-Id: Ic25c5e8b9ac9cf87f4f96c9956140aa4f6576ded	2025-03-05 05:07:55 +00:00
Zuul	29d17552a7	Merge "Add live_migratable flag to PCI device specification"	2025-03-04 20:24:52 +00:00
Zuul	e1b33cdf0c	Merge "Augment the LiveMigrateData object"	2025-03-04 20:24:46 +00:00
Stephen Finucane	244f9b0ad1	api: Address TODO in microversion v2.99 There's a TODO to prevent passing random query strings to the '/os-console-auth-tokens' API that should be addressed while we are updating the API. Do it now. Change-Id: Ic19f75b1e26ae048df110f6cd9217b706bf3c0a4 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2025-03-04 17:13:36 +00:00
Stephen Finucane	244ff89060	tests: Filter out eventlet deprecation warnings These are super annoying (and useless to boot, since there is nothing we can do about them in the near term). Shut them ⬇️⬇️⬇️ down ⬇️⬇️⬇️. Change-Id: I469dafa243b95749b34503c1f3e905d9d8c780d4 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2025-03-04 15:44:44 +00:00

1 2 3 4 5 ...

61321 Commits