womax/nova - nova - Gitea: Git with a cup of tea

womax/nova

Author	SHA1	Message	Date
Ghanshyam Maan	b47d217ca7	Add more test for graceful shutdown Adding more tests for graceful shutdown: - shutdown the destination compute and see how live and cold migration progress - start build instance and ocne comoute start building instance then shutdown the comoute service and see if build instance finish or not. - revert resize server Partial implement blueprint nova-services-graceful-shutdown-part1 Change-Id: I57132fb7b7fa614dfc138508581ff5a67aaed906 Signed-off-by: Ghanshyam Maan <gmaan.os14@gmail.com>	2026-02-25 20:46:24 +00:00
Ghanshyam Maan	996c4ff9e8	Prepare resize/cold migration for graceful shutdown During graceful shutdown, compute service keep a 2nd RPC server active which can be used to finish the in-progress operations. Like live migration, resize and cold migrations also perform RPC call among source and destination compute. For those operation also, we can use 2nd RPC server and make sure they will be completed during graceful shutdown. A quick overview of what all RPC methods are involved in the resize/cold migration and what all will be using 2nd RPC server: Resize/cold migration - prep_resize: No, resize/migration is not started yet. - resize_instance: Yes, here the resize/migration starts. - finish_resize: Yes - cross cell resize case: - prep_snapshot_based_resize_at_dest: NO, this is initial check and migration is not started - prep_snapshot_based_resize_at_source: Yes, this start the migration Confirm resize: NO - confirm_resize: NO - cross cell confirm resize case: - confirm_snapshot_based_resize - NO Revert resize: - revert_resize - NO - check_instance_shared_storage: YES. This is called from dest to source so we need source to respond to it so that revert can continue. - finish_revert_resize on source- YES, at this stage, revert resize is in progress and abandoning it here can lead migration to unreocverable state. - cross cell revert case: - revert_snapshot_based_resize_at_dest: NO - finish_revert_snapshot_based_resize_at_source: YES Partial implement blueprint nova-services-graceful-shutdown-part1 Change-Id: If08b698d012a75b587144501d829403ec616f685 Signed-off-by: Ghanshyam Maan <gmaan.os14@gmail.com>	2026-02-25 20:36:07 +00:00
Ghanshyam Maan	d5ffb58a8d	Use 2nd RPC server in compute operations For graceful shutdown of compute service, it will have two RPC servers. One RPC server is used for the new requests which will be stopped during graceful shutdown and 2nd RPC server (listen on 'compute-alt' topic) will be used to complete the in-progress operations. We select the operations (case by case) and their RPC method to use the 2nd PRC server so that they will not be interupted on shutdown initiative and graceful shutdown time will keep 2nd RPC server active for graceful_shutdown_timeout. A new method 'prepare_for_alt_rpcserver' is added which will fallback to first RPC server if it detect the old compute. As this is upgrade impact, it bumps the compute/service version, adds releasenotes for the same. The list of operations who should use the 2nd RPC server will grow evanutally and this commit moves the below operations to use the 2nd RPC server: * Live migration - Live migration: It use 2nd RPC servers and will try to complete the operation during shutdown. - live_migration_force_complete does not need to use 2nd RPC server. It is direct RPC request from API to compute and if that is rejected during shutdown, it is fine and can be initiated again once compute is up. - live_migration_abort does not need to use 2nd RPC server. Ditto, it is direct RPC request from API to compute. It cancel the queue live migration but if migration is already started, then driver cancel the migration. If it is rejected during shutdown because of RPC is stopped, it is fine and can be initiated again. * server external event * Get server console As graceful shutdown cannot be tested in tempest, this adds a new job to test it. Currently it test the live migration operation which can be extended to other operations who will use 2nd RPC server. Partial implement blueprint nova-services-graceful-shutdown-part1 Change-Id: I4de3afbcfaefbed909a29a831ac18060c4a73246 Signed-off-by: Ghanshyam Maan <gmaan.os14@gmail.com>	2026-02-25 20:32:44 +00:00
Sean Mooney	6c9110bb8b	Handle missing libvirt services in evacuate hook On Debian 13 (Trixie), libvirt packaging is modularized and the libvirt-daemon-lock package (providing virtlockd) is optional. The evacuate hook previously assumed all libvirt services were installed and failed when stopping/starting missing units. Extract a reusable manage_libvirt_service.yaml task file that checks if a service exists via systemctl list-unit-files before managing its units. This prevents failures when optional libvirt packages are not installed and future-proofs against further packaging changes. Generated-By: claude-code Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Change-Id: Ie84e2e8ab2d3065b1562ee5e256fa163541955f7 Signed-off-by: Sean Mooney <work@seanmooney.info>	2026-02-17 18:22:22 +00:00
melanie witt	6827b763b4	run-evacuate-hook: Check cinder before creating BFV server Awhile back change I52046e6f7acdfb20eeba67dda59cbb5169e5d17e disabled cinder in the nova-ovs-hybrid-plug job and added checks for cinder before attempting to run evacuate BFV tests. Resource setup for BFV was however not bypassed and the attempt to setup a BFV server resource fails with: keystoneauth1.exceptions.catalog.EndpointNotFound: publicURL endpoint for volumev3 service not found This adds a bypass to avoid attempting to create a BFV server when cinder is not available. Change-Id: I52c7e5ce268bb38cee16c18c5523fe0e224970aa	2024-02-06 17:52:30 +00:00
Dan Smith	bfdc99ffbb	Install lxml before we need it in post-run Change-Id: Ibf6bfde6c524821fa5dc3c01b2eb57635e587de6 Closes-Bug: #2039463	2023-10-16 08:32:56 -07:00
Amit Uniyal	c486cc89dc	Make our nova-ovs-hybrid-plug job omit cinder modifies nova-ovs-hybrid-plug job to disable cinder and swift to ensure we test for this going forward. Change-Id: I52046e6f7acdfb20eeba67dda59cbb5169e5d17e	2023-09-13 12:23:43 -07:00
melanie witt	e96ac439d3	Use OSC in run-evacuate-hook instead of novaclient Recently a change landed in devstack [1] to install packages into a global venv by default and the "nova" command was not symlinked for compat, so jobs using run-evacuate-hook are failing with: nova: command not found We had intended to switch away from using novaclient CLI commands in our scripts anyway, so we can just use this opportunity to switch to OSC. [1]: If9bc7ba45522189d03f19b86cb681bb150ee2f25 Change-Id: Ifd969b84a99a9c0460bceb1a28fcee6e51cbb4ae	2023-08-12 01:44:02 +00:00
Lucas Alvares Gomes	20a7c98eff	[OVN] Adapt the live-migration job scripts to work with OVN There's no q-agt service in an OVN deployment. Change-Id: Ia25c966c70542bcd02f5540b5b94896c17e49888 Signed-off-by: Lucas Alvares Gomes <lucasagomes@gmail.com>	2021-03-15 09:41:03 +00:00
Lee Yarwood	76360e566b	nova-live-migration: Disable all virt services during negative tests libvirtd was being restarted on the controller during negative evacuation tests that rely on the service being to cause an evacuation failure. This change adds various virt services to the list of services stopped and now disabled on the host to ensure these don't cause systemd to restart libvirtd: * virtlogd.service * virtlogd-admin.socket * virtlogd.socket * virtlockd.service * virtlockd-admin.socket * virtlockd.socket Closes-Bug: #1903979 Change-Id: Ic83252bbda76c205bcbf0eef184ce0b201e224fc	2020-11-27 13:35:42 +00:00
Lee Yarwood	226250beb6	nova-evacuate: Disable libvirtd service and sockets during negative tests The recent switch to Focal introduced a change in behaviour for the libvirtd service that can now be restarted through new systemd socket services associated with it once stopped. As we need it to remain stopped during the initial negative evacuation tests on the controller we now need to also stop these socket services and then later restart them. Change-Id: I2333872670e9e6c905efad7461af4d149f8216b6	2020-10-02 17:01:57 +00:00
Lee Yarwood	f357d80407	zuul: Introduce nova-evacuate This change reworks the evacuation parts of the original nova-live-migration job into a zuulv3 native ansible role and initial job covering local ephemeral and iSCSI/LVM volume attached instance evacuation. Future jobs will cover ceph and other storage backends. Change-Id: I380e9ca1e6a84da2b2ae577fb48781bf5c740e23	2020-09-23 16:47:47 +01:00
Matt Riedemann	7661995b69	Enable cross-cell resize in the nova-multi-cell job This changes the nova-multi-cell job to essentially force cross-cell resize and cold migration. By "force" I mean there is only one compute in each cell and resize to the same host is disabled, so the scheduler has no option but to move the server to the other cell. This adds a new role to write the nova policy.yaml file to enable cross-cell resize and a pre-run playbook so that the policy file setup before tempest runs. Part of blueprint cross-cell-resize Change-Id: Ia4f3671c40e69674afc7a96b5d9b198dabaa4224	2019-12-23 10:10:57 -05:00
Matt Riedemann	cee072b962	Convert nova-next to a zuul v3 job For the most part this should be a pretty straight-forward port of the run.yaml. The most complicated thing is executing the post_test_hook.sh script. For that, a new post-run playbook and role are added. The relative path to devstack scripts in post_test_hook.sh itself had to drop the 'new' directory since we are no longer executing the script through devstack-gate anymore the 'new' path does not exist. Change-Id: Ie3dc90862c895a8bd9bff4511a16254945f45478	2019-07-23 11:32:35 -04:00

14 Commits