Files
nova/roles/run-graceful-shutdown-tests/files/verify_cold_migration.sh
T
Ghanshyam Maan 996c4ff9e8 Prepare resize/cold migration for graceful shutdown
During graceful shutdown, compute service keep a 2nd RPC
server active which can be used to finish the in-progress
operations. Like live migration, resize and cold migrations
also perform RPC call among source and destination compute.
For those operation also, we can use 2nd RPC server and make
sure they will be completed during graceful shutdown.

A quick overview of what all RPC methods are involved in the
resize/cold migration and what all will be using 2nd RPC server:

Resize/cold migration
- prep_resize: No, resize/migration is not started yet.
- resize_instance: Yes, here the resize/migration starts.
- finish_resize: Yes
- cross cell resize case:
  - prep_snapshot_based_resize_at_dest: NO, this is initial check and
    migration is not started
  - prep_snapshot_based_resize_at_source: Yes, this start the migration

Confirm resize: NO
- confirm_resize: NO
- cross cell confirm resize case:
  - confirm_snapshot_based_resize - NO

Revert resize:
- revert_resize - NO
- check_instance_shared_storage: YES. This is called from dest to source
  so we need source to respond to it so that revert can continue.
- finish_revert_resize on source- YES, at this stage, revert resize is
  in progress and abandoning it here can lead migration to unreocverable
  state.
- cross cell revert case:
  - revert_snapshot_based_resize_at_dest: NO
  - finish_revert_snapshot_based_resize_at_source: YES

Partial implement blueprint nova-services-graceful-shutdown-part1

Change-Id: If08b698d012a75b587144501d829403ec616f685
Signed-off-by: Ghanshyam Maan <gmaan.os14@gmail.com>
2026-02-25 20:36:07 +00:00

36 lines
1.0 KiB
Bash
Executable File

#!/bin/bash
source /opt/stack/devstack/openrc admin
set -x
set -e
server=$1
# Wait for the server to finish cold migration and reach VERIFY_RESIZE state,
# which indicates the migration has completed and is awaiting confirmation.
timeout=360
count=0
migration_start=$(date +%s)
while true; do
status=$(openstack server show ${server} -f value -c status)
task_state=$(openstack server show ${server} -f value -c OS-EXT-STS:task_state)
if [ "${status}" == "VERIFY_RESIZE" ] && { [ "${task_state}" == "None" ] || [ -z "${task_state}" ]; }; then
migration_end=$(date +%s)
migration_duration=$((migration_end - migration_start))
echo "Cold migration completed in ${migration_duration} seconds."
break
fi
if [ "${status}" == "ERROR" ]; then
echo "Server went to ERROR status during cold migration"
exit 6
fi
sleep 5
count=$((count+1))
if [ ${count} -eq ${timeout} ]; then
echo "Timed out waiting for cold migration to complete"
exit 5
fi
done