Commit Graph

60801 Commits

Author SHA1 Message Date
Zuul f25065b470 Merge "retry write_sys call on device busy" 2024-06-27 19:49:52 +00:00
Zuul fb2c9714d0 Merge "api: Add request body schemas for SG APIs" 2024-06-27 19:42:37 +00:00
Zuul bc1febbc07 Merge "tweak emulation job to avoid OOM errors" 2024-06-27 19:07:50 +00:00
Stephen Finucane 847608e75a api: Add request body schemas for SG APIs
These are deprecated but there's value in having a proper - if loose -
schema in place for API documentation purposes. Also, doing things this
way allows us to remove a whole load of hand-rolled stuff.

Change-Id: I4106cfa2a09d135f12892ed6d1f42f4151dc72e4
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2024-06-17 17:18:25 +01:00
Zuul 7dc4b1ea62 Merge "add functional repoducer for bug 2065927" 2024-05-31 10:44:09 +00:00
Zuul 96268d4e7a Merge "libvirt: Ensure both swtpm and swtpm_setup exist for vTPM support" 2024-05-28 09:07:13 +00:00
Sean Mooney 44c1b48b31 retry write_sys call on device busy
This change adds a retry_if_busy decorator
to the read_sys and write_sys functions in the filesystem
module that will retry reads and writes up to 5 times with
an linear backoff.

This allows nova to tolerate short periods of time where
sysfs retruns device busy. If the reties are exausted
and offlineing a core fails a warning is log and the failure is
ignored. onling a core is always treated as a hard error if
retries are exausted.

Closes-Bug: #2065927
Change-Id: I2a6a9f243cb403167620405e167a8dd2bbf3fa79
2024-05-27 18:31:31 +01:00
Sean Mooney 3811c7f648 tweak emulation job to avoid OOM errors
This change increases the swap in the emulation job form
1G to 8G
This change updates the default cirrors image form 0.5.2
to 0.5.3 to avoid know kernel bugs
this change reduces the tb_cache_size: to 128
The tempest concurrency is reduced to 4 to avoid
no valid host error in the resize tests

Change-Id: Ic1dde3d54f5ca12408ef53218773a27d55760705
2024-05-27 14:19:02 +01:00
Zuul ac4a67cbda Merge "docs: Follow up for persistent mdevs" 2024-05-27 11:01:32 +00:00
Zuul 3dfdc10f63 Merge "libvirt: Create persistent mdevs" 2024-05-27 11:01:26 +00:00
Zuul bded279a00 Merge "docs: Add more information about unified limits" 2024-05-23 19:48:39 +00:00
melanie witt c7e49dfa16 docs: Follow up for persistent mdevs
This address review feedback on change
I7e1d10e66a260efd0a3f2d6522aeb246c7582178 to add some clarifying text
to the docs and release note.

Related to blueprint persistent-mdevs

Change-Id: I472552c64cc2c2ce06896158664faac0199d90bd
2024-05-23 18:29:04 +00:00
Zuul d7d2fb1edd Merge "scheduler: AggregateMultitenancyIsolation to support unlimited tenant" 2024-05-23 15:35:54 +00:00
Zuul 4e3a41f0a4 Merge "Stop using split UEC image (mostly)" 2024-05-22 19:03:32 +00:00
Zuul 3a53d715cd Merge "[doc] Improve description for nova-manage db purge" 2024-05-22 02:44:50 +00:00
Alexey Stupnikov ac8729ac87 [doc] Improve description for nova-manage db purge
--before argument is currently described in ambiguous way: it
is not actually used to filter entries ARCHIVED before specified
date. Instead, it compares provided data with "deleted_at" value
for most rows and "updated_at" or "created_at" for remaining ones.

Since we already talk about time of deletion when describing
--before argument of "nova-manage db archive_deleted_rows" rows,
it make sense to not provide extra details here as well.

Change-Id: Ib5940e88a52dc8d32303e27237e567c3481fc3dc
2024-05-21 20:19:18 +02:00
Zuul 4bc5ff1c99 Merge "fix py312 tox definitions" 2024-05-21 17:43:27 +00:00
Sean Mooney ee581a5c9d add functional repoducer for bug 2065927
Today if the write sys call to offline a cpu when
deleting an instnace fails due to an OSERROR or ValueERROR
the instance delete fails and the instance goes to error.

as reported in bug: #2065927 this can happen as a result of
OSError: [Errno 16] Device or resource busy if the vm is
deleted shortly after its started.

Related-Bug: #2065927
Change-Id: I1352a3a1e28cfe14ec8f32042ed35cb25e70338e
2024-05-21 17:57:29 +01:00
Zuul c60b81fa4b Merge "tox: Drop envdir" 2024-05-21 15:17:45 +00:00
Sean Mooney 7ff24958ee fix py312 tox definitions
change I6de86f3e3e283ba404f927ea4c8164f791df3989
added the py312 funtional job definition but did not
update the tox.ini to define it.
As a result it is runing the unit tests not the functional tests.
This change simpley corrects that.

Change-Id: Id6ee76e0190469ac09baf0bc56a9022317c6f881
2024-05-21 12:03:41 +01:00
Zuul 61ad4f1f27 Merge "Enable virtio-scsi in nova-next" 2024-05-20 21:41:24 +00:00
Dan Smith eed3e2b47f Stop using split UEC image (mostly)
This reverts us back to using the standard disk image for most of our
tests, which is more representative of how people actually use nova.
This leaves the UEC image on a few jobs for the sake of comparison
data for the time being, and because we should actually test that
code path if we're going to say we support it.

Change-Id: I16ed92d342464325d4bef33c1e22b328bcfbe7d6
2024-05-20 08:24:18 -07:00
Dan Smith 84b0a481fe Enable OCaaS for several nova jobs
This uses the OCaaS feature in devstack which saves *minutes* of time
running devstack by effectively caching the openstackclient startup
instead of taking that hit for every invocation.

Change-Id: I78308128c6249f7f871e0231ce717b1ec0f88509
2024-05-20 08:24:18 -07:00
melanie witt d45379c6b4 docs: Add more information about unified limits
The admin docs are missing some details about enabling unified limits,
like oslo.limit configuration and Keystone roles. This adds more
information about what roles are needed for what actions, how to set
quota limits, quota enforcement, and unified limits in general.

This also removes a couple of tables from the user docs that show
obsolete/deprecated quota limits because they may be more confusing
than helpful considering we don't want new deployments to use them and
they add more clutter to the page.

More info is also added regarding the CLI commands for unified limits
and makes it consistent between the user and admin docs.

Change-Id: Id93f9997d1b217e0c2151c88323564f7a7fefc02
2024-05-17 19:17:05 +00:00
Zuul ab3ca1e205 Merge "Make python 3.12 unit and functional voting" 2024-05-16 15:42:38 +00:00
Zuul b6a846c6c8 Merge "Fix hacking test with syntax error" 2024-05-16 14:26:26 +00:00
Zuul fa44978141 Merge "Fix notification object hashes for python 3.12" 2024-05-16 14:26:18 +00:00
Pierre Riteau bfd3525863 Fix formatting issues in extra-specs docs
Change-Id: I693f70ffb7630fe99336fef52783ea55c492624d
2024-05-16 11:08:19 +02:00
Zuul d467edac49 Merge "do not use str(url) to stringify a URL for subsequent use" 2024-05-15 20:05:28 +00:00
Dan Smith 50b180023f Make python 3.12 unit and functional voting
Change-Id: I6de86f3e3e283ba404f927ea4c8164f791df3989
2024-05-15 12:29:14 -07:00
Dan Smith 3f0879ccc3 Fix hacking test with syntax error
This hacking test has a syntax error in it. On older pythons, this
does not prevent us from finding the second popen() use, but on
python 3.12 it does.

Change-Id: Ib74dc030118e0cb9fab548b112d32ce080969a15
2024-05-15 12:12:55 -07:00
Dan Smith 6ee938fd22 Fix notification object hashes for python 3.12
Python 3.12 changes the repr() syntax for the OrderedDict object,
which causes us to calculate different hashes for the notification
objects across the version gap. The get_extra_data() function wraps
a dict in OrderedDict after sorting the elements of the original dict,
but that is not entirely necessary. The sorted() list of dict.items()
is a perfectly reasonable representation of the things it is capturing,
so we can just eliminate the use of OrderedDict there and thus end
up calculating the same hashes on <3.12 and >3.12.

NOTE: This changes a bunch of the hashes of notification objects
without bumping or changing versions. This is expected purely because
we are changing the hashing method, and thus this is not a violation
or upgrade concern.

Change-Id: I242150138deed7fe74b13d9c44b333293cd24ffa
2024-05-15 12:12:54 -07:00
Zuul 2e6041a76e Merge "Remove SQLAlchemy tips jobs" 2024-05-15 16:51:17 +00:00
Mike Bayer acbe3e28e5 do not use str(url) to stringify a URL for subsequent use
The str(url) function in SQLAlchemy hides the password.
For a URL string that is to be re-used, use
render_as_string(hide_password=False).

Change-Id: I2ab28da5cc2b9ed3a1588259b2e94320662816bb
2024-05-15 16:23:58 +01:00
Zuul 5095336689 Merge "Upload glance image with --file in ceph job" 2024-05-14 16:40:32 +00:00
Zuul 7096423b34 Merge "Reject AZ changes during aggregate add / remove host" 2024-05-09 20:17:32 +00:00
Zuul 5470dedd4d Merge "Fix device_type=lun with boot_index" 2024-05-09 17:32:28 +00:00
Zuul 67119b7de3 Merge "Avoid setting serial on raw LUN devices" 2024-05-08 18:36:11 +00:00
Dan Smith a4e72f71fc Upload glance image with --file in ceph job
This enables our use of the OCaaS devstack feature, which can't support
image upload from stdin.

Change-Id: Idc5646ef6763447e3c1c68de03dea8197c305f6c
2024-05-08 10:57:06 -07:00
Zuul 114b8184e4 Merge "Make overcommit check for pinned instance pagesize aware" 2024-05-08 13:55:26 +00:00
Balazs Gibizer 3c0eadae0b Reject AZ changes during aggregate add / remove host
After this patch nova rejects the add host to aggregate API action
if the host has instances and the new aggregate for the host would
mean that these instances need to move from one AZ (even from the
default one) to another. Such AZ change is not implemented in nova
and currently leads to stuck instances.

Similarly nova will reject remove host from aggregate API action if the
host has instances and the aggregate removal would mean that the
instances need to change AZ.

Depends-On: https://review.opendev.org/c/openstack/tempest/+/821732

Change-Id: I19c4c6d34aa2cc1f32d81e8c1a52762fa3a18580
Closes-Bug: #1907775
2024-05-08 14:56:56 +02:00
Zuul 95bfa492e9 Merge "[ironic] Fix rebooting instance" 2024-05-08 01:10:34 +00:00
Dan Smith 32546d9c1b Enable virtio-scsi in nova-next
This lets us test the direct-lun volume attachment model.

Change-Id: Ibc7bff377cc5b5572e2a11006116401babaac347
Depends-On: https://review.opendev.org/c/openstack/tempest/+/918457
2024-05-07 11:49:21 -07:00
Dan Smith 2f0c340d39 Fix device_type=lun with boot_index
Right now we'll fail to calculate the boot order of a set of BDMs if
one of them is a device_type=lun. This fixes that and teaches us
that it's just a "hd" from qemu's perspective.

Closes-Bug: #2065084
Change-Id: Ic1340918738d503fc797c9373fe2e1dd16b27a09
2024-05-07 11:14:30 -07:00
Dan Smith 575ff86a4f Avoid setting serial on raw LUN devices
Libvirt now enforces that device="lun" (i.e. raw device passthrough)
disks must not have the <serial> property set. We recently enabled
the ability to manage devices by alias instead of serial, but to
fully enable this use-case we need to avoid putting serial in the
XML to appease libvirt.

Related-Bug: #2065084
Change-Id: Ifa2df89f27e58e1e64ce046edeaf6e49a7c89490
2024-05-07 10:39:49 -07:00
Vasyl Saienko 0e766885f6 [ironic] Fix rebooting instance
The correct state for hard and soft reboots are rebooting [0]

[0] https://github.com/openstack/openstacksdk/blob/master/openstack/baremetal/v1/node.py#L44

Closes-Bug: #2064826
Change-Id: I18e0352b3638872e85ce91a3cfcbbfddc812ab67
2024-05-07 20:39:31 +03:00
Zuul 428990e7b7 Merge "Do not close returned image-chunk iterator & get_verifier early" 2024-05-06 17:57:32 +00:00
Zuul 07f05add31 Merge "api: Keep track of action controllers" 2024-05-03 00:52:47 +00:00
Zuul ca5be99837 Merge "Remove old excludes" 2024-05-01 16:59:06 +00:00
Takashi Kajinami b4ff81c329 Remove old excludes
These are detected as errors since the clean up was done[1] in
the requirements repository.

[1] 314734e938f107cbd5ebcc7af4d9167c11347406

Bump the minimum versions to avoid installing these known bad versions.

Change-Id: I5ab0c3a1ac208e3967e65c298573079283a7b6cd
2024-05-01 01:30:04 +09:00