After a baremetal instance is deleted, and its allocation is removed
in placement, the ironic node might start cleaning. Eventually nova
will notice and update the inventory to be reserved.
During this window, a new instance may have already picked this
ironic node.
When that race happens today the build fails with an error:
"Failed to reserve node ..."
This change tries to ensure the remaining alternative hosts are
attempted before aborting the build.
Clearly the race is still there, but this makes it less painful.
Related-Bug: #1974070
Change-Id: Ie5cdc17219c86927ab3769605808cb9d9fa9fa4d
Unlike uwsgi, apache mod_wsgi does not support passing
commandline arguments to the python wsgi script it invokes.
As a result while you can pass --config-file when hosting the
api and metadata wsgi applications with uwsgi there is no
way to use multiple config files with mod_wsgi.
This change mirrors how this is supported in keystone today
by intoducing a new OS_NOVA_CONFIG_FILES env var to allow
operators to optional pass a ';' delimited list of config
files to load.
This change also add docs for this env var and the existing
undocumented OS_NOVA_CONFIG_DIR.
Closes-Bug: 1994056
Change-Id: I8e3ccd75cbb7f2e132b403cb38022787c2c0a37b
For networks with subnets with enabled DHCP service don't provide
mtu value in the metadata. That way cloud-init will not configure it
"statically" in e.g. netplan's config file and guest OS will use MTU
value provided by the DHCP service.
Closes-Bug: #1899487
Change-Id: Ib775c2210349b72b3dc033554ac6d8b35b8d2d79
Added check if quiesce fails because libvirt fails to connect with
qemu guest agent inside instance
Closes-Bug: #1980720
Change-Id: I134a4060ace2678f76ae3606bf117c07194a8d92
A few tests related to volume detach are timeout in
nova-lvm job (failing 100%[1]). Root cause of timeout is not
known and it may take time to find and fix the issue. To unblock
gate and keep runing rest of the tests in lvm job, let's skip
the failing tests until they are fixed.
Related-Bug: #1998148
[1] https://zuul.opendev.org/t/openstack/builds?job_name=nova-lvm&branch=master&skip=0
Change-Id: Id29ce352df84168d0a45512e2c59820aefc75943
As per 2023.1 testing runtime[1], we need to test on Ubuntu
Jammy (which will be taken care by tempest and devstack patches
to move base jobs to Jammy) and at least single job to run on
Ubutnu Focal (for smooth upgrade). Also, python 3.10 testing is
voting now.
This commit adds a new job to run on focal which can be removed
in future cycle when testing runtime drop the requirement of Focal
testing. Also, make python 3.10 functional and unit test job as voting
(openstack-tox-py310 is running as part of generic template so we do
not need to explicitly add that)
[1] https://governance.openstack.org/tc/reference/runtimes/2023.1.html
Change-Id: Ia43f73dba00b0b5932939bcc7d11b97a83072ee3
Libvirt 7.7 changed the mdev device naming to include the parent PCI
device when listing node devices. The domain, however, will still only
see the UUID and not see the parent PCI device. Changing the parsing to
simply drop the PCI identifier is not enough as the device cannot be
found when attempting to lookup the new ID.
Modify the Libvirt Driver's _get_mediated_device_information to tolerate
different formats of the mdev name. This first uses the legacy behavior
by trying to lookup the device name that is passed in (typically
mdev_<uuid> format) and if that is not found, iterates the list of mdev
node devices until the right UUID is found and selects that one.
Note that the lookup of the mdev device by UUID are needed in order
to keep the ability to recreate assigned mediated devices on a reboot of
the compute node.
Additionally, the libvirt utils parsing method mdev_name2uuid, has
been updated to tolerate both mdev_<uuid> and mdev_<uuid>_<pciid>
formats.
Closes-Bug: 1951656
Change-Id: Ifed0fa16053228990a6a8df8d4c666521db7e329
Currently, when you delete an ironic instance, we trigger
and undeploy in ironic and we release our allocation in placement.
We do this well before the ironic node is actually available.
We have attempted to fix this my marking unavailable nodes
as reserved in placement. This works great until you try
and re-image lots of nodes.
It turns out, ironic nodes that are waiting for their automatic
clean to finish, are returned as a valid allocation candidates
for quite some time. Eventually we mark then as reserved.
This patch takes a strange approach, if we mark all nodes as
reserved as soon as the instance lands, we close the race.
That is, when the allocation is removed the node is still
unavailable until the next update of placement is done and
notices that the node has become available. That may or may
not have been after automatic cleaning. The trade off is
that when you don't have automatic cleaning, we wait a bit
longer to notice the node is available again.
Note, this is also useful when a broken Ironic node is
marked as in-maintainance while it is in-use by a nova
instance. In a similar way, we mark the Nova as reserved
immmeidately, rather than first waiting for the instance to be
deleted before reserving the resources in Placement.
Closes-Bug: #1974070
Change-Id: Iab92124b5776a799c7f90d07281d28fcf191c8fe
Add the following hacking rule.
* N372: Don't use the setDaemon method.
Use the daemon attribute instead.
Change-Id: Idb45421205f76d2d3b0576bd0504d261ed249edd
Related-Bug: 1987191
Signed-off-by: Takashi Natsume <takanattie@gmail.com>
The options list in 'Related Options:' section doesn't rendered
as bulleted list for some params because of missing blank line.
This changes adds missing blank line wherever needed in [1].
[1] https://docs.openstack.org/nova/latest/configuration/config.html
Change-Id: I7077aea2abcf3cab67592879ebd1fde066bfcac5
This inflates the cirros image to 1G for a more realistic scenario.
Technically we should have been doing something like this all along,
as the deployment guidance for ceph is to use a raw image, not a qcow2
one, so this also increases our accuracy to real-life.
We also need to up the volume size tempest uses for various tests
to make sure we will fit.
Change-Id: I5c447e630aaf1413a5eac89c2e8103506d245221
Some config in os_vif are affecting nova behavior, so we should add them
in nova.conf.sample in order to let people fine-tune this on demand
without looking into code.
This will also change the nova config reference docs.o.o
Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com>
Change-Id: Icfba423fda037be9cf071022283985297a989b07