The nova-manage placement heal_allocations CLI is capable of healing
missing placement allocations due to port resource requests. To support
the new extended port resource request this code needs to be adapted
too.
When the heal_allocation command got the port resource request
support in train, the only way to figure out the missing allocations was
to dig into the placement RP tree directly. Since then nova gained
support for interface attach with such ports and to support that
placement gained support for in_tree filtering in allocation candidate
queries. So now the healing logic can be generalized to following:
For a given instance
1) Find the ports that has resource request but no allocation key in the
binding profile. These are the ports we need to heal
2) Gather the RequestGroups from the these ports and run an
allocation_candidates query restricted to the current compute of the
instance with in_tree filtering.
3) Extend the existing instance allocation with a returned allocation
candidate and update the instance allocation in placement.
4) Update the binding profile of these ports in neutron
The main change compared to the existing implementation is in step 2)
the rest mostly the same.
Note that support for old resource request format is kept alongside of
the new resource request format until Neutron makes the new format
mandatory.
blueprint: qos-minimum-guaranteed-packet-rate
Change-Id: I58869d2a5a4ed988fc786a6f1824be441dd48484
At the moment, oslo.reports is enabled when running nova-api
standalone, but not when using uWSGI.
We're now updating the uwsgi entry point as well to include the
oslo.reports hook, which is extremely helpful when debugging
deadlocks.
Change-Id: I605f0e40417fe9b0a383cc8b3fefa1325f9690d9
The 'nova-manage placement audit' tool has functionality that can
delete orphaned allocations in placement. Add a section for it in the
doc for troubleshooting orphaned allocations.
Change-Id: I697de57cf7eb43c0993af2b1f5b3f5c4395ef097
This adds some basic documentation for the above command and also
includes some very generic osc commands to use when checking volume
attachments.
Blueprint: nova-manage-refresh-connection-info
Change-Id: Ib3d680654fe0809c9e8341dffd3a63ab02945a38
This patches adjusts the nova documentation about the extended port
resource request support in nova as the neutron API extension did not
land in Xena.
Change-Id: I3b961426745084bdb4a6d04468f5a3c762be4cfa
blueprint: qos-minimum-guaranteed-packet-rate
Currently, when 'nova-manage db archive_deleted_rows' is run with
the --until-complete option, the process will archive rows in batches
in a tight loop, which can cause problems in busy environments where
the aggressive archiving interferes with other requests trying to write
to the database.
This adds an option for users to specify an amount of time in seconds
to sleep between batches of rows while archiving with --until-complete,
allowing the process to be throttled.
Closes-Bug: #1912579
Change-Id: I638b2fa78b81919373e607458e6f68a7983a79aa
The interface attach and detach logic is now fully adapted to the new
extended resource request format, and supports more than one request
group in a single port.
blueprint: qos-minimum-guaranteed-packet-rate
Change-Id: I73e6acf5adfffa9203efa3374671ec18f4ea79eb
Nova re-generates the resource request of an instance for each server
move operation (migrate, resize, evacuate, live-migrate, unshelve) to
find (or validate) a target host for the instance move. This patch
extends the this logic to support the extended resource request from
neutron.
As the changes in the neutron interface code is called from nova-compute
service during the port binding the compute service version is bumped.
And a check is added to the compute-api to reject the move operations
with ports having extended resource request if there are old computes
in the cluster.
blueprint: qos-minimum-guaranteed-packet-rate
Change-Id: Ibcf703e254e720b9a6de17527325758676628d48
This adds the final missing pieces to support creating servers with
ports having extended resource request. As the changes in the neutron
interface code is called from nova-compute service during the port
binding the compute service version is bumped. And a check is added to
the compute-api to reject such server create requests if there are old
computes in the cluster.
Note that some of the negative and SRIOV related interface attach
tests are also started to pass as they are not dependent on any of the
interface attach specific implementation. Still interface attach is
broken here as the failing of the positive tests show.
blueprint: qos-minimum-guaranteed-packet-rate
Change-Id: I9060cc9cb9e0d5de641ade78c5fd7e1cc77ade46
Add a combination of commands to allow users to show existing stashed
connection_info for a volume attachment and update volume attachments
with fresh connection_info from Cinder by recreating the attachments.
Unfortunately we don't have an easy way to access host connector
information remotely (i.e. over the RPC API), meaning we need to also
provide a command to get the compute specific connector information
which must be run on the compute node that the instance is located on.
Blueprint: nova-manage-refresh-connection-info
Co-authored-by: Stephen Finucane <stephenfin@redhat.com>
Change-Id: I2e3a77428f5f6113c10cc316f94bbec83f0f46c1
There's only one driver now, which means there isn't really a driver at
all. Move the code into the manager altogether and avoid a useless layer
of abstraction.
Change-Id: I609df5b707e05ea70c8a738701423ca751682575
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Take the opportunity to clean up the docs quite a bit, ultimately
combining two disparate guides on the scheduler into one.
Change-Id: Ia72d39b4774d93793b381359b554c717dc9a6994
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
To implement the usage of same_subtree query parameter in the
allocation candidate request first the minimum requires placement
microversion needs to be bumped from 1.35 to 1.36. This patch makes such
bump and update the related nova upgrade check. Later patches will
modify the query generation to include the same_subtree param to the
request.
Change-Id: I5bfec9b9ec49e60c454d71f6fc645038504ef9ef
blueprint: qos-minimum-guaranteed-packet-rate
To prepare for the unlikely event that Neutron merges and an operator
enables the port-resource-request-groups neutron API extension before
nova adds support for it, this patch rejects server creation if such
extension is enabled in Neutron. Enabling that extension has zero
benefits without nova support hence the harsh but simple rejection.
A subsequent patch will reject server lifecycle operations in a more
sophisticated way and as soon as we support some operations, like
boot, the deployer might rightfully choose to enable the Neutron
extension.
Change-Id: I2c55d9da13a570efbc1c862116cea31aaa6aa02e
blueprint: qos-minimum-guaranteed-packet-rate
Alembic does lots of new things. Provide docs for how to use this. We
also improve upgrade docs slightly, removing references to ancient
reviews that are no longer really helpful as well as calling out our N
-> N+1 constraint.
Change-Id: I3760b82ce3bd71aa0a760d7137d69dfa3f29dc1d
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
This looks more complicated than it is, but it's really quite simple.
Essentially we have to deal with two possible configurations:
- For existing deployments, the DB sync operation should apply any
outstanding sqlalchemy-migrate-based migrations, dummy apply the
initial alembic migration, and then apply any additional alembic-based
migrations requested (or any available, if no version is specified).
- For new deployments, the DB sync operation should apply the initial
alembic migration and any additional alembic-based migrations
requested (or any available, if no version is specified). No
sqlalchemy-migrate-based migrations will ever be applied.
While we continue to allow users to request a specific database
migration version to upgrade to, we *do not* allow them to request a
sqlalchemy-migrate-based migration version. There's no good reason to do
this - the deployment won't run with an out-of-date DB schema (something
that's also true of the alembic migration, fwiw) - and we want to get
people off of sqlalchemy-migrate as fast as possible. A change in a
future release can remove the sqlalchemy-migrate-based migrations once
we're sure that they'll have upgraded to a release including all of the
sqlalchemy-migrated-based migrations (so Wallaby).
Tests are modified to validate the sanity of these operations. They're
mostly trivial changes, but we do need to do some funky things to ensure
that (a) we don't use logger configuration from 'alembic.ini' that will
mess with our existing logger configuration and (b) we re-use connection
objects as necessary to allow us to run tests against in-memory
databases, where a different connection would actually mean a different
database. We also can't rely on 'WalkVersionsMixin' from oslo.db since
that only supports sqlalchemy-migrate [1]. We instead must re-invent the
wheel here somewhat.
[1] https://github.com/openstack/oslo.db/blob/10.0.0/oslo_db/sqlalchemy/test_migrations.py#L42-L44
Change-Id: I850af601f81bd5d2ecc029682ae10d3a07c936ce
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
The two remaining modules, 'api_models' and 'api_migrations', are
moved to the new 'nova.db.api' module.
Change-Id: I138670fe36b07546db5518f78c657197780c5040
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Nested allocations are only partially supported in nova-manage placement
heal_allocations CLI. This patch documents the missing support and
blocks healing instances with VGPU or Cyborg device profile request in
the embedded flavor. Blocking is needed as if --forced is used with such
instances then the tool could recreate an allocation ignoring some of
these resources.
Change-Id: I89ac90d2ea8bc268940869dbbc90352bfad5c0de
Related-Bug: bug/1939020
As a prerequisite for blueprint generic-mdevs we need to rename the
existing enabled_vgpu_types options and dynamically generated groups
into enabled_mdev_types.
There is no upgrade impact for existing users, as the original
options are still accepted.
NOTE(sbauza): As we have a lot of methods and objects named gpu-ish
let's just change what we need here and provide followups for
fixing internal tech debt later.
Change-Id: Idba094f6366a24965804b88da0bc1b9754549c99
Partially-Implements: blueprint generic-mdevs
Correct a variety of gaps and other issues seen while improving the
flavor docs.
Change-Id: I8d68016cecb0269a5f9af88b0a08578f85403e23
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Replace references to novaclient with OSC in the boot from volume guide.
This is essentially a revert of commit aa3964118, which was a revert of
an earlier attempt at doing this that fell down because it didn't
reflect the changes in CLI parameters between the different tools.
Change-Id: Ic99440dd618243517f64506e3da88885fc2c44c9
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>