Commit Graph

59443 Commits

Author SHA1 Message Date
Zuul ffb810e2ba Merge "[yoga] Add support for VNIC_REMOTE_MANAGED" 2022-02-09 21:00:40 +00:00
Zuul 8bc76b3cc7 Merge "Filter computes without remote-managed ports early" 2022-02-09 21:00:32 +00:00
Zuul d99c15f4f1 Merge "Add supports_remote_managed_ports capability" 2022-02-09 21:00:10 +00:00
Zuul 6674f7668c Merge "Bump os-traits to 2.7.0" 2022-02-09 21:00:03 +00:00
Zuul 1aaedc8e16 Merge "Introduce remote_managed tag for PCI devs" 2022-02-09 20:59:55 +00:00
Zuul 7cb15fa245 Merge "[yoga] Include pf mac and vf num in port updates" 2022-02-09 20:59:46 +00:00
Zuul a656748cf9 Merge "Update announce self workaround opt description" 2022-02-09 20:42:47 +00:00
Zuul 6e126869f0 Merge "Cleanup old resize instances dir before resize" 2022-02-09 13:16:32 +00:00
Zuul f7fa3bf5fc Merge "neutron: Rework how we check for extensions" 2022-02-08 22:56:47 +00:00
Dmitrii Shcherbakov 0620678344 [yoga] Add support for VNIC_REMOTE_MANAGED
Allow instances to be created with VNIC_TYPE_REMOTE_MANAGED ports.
Those ports are assumed to require remote-managed PCI devices which
means that operators need to tag those as "remote_managed" in the PCI
whitelist if this is the case (there is no meta information or standard
means of querying this information).

The following changes are introduced:

* Handling for VNIC_TYPE_REMOTE_MANAGED ports during allocation of
  resources for instance creation (remote_managed == true in
  InstancePciRequests);

* Usage of the noop os-vif plugin for VNIC_TYPE_REMOTE_MANAGED ports
  in order to avoid the invocation of the local representor plugging
  logic since a networking backend is responsible for that in this
  case;

* Expectation of bind time events for ports of VNIC_TYPE_REMOTE_MANAGED.
  Events for those arrive early from Neutron after a port update (before
  Nova begins to wait in the virt driver code, therefore, Nova is set
  to avoid waiting for plug events for VNIC_TYPE_REMOTE_MANAGED ports;

* Making sure the service version is high enough on all compute services
  before creating instances with ports that have VNIC type
  VNIC_TYPE_REMOTE_MANAGED. Network requests are examined for the presence
  of port ids to determine the VNIC type via Neutron API. If
  remote-managed ports are requested, a compute service version check
  is performed across all cells.

Change-Id: Ica09376951d49bc60ce6e33147477e4fa38b9482
Implements: blueprint integration-with-off-path-network-backends
2022-02-09 01:23:27 +03:00
Dmitrii Shcherbakov c487c730d0 Filter computes without remote-managed ports early
Add a pre-filter for requests that contain VNIC_TYPE_REMOTE_MANAGED
ports in them: hosts that do not have either the relevant compute
driver capability COMPUTE_REMOTE_MANAGED_PORTS or PCI device pools
with "remote_managed" devices are filtered out early. Presence of
devices actually available for allocation is checked at a later
point by the PciPassthroughFilter.

Change-Id: I168d3ccc914f25a3d4255c9b319ee6b91a2f66e2
Implements: blueprint integration-with-off-path-network-backends
2022-02-09 01:23:27 +03:00
Dmitrii Shcherbakov d1e9ecb443 Add supports_remote_managed_ports capability
In order to support remote-managed ports the following is needed:

* Nova compute driver needs to support this feature;
* For the Libvirt compute driver, a given host needs to have the right
  version of Libvirt - the one which supports PCI VPD (7.9.0
  https://libvirt.org/news.html#v7-9-0-2021-11-01).

Therefore, this change introduces a new capability to track driver
support for remote-managed ports.

Change-Id: I7ea96fd85d2607e0af0f6918b0b45c58e8bec058
2022-02-09 01:23:27 +03:00
Dmitrii Shcherbakov 6294c144e7 Bump os-traits to 2.7.0
The new version contains changes needed by the multi-architecture
support and off-path SmartNIC DPU support code.

Needed-By: I168d3ccc914f25a3d4255c9b319ee6b91a2f66e2
Needed-By: Ia070a29186c6123cf51e1b17373c2dc69676ae7c
Change-Id: Ic1179f3e5e2c1aeb069972f21edffe5b003eb525
2022-02-09 01:23:27 +03:00
Dmitrii Shcherbakov 0d5f8ffc2b Introduce remote_managed tag for PCI devs
PCI devices may be managed remotely from the perspective of a hypervisor
host (e.g. by a SmartNIC DPU) which means that the VF control plane is
not available to the hypervisor. Depending on the presence of a
remote_managed device attribute in the InstancePCIRequest spec and
available device types in a pool, additional processing needs to be
done:

* Filtering of devices marked as `remote_managed: "true"` in the
  whitelist configuration so that they are not used in legacy SR-IOV
  and hardware offload requests;

* Early error reporting if PFs marked as remote_managed="true" are
  present in the whitelist configuration. This is not supported
  explicitly since allocating such PFs would remove the associated
  VFs from the pool and an instance with such PF and its VFs will
  not have access to the control plane required for representor
  interface plugging at the SmartNIC DPU side. This configuration
  is not valid which is enforced in the PCIDeviceStats code.

* Checking of the presence of a card serial number in the PCI VPD
  capability of a device if it was marked as `remote_managed: "true"`
  in the whitelist. The card serial number presence is mandatory
  because it is used for identification of a host in the networking
  backend that will handle the configuration of a given PCI device at
  the remote host side (i.e. representor plugging, flow programming).

For compatibility, all devices not explicitly marked as remote_managed
in the whitelist are assumed to have remote_managed attribute set to
False.

Implements: blueprint integration-with-off-path-network-backends
Change-Id: Ic44d5e206326827d00a751da3cea67afe3929a08
2022-02-09 01:23:24 +03:00
Dmitrii Shcherbakov 1f71696ecc [yoga] Include pf mac and vf num in port updates
Retrieve PF mac and VF logical number at runtime for
a given VF PCI address and include them in port updates to Neutron.

Implements: blueprint integration-with-off-path-network-backends
Change-Id: I83a128a260acdd8bf78fede566af6881b8b82a9c
2022-02-07 23:38:41 +03:00
Tobias Urdin 9111b99f73 Cleanup old resize instances dir before resize
If there is a failed resize that also failed the cleanup
process performed by _cleanup_remote_migration() the retry
of the resize will fail because it cannot rename the current
instances directory to _resize.

This renames the _cleanup_failed_migration() that does the
same logic we want to _cleanup_failed_instance_base() and
uses it for both migration and resize cleanup of directory.

It then simply calls _cleanup_failed_instances_base() with
the resize dir path before trying a resize.

Closes-Bug: 1960230
Change-Id: I7412b16be310632da59a6139df9f0913281b5d77
2022-02-07 18:14:44 +00:00
Tobias Urdin 2aa1ed5810 Update announce self workaround opt description
This updates the announce self workaround config opt
description to include info about instance being set
as tainted by libvirt.

Change-Id: I8140c8fe592dd54fc09a9510723892806db49a56
2022-02-07 11:25:35 +00:00
Zuul b6fe7521af Merge "docs: Follow-ups for cells v2, architecture docs" 2022-02-07 10:27:51 +00:00
Zuul 87dd10dcd4 Merge "[yoga] Add PCI VPD Capability Handling" 2022-02-05 09:33:35 +00:00
Zuul cc794b3641 Merge "skip test_tagged_attachment in nova-next" 2022-02-05 00:36:44 +00:00
Zuul bdaeadeb64 Merge "Revert "Revert resize: wait for events according to hybrid plug"" 2022-02-04 15:23:28 +00:00
Zuul f5427576c4 Merge "Migrate RequestSpec.numa_topology to use pcpuset" 2022-02-04 14:18:09 +00:00
Zuul 6fdd623288 Merge "Reproduce bug 1952941" 2022-02-04 12:42:47 +00:00
Sean Mooney b00ce99dd4 skip test_tagged_attachment in nova-next
This change adds
tempest.api.compute.servers.test_device_tagging.TaggedAttachmentsTest.test_tagged_attachment
to the tempest exclude regex

over the past few weeks we have noticed this test failing intermitently
and it has not started to become a gate blocker. This test is executed in other
jobs that use the PC machine type and is only failing in the nova-next
job which uses q35. As such while we work out how to address this properly
we skip it in the nova-next.

Change-Id: I845ca5989a8ad84d7c04971316fd892cd29cfe1f
Related-Bug: #1959899
2022-02-04 12:28:10 +00:00
Zuul 37bd469199 Merge "Add 'hw:vif_multiqueue_enabled' flavor extra spec" 2022-02-03 18:51:08 +00:00
Zuul 26ce7b30b2 Merge "docs: Add new architecture guide" 2022-02-03 18:28:02 +00:00
Stephen Finucane 136f1deb6e docs: Follow-ups for cells v2, architecture docs
Based on review feedback on [1] and [2].

[1] If39db50fd8b109a5a13dec70f8030f3663555065
[2] I518bb5d586b159b4796fb6139351ba423bc19639

Change-Id: I44920f20213462a3abe743ccd38b356d6490a7b4
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2022-02-03 11:41:50 +00:00
Zuul b0633ac49b Merge "docs: Add a new cells v2 document" 2022-02-02 17:01:13 +00:00
Dmitrii Shcherbakov ab49f97b2c [yoga] Add PCI VPD Capability Handling
This change comes as a part of the "Off-path Networking Backends
Support" spec implementation.

https://review.opendev.org/c/openstack/nova-specs/+/787458

* Add VPD capability parsing support
  * The XML data from libvirt is parsed and formatted into PCI device
    JSON dict that is sent to Nova API and is stored in the extra_info
    column of a PciDevice.

    The code gracefully handles the lack of the capability since it is
    optional or Libvirt may not support it in a particular release.
    https://libvirt.org/news.html#v7-9-0-2021-11-01 (VPD capability
    was added in 7.9.0).
* Pass the serial number to Neutron in port updates
  If a card serial number is present based on the information from PCI
  VPD, pass it to Neutron along with other PCI-related information.

Change-Id: I6445433142286728a8c7efadcf80d07082d60bc3
Implements: blueprint integration-with-off-path-network-backends
2022-02-01 17:31:04 +03:00
Zuul e8feef747f Merge "Deprecate the powervm driver" 2022-01-31 19:35:25 +00:00
Zuul 55566b90aa Merge "Add nova-ovs-hybrid-plug job" 2022-01-31 18:49:51 +00:00
Stephen Finucane 9fe4654273 api: Reject duplicate port IDs in server create
Specifying a duplicate port ID is currently "allowed" but results in an
integrity error when nova attempts to create a duplicate
'VirtualInterface' entry. Start rejecting these requests by checking for
duplicate IDs and rejecting offending requests. This is arguably an API
change because there isn't a HTTP 5xx error (server create is an async
operation), however, users shouldn't have to opt in to non-broken
behavior and the underlying instance was never actually created
previously, meaning automation that relied on this "feature" was always
going to fail in a later step. We're also silently failing to do what
the user asked (per flow chart at [1]).

[1] https://docs.openstack.org/nova/latest/contributor/microversions.html#when-do-i-need-a-new-microversion

Change-Id: Ie90fb83662dd06e7188f042fc6340596f93c5ef9
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Closes-Bug: #1821088
2022-01-31 13:13:18 +00:00
Artom Lifshitz 0b0f40d1b3 Revert "Revert resize: wait for events according to hybrid plug"
This reverts commit 7a7a223602.

That commit was added because - tl'dr - upon revert resize, Neutron
with the OVS backend and the iptables security group driver would send
us the network-vif-plugged event as soon as we updated the port
binding.

That behaviour has changed with commit 66c7f00e1d. With that commit,
we started unplugging the vifs on the source compute host when doing a
resize. When reverting the resize, the vifs had to be re-plugged again,
regarldess of the networking backend in use. This renders commit
7a7a223602. pointless, and it can be
reverted.

Conflicts - most have to do with context around this commit's code:

nova/compute/manager.py

    a2984b647a added provider_mappings to
    _finish_revert_resize_network_migrate_finish()'s signature

    750aef54b1 started using
    _finish_revert_resize_network_migrate_finish() in
    _finish_revert_snapshot_based_resize_at_source()

nova/network/model.py

    8b33ac0644 added get_live_migration_plug_time_events() and
    has_live_migration_plug_time_event()

    7da94440db added has_port_with_allocation()

nova/objects/migration.py

    f203da3838 added is_resize() and is_live_migration()

nova/tests/unit/compute/test_compute.py

    a0e60feb3e added request_spec to the test

nova/tests/unit/compute/test_compute_mgr.py

    be278006a5 added unit tests below ours

nova/tests/unit/network/test_network_info.py

    7da94440db (again) added tests for has_port_with_allocation()

nova/tests/unit/virt/libvirt/test_driver.py and
nova/virt/libvirt/driver.py are different in that attempting to
identify individual conflicts is a pointless exercise, as so much has
changed (mdev, vtmp, the recent wait for events during hard reboot
workaround config option, etc). They can be treated as
manual removal of any code that had to do with the bind-time events
logic (though guided by the conflict markers in git).

TODO(artom) There was a follow up commit,
78a08d44ea, that added the migration
parameter to finish_revert_migration(). This is no longer needed, as
the migration was only used to obtain plug-time events. We'll have to
undo that as well.

Closes-bug: 1952003
Change-Id: I3cb39a9ec2c260f422b3c48122b9db512cdd799b
2022-01-31 13:10:59 +00:00
Artom Lifshitz ded6168ad7 Add nova-ovs-hybrid-plug job
We have a gap in our testing of the exernal events interaction between
Nova and Neutron. The nova-next job tests with the OVS network
backend, and Neutron has jobs that test the OVN network backend, but
nothing tests OVS + the iptables security group firewall driver, aka
"hybrid plug". Add a job to test that.

Related-bug: 1952003
Change-Id: Ie42eaa2a39ef097b0eb69b8863bb342bae007fff
2022-01-31 13:08:50 +00:00
Stephen Finucane 452913a284 Remove Python 2-specific imports
Change-Id: I64810898cd9126cf619df0b8f60e6fa01958943e
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2022-01-28 12:27:06 +00:00
Stephen Finucane 0396bba4cc requirements: Remove os-xenapi
We no longer have a Xen driver. This is an unnecessary dependency.

Change-Id: Ic298fa9ac4a8935ce4e0dc17d8842d399d4eb808
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2022-01-28 12:26:53 +00:00
Zuul 125a8530cc Merge "libvirt: Create qcow2 disks with the correct size without extending" 2022-01-25 15:12:32 +00:00
Zuul f87a63a46e Merge "Add check job for FIPS" 2022-01-25 01:34:55 +00:00
Zuul 555d859be7 Merge "block_device: Ignore VolumeAttachmentNotFound during detach" 2022-01-25 00:48:05 +00:00
Zuul b9b1b4fa65 Merge "Add service version check workaround for FFU" 2022-01-25 00:37:20 +00:00
Dan Smith 7d2e481589 Add service version check workaround for FFU
We recently added a hard failure to nova service startup for the case
where computes were more than one version old (as indicated by their
service record). This helps to prevent starting up new control
services when a very old compute is still running. However, during an
FFU, control services that have skipped multiple versions will be
started and find the older compute records (which could not be updated
yet due to their reliance on the control services being up) and refuse
to start. This creates a cross-dependency which is not resolvable
without hacking the database.

This patch adds a workaround flag to allow turning that hard fail into
a warning to proceed past the issue. This less-than-ideal solution
is simple and backportable, but perhaps a better solution can be
implemented for the future.

Related-Bug: #1958883

Change-Id: Iddbc9b2a13f19cea9a996aeadfe891f4ef3b0264
2022-01-24 08:45:58 -08:00
Zuul 96731a499a Merge "conf: Allow cinderclient and os_brick to independently log at DEBUG" 2022-01-23 03:58:10 +00:00
Zuul ae779740c5 Merge "nova-next: Deploy noVNC from source instead of packages" 2022-01-22 02:18:09 +00:00
Zuul 5ad0d0cdbe Merge "Test aborting queued live migration" 2022-01-21 21:08:48 +00:00
Zuul 34f841333b Merge "Update centos 8 py36 functional job nodeset to centos stream 8" 2022-01-21 21:08:39 +00:00
Zuul 52b974acb7 Merge "db: Remove unnecessary warning filters" 2022-01-21 13:44:48 +00:00
Zuul 5c7fa6d139 Merge "db: Remove use of 'bind' arguments" 2022-01-21 13:44:40 +00:00
Zuul bb2984b9c7 Merge "[doc] propose Review-Priority label for contribs" 2022-01-20 17:05:49 +00:00
Zuul 7bb2172317 Merge "Remove deprecated opts from VNC conf" 2022-01-18 22:45:59 +00:00
Zuul 59c3a46887 Merge "libvirt: Add announce-self post live-migration workaround" 2022-01-18 20:03:47 +00:00