Allow enabling PCI scheduling in Placement

A new configuration option [filter_scheduler]pci_in_placement is added that allows enabling the scheduler logic for PCI device handling in Placement for flavor based PCI requests. blueprint: pci-device-tracking-in-placement Change-Id: I5ddf6d3cdc7e05cc4914b9b1e762fa02a5c7c550
2022-08-28 19:20:05 +02:00
parent 2cf58d3d1e
commit 2cb1eedeaf
10 changed files with 77 additions and 69 deletions
@@ -65,6 +65,10 @@ capabilities.
  :oslo.config:option:`pci.device_spec` configuration that uses the
  ``devname`` field.
 .. versionchanged:: 27.0.0 (2023.1 Antelope):
   Nova provides Placement based scheduling support for servers with flavor
   based PCI requests. This support is disable by default.
 Enabling PCI passthrough
 ------------------------
@@ -442,6 +446,24 @@ removed and VFs from the same PF is configured (or vice versa) then
 nova-compute will refuse to start as it would create a situation where both
 the PF and its VFs are made available for consumption.
 Since nova 27.0.0 (2023.1 Antelope) scheduling and allocation of PCI devices
 in Placement can also be enabled via
 :oslo.config:option:`filter_scheduler.pci_in_placement`. Please note that this
 should only be enabled after all the computes in the system is configured to
 report PCI inventory in Placement via
 enabling :oslo.config:option:`pci.report_in_placement`. In Antelope flavor
 based PCI requests are support but Neutron port base PCI requests are not
 handled in Placement.
 If you are upgrading from an earlier version with already existing servers with
 PCI usage then you must enable :oslo.config:option:`pci.report_in_placement`
 first on all your computes having PCI allocations and then restart the
 nova-compute service, before you enable
 :oslo.config:option:`filter_scheduler.pci_in_placement`. The compute service
 will heal the missing PCI allocation in placement during startup and will
 continue healing missing allocations for future servers until the scheduling
 support is enabled.
 If a flavor requests multiple ``type-VF`` devices via
 :nova:extra-spec:`pci_passthrough:alias` then it is important to consider the
 value of :nova:extra-spec:`group_policy` as well. The value ``none``
@@ -614,7 +614,7 @@ def update_provider_tree_for_pci(
    if updated:
        LOG.debug(
            "Placement PCI view needs allocation healing. This should only "
-            "happen if [scheduler]pci_in_placement is still disabled. "
+            "happen if [filter_scheduler]pci_in_placement is still disabled. "
            "Original allocations: %s New allocations: %s",
            old_alloc,
            allocations,
@@ -79,7 +79,8 @@ Possible Values:
    ``vendor_id`` and ``product_id`` values of the alias in the form of
    ``CUSTOM_PCI_{vendor_id}_{product_id}``. The ``resource_class`` requested
    in the alias is matched against the ``resource_class`` defined in the
-    ``[pci]device_spec``.
+    ``[pci]device_spec``. This field can only be used only if
    ``[filter_scheduler]pci_in_placement`` is enabled.
  ``traits``
    An optional comma separated list of Placement trait names requested to be
@@ -91,7 +92,8 @@ Possible Values:
    prefixed. The maximum allowed length of a trait name is 255 character
    including the prefix. Every trait in ``traits`` requested in the alias
    ensured to be in the list of traits provided in the ``traits`` field of
-    the ``[pci]device_spec`` when scheduling the request.
+    the ``[pci]device_spec`` when scheduling the request. This field can only
    be used only if ``[filter_scheduler]pci_in_placement`` is enabled.
 * Supports multiple aliases by repeating the option (not by specifying
  a list value)::
@@ -745,7 +745,26 @@ Possible values:
 Related options:
 * ``[filter_scheduler] aggregate_image_properties_isolation_namespace``
-""")]
+"""),
    cfg.BoolOpt(
        "pci_in_placement",
        default=False,
        help="""
 Enable scheduling and claiming PCI devices in Placement.
 This can be enabled after ``[pci]report_in_placement`` is enabled on all
 compute hosts.
 When enabled the scheduler queries Placement about the PCI device
 availability to select destination for a server with PCI request. The scheduler
 also allocates the selected PCI devices in Placement. Note that this logic
 does not replace the PCIPassthroughFilter but extends it.
 * ``[pci] report_in_placement``
 * ``[pci] alias``
 * ``[pci] device_spec``
 """),
 ]
 metrics_group = cfg.OptGroup(
    name="metrics",
@@ -22,6 +22,7 @@ from oslo_serialization import jsonutils
 from oslo_utils import versionutils
 from nova.compute import pci_placement_translator
 import nova.conf
 from nova.db.api import api as api_db_api
 from nova.db.api import models as api_models
 from nova import exception
@@ -30,6 +31,7 @@ from nova.objects import base
 from nova.objects import fields
 from nova.objects import instance as obj_instance
 CONF = nova.conf.CONF
 LOG = logging.getLogger(__name__)
 REQUEST_SPEC_OPTIONAL_ATTRS = ['requested_destination',
@@ -487,16 +489,8 @@ class RequestSpec(base.NovaObject):
    def _traits_from_request(spec: ty.Dict[str, ty.Any]) -> ty.Set[str]:
        return pci_placement_translator.get_traits(spec.get("traits", ""))
    # This is here temporarily until the PCI placement scheduling is under
    # implementation. When that is done there will be a config option
    # [scheduler]pci_in_placement to configure this. Now we add this as a
    # function to allow tests to selectively enable the WIP feature
    @staticmethod
    def _pci_in_placement_enabled():
        return False
    def generate_request_groups_from_pci_requests(self):
-        if not self._pci_in_placement_enabled():
+        if not CONF.filter_scheduler.pci_in_placement:
            return False
        for pci_request in self.pci_requests.requests:
@@ -552,7 +552,7 @@ class PciDeviceStats(object):
            # by it. This could happen if the instance only has neutron port
            # based InstancePCIRequest as that is currently not having
            # placement allocation (except for QoS ports, but that handled in a
-            # separate codepath) or if the [scheduler]pci_in_placement
+            # separate codepath) or if the [filter_scheduler]pci_in_placement
            # configuration option is not enabled in the scheduler.
            return pools
@@ -563,15 +563,15 @@ class PciDeviceStats(object):
                # NOTE(gibi): There can be pools without rp_uuid field if the
                # [pci]report_in_placement is not enabled for a compute with
                # viable PCI devices. We have a non-empty rp_uuids, so we know
-                # that the [scheduler]pci_in_placement is enabled. This is a
+                # that the [filter_scheduler]pci_in_placement is enabled. This
-                # configuration error.
+                # is a configuration error.
                LOG.warning(
                    "The PCI pool %s isn't mapped to an RP UUID but the "
                    "scheduler is configured to create PCI allocations in "
                    "placement. This should not happen. Please enable "
                    "[pci]report_in_placement on all compute hosts before "
-                    "enabling [scheduler]pci_in_placement in the scheduler. "
+                    "enabling [filter_scheduler]pci_in_placement in the "
-                    "This pool is ignored now.", pool)
+                    "scheduler. This pool is ignored now.", pool)
                continue
            if rp_uuid in rp_uuids:
@@ -809,7 +809,7 @@ class PciDeviceStats(object):
            # but the object is hard to change retroactively
            rp_uuids = request.spec[0].get('rp_uuids')
            if not rp_uuids:
-                # This can happen if [scheduler]pci_in_placement is not
+                # This can happen if [filter_scheduler]pci_in_placement is not
                # enabled yet
                # set() will signal that any PCI pool can be used for this
                # request
@@ -1616,15 +1616,7 @@ class PlacementPCIAllocationHealingTests(PlacementPCIReportingTests):
 class RCAndTraitBasedPCIAliasTests(PlacementPCIReportingTests):
    def setUp(self):
        super().setUp()
-        # TODO(gibi): replace this with setting the [scheduler]pci_in_placement
+        self.flags(group='filter_scheduler', pci_in_placement=True)
        # confing to True once that config is added
        self.mock_pci_in_placement_enabled = self.useFixture(
            fixtures.MockPatch(
                'nova.objects.request_spec.RequestSpec.'
                '_pci_in_placement_enabled',
                return_value=True
            )
        ).mock
    def test_boot_with_custom_rc_and_traits(self):
        # The fake libvirt will emulate on the host:
@@ -1737,12 +1729,13 @@ class RCAndTraitBasedPCIAliasTests(PlacementPCIReportingTests):
        self.assert_no_pci_healing("compute1")
    def test_device_claim_consistent_with_placement_allocation(self):
-        """As soon as [scheduler]pci_in_placement is enabled the nova-scheduler
+        """As soon as [filter_scheduler]pci_in_placement is enabled the
-        will allocate PCI devices in placement. Then on the nova-compute side
+        nova-scheduler will allocate PCI devices in placement. Then on the
-        the PCI claim will also allocate PCI devices in the nova DB. This test
+        nova-compute side the PCI claim will also allocate PCI devices in the
-        will create a situation where the two allocation could contradict and
+        nova DB. This test will create a situation where the two allocation
-        observes that in a contradicting situation the PCI claim will fail
+        could contradict and observes that in a contradicting situation the PCI
-        instead of allocating a device that is not allocated in placement.
+        claim will fail instead of allocating a device that is not allocated in
        placement.
        For the contradiction to happen we need two PCI devices that looks
        different from placement perspective than from the nova DB perspective.
@@ -1952,15 +1952,7 @@ class PCIServersTest(_PCIServersTestBase):
    def setUp(self):
        super().setUp()
        self.flags(group="pci", report_in_placement=True)
-        # TODO(gibi): replace this with setting the [scheduler]pci_prefilter
+        self.flags(group='filter_scheduler', pci_in_placement=True)
        # confing to True once that config is added
        self.mock_pci_in_placement_enabled = self.useFixture(
            fixtures.MockPatch(
                'nova.objects.request_spec.RequestSpec.'
                '_pci_in_placement_enabled',
                return_value=True
            )
        ).mock
    def test_create_server_with_pci_dev_and_numa(self):
        """Verifies that an instance can be booted with cpu pinning and with an
@@ -3026,15 +3018,7 @@ class PCIServersWithPreferredNUMATest(_PCIServersTestBase):
    def setUp(self):
        super().setUp()
        self.flags(group="pci", report_in_placement=True)
-        # TODO(gibi): replace this with setting the [scheduler]pci_in_placement
+        self.flags(group='filter_scheduler', pci_in_placement=True)
        # confing to True once that config is added
        self.mock_pci_in_placement_enabled = self.useFixture(
            fixtures.MockPatch(
                'nova.objects.request_spec.RequestSpec.'
                '_pci_in_placement_enabled',
                return_value=True
            )
        ).mock
    def test_create_server_with_pci_dev_and_numa(self):
        """Validate behavior of 'preferred' PCI NUMA policy.
@@ -14,7 +14,6 @@
 import collections
 from unittest import mock
 import fixtures
 from oslo_serialization import jsonutils
 from oslo_utils.fixture import uuidsentinel as uuids
 from oslo_utils import uuidutils
@@ -431,13 +430,8 @@ class _TestRequestSpecObject(object):
        self.assertListEqual([rg], spec.requested_resources)
        self.assertEqual(req_lvl_params, spec.request_level_params)
    # TODO(gibi): replace this with setting the config
    # [scheduler]pci_in_placement=True once that flag is available
    @mock.patch(
        'nova.objects.request_spec.RequestSpec._pci_in_placement_enabled',
        new=mock.Mock(return_value=True),
    )
    def test_from_components_flavor_based_pci_requests(self):
        self.flags(group='filter_scheduler', pci_in_placement=True)
        ctxt = context.RequestContext(
            fakes.FAKE_USER_ID, fakes.FAKE_PROJECT_ID
        )
@@ -1119,18 +1113,10 @@ class TestRemoteRequestSpecObject(test_objects._RemoteTest,
 class TestInstancePCIRequestToRequestGroups(test.NoDBTestCase):
    def setUp(self):
        super().setUp()
-        # TODO(gibi): replace this with setting the config
+        self.flags(group='filter_scheduler', pci_in_placement=True)
        # [scheduler]pci_in_placement=True once that flag is available
        self.mock_pci_in_placement_enabled = self.useFixture(
            fixtures.MockPatch(
                "nova.objects.request_spec.RequestSpec."
                "_pci_in_placement_enabled",
                return_value=True,
            )
        ).mock
    def test_pci_reqs_ignored_if_disabled(self):
-        self.mock_pci_in_placement_enabled.return_value = False
+        self.flags(group='filter_scheduler', pci_in_placement=False)
        spec = request_spec.RequestSpec(
            requested_resources=[],
@@ -0,0 +1,8 @@
 ---
 features:
  - |
    Since 26.0.0 (Zed) Nova supports tracking PCI devices in Placement. Now
    Nova also supports scheduling flavor based PCI device requests via
    Placement. This support is disable by default. Please read
    `documentation <https://docs.openstack.org/nova/latest/admin/pci-passthrough.html#pci-tracking-in-placement>`_
    for more details on what is supported how this feature can be enabled.