Allow enabling PCI scheduling in Placement

A new configuration option [filter_scheduler]pci_in_placement is added
that allows enabling the scheduler logic for PCI device handling in
Placement for flavor based PCI requests.

blueprint: pci-device-tracking-in-placement
Change-Id: I5ddf6d3cdc7e05cc4914b9b1e762fa02a5c7c550
This commit is contained in:
Balazs Gibizer
2022-08-28 19:20:05 +02:00
parent 2cf58d3d1e
commit 2cb1eedeaf
10 changed files with 77 additions and 69 deletions
+22
View File
@@ -65,6 +65,10 @@ capabilities.
:oslo.config:option:`pci.device_spec` configuration that uses the :oslo.config:option:`pci.device_spec` configuration that uses the
``devname`` field. ``devname`` field.
.. versionchanged:: 27.0.0 (2023.1 Antelope):
Nova provides Placement based scheduling support for servers with flavor
based PCI requests. This support is disable by default.
Enabling PCI passthrough Enabling PCI passthrough
------------------------ ------------------------
@@ -442,6 +446,24 @@ removed and VFs from the same PF is configured (or vice versa) then
nova-compute will refuse to start as it would create a situation where both nova-compute will refuse to start as it would create a situation where both
the PF and its VFs are made available for consumption. the PF and its VFs are made available for consumption.
Since nova 27.0.0 (2023.1 Antelope) scheduling and allocation of PCI devices
in Placement can also be enabled via
:oslo.config:option:`filter_scheduler.pci_in_placement`. Please note that this
should only be enabled after all the computes in the system is configured to
report PCI inventory in Placement via
enabling :oslo.config:option:`pci.report_in_placement`. In Antelope flavor
based PCI requests are support but Neutron port base PCI requests are not
handled in Placement.
If you are upgrading from an earlier version with already existing servers with
PCI usage then you must enable :oslo.config:option:`pci.report_in_placement`
first on all your computes having PCI allocations and then restart the
nova-compute service, before you enable
:oslo.config:option:`filter_scheduler.pci_in_placement`. The compute service
will heal the missing PCI allocation in placement during startup and will
continue healing missing allocations for future servers until the scheduling
support is enabled.
If a flavor requests multiple ``type-VF`` devices via If a flavor requests multiple ``type-VF`` devices via
:nova:extra-spec:`pci_passthrough:alias` then it is important to consider the :nova:extra-spec:`pci_passthrough:alias` then it is important to consider the
value of :nova:extra-spec:`group_policy` as well. The value ``none`` value of :nova:extra-spec:`group_policy` as well. The value ``none``
+1 -1
View File
@@ -614,7 +614,7 @@ def update_provider_tree_for_pci(
if updated: if updated:
LOG.debug( LOG.debug(
"Placement PCI view needs allocation healing. This should only " "Placement PCI view needs allocation healing. This should only "
"happen if [scheduler]pci_in_placement is still disabled. " "happen if [filter_scheduler]pci_in_placement is still disabled. "
"Original allocations: %s New allocations: %s", "Original allocations: %s New allocations: %s",
old_alloc, old_alloc,
allocations, allocations,
+4 -2
View File
@@ -79,7 +79,8 @@ Possible Values:
``vendor_id`` and ``product_id`` values of the alias in the form of ``vendor_id`` and ``product_id`` values of the alias in the form of
``CUSTOM_PCI_{vendor_id}_{product_id}``. The ``resource_class`` requested ``CUSTOM_PCI_{vendor_id}_{product_id}``. The ``resource_class`` requested
in the alias is matched against the ``resource_class`` defined in the in the alias is matched against the ``resource_class`` defined in the
``[pci]device_spec``. ``[pci]device_spec``. This field can only be used only if
``[filter_scheduler]pci_in_placement`` is enabled.
``traits`` ``traits``
An optional comma separated list of Placement trait names requested to be An optional comma separated list of Placement trait names requested to be
@@ -91,7 +92,8 @@ Possible Values:
prefixed. The maximum allowed length of a trait name is 255 character prefixed. The maximum allowed length of a trait name is 255 character
including the prefix. Every trait in ``traits`` requested in the alias including the prefix. Every trait in ``traits`` requested in the alias
ensured to be in the list of traits provided in the ``traits`` field of ensured to be in the list of traits provided in the ``traits`` field of
the ``[pci]device_spec`` when scheduling the request. the ``[pci]device_spec`` when scheduling the request. This field can only
be used only if ``[filter_scheduler]pci_in_placement`` is enabled.
* Supports multiple aliases by repeating the option (not by specifying * Supports multiple aliases by repeating the option (not by specifying
a list value):: a list value)::
+20 -1
View File
@@ -745,7 +745,26 @@ Possible values:
Related options: Related options:
* ``[filter_scheduler] aggregate_image_properties_isolation_namespace`` * ``[filter_scheduler] aggregate_image_properties_isolation_namespace``
""")] """),
cfg.BoolOpt(
"pci_in_placement",
default=False,
help="""
Enable scheduling and claiming PCI devices in Placement.
This can be enabled after ``[pci]report_in_placement`` is enabled on all
compute hosts.
When enabled the scheduler queries Placement about the PCI device
availability to select destination for a server with PCI request. The scheduler
also allocates the selected PCI devices in Placement. Note that this logic
does not replace the PCIPassthroughFilter but extends it.
* ``[pci] report_in_placement``
* ``[pci] alias``
* ``[pci] device_spec``
"""),
]
metrics_group = cfg.OptGroup( metrics_group = cfg.OptGroup(
name="metrics", name="metrics",
+3 -9
View File
@@ -22,6 +22,7 @@ from oslo_serialization import jsonutils
from oslo_utils import versionutils from oslo_utils import versionutils
from nova.compute import pci_placement_translator from nova.compute import pci_placement_translator
import nova.conf
from nova.db.api import api as api_db_api from nova.db.api import api as api_db_api
from nova.db.api import models as api_models from nova.db.api import models as api_models
from nova import exception from nova import exception
@@ -30,6 +31,7 @@ from nova.objects import base
from nova.objects import fields from nova.objects import fields
from nova.objects import instance as obj_instance from nova.objects import instance as obj_instance
CONF = nova.conf.CONF
LOG = logging.getLogger(__name__) LOG = logging.getLogger(__name__)
REQUEST_SPEC_OPTIONAL_ATTRS = ['requested_destination', REQUEST_SPEC_OPTIONAL_ATTRS = ['requested_destination',
@@ -487,16 +489,8 @@ class RequestSpec(base.NovaObject):
def _traits_from_request(spec: ty.Dict[str, ty.Any]) -> ty.Set[str]: def _traits_from_request(spec: ty.Dict[str, ty.Any]) -> ty.Set[str]:
return pci_placement_translator.get_traits(spec.get("traits", "")) return pci_placement_translator.get_traits(spec.get("traits", ""))
# This is here temporarily until the PCI placement scheduling is under
# implementation. When that is done there will be a config option
# [scheduler]pci_in_placement to configure this. Now we add this as a
# function to allow tests to selectively enable the WIP feature
@staticmethod
def _pci_in_placement_enabled():
return False
def generate_request_groups_from_pci_requests(self): def generate_request_groups_from_pci_requests(self):
if not self._pci_in_placement_enabled(): if not CONF.filter_scheduler.pci_in_placement:
return False return False
for pci_request in self.pci_requests.requests: for pci_request in self.pci_requests.requests:
+6 -6
View File
@@ -552,7 +552,7 @@ class PciDeviceStats(object):
# by it. This could happen if the instance only has neutron port # by it. This could happen if the instance only has neutron port
# based InstancePCIRequest as that is currently not having # based InstancePCIRequest as that is currently not having
# placement allocation (except for QoS ports, but that handled in a # placement allocation (except for QoS ports, but that handled in a
# separate codepath) or if the [scheduler]pci_in_placement # separate codepath) or if the [filter_scheduler]pci_in_placement
# configuration option is not enabled in the scheduler. # configuration option is not enabled in the scheduler.
return pools return pools
@@ -563,15 +563,15 @@ class PciDeviceStats(object):
# NOTE(gibi): There can be pools without rp_uuid field if the # NOTE(gibi): There can be pools without rp_uuid field if the
# [pci]report_in_placement is not enabled for a compute with # [pci]report_in_placement is not enabled for a compute with
# viable PCI devices. We have a non-empty rp_uuids, so we know # viable PCI devices. We have a non-empty rp_uuids, so we know
# that the [scheduler]pci_in_placement is enabled. This is a # that the [filter_scheduler]pci_in_placement is enabled. This
# configuration error. # is a configuration error.
LOG.warning( LOG.warning(
"The PCI pool %s isn't mapped to an RP UUID but the " "The PCI pool %s isn't mapped to an RP UUID but the "
"scheduler is configured to create PCI allocations in " "scheduler is configured to create PCI allocations in "
"placement. This should not happen. Please enable " "placement. This should not happen. Please enable "
"[pci]report_in_placement on all compute hosts before " "[pci]report_in_placement on all compute hosts before "
"enabling [scheduler]pci_in_placement in the scheduler. " "enabling [filter_scheduler]pci_in_placement in the "
"This pool is ignored now.", pool) "scheduler. This pool is ignored now.", pool)
continue continue
if rp_uuid in rp_uuids: if rp_uuid in rp_uuids:
@@ -809,7 +809,7 @@ class PciDeviceStats(object):
# but the object is hard to change retroactively # but the object is hard to change retroactively
rp_uuids = request.spec[0].get('rp_uuids') rp_uuids = request.spec[0].get('rp_uuids')
if not rp_uuids: if not rp_uuids:
# This can happen if [scheduler]pci_in_placement is not # This can happen if [filter_scheduler]pci_in_placement is not
# enabled yet # enabled yet
# set() will signal that any PCI pool can be used for this # set() will signal that any PCI pool can be used for this
# request # request
@@ -1616,15 +1616,7 @@ class PlacementPCIAllocationHealingTests(PlacementPCIReportingTests):
class RCAndTraitBasedPCIAliasTests(PlacementPCIReportingTests): class RCAndTraitBasedPCIAliasTests(PlacementPCIReportingTests):
def setUp(self): def setUp(self):
super().setUp() super().setUp()
# TODO(gibi): replace this with setting the [scheduler]pci_in_placement self.flags(group='filter_scheduler', pci_in_placement=True)
# confing to True once that config is added
self.mock_pci_in_placement_enabled = self.useFixture(
fixtures.MockPatch(
'nova.objects.request_spec.RequestSpec.'
'_pci_in_placement_enabled',
return_value=True
)
).mock
def test_boot_with_custom_rc_and_traits(self): def test_boot_with_custom_rc_and_traits(self):
# The fake libvirt will emulate on the host: # The fake libvirt will emulate on the host:
@@ -1737,12 +1729,13 @@ class RCAndTraitBasedPCIAliasTests(PlacementPCIReportingTests):
self.assert_no_pci_healing("compute1") self.assert_no_pci_healing("compute1")
def test_device_claim_consistent_with_placement_allocation(self): def test_device_claim_consistent_with_placement_allocation(self):
"""As soon as [scheduler]pci_in_placement is enabled the nova-scheduler """As soon as [filter_scheduler]pci_in_placement is enabled the
will allocate PCI devices in placement. Then on the nova-compute side nova-scheduler will allocate PCI devices in placement. Then on the
the PCI claim will also allocate PCI devices in the nova DB. This test nova-compute side the PCI claim will also allocate PCI devices in the
will create a situation where the two allocation could contradict and nova DB. This test will create a situation where the two allocation
observes that in a contradicting situation the PCI claim will fail could contradict and observes that in a contradicting situation the PCI
instead of allocating a device that is not allocated in placement. claim will fail instead of allocating a device that is not allocated in
placement.
For the contradiction to happen we need two PCI devices that looks For the contradiction to happen we need two PCI devices that looks
different from placement perspective than from the nova DB perspective. different from placement perspective than from the nova DB perspective.
@@ -1952,15 +1952,7 @@ class PCIServersTest(_PCIServersTestBase):
def setUp(self): def setUp(self):
super().setUp() super().setUp()
self.flags(group="pci", report_in_placement=True) self.flags(group="pci", report_in_placement=True)
# TODO(gibi): replace this with setting the [scheduler]pci_prefilter self.flags(group='filter_scheduler', pci_in_placement=True)
# confing to True once that config is added
self.mock_pci_in_placement_enabled = self.useFixture(
fixtures.MockPatch(
'nova.objects.request_spec.RequestSpec.'
'_pci_in_placement_enabled',
return_value=True
)
).mock
def test_create_server_with_pci_dev_and_numa(self): def test_create_server_with_pci_dev_and_numa(self):
"""Verifies that an instance can be booted with cpu pinning and with an """Verifies that an instance can be booted with cpu pinning and with an
@@ -3026,15 +3018,7 @@ class PCIServersWithPreferredNUMATest(_PCIServersTestBase):
def setUp(self): def setUp(self):
super().setUp() super().setUp()
self.flags(group="pci", report_in_placement=True) self.flags(group="pci", report_in_placement=True)
# TODO(gibi): replace this with setting the [scheduler]pci_in_placement self.flags(group='filter_scheduler', pci_in_placement=True)
# confing to True once that config is added
self.mock_pci_in_placement_enabled = self.useFixture(
fixtures.MockPatch(
'nova.objects.request_spec.RequestSpec.'
'_pci_in_placement_enabled',
return_value=True
)
).mock
def test_create_server_with_pci_dev_and_numa(self): def test_create_server_with_pci_dev_and_numa(self):
"""Validate behavior of 'preferred' PCI NUMA policy. """Validate behavior of 'preferred' PCI NUMA policy.
+3 -17
View File
@@ -14,7 +14,6 @@
import collections import collections
from unittest import mock from unittest import mock
import fixtures
from oslo_serialization import jsonutils from oslo_serialization import jsonutils
from oslo_utils.fixture import uuidsentinel as uuids from oslo_utils.fixture import uuidsentinel as uuids
from oslo_utils import uuidutils from oslo_utils import uuidutils
@@ -431,13 +430,8 @@ class _TestRequestSpecObject(object):
self.assertListEqual([rg], spec.requested_resources) self.assertListEqual([rg], spec.requested_resources)
self.assertEqual(req_lvl_params, spec.request_level_params) self.assertEqual(req_lvl_params, spec.request_level_params)
# TODO(gibi): replace this with setting the config
# [scheduler]pci_in_placement=True once that flag is available
@mock.patch(
'nova.objects.request_spec.RequestSpec._pci_in_placement_enabled',
new=mock.Mock(return_value=True),
)
def test_from_components_flavor_based_pci_requests(self): def test_from_components_flavor_based_pci_requests(self):
self.flags(group='filter_scheduler', pci_in_placement=True)
ctxt = context.RequestContext( ctxt = context.RequestContext(
fakes.FAKE_USER_ID, fakes.FAKE_PROJECT_ID fakes.FAKE_USER_ID, fakes.FAKE_PROJECT_ID
) )
@@ -1119,18 +1113,10 @@ class TestRemoteRequestSpecObject(test_objects._RemoteTest,
class TestInstancePCIRequestToRequestGroups(test.NoDBTestCase): class TestInstancePCIRequestToRequestGroups(test.NoDBTestCase):
def setUp(self): def setUp(self):
super().setUp() super().setUp()
# TODO(gibi): replace this with setting the config self.flags(group='filter_scheduler', pci_in_placement=True)
# [scheduler]pci_in_placement=True once that flag is available
self.mock_pci_in_placement_enabled = self.useFixture(
fixtures.MockPatch(
"nova.objects.request_spec.RequestSpec."
"_pci_in_placement_enabled",
return_value=True,
)
).mock
def test_pci_reqs_ignored_if_disabled(self): def test_pci_reqs_ignored_if_disabled(self):
self.mock_pci_in_placement_enabled.return_value = False self.flags(group='filter_scheduler', pci_in_placement=False)
spec = request_spec.RequestSpec( spec = request_spec.RequestSpec(
requested_resources=[], requested_resources=[],
@@ -0,0 +1,8 @@
---
features:
- |
Since 26.0.0 (Zed) Nova supports tracking PCI devices in Placement. Now
Nova also supports scheduling flavor based PCI device requests via
Placement. This support is disable by default. Please read
`documentation <https://docs.openstack.org/nova/latest/admin/pci-passthrough.html#pci-tracking-in-placement>`_
for more details on what is supported how this feature can be enabled.