317 lines
15 KiB
ReStructuredText
317 lines
15 KiB
ReStructuredText
=======================
|
|
Profiling With Eventlet
|
|
=======================
|
|
|
|
When performance of one of the Nova services is worse than expected, and other
|
|
sorts of analysis do not lead to candidate fixes, profiling is an excellent
|
|
tool for producing detailed analysis of what methods in the code are called the
|
|
most and which consume the most time.
|
|
|
|
Because most Nova services use eventlet_, the standard profiling tool provided
|
|
with Python, cProfile_, will not work. Something is required to keep track of
|
|
changing tasks. Thankfully eventlet comes with
|
|
``eventlet.green.profile.Profile``, a mostly undocumented class that provides a
|
|
similar (but not identical) API to the one provided by Python's ``Profile``
|
|
while outputting the same format.
|
|
|
|
.. note:: The eventlet Profile outputs the ``prof`` format produced by
|
|
``profile``, which is not the same as that output by ``cProfile``.
|
|
Some analysis tools (for example, SnakeViz_) only read the latter
|
|
so the options for analyzing eventlet profiling are not always
|
|
deluxe (see below).
|
|
|
|
Setup
|
|
=====
|
|
|
|
This guide assumes the Nova service being profiled is running devstack, but
|
|
that is not necessary. What is necessary is that the code associated with the
|
|
service can be changed and the service restarted, in place.
|
|
|
|
Profiling the entire service will produce mostly noise and the output will be
|
|
confusing because different tasks will operate during the profile run. It is
|
|
better to begin the process with a candidate task or method *within* the
|
|
service that can be associated with an identifier. For example,
|
|
``select_destinations`` in the ``FilterScheduler`` can be associated with the
|
|
list of ``instance_uuids`` passed to it and it runs only once for that set of
|
|
instance uuids.
|
|
|
|
The process for profiling is:
|
|
|
|
#. Identify the method to be profiled.
|
|
|
|
#. Populate the environment with sufficient resources to exercise the code. For
|
|
example you may wish to use the FakeVirtDriver_ to have nova aware of
|
|
multiple ``nova-compute`` processes. Or you may wish to launch many
|
|
instances if you are evaluating a method that loops over instances.
|
|
|
|
#. At the start of that method, change the code to instantiate a ``Profile``
|
|
object and ``start()`` it.
|
|
|
|
#. At the end of that method, change the code to ``stop()`` profiling and write
|
|
the data (with ``dump_stats()``) to a reasonable location.
|
|
|
|
#. Restart the service.
|
|
|
|
#. Cause the method being evaluated to run.
|
|
|
|
#. Analyze the profile data with the pstats_ module.
|
|
|
|
.. note:: ``stop()`` and ``start()`` are two of the ways in which the eventlet
|
|
``Profile`` API differs from the stdlib. There the methods are
|
|
``enable()`` and ``disable()``.
|
|
|
|
Example
|
|
=======
|
|
|
|
For this example we will analyze ``select_destinations`` in the
|
|
``FilterScheduler``. A known problem is that it does excessive work when
|
|
presented with too many candidate results from the Placement service. We'd like
|
|
to know why.
|
|
|
|
We'll configure and run devstack_ with FakeVirtDriver_ so there are several
|
|
candidate hypervisors (the following ``local.conf`` is also useful for other
|
|
profiling and benchmarking scenarios so not all changes are relevant here):
|
|
|
|
.. code-block:: ini
|
|
|
|
[[local|localrc]]
|
|
ADMIN_PASSWORD=secret
|
|
DATABASE_PASSWORD=$ADMIN_PASSWORD
|
|
RABBIT_PASSWORD=$ADMIN_PASSWORD
|
|
SERVICE_PASSWORD=$ADMIN_PASSWORD
|
|
VIRT_DRIVER=fake
|
|
# You may use different numbers of fake computes, but be careful: 100 will
|
|
# completely overwhelm a 16GB, 16VPCU server. In the test profiles below a
|
|
# value of 50 was used, on a 16GB, 16VCPU server.
|
|
NUMBER_FAKE_NOVA_COMPUTE=25
|
|
disable_service cinder
|
|
disable_service horizon
|
|
disable_service dstat
|
|
disable_service tempest
|
|
|
|
[[post-config|$NOVA_CONF]]
|
|
rpc_response_timeout = 300
|
|
|
|
# Disable filtering entirely. For some profiling this will not be what you
|
|
# want.
|
|
[filter_scheduler]
|
|
enabled_filters = '""'
|
|
# Send only one type of notifications to avoid notification overhead.
|
|
[notifications]
|
|
notification_format = unversioned
|
|
|
|
Change the code in ``nova/scheduler/filter_scheduler.py`` as follows:
|
|
|
|
.. code-block:: diff
|
|
|
|
|
|
diff --git a/nova/scheduler/filter_scheduler.py b/nova/scheduler/filter_scheduler.py
|
|
index 672f23077e..cb0f87fe48 100644
|
|
--- a/nova/scheduler/filter_scheduler.py
|
|
+++ b/nova/scheduler/filter_scheduler.py
|
|
@@ -49,92 +49,99 @@ class FilterScheduler(driver.Scheduler):
|
|
def select_destinations(self, context, spec_obj, instance_uuids,
|
|
alloc_reqs_by_rp_uuid, provider_summaries,
|
|
allocation_request_version=None, return_alternates=False):
|
|
"""Returns a list of lists of Selection objects, which represent the
|
|
hosts and (optionally) alternates for each instance.
|
|
|
|
:param context: The RequestContext object
|
|
:param spec_obj: The RequestSpec object
|
|
:param instance_uuids: List of UUIDs, one for each value of the spec
|
|
object's num_instances attribute
|
|
:param alloc_reqs_by_rp_uuid: Optional dict, keyed by resource provider
|
|
UUID, of the allocation_requests that may
|
|
be used to claim resources against
|
|
matched hosts. If None, indicates either
|
|
the placement API wasn't reachable or
|
|
that there were no allocation_requests
|
|
returned by the placement API. If the
|
|
latter, the provider_summaries will be an
|
|
empty dict, not None.
|
|
:param provider_summaries: Optional dict, keyed by resource provider
|
|
UUID, of information that will be used by
|
|
the filters/weighers in selecting matching
|
|
hosts for a request. If None, indicates that
|
|
the scheduler driver should grab all compute
|
|
node information locally and that the
|
|
Placement API is not used. If an empty dict,
|
|
indicates the Placement API returned no
|
|
potential matches for the requested
|
|
resources.
|
|
:param allocation_request_version: The microversion used to request the
|
|
allocations.
|
|
:param return_alternates: When True, zero or more alternate hosts are
|
|
returned with each selected host. The number
|
|
of alternates is determined by the
|
|
configuration option
|
|
`CONF.scheduler.max_attempts`.
|
|
"""
|
|
+ from eventlet.green import profile
|
|
+ pr = profile.Profile()
|
|
+ pr.start()
|
|
+
|
|
self.notifier.info(
|
|
context, 'scheduler.select_destinations.start',
|
|
dict(request_spec=spec_obj.to_legacy_request_spec_dict()))
|
|
compute_utils.notify_about_scheduler_action(
|
|
context=context, request_spec=spec_obj,
|
|
action=fields_obj.NotificationAction.SELECT_DESTINATIONS,
|
|
phase=fields_obj.NotificationPhase.START)
|
|
|
|
host_selections = self._schedule(context, spec_obj, instance_uuids,
|
|
alloc_reqs_by_rp_uuid, provider_summaries,
|
|
allocation_request_version, return_alternates)
|
|
self.notifier.info(
|
|
context, 'scheduler.select_destinations.end',
|
|
dict(request_spec=spec_obj.to_legacy_request_spec_dict()))
|
|
compute_utils.notify_about_scheduler_action(
|
|
context=context, request_spec=spec_obj,
|
|
action=fields_obj.NotificationAction.SELECT_DESTINATIONS,
|
|
phase=fields_obj.NotificationPhase.END)
|
|
+ pr.stop()
|
|
+ pr.dump_stats('/tmp/select_destinations/%s.prof' % ':'.join(instance_uuids))
|
|
+
|
|
return host_selections
|
|
|
|
Make a ``/tmp/select_destinations`` directory that is writable by the user
|
|
nova-scheduler will run as. This is where the profile output will go.
|
|
|
|
Restart the scheduler service. Note that ``systemctl restart`` may not kill
|
|
things sufficiently dead, so::
|
|
|
|
sudo systemctl stop devstack@n-sch
|
|
sleep 5
|
|
sudo systemctl start devstack@n-sch
|
|
|
|
Create a server (which will call ``select_destinations``)::
|
|
|
|
openstack server create --image cirros-0.4.0-x86_64-disk --flavor c1 x1
|
|
|
|
In ``/tmp/select_destinations`` there should be a file with a name using the
|
|
uuid of the created server with a ``.prof`` extension.
|
|
|
|
Change to that directory and view the profile using the pstats
|
|
`interactive mode`_::
|
|
|
|
python3 -m pstats ef044142-f3b8-409d-9af6-c60cea39b273.prof
|
|
|
|
.. note:: The major version of python used to analyze the profile data must be
|
|
the same as the version used to run the process being profiled.
|
|
|
|
Sort stats by their cumulative time::
|
|
|
|
ef044142-f3b8-409d-9af6-c60cea39b273.prof% sort cumtime
|
|
ef044142-f3b8-409d-9af6-c60cea39b273.prof% stats 10
|
|
Tue Aug 6 17:17:56 2019 ef044142-f3b8-409d-9af6-c60cea39b273.prof
|
|
|
|
603477 function calls (587772 primitive calls) in 2.294 seconds
|
|
|
|
Ordered by: cumulative time
|
|
List reduced from 2484 to 10 due to restriction <10>
|
|
|
|
ncalls tottime percall cumtime percall filename:lineno(function)
|
|
1 0.000 0.000 1.957 1.957 profile:0(start)
|
|
1 0.000 0.000 1.911 1.911 /mnt/share/opt/stack/nova/nova/scheduler/filter_scheduler.py:113(_schedule)
|
|
1 0.000 0.000 1.834 1.834 /mnt/share/opt/stack/nova/nova/scheduler/filter_scheduler.py:485(_get_all_host_states)
|
|
1 0.000 0.000 1.834 1.834 /mnt/share/opt/stack/nova/nova/scheduler/host_manager.py:757(get_host_states_by_uuids)
|
|
1 0.004 0.004 1.818 1.818 /mnt/share/opt/stack/nova/nova/scheduler/host_manager.py:777(_get_host_states)
|
|
104/103 0.001 0.000 1.409 0.014 /usr/local/lib/python3.6/dist-packages/oslo_versionedobjects/base.py:170(wrapper)
|
|
50 0.001 0.000 1.290 0.026 /mnt/share/opt/stack/nova/nova/scheduler/host_manager.py:836(_get_instance_info)
|
|
50 0.001 0.000 1.289 0.026 /mnt/share/opt/stack/nova/nova/scheduler/host_manager.py:820(_get_instances_by_host)
|
|
103 0.001 0.000 0.890 0.009 /usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/query.py:3325(__iter__)
|
|
50 0.001 0.000 0.776 0.016 /mnt/share/opt/stack/nova/nova/objects/host_mapping.py:99(get_by_host)
|
|
|
|
From this we can make a couple of useful inferences about ``get_by_host``:
|
|
|
|
* It is called once for each of the 50 ``FakeVirtDriver`` hypervisors
|
|
configured for these tests.
|
|
|
|
* It (and the methods it calls internally) consumes about 40% of the entire
|
|
time spent running (``0.776 / 1.957``) the ``select_destinations`` method
|
|
(indicated by ``profile:0(start)``, above).
|
|
|
|
Several other sort modes can be used. List those that are available by entering
|
|
``sort`` without arguments.
|
|
|
|
Caveats
|
|
=======
|
|
|
|
Real world use indicates that the eventlet profiler is not perfect. There are
|
|
situations where it will not always track switches between greenlets as well as
|
|
it could. This can result in profile data that does not make sense or random
|
|
slowdowns in the system being profiled. There is no one size fits all solution
|
|
to these issues; profiling eventlet services is more an art than science.
|
|
However, this section tries to provide a (hopefully) growing body of advice on
|
|
what to do to work around problems.
|
|
|
|
General Advice
|
|
--------------
|
|
|
|
* Try to profile chunks of code that operate mostly within one module or class
|
|
and do not have many collaborators. The more convoluted the path through
|
|
the code, the more confused the profiler gets.
|
|
|
|
* Similarly, where possible avoid profiling code that will trigger many
|
|
greenlet context switches; either specific spawns, or multiple types of I/O.
|
|
Instead, narrow the focus of the profiler.
|
|
|
|
* If possible, avoid RPC.
|
|
|
|
In nova-compute
|
|
---------------
|
|
|
|
The creation of this caveat section was inspired by issues experienced while
|
|
profiling ``nova-compute``. The ``nova-compute`` process is not allowed to
|
|
speak with a database server directly. Instead communication is mediated
|
|
through the conductor, communication happening via ``oslo.versionedobjects``
|
|
and remote calls. Profiling methods such as ``update_available_resource`` in
|
|
the ResourceTracker, which needs information from the database, results in
|
|
profile data that can be analyzed but is incorrect and misleading.
|
|
|
|
This can be worked around by temporarily changing ``nova-compute`` to allow it
|
|
to speak to the database directly:
|
|
|
|
.. code-block:: diff
|
|
|
|
diff --git a/nova/cmd/compute.py b/nova/cmd/compute.py
|
|
index 01fd20de2e..655d503158 100644
|
|
--- a/nova/cmd/compute.py
|
|
+++ b/nova/cmd/compute.py
|
|
@@ -50,8 +50,10 @@ def main():
|
|
|
|
gmr.TextGuruMeditation.setup_autorun(version, conf=CONF)
|
|
|
|
- cmd_common.block_db_access('nova-compute')
|
|
- objects_base.NovaObject.indirection_api = conductor_rpcapi.ConductorAPI()
|
|
+ # Temporarily allow access to the database. You must update the config file
|
|
+ # used by this process to set [database]/connection to the cell1 database.
|
|
+ # cmd_common.block_db_access('nova-compute')
|
|
+ # objects_base.NovaObject.indirection_api = conductor_rpcapi.ConductorAPI()
|
|
objects.Service.enable_min_version_cache()
|
|
server = service.Service.create(binary='nova-compute',
|
|
topic=compute_rpcapi.RPC_TOPIC)
|
|
|
|
The configuration file used by the ``nova-compute`` process must also be
|
|
updated to ensure that it contains a setting for the relevant database:
|
|
|
|
.. code-block:: ini
|
|
|
|
[database]
|
|
connection = mysql+pymysql://root:secret@127.0.0.1/nova_cell1?charset=utf8
|
|
|
|
In a single node devstack setup ``nova_cell1`` is the right choice. The
|
|
connection string will vary in other setups.
|
|
|
|
Once these changes are made, along with the profiler changes indicated in the
|
|
example above, ``nova-compute`` can be restarted and with luck some useful
|
|
profiling data will emerge.
|
|
|
|
.. _eventlet: https://eventlet.net/
|
|
.. _cProfile: https://docs.python.org/3/library/profile.html
|
|
.. _SnakeViz: https://jiffyclub.github.io/snakeviz/
|
|
.. _devstack: https://docs.openstack.org/devstack/latest/
|
|
.. _FakeVirtDriver: https://docs.openstack.org/devstack/latest/guides/nova.html#fake-virt-driver
|
|
.. _pstats: https://docs.python.org/3/library/profile.html#pstats.Stats
|
|
.. _interactive mode: https://www.stefaanlippens.net/python_profiling_with_pstats_interactive_mode/
|