Commit Graph

54694 Commits

Author SHA1 Message Date
Zuul ffa85a9263 Merge "Add 'resource_request' to neutronv2/constants" 2019-07-22 21:59:56 +00:00
Zuul 438824b50a Merge "Use neutron contants in cmd/manage.py" 2019-07-22 21:59:48 +00:00
Zuul 5d8c8ab628 Merge "Move consts from neutronv2/api to constants module" 2019-07-22 21:59:38 +00:00
Zuul 8fc20874b8 Merge "nova-manage: heal port allocations" 2019-07-22 21:59:30 +00:00
Zuul c9bc00b364 Merge "Remove Newton-era min compute checks for server create with device tags" 2019-07-22 21:46:28 +00:00
Zuul d5c67a3d95 Merge "libvirt: move checking CONF.my_ip to init_host()" 2019-07-22 17:16:08 +00:00
Matt Riedemann e75fc2af93 Remove Newton-era min compute checks for server create with device tags
Compute service version 14 is from Newton [1] so we can remove
the API min compute version checks for creating a server with
bdm and vif device tags. Note that the docstrings which are
removed mentioned microversion support but that was wrong - the
request schema validation is what validates the microversion used.

[1] I8367f740d6d4ebaeb81bc74c6a82a8faf5cd8308

Change-Id: I97e67fb971b7a0cc2373b558907c7354646cf5fa
2019-07-22 10:14:57 -05:00
Zuul 063ef486e9 Merge "Exit 1 when db sync runs before api_db sync" 2019-07-20 03:26:41 +00:00
Zuul 4eb32cb5f0 Merge "Optimize SchedulerReportClient.delete_resource_provider" 2019-07-20 02:17:33 +00:00
Artom Lifshitz 30d8159d4e libvirt: move checking CONF.my_ip to init_host()
Migrations use the libvirt driver's get_host_ip_addr() method to
determine the dest_host field of the migration object.
get_host_ip_addr() checks whether CONF.my_ip is actually assigned to
one of the host's interfaces. It does so by calling
get_machine_ips(), which iterates over all of the host's interfaces.
If the host has many interfaces, this can take a long time, and
introduces needless delays in processing the migration.
get_machine_ips() is only used to print a warning, so this patch moves
the get_machine_ips() call to a single method in init_host(). This
way, a warning is still emitted at compute service startup, and
migration progress is not needlessly slowed down.

This patch also has a chicken and egg problem with the patch on top of
it, which poisons use of netifaces.interfaces() in tests. While this
patch fixes all the tests that break with that poison, it starts
breaking different tests because of the move of get_machine_ips() into
init_host(). Therefore, while not directly related to the bug, this
patch also preventatively mocks or stubs out any use of
get_machine_ips() that will get poisoned with the subsequent patch.

Closes-bug: 1837075
Change-Id: I58a4038b04d5a9c28927d914e71609e4deea3d9f
2019-07-19 15:22:43 -04:00
zhangyangyang d29d1d1a9e Bump the openstackdocstheme extension to 1.20
Some options are now automatically configured by the version 1.20:
- project
- html_last_updated_fmt
- latex_engine
- latex_elements
- version
- release.

Change-Id: I3a5c7e115d0c4f52b015d0d55eb09c9836cd2fe7
2019-07-19 18:30:16 +08:00
Stephen Finucane 102bc41f90 bindep: Remove dead markers
Ubuntu 12.04 is rather long in the tooth now. Remove the bindep markers
for it.

Change-Id: Ie5c2d7ab1e3e637a1d42712e22a7a6e6d6427020
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2019-07-18 11:27:13 +01:00
Zuul 15e082a2a8 Merge "vif: Remove dead minimum libvirt checks" 2019-07-17 20:00:26 +00:00
Matt Riedemann 11cb42f396 Restore RT.old_resources if ComputeNode.save() fails
When starting nova-compute for the first time with a new node,
the ResourceTracker will create a new ComputeNode record in
_init_compute_node but without all of the fields set on the
ComputeNode, for example "free_disk_gb".

Later _update_usage_from_instances will set some fields on the
ComputeNode record (even if there are no instances on the node,
why - I don't know) like free_disk_gb.

This will make the eventual call from _update() to _resource_change()
update the value in the old_resouces dict and return True, and then
_update() will try to update those ComputeNode changes to the database.
If that update fails, for example due to a DBConnectionError, the
value in old_resources will still be for the current version of the node
in memory but not what is actually in the database.

Note that this failure does not result in the compute service failing
to start because ComputeManager._update_available_resource_for_node
traps the Exception and just logs it.

A subsequent trip through the RT._update() method - because of the
update_available_resource periodic task - will call _resource_change
but because old_resource matches the current state of the node, it
returns False and the RT does not attempt to persist the changes to
the DB. _update() will then go on to call _update_to_placement
which will create the resource provider in placement along with its
inventory, making it potentially a candidate for scheduling.

This can be a problem later in the scheduler because the
HostState._update_from_compute_node method may skip setting fields
on the HostState object if free_disk_gb is not set in the
ComputeNode record - which can then break filters and weighers
later in the scheduling process (see bug 1834691 and bug 1834694).

The fix proposed here is simple: if the ComputeNode.save() in
RT._update() fails, restore the previous value in old_resources
so that the subsequent run through _resource_change will compare the
correct state of the object and retry the update.

An alternative to this would be killing the compute service on startup
if there is a DB error but that could have unintended side effects,
especially if the DB error is transient and can be fixed on the next
try.

Obviously the scheduler code needs to be more robust also, but those
improvements are left for separate changes related to the other bugs
mentioned above.

Also, ComputeNode.update_from_virt_driver could be updated to set
free_disk_gb if possible to workaround the tight coupling in the
HostState._update_from_compute_node code, but that's also sort of
a whack-a-mole type change best made separately.

Change-Id: Id3c847be32d8a1037722d08bf52e4b88dc5adc97
Closes-Bug: #1834712
2019-07-17 10:29:10 +01:00
Zuul b7c98befda Merge "vif: Resolve a TODO and update another" 2019-07-17 04:15:54 +00:00
Zuul ecb8c9c5ea Merge "vif: Stop using getattr for VIF lookups" 2019-07-16 20:47:08 +00:00
Zuul 9f9ff03f0a Merge "vif: Remove 'plug_vhostuser', 'unplug_vhostuser'" 2019-07-16 20:03:32 +00:00
Zuul b61c14f362 Merge "Use Adapter global_request_id kwarg" 2019-07-16 19:11:29 +00:00
Zuul cea01e5f4b Merge "Defaults missing group_policy to 'none'" 2019-07-16 17:37:05 +00:00
Zuul 0a62d9765b Merge "Remove Rocky-era min compute trusted certs compat check" 2019-07-16 14:00:26 +00:00
Balazs Gibizer ad4f798362 Defaults missing group_policy to 'none'
If more than one numbered request group is in the placement a_c query
then the group_policy is mandatory. Based on the PTG discussion [1]
'none' seems to be a good default policy from nova perspective. So this
patch makes sure that if the group_policy is not provided in the flavor
extra_spec and there are more than one numbered group in the request and
the flavor only provide one or zero groups (so groups are coming from
other sources like neutron ports) then the group_policy is defaulted to
'none'.

The reasoning behind this change: If more than one numbered request
group is coming from the flavor extra_spec then the creator of the
flavor is responsible to add a group_policy to the flavor. So in this
nova only warns but let the request fail in placement to force the
fixing of the flavor. However when numbered groups are coming from
other sources (like neutron ports) then the creator of the flavor
cannot know if additional group will be included so we don't want to
force the flavor creator but simply default the group_policy.

[1] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005807.html

Change-Id: I0681de217ed9f5d77dae0d9555632b8d160bb179
2019-07-16 14:05:32 +02:00
Balazs Gibizer dc4ca77e4c Add 'resource_request' to neutronv2/constants
This patch replaces of the string usage with referring to the new
constant in neutronv2/constants.py

Change-Id: I6fc3cb0651f65b2448fbbb58989c920974f076c3
2019-07-16 13:23:36 +02:00
Balazs Gibizer d939eadd3b Use neutron contants in cmd/manage.py
Change-Id: Ief0305afd34546c12a50c308fc59eeb26d80eeca
2019-07-16 13:23:36 +02:00
Balazs Gibizer b7fbe96b0b Move consts from neutronv2/api to constants module
Change-Id: Ib083694d3881dee6b10c400239db90eca9163516
2019-07-16 11:10:11 +00:00
Zuul 03753b219e Merge "Add Python 3 Train unit tests" 2019-07-16 10:55:46 +00:00
Zuul a9741bcc77 Merge "doc: Add links to novaclient contributor guide" 2019-07-16 00:27:03 +00:00
Eric Fried 8068bb3f4a Use Adapter global_request_id kwarg
Release 3.15.0 of keystoneauth1 introduced the ability to pass
X-Openstack-Request-Id to request methods (get/put/etc) via a
global_request_id kwarg rather than having to put it in a headers dict.

This commit bumps the minimum ksa level to 3.15.0 and takes advantage of
the new kwarg to replace explicit header construction in
SchedulerReportClient (Placement) and neutronv2/api methods.

Also normalizes the way param lists were being passed from
SchedulerReportClient's REST primitives (get/put/post/delete) into the
Adapter equivalents. There was no reason for them to be different.

Change-Id: I2f6eb50f4cb428179ec788de8b7bd6ef9bbeeaf9
2019-07-15 14:30:35 -05:00
Balazs Gibizer 54dea2531c nova-manage: heal port allocations
Before I97f06d0ec34cbd75c182caaa686b8de5c777a576 it was possible to
create servers with neutron ports which had resource_request (e.g. a
port with QoS minimum bandwidth policy rule) without allocating the
requested resources in placement. So there could be servers for which
the allocation needs to be healed in placement.

This patch extends the nova-manage heal_allocation CLI to create the
missing port allocations in placement and update the port in neutron
with the resource provider uuid that is used for the allocation.

There are known limiations of this patch. It does not try to reimplement
Placement's allocation candidate functionality. Therefore it cannot
handle the situation when there is more than one RP in the compute
tree which provides the required traits for a port. In this situation
deciding which RP to use would require the in_tree allocation candidate
support from placement which is not available yet and 2) information
about which PCI PF an SRIOV port is allocated from its VF and which RP
represents that PCI device in placement. This information is only
available on the compute hosts.

For the unsupported cases the command will fail gracefully. As soon as
migration support for such servers are implemented in the blueprint
support-move-ops-with-qos-ports the admin can heal the allocation of
such servers by migrating them.

During healing both placement and neutron need to be updated. If any of
those updates fail the code tries to roll back the previous updates for
the instance to make sure that the healing can be re-run later without
issue. However if the rollback fails then the script will terminate with
an error message pointing to documentation that describes how to
recover from such a partially healed situation manually.

Closes-Bug: #1819923
Change-Id: I4b2b1688822eb2f0174df0c8c6c16d554781af85
2019-07-15 17:22:40 +02:00
Zuul 78f9961d29 Merge "Fix no propagation of nova context request_id" 2019-07-13 04:43:56 +00:00
Zuul 47a9829058 Merge "Remove needs:* todo from deprecated APIs api-ref" 2019-07-13 00:34:10 +00:00
Zuul ac40a4ee91 Merge "Revert resize: wait for events according to hybrid plug" 2019-07-12 17:25:51 +00:00
Stephen Finucane fea589cfe2 vif: Remove dead minimum libvirt checks
Our minimum for libvirt is now 3.0.0 while our QEMU minimum is 2.8.0,
meaning some checks for older versions can be removed.

Change-Id: Ibecdfb1e903d3c1f711e1d61212be00176110a9b
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2019-07-12 16:39:58 +01:00
Stephen Finucane 3f56e44b84 vif: Resolve a TODO and update another
Once TODO noted that a block could be removed once we bump to libvirt
1.3.8 or greater. We require 3.0.0 now so that's resolved. Another one
looks like it should be resolved in 3.2.0 so the TODO is updated to
highlight this for future reviewers.

Change-Id: I5235751b1dbc77ecc919eec7f3e022cd70085051
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2019-07-12 16:39:58 +01:00
Stephen Finucane b4f3b2b09f vif: Stop using getattr for VIF lookups
'getattr' is really powerful and we make extensive use of it in nova.
However, the way we've used it for VIF lookups, where we use it to
retrieve functions by a key, seems to be a bit of an anti-pattern. Not
only does it completely break static code coverage analysers that we can
use to help us root out code that's not tested (or is tested but is
never used in production) but, more importantly, it makes it so much
more difficult to figure out what on earth is going on in an already
complex part of the codebase.

Be verbose and, in the absence of a true switch statement in Python, use
simple if-else blocks to do the lookups. Due to how this is done, we're
able to remove a few previously no-op functions. Funnily enough, this
actually results in fewer LOC despite being more explicit. #winning?

Change-Id: Idf08adca1e3a0d19e20bca2447c83f7372516cb7
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2019-07-12 16:39:58 +01:00
Stephen Finucane 17bc342c33 vif: Remove 'plug_vhostuser', 'unplug_vhostuser'
These will never be reached since the '_nova_to_osvif_vif_vhostuser'
function in the 'nova.network.os_vif_util' provides as fallthrough case
since change Ifab3006454708ab290b93f02d82b794c334c3946.

Change-Id: I14ab55178692ff13df114a4c628430561df1a55e
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2019-07-12 16:39:58 +01:00
Zuul ff0feed25d Merge "Add host and hypervisor_hostname flag to create server" 2019-07-11 18:35:16 +00:00
Takashi NATSUME e11b589474 Fix no propagation of nova context request_id
Nova context request_id is not propagated for
port binding operations in neutron.
So fix it.

Change-Id: I76163c46b1f01ba7ff592d162b106ea2e5bb34cb
Closes-Bug: #1829914
2019-07-11 22:00:29 +09:00
Artom Lifshitz 7a7a223602 Revert resize: wait for events according to hybrid plug
Since 4817165fc5, when reverting a
resized instance back to the source host, the libvirt driver waits for
vif-plugged events when spawning the instance. When called from
finish_revert_resize() in the source compute manager, libvirt's
finish_revert_migration() does not pass vifs_already_plugged to
_create_domain_and_network(), making the latter use the default False
value.

When the source compute manager calls
network_api.migrate_instance_finish() in finish_revert_resize(), this
updates the port binding back to the source host. If Neutron is
configured to use OVS hybrid plug, it will send the vif-plugged event
immediately after completing this request. This happens before the
virt driver's finish_revert_migration() method is called. This causes
the wait in the libvirt driver to time out because the event is
received before Nova starts waiting for it.

The neutron ovs l2 agent sends vif-plugged events when two conditions
are met. First the port must be bound to the host managed by the
l2 agent and second, the agent must have completed configuring the
port on ovs. This involves assigning the port a local VLAN for tenant
isolation, applying security group rules if required and applying
QoS policies or other agent extensions like service function chaining.

During the boot process, we bind the port first to the host
then plug the interface into ovs which triggers the l2 agent to
configure it resulting in the emission of the vif-plugged event.
In the revert case, as noted above, since the vif is already plugged
on the source node when hybrid-plug is used, binding the port to the
source node fulfils the second condition to send the vif-plugged event.

Events sent immediately after port binding update are hereafter known
as "bind-time" events. For ports that do not use OVS hybrid plug,
Neutron will continue to send vif-plugged events only when Nova
actually plugs the VIF. These types of events are hereafter known as
"plug-time" events. OVS hybrid plug is a per agent setting, so for
a particular host, bind-time events are an all-or-nothing thing for the
ovs backend: either all VIF_TYPE=ovs ports have them, or no ovs ports
have them. In general, a host will only have one network backend.
The only exception to this is SR-IOV. SR-IOV is commonly deployed on
the same host as other network backends such as OVS or linuxbridge.
SR-IOV ports with VNIC_TYPE=direct-physical will always have only
bind-time events. If an instance mixes OVS ports with hybrid-plug=False
with direct physical ports, it will have both kinds of events.

For same host resize reverts we do not update the binding host as the
host does not change, as such for same host resize we do not receive
bind time events. For same host revert we therefore do not wait for
bind time events in the compute manager.

This patch adds functions to the NetworkInfo model that return what
kinds of events each VIF has. These are then used in the migration
revert logic to decide when to wait for external events: in the
compute manager, when binding the port, for bind-time events,
and/or in libvirt, when plugging the VIFs, for plug-time events.

Closes-bug: #1832028
Closes-Bug: #1833902
Co-Authored-By: Sean Mooney <work@seanmooney.info>
Change-Id: I51673e58fc8d5f051df911630f6d7a928d123a5b
2019-07-10 19:56:31 -04:00
Zuul bea9058f02 Merge "nova-lvm: Disable [validation]/run_validation in tempest.conf" 2019-07-10 22:58:25 +00:00
Zuul 470b557085 Merge "docs: Correct issues with 'openstack quota set' commands" 2019-07-10 21:26:52 +00:00
Zuul 2722cab1af Merge "libvirt: Remove unreachable native QEMU iSCSI initiator config code" 2019-07-10 17:32:48 +00:00
Zuul 8260979b71 Merge "Remove 'nova.virt.libvirt.compat'" 2019-07-10 17:06:39 +00:00
Zuul c8488c3117 Merge "db: Add vpmems to instance_extra" 2019-07-10 16:48:07 +00:00
Stephen Finucane 426790237e docs: Correct issues with 'openstack quota set' commands
Change Ic857918b15496049b5ccacde9515f130cc0bd7e9 against
openstack-manuals updated the quotas document to use openstackclient
commands in place of novaclient commands. It missed the fact that you
need to pass the '--class' parameter if you wish to set a quota for a
class rather than a project. Correct this.

Change-Id: I5dc65924fee65f6340d1495a9b1b992001c30731
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Closes-Bug: #1834057
2019-07-10 17:41:57 +01:00
Zuul 27fc32deca Merge "Perf: Use dicts for ProviderTree roots" 2019-07-10 15:21:03 +00:00
Takashi NATSUME de31466fdb doc: Fix a parameter of NotificationPublisher
The 'binary' parameter has been changed to the 'source'
since I95b5b0826190d396efe7bfc017f6081a6356da65.
But the notification document has not been updated yet.

Replace the 'binary' parameter with the 'source' parameter.

Change-Id: I141c90ac27d16f2e9c033bcd2f95ac08904a2f52
Closes-Bug: #1836005
2019-07-10 16:13:51 +09:00
Zuul 93cae754cf Merge "doc: Replace a wiki link with API ref guide link" 2019-07-09 23:23:33 +00:00
Zuul 79863472d6 Merge "libvirt: remove unused Service.get_by_compute_host mocks" 2019-07-09 22:49:36 +00:00
Zuul b233aeddf1 Merge "Fix GET /servers/detail host_status performance regression" 2019-07-09 22:49:16 +00:00
Takashi NATSUME 09d24e5ffa doc: Add links to novaclient contributor guide
Add links to the document for adding a new microversion support
in python-novaclient.

Depends-On: https://review.opendev.org/667002
Change-Id: Ic58afe401464a0da2b19306e7cc6ce412f177b16
2019-07-09 17:33:47 -04:00