Commit Graph

48902 Commits

Author SHA1 Message Date
Eric Fried 112cd9cd1f Proper error handling by _ensure_resource_provider
Previously, if _ensure_resource_provider encountered any error from the
placement REST API, it would (sometimes log a message and) return None.

Furthermore, a name conflict while creating the provider was treated the
same as a UUID conflict, which would actually result in None being
returned.

With this change set, the error paths that previously returned None now
raise one of the new ResourceProviderRetrievalFailed or
ResourceProviderCreationFailed exceptions; and the name conflict path is
detected and treated as an error condition.

Note: This change set only touches the SchedulerReportClient side of
these error conditions - it makes no attempt to add error handling to
its callers.  Case in point, the API samples tests needed fixing because
they were previously running into the name conflict error condition, but
not noticing.  As currently implemented, the new exceptions will
percolate up to ComputeManager.update_available_resource_for_node like
any others coming from SchedulerReportClient, where they will be logged
and ignored.

Change-Id: I0c4ca6a81f213277fe7219cb905a805712f81e36
Closes-Bug: #1735430
2017-11-30 18:30:49 +00:00
Eric Fried 0792d7ad5b Nix log translations from scheduler.client.report
Remove log translation macros from nova/scheduler/client/report.py.

Also correct a couple of docstring typos.

Change-Id: I8d6ab08781fd132afc4db9da4c4c179617d6659f
2017-11-30 10:22:07 -06:00
Zuul 16a968363d Merge "Fix ValueError when loading old pci device record" 2017-11-30 12:21:33 +00:00
Zuul daa1cd6d76 Merge "XenAPI: resolve VBD unplug failure with VM_MISSING_PV_DRIVERS error" 2017-11-30 09:46:58 +00:00
Zuul 54f45b530c Merge "Enable cold migration with target host(2/2)" 2017-11-30 07:38:03 +00:00
Takashi NATSUME d2ce4ca9ec Enable cold migration with target host(2/2)
This function enables users to specify a target host
when cold migrating a VM instance.

This patch modifies the migration API.

APIImpact
    Add an optional parameter 'host' in cold migration action.

Change-Id: Iee356c4dd097c846b6ca8617ead6a061300c83f8
Implements: blueprint cold-migration-with-target-queens
2017-11-29 20:48:16 -05:00
Zuul b9c35aea70 Merge "Remove unnecessary self.flags and ConfPatcher" 2017-11-29 22:57:26 +00:00
Zuul ad2b8f0402 Merge "Add 'all_tenants' for GET sec group api ref" 2017-11-29 22:53:55 +00:00
Zuul faacfeb076 Merge "libvirt: do unicode conversion for error messages." 2017-11-29 20:05:36 +00:00
Zuul 690f8e4e24 Merge "Remove deprecated TrustedFilter" 2017-11-29 20:05:26 +00:00
Zuul b9c19ac8eb Merge "placement: add nested resource providers" 2017-11-29 19:55:39 +00:00
Matt Riedemann e727437b0c Fix ValueError when loading old pci device record
Old pci_devices records might not have a uuid value set
and when we load those out of the database, the
PciDevice._from_db_object code was blindly trying to set
the PciDevice.uuid field to None, which fails because the
PciDevice.uuid field is not nullable.

This change fixes the problem by skipping the 'uuid' field
if it's not set in the db record so that we can auto-generate
a uuid later and update the object with it, which also performs
our online data migration.

This is similar to how we handle the uuid online migration for
other objects like compute nodes, services and migrations.

Change-Id: I5de0979e280004c1ce0acc99d69cc96089a704f8
Closes-Bug: #1735188
2017-11-29 12:49:49 -05:00
OpenStack Proposal Bot f4c436c0cf Updated from global requirements
Change-Id: I3129f3c1b41a68d6c03d43f7bc1dba4a1d2b882d
2017-11-29 09:08:01 +00:00
Zuul 7e2d8cb89b Merge "PowerVM support matrix update" 2017-11-29 06:41:57 +00:00
Zuul 16ee32b0fd Merge "Updated from global requirements" 2017-11-29 02:22:37 +00:00
Zuul 851bf6c122 Merge "Save updated libvirt domain XML after swapping volume" 2017-11-29 00:57:52 +00:00
Zuul dffd50df39 Merge "PowerVM Driver: config drive" 2017-11-29 00:57:39 +00:00
OpenStack Proposal Bot 0ae992bfb3 Updated from global requirements
Change-Id: Ie857878bbf063fa687cba2e745329c3fc9ae1f92
2017-11-28 23:07:37 +00:00
melanie witt 5b008c6540 Save updated libvirt domain XML after swapping volume
When a user calls the volume-update API, we swap_volume in the libvirt
driver from the old volume attachment to the new volume attachment.
Currently, we're saving the domain XML with the old configuration prior
to updating the volume and upon a soft-reboot request, it results in an
error:

  Instance soft reboot failed: Cannot access storage file <old path>

and falls back to a hard reboot, which is like pulling the power cord,
possibly resulting in file system inconsistencies.

This changes to saving the new, updated domain XML after the volume
swap.

Closes-Bug: #1713857

Change-Id: I166cde5ad8b00699e4ec02609f0d7b69236d855d
2017-11-28 21:29:33 +00:00
Zuul 05c6b54eec Merge "Enable cold migration with target host(1/2)" 2017-11-28 21:00:28 +00:00
Jay Pipes b10f11d7e8 placement: add nested resource providers
Adds initial support for storing the relationship between parent and
child resource providers. Nested resource providers are essential for
expressing certain types of resources -- in particular SR-IOV physical
functions and certain SR-IOV fully-programmable gate arrays. The
resources that these providers expose are of resource class
SRIOV_NET_VF and we will need a way of indicating that the physical
function providing these virtual function resources is tagged with
certain traits (representing vendor_id, product_id or the physical
network the PF is attached to).

The compute host is a resource provider which has an SR-IOV-enabled
physical function (NIC) as a child resource provider. The physical
function has an inventory containing some total amount of SRIOV_NET_VF
resources. These SRIOV_NET_VF resources are allocated to zero or more
consumers (instances) on the compute host.

                    compute host (parent resource provider)
                         |
                         |
                      SR-IOV PF  (child resource provider)
                         :
                        / \
                       /   \
                    VF1    VF2   (inventory of child provider)

The resource provider model gets two new fields:

 - root_provider_uuid: The "top" or "root" of the tree of nested
   providers
 - parent_provider_uuid: The immediate parent of the provider, or None
   if the provider is a root provider.

The database schema adds two new columns to the resource_providers
table that contain the internal integer IDs that correspond to the
user-facing UUID values:

 - root_provider_id
 - parent_provider_id

The root_provider_uuid field is non-nullable in the ResourceProvider
object definition, and this code includes an online data migration to
automatically populate the root_provider_id field with the value of the
resource_providers.id field for any resource providers already in the
DB.

The root_provider_id field value is populated automatically when a
provider is created. If the parent provider UUID is set, then the
root_provider_id is set to the root_provider_id value of the parent. If
parent is unset, root_provider_id is set to the value of the id
attribute of the provider being created. The corresponding UUID values
for root and parent provider are fetched in the queries that retrieves
resource provider data using two self-referential joins.

The root_provider_id column allows us to do extremely quick lookups of
an entire tree of providers without needing to perform any recursive
database queries.

Logic in this patch ensures that no resource provider can be deleted if
any of its children have any allocations active on them. We also check
to ensure that when created or updated, a resource provider's parent
provider UUID actually points to an existing provider.

It's important to point out that qualitative trait information is only
associated with a resource provider entity, not the resources that
resource provider has in its inventory. This is the reason why nested
resource providers are necessary. In the case of things like NUMA nodes
or SRIOV physical functions, if a compute host had multiple SRIOV
physical functions, each associated with a different network trait,
there would be no way to differentiate between the SRIOV_NET_VF
resources that those multiple SRIOV physical functions provided if the
containing compute host had a single inventory item containing the
total number of VFs exposed by both PFs.

Change-Id: I2d8df57f77a03cde898d9ec792c5d59b75f61204
blueprint: nested-resource-providers
Co-Authored-By: Moshe Levi <moshele@mellanox.com>
2017-11-28 15:29:28 -05:00
Zuul 379bbf5183 Merge "libvirt: remove extraneous retry assignment in cleanup method" 2017-11-28 20:26:18 +00:00
Zuul 82a6ca21ba Merge "Remove 'nova-manage quota refresh' command" 2017-11-28 20:02:40 +00:00
zhangyangyang aecc165a58 Remove deprecated TrustedFilter
The TrustedFilter and the related trusted_computing config options
were deprecated in Pike:

  If6e53feeb97e6050c1eb7962110ed89504c952fc

Co-Authored-By: Matt Riedemann <mriedem.os@gmail.com>

Change-Id: I0a7ab3a4fb2cfad567a8644bed4de574393ee11a
2017-11-28 14:54:31 -05:00
Zuul caff225a1a Merge "Regenerate and pass configdrive when rebuild Ironic nodes" 2017-11-28 19:44:41 +00:00
Zuul 5b5b5c8df3 Merge "Update the documentation links" 2017-11-28 19:09:16 +00:00
Zuul 96fcf0c942 Merge "[placement] Clean up TODOs in allocations.yaml gabbit" 2017-11-28 19:09:07 +00:00
Zuul a201ce1891 Merge "[placement] Fix GET PUT /allocations nits" 2017-11-28 19:01:43 +00:00
Zuul 403d13247e Merge "[placement] POST /allocations to set allocations for >1 consumers" 2017-11-28 19:01:37 +00:00
Zuul f0097ef7ed Merge "Add regression test for rebuild with new image doubling allocations" 2017-11-28 18:57:32 +00:00
Zuul 757626e781 Merge "Add instance action record for lock/unlock instances" 2017-11-28 18:57:10 +00:00
Zuul 0c5c964988 Merge "Implement query param schema for sec group APIs" 2017-11-28 18:56:55 +00:00
Zuul 42706270b9 Merge "Don't overwrite binding-profile" 2017-11-28 15:45:49 +00:00
Zuul ea51587508 Merge "Add instance action record for attach/detach/swap volumes" 2017-11-28 15:32:13 +00:00
Zuul 114817560b Merge "Update document related to host aggregate" 2017-11-28 15:26:28 +00:00
Chris Dent 453fd67da1 [placement] Fix GET PUT /allocations nits
In the review of I49f5680c15413bce27f2abba68b699f3ea95dcdc, a few
non-blocking nits were identified. This change addresses some of
those nits, fixing some typos, clarifying method names and what
microversion is in use at particular times.

Change-Id: Iff15340502ce43eba3b98db26aa0652b1da24504
2017-11-28 12:25:13 +00:00
Chris Dent 8caf4f5148 [placement] POST /allocations to set allocations for >1 consumers
This provides microversion 1.13 of the placement API, giving the
ability to POST to /allocations to set (or clear) allocations for
more than one consumer uuid.

It builds on the recent work to support a dict-based JSON format
when doing a PUT to /allocations/{consumer_uuid}.

Being able to set allocations for multiple consumers in one request
helps to address race conditions when cleaning up allocations during
move operations in nova.

Clearing allocations is done by setting the 'allocations' key for a
specific consumer to an empty dict.

Updates to placement-api-ref, rest version history and a reno are
included.

Change-Id: I239f33841bb9fcd92b406f979674ae8c5f8d57e3
Implements: bp post-allocations
2017-11-28 12:15:53 +00:00
Kevin_Zheng fbea321841 Add instance action record for lock/unlock instances
We currently don't record lock/unlock instance
actions. This is useful for auditing and debugging.

This patch adds instance lock/unlock actions.

Change-Id: I09fadf79aac1a74465af48015ef97d9e9d4ac580
partial-implements: blueprint fill-the-gap-for-instance-action-records
2017-11-28 19:22:15 +08:00
Zuul b60a599b5f Merge "[placement] Symmetric GET and PUT /allocations/{consumer_uuid}" 2017-11-28 11:09:03 +00:00
ghanshyam 37987ee385 Add 'all_tenants' for GET sec group api ref
GET /os-security-groups API accept 'all_tenants' [1]
as one of the query param to list all tenants sec groups.
But that is missing in api-ref [2]

..1
https://github.com/openstack/nova/blob/e9104dbaef9bbccc6b19811125d439fdf9558428/nova/network/security_group/neutron_driver.py#L178
https://github.com/openstack/nova/blob/e9104dbaef9bbccc6b19811125d439fdf9558428/nova/compute/api.py#L5096

..2 https://developer.openstack.org/api-ref/compute/#list-security-groups

Closes-Bug: #1734406

Change-Id: I2946f05716c9030f7880ac423cc64b49c04b2992
2017-11-28 05:37:15 +00:00
Zuul fa2c1567c1 Merge "Refined fix for validating image on rebuild" 2017-11-28 03:48:43 +00:00
Guoqiang Ding 66a44c95f1 Update the documentation links
The documentation about "ops-guide" has been moved.

Change-Id: I151d1f989cb032c3a3775e5bfffcec58a2cf0121
2017-11-28 11:07:36 +08:00
Kevin_Zheng 1cea4f0135 Add instance action record for attach/detach/swap volumes
We currently don't record volume attach/detach/swap instance
actions. This is useful for auditing and debugging.

This patch adds volume attach/detach/swap actions.

Change-Id: I0a3d15f3e3d0d8d920a79b519e17e3228e99f293
partial-implements: blueprint fill-the-gap-for-instance-action-records
2017-11-27 16:34:48 -05:00
Zuul 916795e0fd Merge "Updated from global requirements" 2017-11-27 20:57:50 +00:00
Matt Riedemann cacfd372ac Add regression test for rebuild with new image doubling allocations
Commit 984dd8ad6a makes a rebuild
with a new image go through the scheduler again to validate the
image against the instance.host (we rebuild to the same host that
the instance already lives on).

The problem is that change introduced a regression where the
FilterScheduler is going to think it's doing a resize to the same
host and double the allocations for the instance (and usage for the
compute node provider) in Placement, which is wrong since the
flavor is the same.

This adds a regression test to show the bug.

Change-Id: Ie0949b4e6101f0b29ec4542146d523a07a683991
Related-Bug: #1664931
2017-11-27 15:52:50 -05:00
Dan Smith f7c688b8ef Refined fix for validating image on rebuild
This aims to fix the issue described in bug 1664931 where a rebuild
fails to validate the existing host with the scheduler when a new
image is provided. The previous attempt to do this could cause rebuilds
to fail unnecessarily because we ran _all_ of the filters during a
rebuild, which could cause usage/resource filters to prevent an otherwise
valid rebuild from succeeding.

This aims to classify filters as useful for rebuild or not, and only apply
the former during a rebuild scheduler check. We do this by using an internal
scheduler hint, indicating our intent. This should (a) filter out
all hosts other than the one we're running on and (b) be detectable by
the filtering infrastructure as an internally-generated scheduling request
in order to trigger the correct filtering behavior.

Closes-Bug: #1664931
Change-Id: I1a46ef1503be2febcd20f4594f44344d05525446
2017-11-27 15:52:45 -05:00
Zuul b9d9de8962 Merge "Versioned notifications for service create and delete" 2017-11-27 19:41:04 +00:00
Zuul c6b352d1ce Merge "Implement query param schema for delete assisted vol" 2017-11-27 19:40:53 +00:00
Zuul bfe0130763 Merge "Add ProviderSummary.resource_class_names @property" 2017-11-27 19:23:18 +00:00
Zuul 43d28c2aab Merge "required traits for no sharing providers" 2017-11-27 19:23:03 +00:00