Commit Graph

52917 Commits

Author SHA1 Message Date
Zuul ff37d50c06 Merge "Remove ironic/pike note from *_allocation_ratio help" 2018-12-06 23:44:32 +00:00
Zuul c9dca64fa6 Merge "Delete NeutronLinuxBridgeInterfaceDriver" 2018-12-06 11:28:58 +00:00
Zuul 66a030528f Merge "Clean up cpu_shared_set config docs" 2018-12-06 11:28:53 +00:00
Zuul 363415a8a8 Merge "modify the avaliable link" 2018-12-06 07:41:34 +00:00
Zuul 6ddc903a41 Merge "Update mailinglist from dev to discuss" 2018-12-06 07:41:24 +00:00
Zuul aed3e24013 Merge "Add a bug tag for nova doc" 2018-12-05 17:59:27 +00:00
Zuul 5bf6f6304e Merge "Deprecate the nova-xvpvncproxy service" 2018-12-05 13:18:41 +00:00
Zuul e26ac8f24a Merge "Deprecate the nova-console service" 2018-12-05 13:05:06 +00:00
Zuul 3a8dd02c81 Merge "Always read-deleted=yes on lazy-load" 2018-12-05 10:54:45 +00:00
Zuul 4a688c3387 Merge "Remove utils.execute() calls from xenapi." 2018-12-05 03:21:24 +00:00
ZhongShengping ba0502182e Update mailinglist from dev to discuss
openstack-dev was decomissioned this night in https://review.openstack.org/621258
Update openstack-dev to openstack-discuss

Change-Id: If51f5d5eb710e06216f6d6981a70d70b6b5783cc
2018-12-05 09:44:35 +08:00
Zuul 5f648dda49 Merge "Refactor TestEvacuateDeleteServerRestartOriginalCompute" 2018-12-04 07:49:27 +00:00
Michael Still 38343cb1b2 Remove utils.execute() calls from xenapi.
They do not please me.

Change-Id: Ibe2f478288db42f8168b52dfc14d85ab92ace74b
2018-12-04 16:04:30 +11:00
Zuul 3ce9aa0192 Merge "Fix InstanceNotFound during _destroy_evacuated_instances" 2018-12-04 03:34:40 +00:00
Zuul 33c3759b85 Merge "SIGHUP n-cpu to clear provider tree cache" 2018-12-04 02:23:39 +00:00
Michael Still 1e8c2c0dcb Fix sloppy initialization of the new disk ops semaphore.
Some tests weren't calling init_host, so the semaphore was None.
This caused the smoke to come out of nova's tests in ways that
would be less confusing if they'd failed during the testing of
the implementing patch.

Instead, set the semaphore to being unbounded, and then override
that later if the user has in fact specified a limit. This relies
on init_host being called very early, but that should be true
already.

Change-Id: If144be253f78b14cef60200a46aefc02c0e19ced
Closes-Bug: #1806123
2018-12-03 10:19:22 +11:00
Zuul 288c537fcd Merge "Revert "Add regression test for bug 1550919"" 2018-12-01 05:23:43 +00:00
Zuul 3c4018d37d Merge "Fix misuse of assertTrue" 2018-12-01 05:07:18 +00:00
Matt Riedemann 90d16c270a Revert "Add regression test for bug 1550919"
This reverts commit bbe88786fc.

The new tests are racy and causing a modest amount of
failures in the gate since the change merged, so it is
probably best to just revert the tests so they can be
robustified.

Change-Id: I18bd68ba6e59aba4c450eb85e6f4450d7044b1e9
Related-Bug: #1806126
2018-11-30 21:15:33 +00:00
Zuul 8446a1e58d Merge "Add I/O Semaphore to limit concurrent disk ops" 2018-11-30 03:25:18 +00:00
Eric Fried bbc2fcb8fb SIGHUP n-cpu to clear provider tree cache
An earlier change [1] allowed
[compute]resource_provider_association_refresh to be set to zero to
disable the resource tracker's periodic refresh of its local copy of
provider traits and aggregates. To allow for out-of-band changes to
placement (e.g. via the CLI) to be picked up by the resource tracker in
this configuration (or a configuration where the timer is set to a high
value) this change clears the provider tree cache when SIGHUP is sent to
the compute service. The next periodic will repopulate it afresh from
placement.

[1] Iec33e656491848b26686fbf6fb5db4a4c94b9ea8

Change-Id: I65a7ee565ca5b3ec6c33a2fd9e39d461f7d90ed2
2018-11-29 15:42:08 -06:00
Takashi NATSUME 96b5ef3456 Fix misuse of assertTrue
If the first argument of assertTrue is True,
the assertion is always passed.
Fix it because it is useless.

Change-Id: Ie954fc770c61956a80d472190e97646a39b7420f
Closes-Bug: #1805800
2018-11-29 09:52:19 +00:00
Eric Fried 8c318d0fb2 Remove get_node_uuid
get_node_uuid was added in [1] and it was used [2], but that code was
removed in Stein [3].

[1] I982b211e0315bdb9a816f346fafffd0f70e46d07
[2] https://github.com/openstack/nova/blob/76136bfb01076da37351aaf11751dd557cb97ca4/nova/compute/manager.py#L3939
[3] I0851e2d54a1fdc82fe3291fb7e286e790f121e92

Change-Id: I3cd3565b6651677552d8a27c9f7054b0322055fb
2018-11-28 16:24:05 -06:00
Zuul 3b2e42f371 Merge "Give drop_move_claim() correct docstring" 2018-11-28 18:50:50 +00:00
Zuul 62245235bc Merge "Add regression test for bug 1550919" 2018-11-28 00:05:06 +00:00
Dan Smith 604819b29c Always read-deleted=yes on lazy-load
For some reason we were only reading deleted instances when loading generic
fields and not things like flavor. That weird behavior isn't very helpful,
so this makes us always read deleted for that case. Some of the fields, like
tags, will short-circuit that and just immediately lazy-load an empty set.
But for anything else, we should allow reading that data if it's still there.

With this change, we are able to remove a specific read_deleted='yes' usage
from ComputeManager._destroy_evacuated_instances() which is handled with
the generic solution. TestEvacuateDeleteServerRestartOriginalCompute asserts
that the evacuate scenario is still fixed.

Related-Bug: #1794996
Related-Bug: #1745977

Change-Id: I8ec3a3a697e55941ee447d0b52d29785717e4bf0
2018-11-27 12:42:48 -05:00
Matt Riedemann 92dbeae1d4 Refactor TestEvacuateDeleteServerRestartOriginalCompute
This moves _check_allocation_during_evacuate into the
ProviderUsageBaseTestCase base class and drops the
overridden methods from TestEvacuateDeleteServerRestartOriginalCompute.

Change-Id: I6a084031c1d3ffa72b09d2194c44cdd80cc875fa
2018-11-27 12:42:48 -05:00
Matt Riedemann 05cd8d1282 Fix InstanceNotFound during _destroy_evacuated_instances
The _destroy_evacuated_instances method on compute
startup tries to cleanup guests on the hypervisor and
allocations held against that compute node resource
provider by evacuated instances, but doesn't take into
account that those evacuated instances could have been
deleted in the meantime which leads to a lazy-load
InstanceNotFound error that kills the startup of the
compute service.

This change does two things in the _destroy_evacuated_instances
method:

1. Loads the evacuated instances with a read_deleted='yes'
   context when calling _get_instances_on_driver(). This
   should be fine since _get_instances_on_driver() is already
   returning deleted instances anyway (InstanceList.get_by_filters
   defaults to read deleted instances unless the filters tell
   it otherwise - which we don't in this case). This is needed
   so that things like driver.destroy() don't raise
   InstanceNotFound while lazy-loading fields on the instance.

2. Skips the call to remove_allocation_from_compute() if the
   evacuated instance is already deleted. If the instance is
   already deleted, its allocations should have been cleaned
   up by its hosting compute service (or the API).

The functional regression test is updated to show the bug is
now fixed.

Change-Id: I1f4b3540dd453650f94333b36d7504ba164192f7
Closes-Bug: #1794996
2018-11-27 12:42:48 -05:00
Artom Lifshitz 3e32e76d83 Give drop_move_claim() correct docstring
Previously, there was a just a comment about removing usage on the
destination node. This is incorrect: usage is removed on the compute
host specified by the nodename parameter to the method. This patch
corrects this in a proper docstring.

Change-Id: I2f676966136a78bb9600626852584f838cb08c5b
2018-11-26 19:34:09 -05:00
zhufl 8545ba2af7 Add missing ws seperator between words
This is to add missing ws seperator between words, usually
in log messages.

Change-Id: I71bf4c5b5be4dbc89a28bf243b7d11cf1d612ab4
2018-11-26 23:42:18 +00:00
Zuul c1de096098 Merge "Add debug logs when doubling-up allocations during scheduling" 2018-11-26 22:14:06 +00:00
Matt Riedemann 5d536b0d3a Remove ironic/pike note from *_allocation_ratio help
The note in the cpu/ram/disk allocation ratio config
option help was referring to commit e7840cdf1 from Pike
when the ironic driver reported allocation_ratio=1.0
for VCPU/MEMORY_MB/DISK_GB resource inventory.

That code was removed in commit a985e34cd so we can
remove the related note from the config option help as
it no longer applies.

Change-Id: Ifd9dba0c24fde25d54761077c1374313019af1d8
2018-11-26 15:56:58 -05:00
Takashi NATSUME 168704349b Add a bug tag for nova doc
Add a default bug tag for nova doc in doc/source/conf.py.
The 'doc' tag(*) should be set for document bug
in bug reports by default.

*: https://wiki.openstack.org/wiki/Nova/BugTriage#Tag_Owner_List

TrivialFix
Change-Id: Ib2de207d368d248464770fd0a9452e325f0a0596
2018-11-26 04:10:44 +00:00
Zuul 594c653dc1 Merge "Add HPET timer support for x86 guests" 2018-11-24 16:50:57 +00:00
Zuul 1a1ea8e2aa Merge "Use long_rpc_timeout in select_destinations RPC call" 2018-11-21 23:51:14 +00:00
Zuul a5d63f7e9e Merge "Make supports_direct_io work on 4096b sector size" 2018-11-21 22:39:33 +00:00
Zuul 1d444704a2 Merge "Default embedded instance.flavor.is_public attribute" 2018-11-21 21:20:00 +00:00
Jack Ding 728f20e8f4 Add I/O Semaphore to limit concurrent disk ops
Introduce an I/O semaphore to limit the number of concurrent
disk-IO-intensive operations. This could reduce disk contention from
image operations like image download, image format conversion, snapshot
extraction, etc.

The new config option max_concurrent_disk_ops can be set in nova.conf
per compute host and would be virt-driver-agnostic. It is default to 0
which means no limit.

blueprint: io-semaphore-for-concurrent-disk-ops
Change-Id: I897999e8a4601694213f068367eae9608cdc7bbb
Signed-off-by: Jack Ding <jack.ding@windriver.com>
2018-11-21 15:57:11 -05:00
Zuul 7217e38baf Merge "Remove v1 check in Cinder client version lookup" 2018-11-21 05:35:43 +00:00
Zuul 208db51fa1 Merge "Consider root id is None in the database case" 2018-11-21 02:00:01 +00:00
Jack Ding 9e884de68a Add HPET timer support for x86 guests
This commit adds support for the High Precision Event Timer (HPET) for
x86 guests in the libvirt driver. The timer can be set by image property
'hw_time_hpet'. By default it remains turned off. When it is turned on
the HPET timer is activated in libvirt.

If the image property 'hw_time_hpet' is incorrectly set to a
non-boolean, the HPET timer remains turned off.

blueprint: support-hpet-on-guest
Change-Id: I3debf725544cae245fd31a8d97650392965d480a
Signed-off-by: Jack Ding <jack.ding@windriver.com>
2018-11-20 22:39:37 +00:00
Zuul 47bcc39cd6 Merge "Add CellsV2 FAQ about API design decisions" 2018-11-20 15:55:36 +00:00
Zuul 72978c0758 Merge "Add description of custom resource classes" 2018-11-20 15:55:29 +00:00
Tetsuro Nakamura cdbedac920 Consider root id is None in the database case
There are cases where ``root_provider_id`` of a resource provider is
set to NULL just after it is upgraded to the Rocky release. In such
cases getting allocation candidates raises a Keyerror.

This patch fixes that bug for cases there is no sharing or nested
providers in play.

Change-Id: I9639d852078c95de506110f24d3f35e7cf5e361e
Closes-Bug:#1799892
2018-11-20 14:53:59 +00:00
Sean McGinnis 82c5f9b239 Remove v1 check in Cinder client version lookup
The Cinder v1 API was deprecated in Juno on removed completely in
Queens. We no do not support compatibility between Stein Nova and Queens
Cinder, so this checking can be removed.

Change-Id: I947f50e921159f66b425f10e31a08a3e0840228e
Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
2018-11-20 14:52:12 +00:00
Dan Smith dc7039669f Add CellsV2 FAQ about API design decisions
At the Stein summit (and previous discussions) the topic of exposing
cellsv2 out of the API came up again. This patch adds two FAQ entries
reflecting my notes from early design decisions about why we did not
want to do that, along with more recent examples, such as FFU.

These are my feelings on the subject and I was asked to put these into
FAQ form for posterity to make the discussion easier in the future. I
would recommend that we agree on these and then codify them here.

Change-Id: I0499e141456fcca63f95bad25503c4e86c6aa369
2018-11-20 06:44:59 -08:00
Matt Riedemann 5af632e9ca Use long_rpc_timeout in select_destinations RPC call
Conductor RPC calls the scheduler to get hosts during
server create, which in a multi-create request with a
lot of servers and the default rpc_response_timeout, can
trigger a MessagingTimeout. Due to the old
retry_select_destinations decorator, conductor will retry
the select_destinations RPC call up to max_attempts times,
so thrice by default. This can clobber the scheduler and
placement while the initial scheduler worker is still
trying to process the beefy request and allocate resources
in placement.

This has been recreated in a devstack test patch [1] and
shown to fail with 1000 instances in a single request with
the default rpc_response_timeout of 60 seconds. Changing the
rpc_response_timeout to 300 avoids the MessagingTimeout and
retry loop.

Since Rocky we have the long_rpc_timeout config option which
defaults to 1800 seconds. The RPC client can thus be changed
to heartbeat the scheduler service during the RPC call every
$rpc_response_timeout seconds with a hard timeout of
$long_rpc_timeout. That change is made here.

As a result, the problematic retry_select_destinations
decorator is also no longer necessary and removed here. That
decorator was added in I2b891bf6d0a3d8f45fd98ca54a665ae78eab78b3
and was a hack for scheduler high availability where a
MessagingTimeout was assumed to be a result of the scheduler
service dying so retrying the request was reasonable to hit
another scheduler worker, but is clearly not sufficient
in the large multi-create case, and long_rpc_timeout is a
better fit for that HA type scenario to heartbeat the scheduler
service.

[1] https://review.openstack.org/507918/

Change-Id: I87d89967bbc5fbf59cf44d9a63eb6e9d477ac1f3
Closes-Bug: #1795992
2018-11-20 09:03:53 -05:00
Zuul ea26392239 Merge "Nix refs to ResourceProvider obj from libvirt UT" 2018-11-20 11:29:56 +00:00
Zuul ab78eb2c79 Merge "Fix server query examples" 2018-11-20 04:57:11 +00:00
Takashi NATSUME 54d3745101 Fix server query examples
The 'locked' query parameter is not supported
in the "List Servers Detailed" API.
So replace examples using the 'locked' query parameter
with examples using another query parameters.

Change-Id: Ibcea6147dd6716ad544e7ac5fa0df17f8c397a28
Closes-Bug: #1801904
2018-11-19 23:22:39 +00:00