Commit Graph

56451 Commits

Author SHA1 Message Date
Brian Rosmaita bc29084012 Absolutely-non-inheritable image properties
Inheritance of image properties from the image an instance was booted
from to an image created from that instance is governed by the
non_inheritable_image_properties configuration option.  However, there
are some image properties (for example, those used for image signature
validation or to reference a cinder encryption key id) which it makes
no sense to inherit under any circumstances.  Additionally,
misconfiguration of the non-inheritable properties can lead to data
loss under the circumstances described in Bug #1852106.  So it would
be better if these properties were not subject to configuration.

The initial set of absolutely non-inheritable image properties
consists of those associated with cinder encryption keys and image
signature validation.

Change-Id: I4332b9c343b6c2b50226baa8f78396c2012dabd1
Closes-bug: #1852106
2020-02-17 10:13:54 -05:00
Zuul 1fcd74730d Merge "Fix instance.hidden migration and querying" 2020-02-08 17:02:17 +00:00
Dan Smith 001f3a7bfe Fix instance.hidden migration and querying
It was discovered that default= on a Column definition in a schema migration
will attempt to update the table with the provided value, instead of just
translating on read, which is often the assumption. The Instance.hidden=False
change introduced in Train[1] used such a default on the new column, which caused
at least one real-world deployment to time out rewriting the instances table
due to size. Apparently SQLAlchemy-migrate also does not consider such a timeout
to be a failure and proceeds on. The end result is that some existing instances
in the database have hidden=NULL values, and the DB model layer does not convert
those to hidden=False when we read/query them, causing those instances to be
excluded from the API list view.

This change alters the 399 schema migration to remove the default=False
specification. This does not actually change the schema, but /will/ prevent
users who have not yet upgraded to Train from rewriting the table.

This change also makes the instance_get_all_by_filters() code handle hidden
specially, including false and NULL in a query for non-hidden instances.

A future change should add a developer trap test to ensure that future migrations
do not add default= values to new columns to avoid this situation in the future.

[1] Iaffb27bd8c562ba120047c04bb62619c0864f594

Change-Id: Iace3f653b42c20887b40ee0105c8e9a4edeff1f7
Closes-Bug: #1862205
2020-02-07 08:54:56 -08:00
Zuul 69ce0f01b6 Merge "nova-net: Update API reference guide" 2020-02-06 17:12:25 +00:00
Zuul 38320397b0 Merge "Don't error out on floating IPs without associated ports" 2020-02-06 17:12:20 +00:00
Lee Yarwood afebcdc950 libvirt: Rename _is_storage_shared_with to _is_path_shared_with
This method only checks if a specific path is shared between two hosts
and has been renamed accordingly to avoid confusion.

Additionally the shared_storage variable used to store the returned
value from this method within migrate_disk_and_power_off is renamed to
shared_instance_path.

Change-Id: I426de20864321d664d3fe0e08a14e1af509c8a2b
2020-02-06 11:09:00 +00:00
Stephen Finucane aec3ca0765 Don't error out on floating IPs without associated ports
Floating IPs don't have to have an associated port, so there's no reason
to error out when this is the case.

Change-Id: I50c79843bf819b731c597dbfe72090cdf02c7641
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Closes-bug: #1861876
2020-02-06 10:04:24 +00:00
Zuul 014c1ab864 Merge "Avoid calling neutron for N networks" 2020-02-06 06:50:58 +00:00
Zuul 2d7871792e Merge "Revert "nova shared storage: rbd is always shared storage"" 2020-02-06 03:00:27 +00:00
Zuul de3f76956b Merge "Handle neutron without the fip-port-details extension" 2020-02-05 21:35:14 +00:00
Zuul 8f9d3c1646 Merge "nova-net: Remove now unnecessary nova-net workaround" 2020-02-05 18:38:40 +00:00
Zuul 097c31d4b4 Merge "nova-net: Remove use of legacy 'SecurityGroup' object" 2020-02-05 12:25:16 +00:00
Zuul dc50c6a8d6 Merge "Minor improvements to cell commands" 2020-02-05 10:36:59 +00:00
Zuul 64cf40c955 Merge "nova-net: Remove use of legacy 'Network' object" 2020-02-04 22:42:49 +00:00
Vladyslav Drok 9f65599892 Minor improvements to cell commands
It adds a bit more verbosity to cell update command, so that it
is more obvious for an operator which values are being used as
transport URL or database connection URL for a cell.

Change-Id: Ie567ae7da4508a4b6f797d4bc77347c84702a74e
2020-02-04 19:09:29 +01:00
Stephen Finucane 3e79cb7577 Avoid calling neutron for N networks
In the worst case scenario, we could list N floating IPs, each of which
has a different network. This would result in N additional calls to
neutron - one for each of the networks. Avoid this by calling neutron
once for all networks associated with the floating IPs.

Change-Id: If067a730b0fcbe3f59f4472b00c690cc43be4b3b
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-02-04 17:37:06 +00:00
Stephen Finucane eef658bf53 Handle neutron without the fip-port-details extension
The 'fip-port-details' API extension was added to neutron in Rocky [1]
and is optional. As a result, we cannot rely on the 'port_details' field
being present in API responses. If it is not, we need to make a second
query for all ports and build 'port_details' using the 'port_id' field.

[1] https://docs.openstack.org/releasenotes/neutron-lib/rocky.html#relnotes-1-14-0-stable-rocky-new-features

Change-Id: Ifb96f31f471cc0a25c1dfce2161a669b97a384ae
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Closes-bug: #1861876
2020-02-04 17:36:24 +00:00
Zuul b42c54752f Merge "Avoid fetching metadata when no subnets found" 2020-02-04 12:58:47 +00:00
Zuul 1c368e30ce Merge "Enable live migration with qos ports" 2020-02-04 12:58:41 +00:00
Zuul 7601efa5e3 Merge "nova-net: Remove use of legacy 'FloatingIP' object" 2020-02-03 20:47:38 +00:00
Zuul 1d9a131707 Merge "Use common server create function for qos func tests" 2020-02-03 19:55:40 +00:00
Zuul 8b109a262e Merge "Remove extra instance.save() calls related to qos SRIOV ports" 2020-02-03 19:55:35 +00:00
Balazs Gibizer 64cdb82b99 Enable live migration with qos ports
Previous patches in the blueprint implemented the support for live
migraton with qos ports and added functional test coverage for the
various live migration scenarios. So this patch removes the API check
that rejected such operation and document the new feature.

Change-Id: Ib9ef18fff28c463c9ffe3607d93428b689dc89fb
blueprint: support-move-ops-with-qos-ports-ussuri
2020-02-03 11:43:12 +01:00
Balazs Gibizer 52a03b195e Use common server create function for qos func tests
I1222fc21bde4158df1db70370c7f3bd319ec081f added a common helper for
server creation. This patch updated the existing qos tests to use that
helper.

Change-Id: I017163c6cdf8727be9913a6870cd91fec5f4d568
blueprint: support-move-ops-with-qos-ports-ussuri
2020-02-03 11:43:12 +01:00
Balazs Gibizer 56f29b3e4a Remove extra instance.save() calls related to qos SRIOV ports
During creating or moving of an instance with qos SRIOV port the PCI
device claim on the destination compute needs to be restricted to select
PCI VFs from the same PF where the bandwidth for the qos port is
allocated from. This is achieved by updating the spec part of the
InstancePCIRequest with the device name of the PF by calling
update_pci_request_spec_with_allocated_interface_name(). Until now
such update of the instance object was directly persisted by the call.

During code review it was came up that the instance.save() in the util
is not appropriate as the caller has a lot more context to decide when
to persist the changes.

The original eager instance.save was introduced when support added to
the server create flow. Now I realized that the need for such save was
due to a mistake in the original ResourceTracker.instance_claim() call
that loads the InstancePCIRequest from the DB instead of using the
requests through the passed in instance object. By removing the extra DB
call the need for eagerly persisting the PCI spec update is also
removed. It turned out that both the server create code path and every
server move code paths eventually persist the instance object either
during at the end of the claim process or in case of live migration in
the post_live_migration_at_destination compute manager call. This means
that the code now can be simplified. Especially the live migration cases.

In the live migrate abort case we don't need to roll back the eagerly
persisted PCI change as now such change is only persisted at the end
of the migration but still we need to refresh pci_requests field of
the instance object during the rollback as that field might be stale,
containing dest host related PCI information.

Also in case of rescheduling during live migrate if the rescheduling
failed the PCI change needed to be rolled back to the source host by a
specific code. But now those change are never persisted until the
migration finishes so this rollback code can be removed too.

Change-Id: Ied8f96b4e67f79498519931cb6b35dad5288bbb8
blueprint: support-move-ops-with-qos-ports-ussuri
2020-02-03 11:41:38 +01:00
Stephen Finucane 4bdecee385 docs: Fix the monkeypatching of blockdiag
blockdiag has a longstanding bug whereby it tries to access an attribute
on an 'io.BufferedReader' that doesn't exist. We had previously fixed
this in change Ibd32d30aacae65702d0ccbdb8a02b1667ec4e8ee, which undid
the damage blockdiag was doing. However, this worked because the monkey
patching blockdiag does happens when the 'blockdiag.utils.compat' module
is loaded [1], which was happening implicitly with our import of
'blockdiag.imagedraw.textfolder' [2]. However, that module no longer
imports the 'compat' [3] module so this doesn't work. Fix the issue by
just importing the 'compat' module manually, triggering the monkey
patching...which we can then undo.

[1] https://github.com/blockdiag/blockdiag/blob/2.0.0/src/blockdiag/utils/compat.py#L19-L26
[2] https://github.com/blockdiag/blockdiag/blob/1.5.4/src/blockdiag/imagedraw/textfolder.py#L18
[3] https://github.com/blockdiag/blockdiag/tree/2.0.0/src/blockdiag/imagedraw/textfolder.py

Change-Id: Idacfff98842fde38fb39791090f2da3310b441b5
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-01-31 17:52:54 +00:00
Zuul c16315165c Merge "Use COMPUTE_SAME_HOST_COLD_MIGRATE trait during migrate" 2020-01-30 19:22:12 +00:00
Zuul b8f4e46939 Merge "zuul: Remove unnecessary 'USE_PYTHON3'" 2020-01-30 17:09:07 +00:00
Matt Riedemann 4921e822e7 Use COMPUTE_SAME_HOST_COLD_MIGRATE trait during migrate
This uses the COMPUTE_SAME_HOST_COLD_MIGRATE trait in the API during a
cold migration to filter out hosts that cannot support same-host cold
migration, which is all of them except for the hosts using the vCenter
driver.

For any nodes that do not report the trait, we won't know if they don't
because they don't support it or if they are not new enough to report
it, so the API has a service version check and will fallback to old
behavior using the config if the node is old. That compat code can be
removed in the next release.

As a result of this the FakeDriver capabilities are updated so the
FakeDriver no longer supports same-host cold migration and a new fake
driver is added to support that scenario for any tests that need it.

Change-Id: I7a4b951f3ab324c666ab924e6003d24cc8e539f5
Closes-Bug: #1748697
Related-Bug: #1811235
2020-01-29 09:44:47 +00:00
Zuul 9fa3600fca Merge "doc: define boot from volume in the glossary" 2020-01-29 03:52:50 +00:00
Zuul 838b364a6e Merge "Handle cell failures in get_compute_nodes_by_host_or_node" 2020-01-29 03:52:43 +00:00
Kobi Samoray 3177371568 Avoid fetching metadata when no subnets found
Metadata service uses the provider id to identify the requesting
instance.
However, when provider query doesn't find any networks, the request
should fail.
The same goes to the case where multiple ports are found.

Closes-Bug: #1841933
Change-Id: I8ce3703ca86a3a0769edd42a790d82796d1071d7
2020-01-28 14:35:52 +02:00
Zuul 80539a5e84 Merge "nova-net: Remove remaining nova-network quotas" 2020-01-27 12:33:15 +00:00
Zuul 575d988a2c Merge "Fix typos for update_available_resource reference" 2020-01-25 19:21:44 +00:00
Zuul 6db486e9fd Merge "libvirt: Add a default VirtIO-RNG device to guests" 2020-01-23 18:03:15 +00:00
Zuul 28639ffa22 Merge "Remove remaining Python 2.7-only dependencies" 2020-01-23 17:12:36 +00:00
Kashyap Chamarthy de512f2c02 libvirt: Add a default VirtIO-RNG device to guests
tl;dr: We're adding the default VirtIO-RNG device to ensure guests are
       not starved of entropy (and thus not hang) during boot time.

Background
----------

From Nova Git history, commit b94550f419 ("libvirt: configuration
element for a random number generator device") _did_ add a default RNG
device (but with its entropy source to the undesirable '/dev/random').
However, the default RNG device was immediately removed in another
commit (605677c -- "libvirt: remove explicit /dev/random rng default"),
with this rationale:

    libvirt (or rather qemu) will default to /dev/random if no rng device
    path is specified [...]

    It's preferable for us to not duplicate this default to allow for a
    future where libvirt or the hypervisor needs to make more intelligent
    decisions about the default device to use.

The above reasoning doesn't hold up, because:

(a) libvirt does not make "policy" decisions, such as choosing an
    entropy source (or any other such).  Therefore Nova, as a management
    application, should make the decision here.

(b) More importantly, when QEMU exposes a VirtIO-RNG device to the
    guest, that device needs a source of entropy; and QEMU by default
    uses the legacy and problematic `/dev/random` as the source —
    instead of the preferred `/dev/urandom`.  So QEMU's default for
    VirtIO-RNG devices is not sufficient, and Nova should not rely on
    it.  (Discussion[+] on 'qemu-devel' list to consider changing QEMU's
    default.)

                    * * *

In this patch:

  - Make Nova configure a VirtIO-RNG device by default for guests.
    (Which will be using `/dev/urandom` as the default entropy source.)
    This will also work for Windows guests, when using VirtIO-Win
    drivers[*] on the Linux host.

  - The 'hw_rng_model' image metadata property is now rendered
    (temporarily) useless -- as it's not used anywhere outside the
    _add_rng_device() method.  But we don't want to deprecate it yet, as
    we may extend it (see code comment for details); docucment that.

[*] https://docs.pagure.org/docs-fedora/create-windows-vms-using-virtio.html
[+] https://lists.nongnu.org/archive/html/qemu-devel/2018-09/msg02724.html
    -- "[RFC] Virtio RNG: Consider changing the default entropy source to
    /dev/urandom?"

Closes-Bug: #1789868

Change-Id: I28e66c9640c38d23b8c0dbd0b05f5260bfcf6d30
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
2020-01-23 13:24:52 +01:00
Stephen Finucane dafbe3503a Remove remaining Python 2.7-only dependencies
Change-Id: I7c50e73c02e710a357eca51b0e156e2bd2f3ec1d
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-01-23 09:27:23 +00:00
Zuul 1e63f29b74 Merge "Remove unused code" 2020-01-23 08:31:34 +00:00
Zuul 4be39f48f6 Merge "Func test for failed and aborted live migration" 2020-01-23 08:31:29 +00:00
Stephen Finucane 0b1a33ec9c nova-net: Update API reference guide
As highlighted in I77b1cfeab3c00c6c3d7743ba51e12414806b71d2, filtering
either floating IPs or floating IP pools by floating IP name will
actually fallback to filtering by ID. Update the API ref to reflect
this.

Change-Id: I00443ae111cbd1e1ec4d2c2ae1828ddaa095fd1a
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-01-22 13:50:30 +00:00
Zuul 058e77e26c Merge "functional: Add '_create_server' helper" 2020-01-21 19:37:41 +00:00
Zuul 23858ce72d Merge "Add ironic hypervisor doc" 2020-01-21 19:37:35 +00:00
Zuul 8e779d9221 Merge "functional: Stop setting Flavor.id" 2020-01-21 15:24:00 +00:00
Balazs Gibizer 4eafc9d5b1 Func test for failed and aborted live migration
The failed case covers the situation when the IntancePCIRequest cannot be
updated with the PF device names on the target host due to incorrect
naming of the device RPs in placement.

The abort case covers when the API user cancels the running live
migration.

Change-Id: I1222fc21bde4158df1db70370c7f3bd319ec081f
blueprint: support-move-ops-with-qos-ports-ussuri
2020-01-21 14:12:11 +01:00
Stephen Finucane 765e4e52bf functional: Stop setting Flavor.id
In change I475ea0fa5f2d5b197118f0ced5a0ff6907411972, we changed how we
generated flavor names and IDs to stop basing them on existing flavors.
However, the call to 'randint(10000)' that we used instead has proven
problematic since there is a high chance of collisions that will only
increase as the number of tests increase. We could switch back to the
previous scheme but that's unnecessary since there's actually no reason
we need to set 'Flavor.id' in the first place...so don't.

A now-unused remnant of the "old way" is also removed, since it was
spotted while fixing this.

Change-Id: Iab6245bc5ed8f95dae9c384b96e6bef0add7dca6
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Closes-bug: #1860417
2020-01-21 09:49:54 +00:00
Zuul a948a803b5 Merge "nova-net: Remove layer of indirection in 'nova.network'" 2020-01-20 19:17:02 +00:00
Stephen Finucane bce30de28a Remove unused code
This was missed in change Ie01ab1c3a1219f1d123f0ecedc66a00dfb2eb2c1.

Change-Id: I7c56d5ecc1131c20324f41c83dc97f41e4628d4d
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-01-20 14:44:54 +00:00
Stephen Finucane 5b36d8c054 functional: Add '_create_server' helper
This is a *very* common pattern. Centralize it. Only a few users are
added for now, but we can migrate more later (it's rather tedious work).

Change-Id: I84c58de90dad6d86271767363aef90ddac0f1730
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-01-20 14:32:43 +00:00
Zuul 824bc358c2 Merge "Set instance CPU policy to 'share' through image property" 2020-01-20 13:49:08 +00:00