Commit Graph

56467 Commits

Author SHA1 Message Date
Zuul a3d4ebd3c9 Merge "tests: Validate huge pages" 2020-02-19 13:58:37 +00:00
Zuul a7cc98e997 Merge "Absolutely-non-inheritable image properties" 2020-02-19 13:58:29 +00:00
Zuul 157daff9e4 Merge "Reject boot request for unsupported images" 2020-02-19 04:15:20 +00:00
Zuul 4f69d8e6c7 Merge "Add JSON schema and test for network_data.json" 2020-02-18 18:57:16 +00:00
Zuul 4de74604ce Merge "set default value to 0 instead of ''" 2020-02-18 16:19:30 +00:00
Zuul e69dbfa0d3 Merge "Recalculate 'RequestSpec.numa_topology' on resize" 2020-02-17 17:59:36 +00:00
Brian Rosmaita 963fd8c0f9 Reject boot request for unsupported images
Nova has never supported direct booting of an image of an encrypted
volume uploaded to Glance via the Cinder upload-volume-to-image
process, but instead of rejecting such a request, an 'active' but
unusable instance is created.  This patch allows Nova to use image
metadata to detect such an image and reject the boot request.

Change-Id: Idf84ccff254d26fa13473fe9741ddac21cbcf321
Related-bug: #1852106
Closes-bug: #1863611
2020-02-17 10:20:57 -05:00
Brian Rosmaita bc29084012 Absolutely-non-inheritable image properties
Inheritance of image properties from the image an instance was booted
from to an image created from that instance is governed by the
non_inheritable_image_properties configuration option.  However, there
are some image properties (for example, those used for image signature
validation or to reference a cinder encryption key id) which it makes
no sense to inherit under any circumstances.  Additionally,
misconfiguration of the non-inheritable properties can lead to data
loss under the circumstances described in Bug #1852106.  So it would
be better if these properties were not subject to configuration.

The initial set of absolutely non-inheritable image properties
consists of those associated with cinder encryption keys and image
signature validation.

Change-Id: I4332b9c343b6c2b50226baa8f78396c2012dabd1
Closes-bug: #1852106
2020-02-17 10:13:54 -05:00
Ilya Etingof 69ee625a66 Add JSON schema and test for network_data.json
Added JSON schema defining `network_data.json` contents and
beefed up the MetadataTest functional test cases to use a
real instance instead of a database shell. This way the
tests see real data in the metadata service like a real
network_data.json.

Besides internal Nova consumption, this schema might be
helpful to other tools (such as ironic or Glean) to
validate human-generated `network_data.json` prior to
using it.

Co-Authored-By: Balazs Gibizer <balazs.gibizer@est.tech>
Change-Id: Ie5a5a1fc81c7c2d3f61b72d19de464cfc9dab5ec
2020-02-17 15:35:24 +01:00
Kobi Samoray bfb8dcded6 Support large network queries towards neutron
While querying metadata from an LB source, the subnet query may result
with a large number of networks.
That might exceed the maximum request length on the following port query
towards Neutron.
The patch addresses such issue.

Closes-Bug: #1861087
Change-Id: I9d72c80574d08d8409ed0dcc0476f52a0d173a1e
2020-02-16 23:38:48 -08:00
Zuul b9ac4e4905 Merge "Skip to run all integration jobs for policy-only changes." 2020-02-14 17:19:15 +00:00
Zuul 0d3aeb0287 Merge "Make RBD imagebackend flatten method idempotent" 2020-02-13 14:25:10 +00:00
Ghanshyam Mann db39391fe0 Skip to run all integration jobs for policy-only changes.
Currently we run all the integration jobs for policies
only changes which is not required. Policy-only changes
can be covered by unit, functional and single tempest, grenade
job.

Change-Id: I4b2d321b7243ec149e9445035d1feb7a425e9a4b
2020-02-12 16:33:17 +00:00
jichenjc 560987f920 set default value to 0 instead of ''
hypervisor_version of z/VM driver returned '' but
in fact it is defined as fields.IntegerField() which
means it should be int value by default.

Closes-Bug: 1862750
Change-Id: Ib4f2ecbbb731943eda996d525ddaafd2260fd1a3
2020-02-11 10:01:37 +00:00
Zuul 1fcd74730d Merge "Fix instance.hidden migration and querying" 2020-02-08 17:02:17 +00:00
Dan Smith 001f3a7bfe Fix instance.hidden migration and querying
It was discovered that default= on a Column definition in a schema migration
will attempt to update the table with the provided value, instead of just
translating on read, which is often the assumption. The Instance.hidden=False
change introduced in Train[1] used such a default on the new column, which caused
at least one real-world deployment to time out rewriting the instances table
due to size. Apparently SQLAlchemy-migrate also does not consider such a timeout
to be a failure and proceeds on. The end result is that some existing instances
in the database have hidden=NULL values, and the DB model layer does not convert
those to hidden=False when we read/query them, causing those instances to be
excluded from the API list view.

This change alters the 399 schema migration to remove the default=False
specification. This does not actually change the schema, but /will/ prevent
users who have not yet upgraded to Train from rewriting the table.

This change also makes the instance_get_all_by_filters() code handle hidden
specially, including false and NULL in a query for non-hidden instances.

A future change should add a developer trap test to ensure that future migrations
do not add default= values to new columns to avoid this situation in the future.

[1] Iaffb27bd8c562ba120047c04bb62619c0864f594

Change-Id: Iace3f653b42c20887b40ee0105c8e9a4edeff1f7
Closes-Bug: #1862205
2020-02-07 08:54:56 -08:00
Zuul 69ce0f01b6 Merge "nova-net: Update API reference guide" 2020-02-06 17:12:25 +00:00
Zuul 38320397b0 Merge "Don't error out on floating IPs without associated ports" 2020-02-06 17:12:20 +00:00
Lee Yarwood afebcdc950 libvirt: Rename _is_storage_shared_with to _is_path_shared_with
This method only checks if a specific path is shared between two hosts
and has been renamed accordingly to avoid confusion.

Additionally the shared_storage variable used to store the returned
value from this method within migrate_disk_and_power_off is renamed to
shared_instance_path.

Change-Id: I426de20864321d664d3fe0e08a14e1af509c8a2b
2020-02-06 11:09:00 +00:00
Stephen Finucane aec3ca0765 Don't error out on floating IPs without associated ports
Floating IPs don't have to have an associated port, so there's no reason
to error out when this is the case.

Change-Id: I50c79843bf819b731c597dbfe72090cdf02c7641
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Closes-bug: #1861876
2020-02-06 10:04:24 +00:00
Zuul 014c1ab864 Merge "Avoid calling neutron for N networks" 2020-02-06 06:50:58 +00:00
Zuul 2d7871792e Merge "Revert "nova shared storage: rbd is always shared storage"" 2020-02-06 03:00:27 +00:00
Zuul de3f76956b Merge "Handle neutron without the fip-port-details extension" 2020-02-05 21:35:14 +00:00
Zuul 8f9d3c1646 Merge "nova-net: Remove now unnecessary nova-net workaround" 2020-02-05 18:38:40 +00:00
Zuul 097c31d4b4 Merge "nova-net: Remove use of legacy 'SecurityGroup' object" 2020-02-05 12:25:16 +00:00
Zuul dc50c6a8d6 Merge "Minor improvements to cell commands" 2020-02-05 10:36:59 +00:00
Zuul 64cf40c955 Merge "nova-net: Remove use of legacy 'Network' object" 2020-02-04 22:42:49 +00:00
Vladyslav Drok 9f65599892 Minor improvements to cell commands
It adds a bit more verbosity to cell update command, so that it
is more obvious for an operator which values are being used as
transport URL or database connection URL for a cell.

Change-Id: Ie567ae7da4508a4b6f797d4bc77347c84702a74e
2020-02-04 19:09:29 +01:00
Stephen Finucane 3e79cb7577 Avoid calling neutron for N networks
In the worst case scenario, we could list N floating IPs, each of which
has a different network. This would result in N additional calls to
neutron - one for each of the networks. Avoid this by calling neutron
once for all networks associated with the floating IPs.

Change-Id: If067a730b0fcbe3f59f4472b00c690cc43be4b3b
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-02-04 17:37:06 +00:00
Stephen Finucane eef658bf53 Handle neutron without the fip-port-details extension
The 'fip-port-details' API extension was added to neutron in Rocky [1]
and is optional. As a result, we cannot rely on the 'port_details' field
being present in API responses. If it is not, we need to make a second
query for all ports and build 'port_details' using the 'port_id' field.

[1] https://docs.openstack.org/releasenotes/neutron-lib/rocky.html#relnotes-1-14-0-stable-rocky-new-features

Change-Id: Ifb96f31f471cc0a25c1dfce2161a669b97a384ae
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Closes-bug: #1861876
2020-02-04 17:36:24 +00:00
Zuul b42c54752f Merge "Avoid fetching metadata when no subnets found" 2020-02-04 12:58:47 +00:00
Zuul 1c368e30ce Merge "Enable live migration with qos ports" 2020-02-04 12:58:41 +00:00
Zuul 7601efa5e3 Merge "nova-net: Remove use of legacy 'FloatingIP' object" 2020-02-03 20:47:38 +00:00
Zuul 1d9a131707 Merge "Use common server create function for qos func tests" 2020-02-03 19:55:40 +00:00
Zuul 8b109a262e Merge "Remove extra instance.save() calls related to qos SRIOV ports" 2020-02-03 19:55:35 +00:00
Balazs Gibizer 64cdb82b99 Enable live migration with qos ports
Previous patches in the blueprint implemented the support for live
migraton with qos ports and added functional test coverage for the
various live migration scenarios. So this patch removes the API check
that rejected such operation and document the new feature.

Change-Id: Ib9ef18fff28c463c9ffe3607d93428b689dc89fb
blueprint: support-move-ops-with-qos-ports-ussuri
2020-02-03 11:43:12 +01:00
Balazs Gibizer 52a03b195e Use common server create function for qos func tests
I1222fc21bde4158df1db70370c7f3bd319ec081f added a common helper for
server creation. This patch updated the existing qos tests to use that
helper.

Change-Id: I017163c6cdf8727be9913a6870cd91fec5f4d568
blueprint: support-move-ops-with-qos-ports-ussuri
2020-02-03 11:43:12 +01:00
Balazs Gibizer 56f29b3e4a Remove extra instance.save() calls related to qos SRIOV ports
During creating or moving of an instance with qos SRIOV port the PCI
device claim on the destination compute needs to be restricted to select
PCI VFs from the same PF where the bandwidth for the qos port is
allocated from. This is achieved by updating the spec part of the
InstancePCIRequest with the device name of the PF by calling
update_pci_request_spec_with_allocated_interface_name(). Until now
such update of the instance object was directly persisted by the call.

During code review it was came up that the instance.save() in the util
is not appropriate as the caller has a lot more context to decide when
to persist the changes.

The original eager instance.save was introduced when support added to
the server create flow. Now I realized that the need for such save was
due to a mistake in the original ResourceTracker.instance_claim() call
that loads the InstancePCIRequest from the DB instead of using the
requests through the passed in instance object. By removing the extra DB
call the need for eagerly persisting the PCI spec update is also
removed. It turned out that both the server create code path and every
server move code paths eventually persist the instance object either
during at the end of the claim process or in case of live migration in
the post_live_migration_at_destination compute manager call. This means
that the code now can be simplified. Especially the live migration cases.

In the live migrate abort case we don't need to roll back the eagerly
persisted PCI change as now such change is only persisted at the end
of the migration but still we need to refresh pci_requests field of
the instance object during the rollback as that field might be stale,
containing dest host related PCI information.

Also in case of rescheduling during live migrate if the rescheduling
failed the PCI change needed to be rolled back to the source host by a
specific code. But now those change are never persisted until the
migration finishes so this rollback code can be removed too.

Change-Id: Ied8f96b4e67f79498519931cb6b35dad5288bbb8
blueprint: support-move-ops-with-qos-ports-ussuri
2020-02-03 11:41:38 +01:00
Stephen Finucane 4bdecee385 docs: Fix the monkeypatching of blockdiag
blockdiag has a longstanding bug whereby it tries to access an attribute
on an 'io.BufferedReader' that doesn't exist. We had previously fixed
this in change Ibd32d30aacae65702d0ccbdb8a02b1667ec4e8ee, which undid
the damage blockdiag was doing. However, this worked because the monkey
patching blockdiag does happens when the 'blockdiag.utils.compat' module
is loaded [1], which was happening implicitly with our import of
'blockdiag.imagedraw.textfolder' [2]. However, that module no longer
imports the 'compat' [3] module so this doesn't work. Fix the issue by
just importing the 'compat' module manually, triggering the monkey
patching...which we can then undo.

[1] https://github.com/blockdiag/blockdiag/blob/2.0.0/src/blockdiag/utils/compat.py#L19-L26
[2] https://github.com/blockdiag/blockdiag/blob/1.5.4/src/blockdiag/imagedraw/textfolder.py#L18
[3] https://github.com/blockdiag/blockdiag/tree/2.0.0/src/blockdiag/imagedraw/textfolder.py

Change-Id: Idacfff98842fde38fb39791090f2da3310b441b5
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-01-31 17:52:54 +00:00
Stephen Finucane 5c048345a5 tests: Validate huge pages
Validate basic huge page assignment to an instance. When assigning huge
pages to an instance, a NUMA topology will automatically be added, thus
this is a subclass of the NUMA tests.

This will prove useful as a starting base for functional tests if we
ever do manage to get memory pages modelled in placement.

Change-Id: I8c760ed242a8ffd9ad963a5f51364f541909cd4c
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-01-31 17:28:38 +00:00
Stephen Finucane c29f382f69 Recalculate 'RequestSpec.numa_topology' on resize
When resizing, it's possible to change the NUMA topology of an instance,
or remove it entirely, due to different extra specs in the new flavor.
Unfortunately we cache the instance's NUMA topology object in
'RequestSpec.numa_topology' and don't update it when resizing. This
means if a given host doesn't have enough free CPUs or mempages of the
size requested by the *old* flavor, that host can be rejected by the
filter.

Correct this by regenerating the 'RequestSpec.numa_topology' field as
part of the resize operation, ensuring that we revert to the old field
value in the case of a resize-revert.

Change-Id: I0ca50665b86b9fdb4618192d4d6a3bcaa6ea2291
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Co-Authored-By: He Jie Xu <hejie.xu@intel.com>
Closes-bug: #1805767
2020-01-31 15:45:46 +00:00
Zuul c16315165c Merge "Use COMPUTE_SAME_HOST_COLD_MIGRATE trait during migrate" 2020-01-30 19:22:12 +00:00
Zuul b8f4e46939 Merge "zuul: Remove unnecessary 'USE_PYTHON3'" 2020-01-30 17:09:07 +00:00
Matt Riedemann 4921e822e7 Use COMPUTE_SAME_HOST_COLD_MIGRATE trait during migrate
This uses the COMPUTE_SAME_HOST_COLD_MIGRATE trait in the API during a
cold migration to filter out hosts that cannot support same-host cold
migration, which is all of them except for the hosts using the vCenter
driver.

For any nodes that do not report the trait, we won't know if they don't
because they don't support it or if they are not new enough to report
it, so the API has a service version check and will fallback to old
behavior using the config if the node is old. That compat code can be
removed in the next release.

As a result of this the FakeDriver capabilities are updated so the
FakeDriver no longer supports same-host cold migration and a new fake
driver is added to support that scenario for any tests that need it.

Change-Id: I7a4b951f3ab324c666ab924e6003d24cc8e539f5
Closes-Bug: #1748697
Related-Bug: #1811235
2020-01-29 09:44:47 +00:00
Zuul 9fa3600fca Merge "doc: define boot from volume in the glossary" 2020-01-29 03:52:50 +00:00
Zuul 838b364a6e Merge "Handle cell failures in get_compute_nodes_by_host_or_node" 2020-01-29 03:52:43 +00:00
Vladyslav Drok 65825ebfbd Make RBD imagebackend flatten method idempotent
If glance and nova are both configured with RBD backend, but glance
does not return location information from the API, nova will fail to
clone the image from glance pool and will download it from the API.
In this case, image will be already flat, and subsequent flatten call
will fail.

This commit makes flatten call idempotent, so that it ignores already
flat images by catching ImageUnacceptable when requesting parent info
from ceph.

Closes-Bug: 1860990
Change-Id: Ia6c184c31a980e4728b7309b2afaec4d9f494ac3
2020-01-28 14:30:40 +01:00
Kobi Samoray 3177371568 Avoid fetching metadata when no subnets found
Metadata service uses the provider id to identify the requesting
instance.
However, when provider query doesn't find any networks, the request
should fail.
The same goes to the case where multiple ports are found.

Closes-Bug: #1841933
Change-Id: I8ce3703ca86a3a0769edd42a790d82796d1071d7
2020-01-28 14:35:52 +02:00
Zuul 80539a5e84 Merge "nova-net: Remove remaining nova-network quotas" 2020-01-27 12:33:15 +00:00
Zuul 575d988a2c Merge "Fix typos for update_available_resource reference" 2020-01-25 19:21:44 +00:00