Nova has never supported direct booting of an image of an encrypted
volume uploaded to Glance via the Cinder upload-volume-to-image
process, but instead of rejecting such a request, an 'active' but
unusable instance is created. This patch allows Nova to use image
metadata to detect such an image and reject the boot request.
Change-Id: Idf84ccff254d26fa13473fe9741ddac21cbcf321
Related-bug: #1852106
Closes-bug: #1863611
Inheritance of image properties from the image an instance was booted
from to an image created from that instance is governed by the
non_inheritable_image_properties configuration option. However, there
are some image properties (for example, those used for image signature
validation or to reference a cinder encryption key id) which it makes
no sense to inherit under any circumstances. Additionally,
misconfiguration of the non-inheritable properties can lead to data
loss under the circumstances described in Bug #1852106. So it would
be better if these properties were not subject to configuration.
The initial set of absolutely non-inheritable image properties
consists of those associated with cinder encryption keys and image
signature validation.
Change-Id: I4332b9c343b6c2b50226baa8f78396c2012dabd1
Closes-bug: #1852106
Added JSON schema defining `network_data.json` contents and
beefed up the MetadataTest functional test cases to use a
real instance instead of a database shell. This way the
tests see real data in the metadata service like a real
network_data.json.
Besides internal Nova consumption, this schema might be
helpful to other tools (such as ironic or Glean) to
validate human-generated `network_data.json` prior to
using it.
Co-Authored-By: Balazs Gibizer <balazs.gibizer@est.tech>
Change-Id: Ie5a5a1fc81c7c2d3f61b72d19de464cfc9dab5ec
While querying metadata from an LB source, the subnet query may result
with a large number of networks.
That might exceed the maximum request length on the following port query
towards Neutron.
The patch addresses such issue.
Closes-Bug: #1861087
Change-Id: I9d72c80574d08d8409ed0dcc0476f52a0d173a1e
Currently we run all the integration jobs for policies
only changes which is not required. Policy-only changes
can be covered by unit, functional and single tempest, grenade
job.
Change-Id: I4b2d321b7243ec149e9445035d1feb7a425e9a4b
hypervisor_version of z/VM driver returned '' but
in fact it is defined as fields.IntegerField() which
means it should be int value by default.
Closes-Bug: 1862750
Change-Id: Ib4f2ecbbb731943eda996d525ddaafd2260fd1a3
It was discovered that default= on a Column definition in a schema migration
will attempt to update the table with the provided value, instead of just
translating on read, which is often the assumption. The Instance.hidden=False
change introduced in Train[1] used such a default on the new column, which caused
at least one real-world deployment to time out rewriting the instances table
due to size. Apparently SQLAlchemy-migrate also does not consider such a timeout
to be a failure and proceeds on. The end result is that some existing instances
in the database have hidden=NULL values, and the DB model layer does not convert
those to hidden=False when we read/query them, causing those instances to be
excluded from the API list view.
This change alters the 399 schema migration to remove the default=False
specification. This does not actually change the schema, but /will/ prevent
users who have not yet upgraded to Train from rewriting the table.
This change also makes the instance_get_all_by_filters() code handle hidden
specially, including false and NULL in a query for non-hidden instances.
A future change should add a developer trap test to ensure that future migrations
do not add default= values to new columns to avoid this situation in the future.
[1] Iaffb27bd8c562ba120047c04bb62619c0864f594
Change-Id: Iace3f653b42c20887b40ee0105c8e9a4edeff1f7
Closes-Bug: #1862205
This method only checks if a specific path is shared between two hosts
and has been renamed accordingly to avoid confusion.
Additionally the shared_storage variable used to store the returned
value from this method within migrate_disk_and_power_off is renamed to
shared_instance_path.
Change-Id: I426de20864321d664d3fe0e08a14e1af509c8a2b
Floating IPs don't have to have an associated port, so there's no reason
to error out when this is the case.
Change-Id: I50c79843bf819b731c597dbfe72090cdf02c7641
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Closes-bug: #1861876
It adds a bit more verbosity to cell update command, so that it
is more obvious for an operator which values are being used as
transport URL or database connection URL for a cell.
Change-Id: Ie567ae7da4508a4b6f797d4bc77347c84702a74e
In the worst case scenario, we could list N floating IPs, each of which
has a different network. This would result in N additional calls to
neutron - one for each of the networks. Avoid this by calling neutron
once for all networks associated with the floating IPs.
Change-Id: If067a730b0fcbe3f59f4472b00c690cc43be4b3b
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Previous patches in the blueprint implemented the support for live
migraton with qos ports and added functional test coverage for the
various live migration scenarios. So this patch removes the API check
that rejected such operation and document the new feature.
Change-Id: Ib9ef18fff28c463c9ffe3607d93428b689dc89fb
blueprint: support-move-ops-with-qos-ports-ussuri
I1222fc21bde4158df1db70370c7f3bd319ec081f added a common helper for
server creation. This patch updated the existing qos tests to use that
helper.
Change-Id: I017163c6cdf8727be9913a6870cd91fec5f4d568
blueprint: support-move-ops-with-qos-ports-ussuri
During creating or moving of an instance with qos SRIOV port the PCI
device claim on the destination compute needs to be restricted to select
PCI VFs from the same PF where the bandwidth for the qos port is
allocated from. This is achieved by updating the spec part of the
InstancePCIRequest with the device name of the PF by calling
update_pci_request_spec_with_allocated_interface_name(). Until now
such update of the instance object was directly persisted by the call.
During code review it was came up that the instance.save() in the util
is not appropriate as the caller has a lot more context to decide when
to persist the changes.
The original eager instance.save was introduced when support added to
the server create flow. Now I realized that the need for such save was
due to a mistake in the original ResourceTracker.instance_claim() call
that loads the InstancePCIRequest from the DB instead of using the
requests through the passed in instance object. By removing the extra DB
call the need for eagerly persisting the PCI spec update is also
removed. It turned out that both the server create code path and every
server move code paths eventually persist the instance object either
during at the end of the claim process or in case of live migration in
the post_live_migration_at_destination compute manager call. This means
that the code now can be simplified. Especially the live migration cases.
In the live migrate abort case we don't need to roll back the eagerly
persisted PCI change as now such change is only persisted at the end
of the migration but still we need to refresh pci_requests field of
the instance object during the rollback as that field might be stale,
containing dest host related PCI information.
Also in case of rescheduling during live migrate if the rescheduling
failed the PCI change needed to be rolled back to the source host by a
specific code. But now those change are never persisted until the
migration finishes so this rollback code can be removed too.
Change-Id: Ied8f96b4e67f79498519931cb6b35dad5288bbb8
blueprint: support-move-ops-with-qos-ports-ussuri
Validate basic huge page assignment to an instance. When assigning huge
pages to an instance, a NUMA topology will automatically be added, thus
this is a subclass of the NUMA tests.
This will prove useful as a starting base for functional tests if we
ever do manage to get memory pages modelled in placement.
Change-Id: I8c760ed242a8ffd9ad963a5f51364f541909cd4c
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
When resizing, it's possible to change the NUMA topology of an instance,
or remove it entirely, due to different extra specs in the new flavor.
Unfortunately we cache the instance's NUMA topology object in
'RequestSpec.numa_topology' and don't update it when resizing. This
means if a given host doesn't have enough free CPUs or mempages of the
size requested by the *old* flavor, that host can be rejected by the
filter.
Correct this by regenerating the 'RequestSpec.numa_topology' field as
part of the resize operation, ensuring that we revert to the old field
value in the case of a resize-revert.
Change-Id: I0ca50665b86b9fdb4618192d4d6a3bcaa6ea2291
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Co-Authored-By: He Jie Xu <hejie.xu@intel.com>
Closes-bug: #1805767
This uses the COMPUTE_SAME_HOST_COLD_MIGRATE trait in the API during a
cold migration to filter out hosts that cannot support same-host cold
migration, which is all of them except for the hosts using the vCenter
driver.
For any nodes that do not report the trait, we won't know if they don't
because they don't support it or if they are not new enough to report
it, so the API has a service version check and will fallback to old
behavior using the config if the node is old. That compat code can be
removed in the next release.
As a result of this the FakeDriver capabilities are updated so the
FakeDriver no longer supports same-host cold migration and a new fake
driver is added to support that scenario for any tests that need it.
Change-Id: I7a4b951f3ab324c666ab924e6003d24cc8e539f5
Closes-Bug: #1748697
Related-Bug: #1811235
If glance and nova are both configured with RBD backend, but glance
does not return location information from the API, nova will fail to
clone the image from glance pool and will download it from the API.
In this case, image will be already flat, and subsequent flatten call
will fail.
This commit makes flatten call idempotent, so that it ignores already
flat images by catching ImageUnacceptable when requesting parent info
from ceph.
Closes-Bug: 1860990
Change-Id: Ia6c184c31a980e4728b7309b2afaec4d9f494ac3
Metadata service uses the provider id to identify the requesting
instance.
However, when provider query doesn't find any networks, the request
should fail.
The same goes to the case where multiple ports are found.
Closes-Bug: #1841933
Change-Id: I8ce3703ca86a3a0769edd42a790d82796d1071d7