This adds a new microversion to expose the instance action
event details in the
GET /servers/{server_id}/os-instance-actions/{request_id} API.
With the new microversion the "details" key is always returned
with each event dict but the value may be null because of old
records or events that did not fail.
The details are not constrained by policy like the traceback
field since the details are like a fault message on the server
resource when the server is in ERROR status and the fault
message is likewise not constraint by policy unlike the fault
details which is a traceback like the event traceback field.
This commit add a SYSTEM_READER ('rule: system_reader_api') role
to the Show Server Action Details API. With this default policy,
events fault details can be displayed. And also add some nova and
non-nova exception functional tests for os-instance-actions API.
Co-Authored-By: Brin Zhang <zhangbailin@inspur.com>
Implements blueprint action-event-fault-details
Change-Id: I6fe4dd265b0030ce12f92771b255a3d795f03d01
Unlike x86, AArch64 doesn't have a default model.
Usually when using libvirt driver, set cpu mode to custom, nova
will call libvirt to return the default models. But for aarch64,
the support CPU models varies according to machine type.
AArch64 use "virt" as the default machine type. In Qemu it support
several models, and we should choose "max" as the by default one.
Closes-Bug: #1864588
Change-Id: Ib2df50bda991a659fe10ef1dd9e7ab56800c34fb
Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>
Current tests do not have good test coverage of existing policies.
Either tests for policies do not exist or if they exist then they
do not cover the actual negative and positive testing.
For Example, if any policy with default rule as admin only then
test should verify:
- policy check pass with context having admin or server owner
- policy check fail with context having non-admin and not server owner
As discussed in policy-defaults-refresh, to change the policies
with new default roles and scope_type, we need to have the enough
testing coverage of existing policy behavior.
When we will add the scope_type in policies or new default roles,
then these test coverage will be extended to adopt the new changes
and also make sure we do not break the existing behavior.
This commit covers the testing coverage of existing instance usage audit
log policies.
Partial implement blueprint policy-defaults-refresh
Change-Id: I4a8b935829edb1d7fd7efb0291d71d3a9d2b7abd
Previously virDomainBlockRebase [1] was used by swap_volume to switch
between volumes presented to the compute host as block devices or files.
As outlined in the virDomainBlockCopy [2] documentation this command is
actually a superset of virDomainBlockRebase in our case:
> This command is a superset of the older virDomainBlockRebase() when used
> with the VIR_DOMAIN_BLOCK_REBASE_COPY flag, and offers better control
> over the destination format, the ability to copy to a destination that
> is not a local file, and the possibility of additional tuning
> parameters.
As such we can switch to virDomainBlockCopy and expand support for
swap_volume outside of just host block devices and files.
To allow swap_volume to support RBD volumes we also need the domain to
use the recently introduced -blockdev support within libvirt >= 6.0.0
and QEMU >= 4.2.0. New MIN_LIBVIRT_BLOCKDEV and MIN_QEMU_BLOCKDEV
version constants are introduced and used to determine when to switch to
the virDomainBlockCopy method of moving between volumes.
[1] https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockRebase
[2] https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockCopy
Closes-Bug: #1868996
Change-Id: I8e8035dcf508f5215bba9b7575c5c6abfe41da31
. Do not delete accelerator requests in stop code paths.
. In the start code path, get the list of accelerator requests from
Cyborg in the compute manager 'power_on'.
. Pass accel_info (said list) to the virt driver power_on.
. In libvirt driver, pass that accel_info to driver power_on.
Change-Id: I8c94504b87aa4450d163fe2b33f6aa0eb5dae5ff
Blueprint: nova-cyborg-interaction
This patch series now works for many VM operations with libvirt:
* Creation, deletion of VM instances.
* Pause/unpause
The following works but is a no-op:
* Lock/unlock
Hard reboots are taken up in a later patch in this series.
Soft reboots work for accelerators unless some unrelated failure
forces a hard reboot in the libvirt driver.
Suspend is not supported yet. It would fail with this error:
libvirtError: Requested operation is not valid:
domain has assigned non-USB host devices
Shelve is not supported yet.
Live migration is not intended to be supported with accelerators now.
Change-Id: Icb95890d8f16cad1f7dc18487a48def2f7c9aec2
Blueprint: nova-cyborg-interaction
Add a table-driven prefilter to transform image metadata into required
traits. This requires a new config option to make the filter optional.
Change-Id: I257ff81e23cdae6f2b62ec3d071b8f8f32d97781
Implements: blueprint image-metadata-prefiltering
Co-Authored-By: Stephen Finucane <sfinucan@redhat.com>
This change extends parsing of domain capability XML to discover the
supported storage and video models. To do this, we alter the behavior of
'_get_storage_bus_traits' to prefer the data from the domain
capabilities API for 'qemu' and 'kvm' virt types, only falling back to
generating the storage traits statically for other virt types.
In addition, we extend the libvirt driver with '_get_video_model_traits'
and '_get_vif_model_traits' functions to generate sets of video models
and VIF models respectively that are supported by this host.
Finally, we start caching the static driver traits in a property to
avoid the need to recalculating them every time 'update_provider_tree' is
called. This is okay since these things will not change during runtime
unless libvirt or QEMU are upgraded, in which case the user really
should be restarting consumers of libvirt such as nova anyway.
Change-Id: I0bdf9ccf7bf3fb1f3136c1e4267b9c99732908d5
Partially-Implements: blueprint image-metadata-prefiltering
Each accelerator request in the accel_info list has an attach handle,
which in turn contains a PCI BDF for passthrough. They get composed
as hostdev devices into the VM's domain XML.
Blueprint: nova-cyborg-interaction
Change-Id: I08cd4787ab4c9539574237e26ba5bf6d4246b32e
Update the signature of the spawn() API for each virt driver
to include accel_info, which is a list of accelerator requests.
Change-Id: I4aac66c125a162bf35991a7d0c2638c7475ec0e7
Blueprint: nova-cyborg-interaction
* Call Cyborg with device profile name to get ARQs (Accelerator Requests).
Each ARQ corresponds to a single device profile group, which
corrresponds to a single request group in request spec.
* Match each ARQ to associated request group, and thereby obtain the
corresponding RP for that ARQ.
* Call Cyborg to bind the ARQ to that host/device-RP.
* When Cyborg sends the ARQ bind notification events, wait for those
events with a timeout.
Change-Id: I0f8b6bf2b4f4510da6c84fede532533602b6af7f
Blueprint: nova-cyborg-interaction
Find the name of the device profile, if any, in flavor extra specs.
Get its profile groups (equiv to flavor request groups) from Cyborg.
Parse/validate them similar to extra_specs.
Generate RequestGroup objects and add them to the request spec
(in requested_resources field, following precedent).
Change-Id: Icd2ee9024dd4af0a7eb105eca14df8e458e9de77
Blueprint: nova-cyborg-interaction
Framework for communication with the Cyborg API.
- Standard keystoneauth1 config options for setting up authentication in
the [cyborg] section of nova*.conf.
- A new nova.accelerator.cyborg module containing a get_client method to
return a client containing a keystoneauth1 adapter pointing
to the Cyborg service with user- and service- based authentication.
- Requirements updates to pull in the os-service-types release
containing the 'accelerator' service type.
Change-Id: Iee0766269d61948ad701911e8b0e5e24d3d6eb04
Blueprint: nova-cyborg-interaction
I8af2ad741ca08c3d88efb9aa817c4d1470491a23 started to correctly fence the
subnode ahead of evacuation testing but missed that c-vol and g-api
where also running on the host. As a result the BFV evacuation test will
fail if the volume being used is created on the c-vol backend hosted on
the subnode.
This change now avoids this by limiting the services stopped ahead of
the evacuation on the subnode to n-cpu and q-agt.
Change-Id: Ia7c317e373e4037495d379d06eda19a71412d409
Closes-Bug: #1868234
The InstanceActionEvent.pack_action_event_finish method was
not storing the exc_val in a field that was actually part
of the data model so it was never stored ("message" isn't a
column in the instance_actions_events table, "details" is).
This formats and stores the exc_val, if provided, using the
same utility code that is used to record the message for
an instance fault record.
Eventually we can build on this in the os-instance-actions API
by exposing the fault details to the user without needing to
expose the traceback (like the server fault "details" traceback).
Co-Authored-By: Brin Zhang <zhangbailin@inspur.com>
Part of blueprint action-event-fault-details
Change-Id: Ie3e11b38aac251c3f8911ed57dc5e7503aa91301