tl;dr: Use 'writeback' instead of 'writethrough' as the cache mode of
the target image for `qemu-img convert`. Two reasons: (a) if the image
conversion completes succesfully, then 'writeback' calls fsync() to
safely write data to the physical disk; and (b) 'writeback' makes the
image conversion a _lot_ faster.
Back-of-the-envelope "benchmark" (on an SSD)
--------------------------------------------
(Ran both the tests thrice each; version: qemu-img-2.11.0)
With 'writethrough':
$> time (qemu-img convert -t writethrough -f qcow2 -O raw \
Fedora-Cloud-Base-29.qcow2 Fedora-Cloud-Base-29.raw)
real 1m43.470s
user 0m8.310s
sys 0m3.661s
With 'writeback':
$> time (qemu-img convert -t writeback -f qcow2 -O raw \
Fedora-Cloud-Base-29.qcow2 5-Fedora-Cloud-Base-29.raw)
real 0m7.390s
user 0m5.179s
sys 0m1.780s
I.e. ~103 seconds of elapsed wall-clock time for 'writethrough' vs. ~7
seconds for 'writeback' -- IOW, 'writeback' is nearly _15_ times faster!
Details
-------
Nova commit e6ce9557f8 ("qemu-img do not
use cache=none if no O_DIRECT support") was introduced to make instances
boot on filesystems that don't support 'O_DIRECT' (which bypasses the
host page cache and flushes data directly to the disk), such as 'tmpfs'.
In doing so it introduced the 'writethrough' cache for the target image
for `qemu-img convert`.
This patch proposes to change that to 'writeback'.
Let's addresses the 'safety' concern:
"What about data integrity in the event of a host crash (especially
on shared file systems such as NFS)?"
Answer: If the host crashes mid-way during image conversion, then
neither "data integrity" nor the cache mode in use matters. But if the
image conversion completes _succesfully_, then 'writeback' will safely
write the data to the physical disk, just as 'writethough' does.
So we are as safe as we can, but with the extra benefit of image
conversion being _much_ faster.
* * *
The `qemu-img convert` command defaults to 'cache=writeback' for the
source image. And 'cache=unsafe' for the target, because if `qemu-img`
"crashes during the conversion, the user will throw away the broken
output file anyway and start over"[1]. And `qemu-img convert`
supports[2] fsync() for the target image since QEMU 1.1 (2012).
[1] https://git.qemu.org/?p=qemu.git;a=commitdiff;h=1bd8e175
-- "qemu-img convert: Use cache=unsafe for output image"
[2] https://git.qemu.org/?p=qemu.git;a=commitdiff;h=80ccf93b
-- "qemu-img: let 'qemu-img convert' flush data"
Closes-Bug: #1818847
Change-Id: I574be2b629aaff23556e25f8db0d740105be6f07
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Looks-good-to-me'd-by: Kevin Wolf <kwolf@redhat.com>
During port detach the unbind towards neutron happens before the
port allocation is removed from placement. The functional test only
waited for the port unbind before asserted the remaining allocations and
therefore it was racy.
Fortunately the instance.interface_detach.end is emitted after the both
the unbind and the allocation shrink. So the test is changed to wait for
this notification instead.
Change-Id: I53d76d6353ae634e387672e14943f518955b221e
Closes-Bug: #1819374
This adds an online data migration for the user_id field on
InstanceMapping. It does this by processing instance mappings that do
not have a value set for the field (i.e. are NULL in the database) and
queries instances in each cell that need to be updated.
Part of blueprint count-quota-usage-from-placement
Change-Id: I8cc873ba63db7b806ab1de0a88fe8a87d4baeea9
The InstanceMapping user_id field is a new, non-nullable field
representing the user_id for the instance.
When new instance create requests come in, we create the instance
mapping. We will set user_id here before creating the record.
Some virtual interface online data migration and map_instances routine
create InstanceMapping records and since the user_id field did not
previously exist, they were not setting it. We will populate user_id in
these cases.
Finally, whenever an API does a compute_api.get(), we can
opportunistically set and save user_id on the instance mapping if it is
not set.
Part of blueprint count-quota-usage-from-placement
Change-Id: Ic4bb7b49b90a3d6d7ce6c6c62d87836f96309f06
This adds the new user_id column from the instance_mappings table as
a field in the InstanceMapping object. There is already a project_id
field containing the project_id for the instance. The user_id field
will contain the corresponding user_id for the instance.
Part of blueprint count-quota-usage-from-placement
Change-Id: I0f523b2a2e09e1ece9e1911325e55cffd183a9d5
micro-version 2.68 removed force evacuation, this chage
updates gate/test_evacuate.sh to use micro-version 2.67
Closes-Bug: #1819166
Change-Id: I44a3514b4b0ba1648aa96f92e896729c823b151c
The instance_mappings table already contains the project_id for an
instance. This adds the corresponding user_id. This also adds an index
on (user_id, project_id) because later patches in this series will be
using these columns for quota counting.
Part of blueprint count-quota-usage-from-placement
Change-Id: Id9eef7a58f66c73cd638c6c3e228447b7ab81e34
This change extends nova/pci/request.py with a method
that retrieves an instance's PCI request from a given VIF
if the given VIF required a PCI allocation during instance
creation.
The PCI request, if retrieved, belongs to a PCI device
in the compute node where the instance is running.
The change is required to facilitate SR-IOV live migration
allowing to claim VIF related PCI resources on
the destination node.
Change-Id: I9ba475e91b8283f063db446de74d3e4b2de002c5
Partial-Implements: blueprint libvirt-neutron-sriov-livemigration
Today neutronv2's bind_ports_to_host() method is being used
by the conductor during live-migration to bind neutron ports
to the destination host when neutron port binding API extention
is supported[1].
Until now, modifying the vnic_type and profile was not required
as the only vnic_type which was officially supported for live-migration
is the 'normal' vnic_type. Port profile updates were not required.
Support for live-migration with SR-IOV ports requires updates to
the port's profile with host specific information e.g the
claimed PCI device address on the destination node.
- This change modifies bind_ports_to_host to accept a per port
dictionary for both vnic_type and port profile.
Allowing the user to override each attribute on a per-port
basis.
- This change updates bind_ports_to_host to generate
a separate payload per vif instead of assuming all
vifs attached to an interface share the same
vnic_type and profile data.
- The change aims to extend the existing logic of
bind_ports_to_host to allow the flexibility for the
user to augment port attributes on a per-port basis.
The change does not protect the user from wrongfully
calling this method.
[1] https://blueprints.launchpad.net/nova/+spec/neutron-new-port-binding-api
Change-Id: I958685ee20676d45e5fbdf020b82d5844dcc85fe
Partial-Implements: blueprint libvirt-neutron-sriov-livemigration
Co-Authored-By: Adrian Chiris <adrianc@mellanox.com>
The existing PCI manager exposes a single method to free PCI resources
free_instance(), which frees both claimed and allocated PCI resources
for an instance.
This change proposes to extend the PCI manager API with two methods:
1. free_instance_claims() : free PCI resources claims for instance.
2. free_instance_allocations() : free PCI resources allocations for
instance.
This change refactors free_instance() to use (1) and (2) from above.
This change is required to enable SR-IOV live migration as it is
required to free instance PCI allocations on the source node
in case of a successful migration and free instance PCI claims
on the destination node in case of an unsuccessful migration.
Change-Id: Id961f0fc219f32a2cf0282859f228e87cb36ffeb
Partial-Implements: blueprint libvirt-neutron-sriov-livemigration
The admin and user flavor docs on pci.alias were not super
helpful by just throwing the user to the config docs or
flavor docs and letting them figure it out. This change
helps the reader by linking directly to the things being
referenced.
Also cleans up a pci.passthrough config option reference
while in here.
Change-Id: Ie2e28a14ff4655e38a5db3925adcd605ac773843