Add a retry for the xapi glance plugin to handle transient issues
when uploading the vhd.
An example issue we're seeing is a connection timeout:
['XENAPI_PLUGIN_FAILURE', 'upload_vhd', 'error',
"(110, 'Connection timed out')"]
To work around transient issues such as a connection timeout, we
should retry based on glance_num_retries before outright failing.
Change-Id: Ice6fdd3dd39ef40e5997d69209aaafa66bff5d6e
Fixes: bug #1134493
Tox and run_tests.sh were running PEP8 checks against different
file-sets. This patch refactors the logic to determine which files to
run PEP8 checks on into `tools/run_pep8.sh` where it can be called by
both tox and `run_tests.sh`.
Additional fixes:
Some of our Python XenAPI Dom0 plugins don't end in *.py but should
still be checked by PEP8. This patches fixes the hacking.py violations
in the files and adds them back to the srcfiles list.
Merged tools/unused_imports.sh into tools/run_pep8.sh
Change-Id: Id5edd1acb644ab938beffc3473494a179d9d8cda
fix the N402 errors that have slipped in in the last 48 hrs since
starting this patch series.
fix an N401 error that our scanner current can't find because it
only looks for doc strings on classes and defs.
this is the xeno's paradox of patch series, but we're getting close.
Change-Id: I4a763bb4c812335d853eae05c72464f18ab93297
When the bandwidth polling task tries to update bw usage for an
instance that does not have an entry for the last two periods already
in the bw_usage_cache table, it will throw an exception in the
polling task. This was just a misses 'if' statement.
Also:
xenserver: fix swapped bw data in xs plugin.
This fixes bug 1075255
Change-Id: I44bb143039fcdfc8dacb13b67ae8f79dc5f38777
This patch adds a new dom0 plugin which supports downloading images via
BitTorrent. Torrent metadata files are assumed to be served from a
webserver which is specified by the `torrent_base_url` config.
Under the hood, the dom0 plugins calls out to rasterbar's libtorrent via
Python bindings in order perform the initial download as well as the
seeding thereafter.
Implements BP xenserver-bittorrent-images
Change-Id: I824720a6e3a37317080a22cd7405d2a88172c3ef
This patch removes our legacy handling of swap in the image. Now that
we're generating swap on-the-fly, this stop-gap solution can go away.
Change-Id: Ied3198f77af8dabb6cfbf2ab9cfb3a4eb18e32ea
Windows can take longer than the default 30 seconds for resetnetwork
requests. Double the timeout for the command to 60 seconds, but add
a flag so it can be changed without code changes in the future.
At the same time, add a flag for all other agent requests too.
Change-Id: Iba91c37fd5596ea0dd63c20f74925972df1ca715
This changes the method used to poll xenserver for bandwidth data.
The reccomended way of collecting such data from xenserver (namely the
RRD files provided by the hosts) do not seem to be reliable, they
will sometimes be correct, often will be signifigantly under (> 10%),
and occasionally will show artifacts, such as phantom 4gb bandwidth
'spikes'.
This patch changes that to use the much simpler method of simply polling the
byte counters on the VIF network devices on the host. (We have old non-nova
code that does that on xenserver, and that method is known to work).
This should also make it much easier for other hypervisors other than
xenserver to implement bandwidth polling, as polling the counters is a rather
more universal method.
Fixes bug 1055737
Change-Id: I6a280d8bbfcc74914f888b11bc09349a270a5f58
Fixes bug 1055431.
As the scheduler wants to boot a vm_mode=xen type of image, the host's
"supported_instances" capability is used for finding a good candidate.
In the Xapi case, this field was not populated.
This fix modifies the xapi xenhost plugin, so the Xen host capabilities
are returned back to the compute node, as "host_capabilities".
On the compute side, the mentioned information is used, to extract the
"supported_instances" information.
Change-Id: I2da11ab81f74b5b52e2c30832a694470978e21b0
The dom0 plugin code had been using `pickle` for serializing input and
`json` for serializing output which was needlessly inconsistent. This
patch makes the code use `pickle`--chosen for its better handling of
`datetime` objects--for both sending and receiving data.
This patch also refactors the code so that neither the caller nor the
callee need to explicitly worry about serialization: the caller just
passes in args and kwargs, and the callee's function signature just
accepts the args and kwargs as usual.
Bonus: Removes unecessary imports
Change-Id: I3abb42eeebd8d37d67e6c26fa7bcae66d876b3ee
This introduces a new glance_api_insecure setting that can be used to
not verify the certificate of the glance server against the certificate
authorities.
Fix bug 1042081.
Change-Id: I0a9f081425854e9c01e00dfd641e42276c878c67
Communicating with the agent requires polling for a response. The
operation uses xenstore, which is lightweight, yet the interval
in between polls was 3 seconds. This would cause longer than necessary
sleeps when an instance was booting making the overall boot slower.
Change-Id: I560c05887128f1a0e29228e859cca25ded4eceec
Unlike every other agent command, the resetnetwork command would not
wait for a response. All failures were silently ignored. Change this
to at least log a message if an error occurs.
Change-Id: I40e323607b2ce50869f3bf11e4582ff83cbed1c0
The core problem is that XenServer's `VDI.copy` call drops the
destination file directly into the SR. This means that half-completed
files are visible with no way to distinguish these from fully-copied
files.
We had some code that attempted to mitigate this issue by checking
physical_utilisation against an expected value. The problem with this
code is that it didn't account for VDI chaining where the
physical_utilisation would not necessarily match the parent.
The net effect of this was that 'cloned' VDIs would never be found
because their physical_utilisation was far below what was expected.
The work around is to create our own `_safe_copy_vdi` which is isolated
and atomic. Long term, `VDI.copy` should be fixed so that half-completed
files are never stored in the SR.
Change-Id: I6eb3cb5259f9ee1c7394e58f76105a8b39bfc720
The VHD sequence validation code was erroneously counting `swap.vhd`,
which caused it to raise an exception when a corresponding numbered VHD
was not found.
The fix is to simply ignore the `swap.vhd` file.
Other unknown VHDs will generate an exception, but from a
sanity-checking perspective, this is a Good Thing(tm).
Fixes bug 1030939
Change-Id: Ic82ae27a4af7ea8f7669fd006aea1a310b691218
This is a sanity check to ensure the footer timestamps on a VHD are
reasonable (e.g. not in the future). This condition can occur if the
local time for the source and destination machines in a migration are
not in agreement, requiring an adjustment to /etc/localtime and/or NTP
reconfiguration.
Without this check there is a risk of importing a corrupt VHD into the
SR causing the entire SR to become corrupted.
Change-Id: I17228e50d6f54632f3bfc32a682e511f876517ec
This will provide a bit more visibility into what's happening when dom0
plugin generates an exception.
Change-Id: Ia529956ee4fc56e49efdcf2cca4f42fc8ebcc3ea
This adds a check to raise a sensible error if the VHDs in the staging
area are not sequence numbered properly, meaning 0 to n-1 with no gaps.
The previous error was an UnboundLookupError which made it difficult to
pinpoint the root cause.
Change-Id: I6b9e4f854c271bf73711480568be384ba883775d
The strategy for removing the limit is to refactor migration so that
they work nearly identically to snapshots, meaning sequence-numbered
VHDs are rsynced over into a staging-area and then imported into the SR
using the `import_vhds` function.
Change-Id: Ibf5c82c52ae7d505ea9e54d64fcc8b8fdce4d05d
Recent versions of xenserver use a newer, but buggy, version of rsync
that will attempt to parse out the username and fail if it isn't in
the destination. So, add it to the destination to ensure this works
with both older and newer versions of rsync.
Change-Id: I9b7f05a8ea5cf5b7fae1a55a2b8557b2bfe5b865
Snapshots and migrations were coded with a simplifying assumption that
the maximum length of a VDI chain would be 3. Now that fast-cloning has
been added, this assumption no longer holds.
The goal of this patch is to remove the restriction for snapshots. A
follow-on patch will remove the restriction for migrations.
This patch changes the image-format for XenAPI images. Instead of naming
the VHDs, 'base', 'image', and 'snap', they are now numbered staring
with 0 as the leaf and going to N as the base-copy (root).
Old-style images are still supported.
Change-Id: Ieb073b42dc25db7cee4dfca7ff6525f7e7f46e8e
Fixes bug 1022681
If deleting the kernel/ramdisk fails because the files don't exist
anymore, then ignore the error and continue. This will ensure that the
instance will get destroyed properly even if the files were deleted
outside of nova.
Change-Id: I5d1f95ea2a6f552c48efbb9e92bb36767df19e34
Nova has additional pep8 "plugins" that they expect to run as part of the
gate. This patch will run tools/hacking.py instead of pep8 directly. Also,
it fixes the hacking violaions in contrib, plugins and smoketests.
Fixes bug 1010136
Change-Id: I86d8789218c197d5d4a43d1201465d340646a395
Adds a call to retrieve the current uptime on a specific hypervisor.
This version of the patch only adds the XenAPI variant; other virt
drivers will raise a NotImplementedError until they implement the
get_host_uptime() method.
Change-Id: Ie259589757a460fcd91a49a8dd8099e4d91524e7
The kernel and ramdisk VDI manipulation code really has nothing to do
with Glance so it doesn't make a lot of sense for it to exist in the
Glance Dom0 plugin.
This patch refactors the code out to its own plugin and then applies a
few misc cleanups.
Change-Id: I363d54ea3c2d51aa6a6c1635b4fb59ebb9ce1fc0
The virt-layer code was refactored so that a dict was used to pass
around which VDIs are present. This code makes the Dom0 plugin return
that same data structure so we don't have to perform an extra conversion
step.
Change-Id: Ib4f1b0082138d233eb0c3873bbc553395510bc8d
`utils.py` was added by the Dom0 plugin refactoring but the
corresponding declaration in the SPEC file was forgotten.
Change-Id: If12d7389a51928b1f741063e12b3b5a9015d0656
Windows agent requires an argument of either 'agent' or 'xentools' to
the 'version' command. All we care about is 'agent', so add it. The
unix agent happily ignores the arg.
Fixes bug 997805
Change-Id: Ic369c8a2850173057da9d3175a02b5864d7a6514
Instance types define disk names as root, swap and ephemeral. The
XenAPI driver however uses os, swap and ephemeral. Standardize on
calling them 'root' disks instead of 'os' disks.
Change-Id: Ia34346d463d06cb971537c305602926ceb0dc175