Commit Graph

126 Commits

Author SHA1 Message Date
Zuul 9dbf7b7ca9 Merge "Drop delete_build_requests_with_no_instance_uuid online migration" 2019-07-23 22:43:45 +00:00
Zuul 8fc20874b8 Merge "nova-manage: heal port allocations" 2019-07-22 21:59:30 +00:00
Zuul 063ef486e9 Merge "Exit 1 when db sync runs before api_db sync" 2019-07-20 03:26:41 +00:00
Balazs Gibizer 54dea2531c nova-manage: heal port allocations
Before I97f06d0ec34cbd75c182caaa686b8de5c777a576 it was possible to
create servers with neutron ports which had resource_request (e.g. a
port with QoS minimum bandwidth policy rule) without allocating the
requested resources in placement. So there could be servers for which
the allocation needs to be healed in placement.

This patch extends the nova-manage heal_allocation CLI to create the
missing port allocations in placement and update the port in neutron
with the resource provider uuid that is used for the allocation.

There are known limiations of this patch. It does not try to reimplement
Placement's allocation candidate functionality. Therefore it cannot
handle the situation when there is more than one RP in the compute
tree which provides the required traits for a port. In this situation
deciding which RP to use would require the in_tree allocation candidate
support from placement which is not available yet and 2) information
about which PCI PF an SRIOV port is allocated from its VF and which RP
represents that PCI device in placement. This information is only
available on the compute hosts.

For the unsupported cases the command will fail gracefully. As soon as
migration support for such servers are implemented in the blueprint
support-move-ops-with-qos-ports the admin can heal the allocation of
such servers by migrating them.

During healing both placement and neutron need to be updated. If any of
those updates fail the code tries to roll back the previous updates for
the instance to make sure that the healing can be re-run later without
issue. However if the rollback fails then the script will terminate with
an error message pointing to documentation that describes how to
recover from such a partially healed situation manually.

Closes-Bug: #1819923
Change-Id: I4b2b1688822eb2f0174df0c8c6c16d554781af85
2019-07-15 17:22:40 +02:00
Mark Goddard e99937c9a9 Exit 1 when db sync runs before api_db sync
Since cells v2 was introduced, nova operators must run two commands to
migrate the database schemas of nova's databases - nova-manage api_db
sync and nova-manage db sync. It is necessary to run them in this order,
since the db sync may depend on schema changes made to the api database
in the api_db sync. Executing the db sync first may fail, for example
with the following seen in a Queens to Rocky upgrade:

nova-manage db sync
ERROR: Could not access cell0.
Has the nova_api database been created?
Has the nova_cell0 database been created?
Has "nova-manage api_db sync" been run?
Has "nova-manage cell_v2 map_cell0" been run?
Is [api_database]/connection set in nova.conf?
Is the cell0 database connection URL correct?
Error: (pymysql.err.InternalError) (1054, u"Unknown column
        'cell_mappings.disabled' in 'field list'") [SQL: u'SELECT
cell_mappings.created_at AS cell_mappings_created_at,
cell_mappings.updated_at AS cell_mappings_updated_at,
cell_mappings.id AS cell_mappings_id, cell_mappings.uuid AS
cell_mappings_uuid, cell_mappings.name AS cell_mappings_name,
cell_mappings.transport_url AS cell_mappings_transport_url,
cell_mappings.database_connection AS
cell_mappings_database_connection, cell_mappings.disabled AS
cell_mappings_disabled \nFROM cell_mappings \nWHERE
cell_mappings.uuid = %(uuid_1)s \n LIMIT %(param_1)s'] [parameters:
{u'uuid_1': '00000000-0000-0000-0000-000000000000', u'param_1': 1}]
(Background on this error at: http://sqlalche.me/e/2j85)

Despite this error, the command actually exits zero, so deployment tools
are likely to continue with the upgrade, leading to issues down the
line.

This change modifies the command to exit 1 if the cell0 sync fails.

This change also clarifies this ordering in the upgrade and nova-manage
documentation, and adds information on exit codes for the command.

Change-Id: Iff2a23e09f2c5330b8fc0e9456860b65bd6ac149
Closes-Bug: #1832860
2019-07-04 09:16:41 +01:00
Zuul 0824fd1864 Merge "Clarify --before help text in nova manage" 2019-06-22 00:42:53 +00:00
Eric Fried d8ad9f986e Clarify --before help text in nova manage
The --before option to nova manage db purge and archive_deleted_rows
accepts a string to be parsed by dateutils.parser.parse() with
fuzzy=True. This is fairly forgiving, but doesn't handle e.g. "now - 1
day". This commit adds some clarification to the help strings, and some
examples to the docs.

Change-Id: Ib218b971784573fce16b6be4b79e0bf948371954
2019-06-19 20:07:12 +00:00
Stephen Finucane 009fd0f35b docs: Remove references to nova-consoleauth
We're going to remove all the code, but first, remove the docs.

Part of blueprint remove-consoleauth

Change-Id: Ie96e18ea7762b93b4116b35d7ebcfcbe53c55527
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2019-06-17 15:18:31 +01:00
melanie witt 2fc3c9453a Literalize CLI options in docs
This puts CLI options under doc/source/cli/ in literal quotes for nicer
doc renderings.

Change-Id: Iafb90ec020de4de88fc59f1f15f1a6e0972e78fb
2019-06-13 18:59:09 +00:00
melanie witt 5c544c7e2a Warn for duplicate host mappings during discover_hosts
When the 'nova-manage cellv2 discover_hosts' command is run in parallel
during a deployment, it results in simultaneous attempts to map the
same compute or service hosts at the same time, resulting in
tracebacks:

  "DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u\"Duplicate
  entry 'compute-0.localdomain' for key 'uniq_host_mappings0host'\")
  [SQL: u'INSERT INTO host_mappings (created_at, updated_at, cell_id,
  host) VALUES (%(created_at)s, %(updated_at)s, %(cell_id)s,
  %(host)s)'] [parameters: {'host': u'compute-0.localdomain',
  %'cell_id': 5, 'created_at': datetime.datetime(2019, 4, 10, 15, 20,
  %50, 527925), 'updated_at': None}]

This adds more information to the command help and adds a warning
message when duplicate host mappings are detected with guidance about
how to run the command. The command will return 2 if a duplicate host
mapping is encountered and the documentation is updated to explain
this.

This also adds a warning to the scheduler periodic task to recommend
enabling the periodic on only one scheduler to prevent collisions.

We choose to warn and stop instead of ignoring DBDuplicateEntry because
there could potentially be a large number of parallel tasks competing
to insert duplicate records where only one can succeed. If we ignore
and continue to the next record, the large number of tasks will
repeatedly collide in a tight loop until all get through the entire
list of compute hosts that are being mapped. So we instead stop the
colliding task and emit a message.

Closes-Bug: #1824445

Change-Id: Ia7718ce099294e94309103feb9cc2397ff8f5188
2019-06-13 17:18:16 +00:00
Jake Yip e822360b66 Add --before to nova-manage db archive_deleted_rows
Add a parameter to limit the archival of deleted rows by date. That is,
only rows related to instances deleted before provided date will be
archived.

This option works together with --max_rows, if both are specified both
will take effect.

Closes-Bug: #1751192
Change-Id: I408c22d8eada0518ec5d685213f250e8e3dae76e
Implements: blueprint nova-archive-before
2019-05-23 11:07:08 +10:00
Zuul 44e686c727 Merge "Add --instance option to heal_allocations" 2019-05-09 19:22:50 +00:00
Matt Riedemann 270d5d351e Add nova-status upgrade check for minimum required cinder API version
The compute API has required cinder API >= 3.44 since Queens [1] for
working with the volume attachments API as part of the wider
volume multi-attach support.

In order to start removing the compatibility code in the compute API
this change adds an upgrade check for the minimum required cinder API
version (3.44).

[1] Ifc01dbf98545104c998ab96f65ff8623a6db0f28

Change-Id: Ic9d1fb364e06e08250c7c5d7d4bdb956cb60e678
2019-05-03 11:53:12 -04:00
Zuul dd6bd75355 Merge "Query in_tree to placement" 2019-05-02 21:55:38 +00:00
Tetsuro Nakamura 575fd08e63 Query in_tree to placement
This patch adds the translation of `RequestGroup.in_tree` to the
actual placement query and bumps microversion to enable it.

The release note for this change is added.

Change-Id: I8ec95d576417c32a57aa0298789dac6afb0cca02
Blueprint: use-placement-in-tree
Related-Bug: #1777591
2019-04-17 08:52:59 +00:00
Stephen Finucane 7954b2714e Remove 'nova-manage cell' commands
These are no longer necessary with the removal of cells v1. A check for
cells v1 in 'nova-manage cell_v2 simple_cell_setup' is also removed,
meaning this can no longer return the '2' exit code.

Part of blueprint remove-cells-v1

Change-Id: I8c2bfb31224300bc639d5089c4dfb62143d04b7f
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2019-04-16 18:26:17 +01:00
Matt Riedemann c92b297896 Add --instance option to heal_allocations
This resolves one of the TODOs in the heal_allocations CLI
by adding an --instance option to the command which, when
specified, will process just the single instance given.

Change-Id: Icf57f217f03ac52b1443addc34aa5128661a8554
2019-04-15 10:34:59 -04:00
Matt Riedemann ded3e4d900 Add --dry-run option to heal_allocations CLI
This resolves one of the TODOs in the heal_allocations CLI
by adding a --dry-run option which will still print the output
as we process instances but not commit any allocation changes
to placement, just print out that they would happen.

Change-Id: Ide31957306602c1f306ebfa48d6e95f48b1e8ead
2019-04-11 18:15:09 -04:00
Matt Riedemann aaee893a5e Drop delete_build_requests_with_no_instance_uuid online migration
This migration was added in Ocata and backported to Newton in change
I8a05ee01ec7f6a6f88b896f78414fb5487e0071e to deal with Mitaka-era
build_requests records that would not have an instance_uuid value
and thus raise a ValueError in BuildRequest._from_db_object (because
BuildRequest.instance_uuid is not nullable).

This is essentially a revert of that change now since operators
have had long enough to run the migration. If anyone were to skip
level upgrade from Mitaka to Train (which we don't support, we require
you to roll through), and hit an issue with this they could simply
execute this on their nova_api DB:

  DELETE FROM build_requests WHERE instance_uuid IS NULL;

Change-Id: Ie9593657544b7aef1fd7a5c8f01e30e09e3fcce6
2019-04-08 17:48:14 -04:00
Matt Riedemann cec1808050 Drop migrate_keypairs_to_api_db data migration
This was added in Newton:

  I97b72ae3e7e8ea3d6b596870d8da3aaa689fd6b5

And was meant to migrate keypairs from the cell
(nova) DB to the API DB. Before that though, the
keypairs per instance would be migrated to the
instance_extra table in the cell DB. The migration
to instance_extra was dropped in Queens with change:

  Ie83e7bd807c2c79e5cbe1337292c2d1989d4ac03

As the commit message on ^ mentions, the 345 cell
DB schema migration required that the cell DB keypairs
table was empty before you could upgrade to Ocata.

The migrate_keypairs_to_api_db routine only migrates
any keypairs to the API DB if there are entries in the
keypairs table in the cell DB, but because of that blocker
migration in Ocata that cannot be the case anymore, so
really migrate_keypairs_to_api_db is just wasting time
querying the database during the online_data_migrations
routine without it actually migrating anything, so we
should just remove it.

Change-Id: Ie56bc411880c6d1c04599cf9521e12e8b4878e1e
Closes-Bug: #1822613
2019-04-03 11:42:48 -04:00
Zuul c756e868b6 Merge "Remove cells v1 (for the most part) from the docs" 2019-03-08 01:29:10 +00:00
Surya Seetharaman 660f717394 Update --max-rows parameter description for archive_deleted_rows
Since API tables do not have the concept of soft-delete, we purge
the instance_mappings, request_specs and instance_group_member records
of deleted instances while they are archived. The ``nova-manage db
archive_deleted_rows`` offers a ``max-rows`` parameter which actually
means the batch size of the iteration for moving the soft-deleted
records from table to their shadow-tables. So this patch clarifies
that the batch size does not include the API table records that are
purged so that the users are not confused by the ``--verbose`` output
of the command giving more rows than specified.

Change-Id: I652854c7192b996a33ed343a51a0fd8c7620e876
Closes-Bug: #1794994
2019-03-06 06:35:55 +00:00
Matt Riedemann bc5ef2ff06 Remove cells v1 (for the most part) from the docs
As discussed in the mailing list [1] since cells v1
has been deprecated since Pike and the biggest user
of it (CERN as far as we know) moved to cells v2
in Queens, we can start rolling back the cells v1
specific documentation to avoid confusing people
new to nova about what cells is and making them
understand there was an optional v1.

There are still a few mentions of cells v1 left in
here for things like adding a new cell which need
to be re-written and for that I've left a todo.

Users can still get at cells v1 specific docs from
published stable branches and/or rebuilding the
docs from before this change.

[1] http://lists.openstack.org/pipermail/openstack-discuss/2019-February/002569.html

Change-Id: Idaa04a88b6883254cad9a8c6665e1c63a67e88d3
2019-02-13 13:59:09 -05:00
Zuul 13f8b54414 Merge "Allow run metadata api per cell" 2019-01-15 02:20:14 +00:00
Kevin_Zheng e2e372b2b1 Allow run metadata api per cell
Adds configuration option ``[api]/local_metadata_per_cell``
to allow user run Nova metadata API service per cell. Doing
this can avoid query API DB for instance information each
time an instance query for its metadata.

Implements blueprint run-meta-api-per-cell

Change-Id: I2e6ebb551e782e8aa0ac90169f4d4b8895311b3c
2019-01-14 10:20:50 -05:00
Matt Riedemann 63a32c230c Remove "API Service Version" upgrade check
With change I11083aa3c78bd8b6201558561457f3d65007a177
the code for the API Service Version upgrade check no
longer exists and therefore the upgrade check itself
is meaningless now.

Change-Id: I68b13002bc745c2c9ca7209b806f08c30272550e
2018-12-19 20:41:04 -05:00
Matt Riedemann 7737acdc20 Remove "Resource Providers" upgrade check
With placement being extracted from nova, the
"Resource Providers" nova-status upgrade check no
longer works as intended since the placement data
will no longer be in the nova_api database. As a
result the check can fail on otherwise properly
deployed setups with placement extracted.

This check was originally intended to ease the upgrade
to Ocata when placement was required for nova to work,
as can be seen from the newton/ocata/pike references
in the code.

Note that one could argue the check itself as a concept
is still useful for fresh installs to make sure everything
is deployed correctly and nova-compute is properly
reporting into placement, but for it to be maintained
we would have to change it to no longer rely on the
nova_api database and instead use the placement REST API,
which while possible, might not be worth the effort or
maintenance cost.

For simplicity and expediency, the check is simply removed
in this change.

Related mailing list discussion can be found here [1].

[1] http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000454.html

Change-Id: I630a518d449a64160c81410245a22ed3d985fb01
Closes-Bug: #1808956
2018-12-19 20:31:36 -05:00
Zuul af9977c71c Merge "Restore nova-consoleauth to install docs" 2018-12-17 16:36:28 +00:00
Zuul 5ca357c100 Merge "Use links to placement docs in nova docs" 2018-12-08 10:05:43 +00:00
Zuul 5bf6f6304e Merge "Deprecate the nova-xvpvncproxy service" 2018-12-05 13:18:41 +00:00
Zuul e26ac8f24a Merge "Deprecate the nova-console service" 2018-12-05 13:05:06 +00:00
melanie witt 983e6ea551 Restore nova-consoleauth to install docs
The installation of the nova-consoleauth service was erroneously
removed from the docs prematurely. The nova-consoleauth service
is still being used in Rocky, with the removal being possible in
Stein.

This should have been fixed as part of change
Ibbdc7c50c312da2acc59dfe64de95a519f87f123 but was missed.

This is also related to the release note update in Rocky
under change Ie637b4871df8b870193b5bc07eece15c03860c06.

Co-Authored-By: Matt Riedemann <mriedem.os@gmail.com>

Closes-Bug: #1793255
Related-Bug: #1798188

Change-Id: Ied268da9e70bd2807c2dfe7a479181fbec52979d
2018-11-28 15:21:55 -05:00
Takashi NATSUME 7dd7d9a5fa Use links to placement docs in nova docs
Placement documents have been published since
I667387ec262680af899a628520c107fa0d4eec24.

So use links to placement documents
https://docs.openstack.org/placement/latest/
in nova documents.

Change-Id: I218a6d11fea934e8991e41b4b36203c6ba3e3dbf
2018-11-26 05:39:56 +00:00
Zuul edee8e6f8d Merge "Add nova-status upgrade check for consoles" 2018-10-30 22:44:07 +00:00
melanie witt d2535b0261 Add nova-status upgrade check for consoles
This will check if a deployment is currently using consoles and warns
the operator to set [workarounds]enable_consoleauth = True on their
console proxy host if they are performing a rolling upgrade which is
not yet complete.

Partial-Bug: #1798188

Change-Id: Idd6079ce4038d6f19966e98bcc61422b61b3636b
2018-10-26 04:34:49 +00:00
Matt Riedemann c86f309c56 Add more documentation for online_data_migrations CLI
This is a follow up to commit c4c6dc736 to clarify some
confusing comments in the code, add more comments in
the actual runtime code, and also provide an example
in the CLI man page docs along with an explanation of
the output, specifically for the case that $found>0
but done=0 and what that means.

Change-Id: I0691caab2c44d3189504c54e51bb263ecdc5d1d2
Related-Bug: #1794364
2018-10-24 16:14:00 -04:00
imacdonn 3eea37b85b Handle online_data_migrations exceptions
When online_data_migrations raise exceptions, nova/cinder-manage catches
the exceptions, prints fairly useless "something didn't work" messages,
and moves on. Two issues:

1) The user(/admin) has no way to see what actually failed (exception
   detail is not logged)

2) The command returns exit status 0, as if all possible migrations have
   been completed successfully - this can cause failures to get missed,
   especially if automated

This change adds logging of the exceptions, and introduces a new exit
status of 2, which indicates that no updates took effect in the last
batch attempt, but some are (still) failing, which requires intervention.

Change-Id: Ib684091af0b19e62396f6becc78c656c49a60504
Closes-Bug: #1796192
2018-10-16 15:49:51 +00:00
Stephen Finucane 4e6cffe45e Deprecate the nova-xvpvncproxy service
This is a relic that has long since been replaced by the noVNC proxy
service. Start preparing for its removal.

Change-Id: Icb225dec3ad291b751e475bd3703ce0eb30b44db
2018-10-15 10:03:13 +01:00
Stephen Finucane f18ae13e36 Deprecate the nova-console service
As discussed on the mailing list [1].

[1] http://lists.openstack.org/pipermail/openstack-dev/2018-October/135413.html

Change-Id: I1f1fa1d0f79bec5a4101e03bc2d43ba581dd35a0
2018-10-15 10:03:08 +01:00
Zuul 2f635fa914 Merge "Validate transport_url in nova-manage cell_v2 commands" 2018-09-25 10:10:24 +00:00
Christian Berendt d3e254e47b Add missing backticks in nova-manage docs
Change-Id: I8f7a4c8029a54902d531d98da62a0d8bd7cc383d
2018-09-18 15:02:36 +02:00
Eric Fried 2833785f59 Report client: _reshape helper, placement min bump
Add a thin wrapper to invoke the POST /reshaper placement API with
appropriate error checking. This bumps the placement minimum to the
reshaper microversion, 1.30.

Change-Id: Idf8997d5efdfdfca6967899a0882ffb9ecf96915
blueprint: reshape-provider-tree
2018-08-24 15:39:18 -05:00
Zuul 63c1f9a17a Merge "Add nova-status upgrade check for request spec migrations" 2018-07-28 02:39:19 +00:00
Zuul 99d2a34d1f Merge "Add nova-manage placement sync_aggregates" 2018-07-25 18:56:26 +00:00
Matt Riedemann aa6360d683 Add nova-manage placement sync_aggregates
This adds the "nova-manage placement sync_aggregates"
command which will compare nova host aggregates to
placement resource provider aggregates and add any
missing resource provider aggregates based on the nova
host aggregates.

At this time, it's only additive in that the command
does not remove resource provider aggregates if those
matching nodes are not found in nova host aggregates.
That likely needs to happen in a change that provides
an opt-in option for that behavior since it could be
destructive for externally-managed provider aggregates
for things like ironic nodes or shared storage pools.

Part of blueprint placement-mirror-host-aggregates

Change-Id: Iac67b6bf7e46fbac02b9d3cb59efc3c59b9e56c8
2018-07-24 11:19:23 -04:00
Matt Riedemann 660e328a25 Use consumer generation in _heal_allocations_for_instance
If we're updating existing allocations for an instance due
to the project_id/user_id not matching the instance, we should
use the consumer_generation parameter, new in placement 1.28,
to ensure we don't overwrite the allocations while another
process is updating them.

As a result, the include_project_user kwarg to method
get_allocations_for_consumer is removed since nothing else
is using it now, and the minimum required version of placement
checked by nova-status is updated to 1.28.

Change-Id: I4d5f26061594fa9863c1110e6152069e44168cc3
2018-07-23 14:09:55 -04:00
Matt Riedemann 6b6d81cf2b Heal allocations with incomplete consumer information
Allocations created before microversion 1.8 didn't have project_id
/ user_id consumer information. In Rocky those will be migrated
to have consumer records, but using configurable sentinel values.

As part of heal_allocations, we can detect this and heal the
allocations using the instance.project_id/user_id information.

This is something we'd need if we ever use Placement allocation
information counting quotas.

Note that we should be using Placement API version 1.28 with
consumer_generation when updating the allocations, but since
people might backport this change the usage of consumer
generations is left for a follow up patch.

Related to blueprint add-consumer-generation

Change-Id: Idba40838b7b1d5389ab308f2ea40e28911aecffa
2018-07-13 11:29:54 -04:00
Matt Riedemann e73f828057 Add nova-status upgrade check for request spec migrations
We can't easily add a blocker db sync migration to make
sure the migrate_instances_add_request_spec online data
migration has been run since we have to iterate both cells
(for instances) and the API DB (for request specs) and that's
not something we should do during a db sync call.

But we want to eventually drop the online data migration and
the accompanying compat code found in the api and conductor
services.

This adds a nova-status upgrade check for missing request specs
and fails if any existing non-deleted instances are found which
don't have a matching request spec.

Related to blueprint request-spec-use-by-compute

Change-Id: I1fb63765f0b0e8f35d6a66dccf9d12cc20e9c661
2018-07-11 14:34:03 -04:00
Zuul 44c8aec3f0 Merge "Mention nova-status upgrade check CLI in upgrade doc" 2018-06-29 15:45:58 +00:00
Matt Riedemann 1476b030bd Fix CLI docs for nova-manage api_db commands
There were a few changes needed here:

1. There is no "API cell database", just the API
   database, so this removes mentions of cells.

2. The VERSION argument was missing from the sync help.

3. The sync command does not create a database, it upgrades
   the schema. Wording for that was borrowed from the
   nova-manage db sync help.

4. Starting in Rocky, the api_db sync command also upgrades
   the schema for the optional placement database if configured
   so that's mentioned here as well.

Change-Id: Ibc49f93b8bd51d9a050acde5ef3dc8aad91321ca
Closes-Bug: #1778733
2018-06-26 10:16:55 -04:00