womax/nova - nova - Gitea: Git with a cup of tea

womax/nova

Author	SHA1	Message	Date
Stephen Finucane	eb728e877a	db: Final cleanups Some things that were missed in previous patches and are thrown together here: - Add alembic as an explicit dependency (we were getting it transitively from oslo.db). We also bump the sqlalchemy dependency to a 1.4.x release, which is the minimum supported by our chosen version of alembic (more on this below) - Remove tooling related to the old migrations - Fix the tox whitelisting of the flaky MySQL tests On the SQLAlchemy front, we opt for 1.4.13. Technically alembic should support anything from 1.4.0, however, with SQLAlchemy >= 1.4.0, < 1.4.13 we see errors like the following in some tests: sqlalchemy.exc.InvalidRequestError: Entity namespace for "count(instance_mappings.id)" has no property "queued_for_delete" There's nothing specific about this in the release notes for 1.4.13 [1] but it definitely fixes things. [1] https://docs.sqlalchemy.org/en/14/changelog/changelog_14.html#change-1.4.13 Change-Id: I4c8eb13f11aa7471c26a5ba326319aef245c9836 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-17 13:50:19 +01:00
Stephen Finucane	a7584ec1a5	docs: Add documentation on database migrations Alembic does lots of new things. Provide docs for how to use this. We also improve upgrade docs slightly, removing references to ancient reviews that are no longer really helpful as well as calling out our N -> N+1 constraint. Change-Id: I3760b82ce3bd71aa0a760d7137d69dfa3f29dc1d Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-17 13:49:18 +01:00
Stephen Finucane	1ba2c1c55d	db: Enable auto-generation of migrations Let's take advantage of the features that alembic provides [1]. We use this opportunity to clean up how we do our base model creation and remove the downgrade section from the migration script template, since we don't support downgrades. [1] https://alembic.sqlalchemy.org/en/latest/autogenerate.html Change-Id: I18846a5c7557db45bb63b97c7e8be5c4367e4547 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-09 15:34:40 +01:00
Stephen Finucane	905c9723e9	db: Integrate alembic This looks more complicated than it is, but it's really quite simple. Essentially we have to deal with two possible configurations: - For existing deployments, the DB sync operation should apply any outstanding sqlalchemy-migrate-based migrations, dummy apply the initial alembic migration, and then apply any additional alembic-based migrations requested (or any available, if no version is specified). - For new deployments, the DB sync operation should apply the initial alembic migration and any additional alembic-based migrations requested (or any available, if no version is specified). No sqlalchemy-migrate-based migrations will ever be applied. While we continue to allow users to request a specific database migration version to upgrade to, we do not allow them to request a sqlalchemy-migrate-based migration version. There's no good reason to do this - the deployment won't run with an out-of-date DB schema (something that's also true of the alembic migration, fwiw) - and we want to get people off of sqlalchemy-migrate as fast as possible. A change in a future release can remove the sqlalchemy-migrate-based migrations once we're sure that they'll have upgraded to a release including all of the sqlalchemy-migrated-based migrations (so Wallaby). Tests are modified to validate the sanity of these operations. They're mostly trivial changes, but we do need to do some funky things to ensure that (a) we don't use logger configuration from 'alembic.ini' that will mess with our existing logger configuration and (b) we re-use connection objects as necessary to allow us to run tests against in-memory databases, where a different connection would actually mean a different database. We also can't rely on 'WalkVersionsMixin' from oslo.db since that only supports sqlalchemy-migrate [1]. We instead must re-invent the wheel here somewhat. [1] https://github.com/openstack/oslo.db/blob/10.0.0/oslo_db/sqlalchemy/test_migrations.py#L42-L44 Change-Id: I850af601f81bd5d2ecc029682ae10d3a07c936ce Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-09 15:34:40 +01:00
Stephen Finucane	307adc5e08	db: Normalize migrations tests We're going to be extensively reworking these to handle the alembic switchover in a future change. Ahead of this rework, rework things somewhat to simplify things and remove unnecessary noise. Change-Id: Ie9d00e25b7e99104155466060a2b6f8a884a5e0a Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-09 15:34:40 +01:00
Stephen Finucane	c3e113eb31	db: Trivial style changes Rewrap some code just to make it a little less fugly to read. Change-Id: If78bbd578bbba73fc85446ad55d34d3addd6c4af Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-09 15:34:40 +01:00
Stephen Finucane	38a022166f	db: Add initial alembic migration for API DB The pattern here is the same as the one used for the main database in change I1fa2feaee78213ad81f1889ce54888696f58d98c. Once again, we're mostly copy-pasting from the existing schema. Reviewers can easily diff 'nova/db/api/legacy_migrations/versions/067_train.py' against 'nova/db/api/migrations/versions/d67eeaabee36_initial_version.py' in order to sanity check the new schema files. Change-Id: Ic65ac5cec46ace7bd49243d052deb2a434bb1099 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-09 15:34:40 +01:00
Stephen Finucane	ca9ba88a7a	db: Add initial alembic migration for main DB This wasn't as complicated as feared. We're mostly able to copy-paste the existing sqlalchemy-migrate migration and simply changes the pattern of calls from monkey-patched 'create' methods such as: agent_builds = Table('agent_builds', meta, sa.Column('created_at', sa.DateTime), ... mysql_charset='utf8' ) agent_builds.create() to explicit alembic APIs like: op.create_table( 'agent_builds', sa.Column('created_at', sa.DateTime), ... mysql_charset='utf8' ) Reviewers are encouraged to diff the old migration file, 'nova/db/main/legacy_migrations/versions/402_train.py' against the new one, 'nova/db/main/migrations/versions/8f2f1571d55b_initial_version.py', to ease their job. The only significant divergences are the removal of a single reference to the 'migrate_version' table created by sqlalchemy-migrate, which obviously won't exist in an alembic-only world, along with some reordering of tables. The latter step is necessary since Alembic is not smart enough to correctly order the creation of tables so that tables that reference (via a foreign key) one or more other tables are created after the table(s) they reference. Since we now have to create tables using the 'create_table' API as noted above, rather than by creating a 'Table' instance and calling the sqlalchemy-migrate-provided 'create' API as we could previously, we must reorder our calls. The alternative would be to leave the creation of any 'ForeignKeyConstraint' until later but that seems no better and would arguably be harder to read. Note that this isn't yet wired up to anything: users can run it manually but the nova-manage commands we provide are still connected to the sqlalchemy-migrate commands. This will change shortly when we add shims to handle the conversion. Change-Id: I1fa2feaee78213ad81f1889ce54888696f58d98c Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-09 15:34:40 +01:00
Stephen Finucane	21c5ced0ba	db: Avoid use of ALTER in initial migration We have a class of bugs in our current database schema that were introduced by now-squashed migrations, whereby we applied a new constraint to a column but forgot to apply the same change to the column in the shadow table. Currently, we preserve the behavior of these bugs through use of the 'ALTER' statement. This results in more complex SQL for the initial migration, however, more importantly, the ALTER statement is not actually supported by SQLite and the underlying migration engine must instead mimic behavior by creating a new table with the updated schema and a temporary name, copy across all data from the old table to the new table, delete the old table and rename the new table. While sqlalchemy-migrate handled this transparently for us, therefore allowing us to mostly ignore the issue, alembic makes this lack of support in SQLite explicit by insisting on the 'batch_alter_table' context manager [1] wrapping any ALTER statements. This is context manager is "a little tricky" (zzzeek's words, not mine) and is therefore best avoided. Fortunately, this is easily done through the addition of a couple of checks within the '_create_shadow_tables' helper that simply modify the affected columns before creating the table. [1] https://alembic.sqlalchemy.org/en/latest/ops.html#alembic.operations.Operations.batch_alter_table Change-Id: Ib3c8ed876791cda96486d43384f7aec0c487fd2f Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-09 15:34:40 +01:00
Stephen Finucane	43b253cd60	db: Post reshuffle cleanup Introduce a new 'nova.db.api.api' module to hold API database-specific helpers, plus a generic 'nova.db.utils' module to hold code suitable for both main and API databases. This highlights a level of complexity around connection management that is present for the main database but not for the API database. This is because we need to handle the complexity of cells for the former but not the latter. Change-Id: Ia5304c552ce552ae3c5223a2bfb3a9cd543ec57c Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-09 15:34:40 +01:00
Stephen Finucane	bf8b5fc7d0	db: Move remaining 'nova.db.sqlalchemy' modules The two remaining modules, 'api_models' and 'api_migrations', are moved to the new 'nova.db.api' module. Change-Id: I138670fe36b07546db5518f78c657197780c5040 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-09 15:34:40 +01:00
Stephen Finucane	aec2ca9468	db: Exclude the 'use_db_reconnect' option Build on changes I714dce969088e17caaa1daddc881ebca6834e0f1, which modified how we registered options for both the main and API databases, and Iad2e4da4546b80a016e477577d23accb2606a6e4, which removed support for experimental database reconnects, and start ignoring the 'use_db_reconnect' option. This means the option will no longer appear in our documentation and won't be recognised by any of the validation tooling if present in a nova.conf file. No tests are present because I'm not entirely sure how to test the thing. You can validate it manually though by attempting to set the option in a test and watch things blow up like so: oslo_config.cfg.NoSuchOptError: no such option use_db_reconnect in group [database] A second unnecessary 'deepcopy()' is removed. This was missed in change I714dce969088e17caaa1daddc881ebca6834e0f1. Change-Id: Ie7e4ccaf32f7222b8e305a38b48de7980744a98f Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-09 15:34:40 +01:00
Stephen Finucane	100b9dc62c	db: Unify 'nova.db.api', 'nova.db.sqlalchemy.api' Merge these, removing an unnecessary layer of abstraction, and place them in the new 'nova.db.main' directory. The resulting change is huge, but it's mainly the result of 's/sqlalchemy import api/main import api/' and 's/nova.db.api/nova.db.main.api/' with some necessary cleanup. We also need to rework how we do the blocking of API calls since we no longer have a 'DBAPI' object that we can monkey patch as we were doing before. This is now done via a global variable that is set by the 'main' function of 'nova.cmd.compute'. The main impact of this change is that it's no longer possible to set '[database] use_db_reconnect' and have all APIs automatically wrapped in a DB retry. Seeing as this behavior is experimental, isn't applied to any of the API DB methods (which don't use oslo.db's 'DBAPI' helper), and is used explicitly in what would appear to be the critical cases (via the explicit 'oslo_db.api.wrap_db_retry' decorator), this doesn't seem like a huge loss. Change-Id: Iad2e4da4546b80a016e477577d23accb2606a6e4 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-09 15:34:40 +01:00
Stephen Finucane	84b6179076	db: Register database config options ourselves Deep copy the database options, allowing us to reuse them exactly. This was actually suggested in I7b74c4bb4a3edc335523c62ec63e57fdd5c51f39, which added the current code but was rejected at the time because it "would make our lives harder when we need to deprecate those cloned API DB opts." I don't know why we'd ever want to deprecate the cloned options so I see no reason not to take this approach now. Doing so results in less complicated/weird code and also sets us up nicely for future changes that'll remove some of the now unnecessary abstraction layers provided by oslo.db. Change-Id: I714dce969088e17caaa1daddc881ebca6834e0f1 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-08-09 15:34:39 +01:00
Zuul	d20c192655	Merge "db: Drop support for experimental concurrency"	2021-08-08 13:56:39 +00:00
Zuul	b83b5bed0f	Merge "db: Remove constant aliases from 'nova.db.api'"	2021-08-08 13:55:33 +00:00
Zuul	389876eb6f	Merge "db: Move 'sqlalchemy.types' up a directory"	2021-08-08 13:55:25 +00:00
Zuul	d6adb962a6	Merge "I2f9ad3df25306e070c8c3538bfed1212d6d8682f fup: add log"	2021-08-07 19:39:06 +00:00
Zuul	c483c0a811	Merge "Allow deletion of compute service with no compute nodes"	2021-08-07 19:38:59 +00:00
Zuul	b25fe0f900	Merge "Bump oslo.db to 10.0.0"	2021-08-07 15:05:21 +00:00
Zuul	99536b95bc	Merge "libvirt: Handle silent failures to extend volume within os-brick"	2021-08-06 20:39:27 +00:00
Zuul	b218f167cc	Merge "db: Move main DB migrations"	2021-08-06 16:57:04 +00:00
Zuul	d92e5f4b5d	Merge "Add regression test for bug 1938326"	2021-08-06 14:34:32 +00:00
Zuul	119d188dc3	Merge "fup: Move _wait_for_volume_attach into InstanceHelperMixin"	2021-08-06 10:50:06 +00:00
Zuul	7274d67d1c	Merge "func: Increase rpc_response_timeout in TestMultiCellMigrate tests"	2021-08-06 10:25:03 +00:00
Zuul	32c8820ff1	Merge "compute: Avoid duplicate BDMs during reserve_block_device_name"	2021-08-06 09:57:34 +00:00
Lee Yarwood	d4dbcd5fa0	func: Increase rpc_response_timeout in TestMultiCellMigrate tests This was previously set really low to 1 second that was leading to more involved flows such as test_delete_while_in_verify_resize_status timing out when the target calls the conductor to confirm the resize on the source. This change simply increases the timeout in the test but we might want to think about moving this call over to rpc_long_timeout this could be an issue in real world deployments. Closes-Bug: #1938021 Change-Id: Ibba2d1506a0b026d35d7bf35384ec6439f438b01	2021-08-06 10:07:05 +01:00
Zuul	72c8722e09	Merge "zuul: Mark live migration jobs as non-voting due to bug #1912310 "	2021-08-05 18:11:53 +00:00
Zuul	ee02cb10fd	Merge "trivial: Cleanup a comment about a now removed libvirt version check"	2021-08-05 13:37:52 +00:00
Lee Yarwood	4e18f07670	zuul: Mark live migration jobs as non-voting due to bug #1912310 Bug #1912310 details an underlying QEMU live migration failure seen during both volume backend and block migration based live migration tests in the nova-live-migration job. The failure rate is getting high enough now that we should mark these jobs as non-voting until it is resolved. Change-Id: Ifc211fadb3ca9c7c0a74d7dcd225bdb3f3008d26 Related-Bug: #1912310	2021-08-05 13:01:47 +01:00
Lee Yarwood	ede7866de4	Add regression test for bug 1938326 Related-Bug: #1938326 Change-Id: Ie7fa120cb7d1739c5de7cd8dbcfaeb3a37fa729b	2021-08-05 10:13:47 +01:00
Lee Yarwood	753a2ae5d4	fup: Move _wait_for_volume_attach into InstanceHelperMixin As discussed in I332d4f33ea6b9506cc24ac12e5c0994f208a3107 this is useful enough to be copied into InstanceHelperMixin. Most of this change is migrating the original functional test that introduced _wait_for_volume_attach over to use _IntegratedTestBase. Change-Id: I345dce93c6e2593c1ff1b863425f9a854c49ab9d	2021-08-04 12:37:43 +01:00
Felix Huettner	2209b0007f	compute: Avoid duplicate BDMs during reserve_block_device_name When attaching a volume to a running instance the nova-api validates that the volume is not already attached to the instance. However nova-compute is responsible for actually creating the BDM entry in the database. If sending attach requests fast enough it can be possible that the same "attach_volume" request can be sent to nova-compute for the same volume/instance combination. To work around this we add a check in nova-compute to validate that the volume has not been attached in the mean time. Closes-Bug: #1937375 Change-Id: I92f35514efddcb071c7094370b79d91d34c5bc72	2021-08-04 12:37:43 +01:00
Lee Yarwood	2ffd973860	Add functional test for bug 1937375 Related-Bug: #1937375 Change-Id: I332d4f33ea6b9506cc24ac12e5c0994f208a3107	2021-08-04 12:37:43 +01:00
Lee Yarwood	49ba5a763f	libvirt: Handle silent failures to extend volume within os-brick As seen in bug #1849425 os-brick can fail to extend an underlying volume device on the compute silently, returning a new_size of None to n-cpu. While this should ultimatley be addressed in os-brick n-cpu can also handle this before we eventually run into a type error when attemting floor division later in the volume extend flow. Change-Id: Ic8091537274a5ad27fb5af8939f81ed154b7ad7c Closes-Bug: #1849425	2021-08-04 11:37:26 +00:00
Lee Yarwood	dbf3e276ba	zuul: Increase GLANCE_LIMIT_IMAGE_SIZE_TOTAL for nova-lvm As seen before in bug #1913451 the nova-lvm job creates large RAW snapshots during each run due to the nature of using LVM storage for the ephemeral instance disks. Id425aa546f1a5973bae8be9c017782d18f0b4a47 recently landed capping the per teneant image size total to 1000 MiB that is far too low for this job given the RAW snapshots it's creating. This in turn resulted in 504 failures when n-cpu attempted to upload snapshots to g-api. This change simply increases the per tenant limit to 10 GiB that should be more than enough for all of the tests covered by the nova-lvm job. Closes-Bug: #1938765 Change-Id: I824655387a10ac9b813c1b7b2399e25ed95f7fc3	2021-08-03 10:06:25 +01:00
melanie witt	39ef1ad774	Restore original time override in test_archive_task_logs Change I9b16a3a849937aba5b90ed1ab9a80b7f0103f673 attempted to fix a non-determinstic failure in the test_archive_task_logs functional test by adding a time override. That fix however missed that the osloutils_fixture.TimeFixture clears the time override when it exits and that does not mean a previous time override will be restored. This removes usage of the osloutils_fixture.TimeFixture and instead sets time overrides and restores previous time overrides. Restoration of the original time overrides also made an adjustment to the --before option value necessary in order to pick up the first task_log record. Closes-Bug: #1934519 Change-Id: Ic28d2e53d5f50e89635e19df08699f2b8c5cea84	2021-07-31 16:01:32 +00:00
Zuul	5a56ac42ad	Merge "db: Move db.sqalchemy.migration to db.migration"	2021-07-30 17:57:54 +00:00
Zuul	35ddf1ad40	Merge "db: Use module-level imports for sqlalchemy (for real)"	2021-07-30 17:57:47 +00:00
Zuul	5ddaf36f3b	Merge "Fix request path to query a resource provider by uuid"	2021-07-30 13:25:51 +00:00
Artom Lifshitz	a3d0aa4772	I2f9ad3df25306e070c8c3538bfed1212d6d8682f fup: add log Log that we're deleting a compute service with no associated compute node records. Change-Id: I72a867fe0287e1a3f75d717107011fbf447731d4	2021-07-30 15:10:40 +02:00
Lee Yarwood	f916266d4f	trivial: Cleanup a comment about a now removed libvirt version check Some associated logic checking the underlying libvirt hypervisor being used is also reworked to read more easily. Change-Id: I9ad5c2c74ecfa7a2a07e2888b628d2ecd4534523	2021-07-28 19:54:29 +00:00
Artom Lifshitz	880611df0b	Allow deletion of compute service with no compute nodes Consider the following situation: - Using the Ironic virt driver - Replacing (so removing and re-adding) all baremetal nodes associated with a single nova-compute service The update resources periodic will have destroyed the compute node records because they're no longer being reported by the virt driver. If we then attempt to manually delete the compute service record, the datbase layer will raise an exception, as there are no longer any compute node records for the host. Previously, this exception would get bubbled up as an error 500 in the API. This patch catches it and allows service deletion to complete succefully. Closes bug: 1860312 Change-Id: I2f9ad3df25306e070c8c3538bfed1212d6d8682f	2021-07-28 14:57:36 +02:00
Artom Lifshitz	32257a2a6d	Reproducer unit test for bug 1860312 Consider the following situation: - Using the Ironic virt driver - Replacing (so removing and re-adding) all baremetal nodes associated with a single nova-compute service The update resources periodic will have destroyed the compute node records because they're no longer being reported by the virt driver. If we then attempt to manually delete the compute service record, the datbase layer will raise an exception, as there are no longer any compute node records for the host. This exception gets bubbled up as an error 500 in the API. This patch adds a unit test to demonstrate this. Related bug: 1860312 Change-Id: I03eec634b25582ec9643cacf3e5868c101176983	2021-07-28 14:55:19 +02:00
Zuul	97e1a6bece	Merge "trivial: Remove FakeScheduler (for realz)"	2021-07-27 22:00:00 +00:00
Zuul	f3c2bb6893	Merge "scheduler: Remove 'hosts_up'"	2021-07-27 21:59:52 +00:00
Zuul	d917ca9089	Merge "scheduler: 'USES_ALLOCATION_CANDIDATES' removal cleanup"	2021-07-27 21:25:32 +00:00
Zuul	9c6b44820d	Merge "scheduler: Remove 'USES_ALLOCATION_CANDIDATES'"	2021-07-27 21:21:08 +00:00
Zuul	b5aa1de0ab	Merge "docs: Add libvirt misc doc"	2021-07-27 18:24:31 +00:00
Zuul	fa60c0aa45	Merge "docs: Change formatting of hypervisor config guides"	2021-07-27 18:24:22 +00:00

1 2 3 4 5 ...

58940 Commits