Full shutdown document added #10

bbezak · 2021-03-29T15:25:42Z

No description provided.

oneswig

Nice work Bartosz, just a couple of questions / suggestions

source/full_shutdown.rst

source/operations_and_monitoring.rst

oneswig · 2021-03-29T15:43:28Z

source/full_shutdown.rst

+
+   Stop Ceph
+   ---------
+   Procedure based on `Red Hat documentation <https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html/administration_guide/understanding-process-management-for-ceph#powering-down-and-rebooting-a-red-hat-ceph-storage-cluster_admin>`__ 


If there's something equivalent in the community docs it would be better, but the closest I found was https://docs.ceph.com/en/latest/rados/operations/operating/ and it doesn't cover setting all the flags below.

oneswig · 2021-03-29T15:48:08Z

source/full_shutdown.rst

+
+.. code-block:: bash
+
+   systemctl poweroff


There might be serialised form of shutdown invocation using Kayobe's tools https://docs.openstack.org/kayobe/latest/administration/overcloud.html#running-commands - perhaps also with a small delay to the shutdown command so that it doesn't immediately chop off the ansible connection.

markgoddard

Nice addition. Looks like there is some room for automation here, but that can be added iteratively.

source/full_shutdown.rst

markgoddard · 2021-03-30T09:29:09Z

source/full_shutdown.rst

+.. code-block:: bash
+
+   for i in `openstack server list --all-projects -c ID -f value` ; \
+   do openstack server stop $i ; done


Is this asynchronous? Should we check for success?

markgoddard · 2021-03-30T09:30:37Z

source/full_shutdown.rst

+   - Stop the Ceph clients from using any Ceph resources (RBD, RADOS Gateway, CephFS)
+   - Check if cluster is in healthy state
+
+   .. code-block:: bash


Does it need to be indented more to be part of the bullet?

markgoddard · 2021-03-30T09:31:02Z

source/full_shutdown.rst

+
+   - Stop CephFS (if applicable)
+
+   Stop CephFS cluster by reducing the number of ranks to 1, setting the cluster_down flag, and then failing the last rank.


Again, indentation?

markgoddard · 2021-03-30T09:32:26Z

source/full_shutdown.rst

+----------------------------
+
+Set maintenance mode in bifrost to prevent nodes from automatically
+powering back on


Other option is to power off via bifrost

markgoddard · 2021-03-30T09:35:31Z

source/full_shutdown.rst

+
+
+Full Power on Procedure
+-----------------------


This needs to be a different heading style. Alternatively (preferably?) this section could go in another page called cold_start.rst.

Or change the page to be: "Shutdown and power on procedures"

markgoddard · 2021-03-30T09:36:08Z

source/full_shutdown.rst

+* Shut down controllers
+* Shut down Ceph nodes (if applicable)
+* Shut down seed VM
+* Shut down Ansible control host


This one isn't covered

We probably should't make any assumptions about what or where this is. It may not be the seed hypervisor, which should also be called out explicitly.

markgoddard · 2021-03-30T09:37:15Z

source/full_shutdown.rst

+* Perform a graceful shutdown of all virtual machine instances
+* Stop Ceph (if applicable)
+* Put all nodes into maintenance mode in Bifrost
+* Shut down compute nodes


nit: this lists shutting down different types of nodes separately, but the procedure only stops the services separately, then shuts down all nodes at once.

markgoddard · 2021-03-30T09:40:24Z

source/full_shutdown.rst

+* Remove nodes from maintenance mode in bifrost
+* Recover MariaDB cluster
+* Start Ceph (if applicable)
+* Check that all docker containers are running


nit: they haven't been started

markgoddard · 2021-03-30T09:43:52Z

source/full_shutdown.rst

+
+.. code-block:: bash
+
+   kayobe# kayobe overcloud database recover


Wondering if it would be cleaner to stop the containers before shutdown, to avoid them starting up in a broken state.

markgoddard · 2021-04-08T09:28:55Z

Looks like quite a few comments still to be addressed. It's quite hard to review larger changes when force-pushed. Could you add commits, then squash at the end?

oneswig · 2021-04-08T11:25:28Z

source/full_shutdown.rst

+following order:
+
+* Perform a graceful shutdown of all virtual machine instances
+* Stop Ceph (if applicable)


This might be early for stopping Ceph, in case the OpenStack services are still using Ceph state (eg, image uploads). Perhaps stop Ceph at the point where the Ceph nodes are shut down.

bbezak · 2021-04-09T08:08:17Z

Looks like quite a few comments still to be addressed. It's quite hard to review larger changes when force-pushed. Could you add commits, then squash at the end?

sure, makes perfect sense - that was Gerrit habit ;)

priteau · 2023-01-09T16:15:23Z

This would be nice to complete and merge.

bump to 2021

86a01a4

bbezak requested review from priteau, oneswig and mnasiadka March 29, 2021 15:25

bbezak force-pushed the full-shutdown branch from e935e16 to d07c6bb Compare March 29, 2021 15:26

oneswig reviewed Mar 29, 2021

View reviewed changes

markgoddard suggested changes Mar 30, 2021

View reviewed changes

Full shutdown document added

1ce5bae

bbezak force-pushed the full-shutdown branch from d07c6bb to 1ce5bae Compare March 30, 2021 09:51

oneswig reviewed Apr 8, 2021

View reviewed changes

bbezak marked this pull request as draft June 26, 2024 09:29


		- Stop CephFS (if applicable)

		Stop CephFS cluster by reducing the number of ranks to 1, setting the cluster_down flag, and then failing the last rank.


		.. code-block:: bash

		kayobe# kayobe overcloud database recover

Full shutdown document added #10

Are you sure you want to change the base?

Full shutdown document added #10

Uh oh!

Conversation

bbezak commented Mar 29, 2021

Uh oh!

oneswig left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markgoddard left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markgoddard commented Apr 8, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bbezak commented Apr 9, 2021

Uh oh!

priteau commented Jan 9, 2023

Uh oh!

Uh oh!