Skip to content

Full shutdown document added #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Full shutdown document added #10

wants to merge 2 commits into from

Conversation

bbezak
Copy link
Member

@bbezak bbezak commented Mar 29, 2021

No description provided.

Copy link
Member

@oneswig oneswig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work Bartosz, just a couple of questions / suggestions


Stop Ceph
---------
Procedure based on `Red Hat documentation <https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html/administration_guide/understanding-process-management-for-ceph#powering-down-and-rebooting-a-red-hat-ceph-storage-cluster_admin>`__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's something equivalent in the community docs it would be better, but the closest I found was https://docs.ceph.com/en/latest/rados/operations/operating/ and it doesn't cover setting all the flags below.


.. code-block:: bash

systemctl poweroff
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be serialised form of shutdown invocation using Kayobe's tools https://docs.openstack.org/kayobe/latest/administration/overcloud.html#running-commands - perhaps also with a small delay to the shutdown command so that it doesn't immediately chop off the ansible connection.

Copy link

@markgoddard markgoddard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition. Looks like there is some room for automation here, but that can be added iteratively.

.. code-block:: bash

for i in `openstack server list --all-projects -c ID -f value` ; \
do openstack server stop $i ; done

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this asynchronous? Should we check for success?

- Stop the Ceph clients from using any Ceph resources (RBD, RADOS Gateway, CephFS)
- Check if cluster is in healthy state

.. code-block:: bash

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be indented more to be part of the bullet?


- Stop CephFS (if applicable)

Stop CephFS cluster by reducing the number of ranks to 1, setting the cluster_down flag, and then failing the last rank.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, indentation?

----------------------------

Set maintenance mode in bifrost to prevent nodes from automatically
powering back on

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other option is to power off via bifrost



Full Power on Procedure
-----------------------

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be a different heading style. Alternatively (preferably?) this section could go in another page called cold_start.rst.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or change the page to be: "Shutdown and power on procedures"

* Shut down controllers
* Shut down Ceph nodes (if applicable)
* Shut down seed VM
* Shut down Ansible control host

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one isn't covered

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should't make any assumptions about what or where this is. It may not be the seed hypervisor, which should also be called out explicitly.

* Perform a graceful shutdown of all virtual machine instances
* Stop Ceph (if applicable)
* Put all nodes into maintenance mode in Bifrost
* Shut down compute nodes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this lists shutting down different types of nodes separately, but the procedure only stops the services separately, then shuts down all nodes at once.

* Remove nodes from maintenance mode in bifrost
* Recover MariaDB cluster
* Start Ceph (if applicable)
* Check that all docker containers are running

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: they haven't been started


.. code-block:: bash

kayobe# kayobe overcloud database recover

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if it would be cleaner to stop the containers before shutdown, to avoid them starting up in a broken state.

@markgoddard
Copy link

Looks like quite a few comments still to be addressed. It's quite hard to review larger changes when force-pushed. Could you add commits, then squash at the end?

following order:

* Perform a graceful shutdown of all virtual machine instances
* Stop Ceph (if applicable)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be early for stopping Ceph, in case the OpenStack services are still using Ceph state (eg, image uploads). Perhaps stop Ceph at the point where the Ceph nodes are shut down.

@bbezak
Copy link
Member Author

bbezak commented Apr 9, 2021

Looks like quite a few comments still to be addressed. It's quite hard to review larger changes when force-pushed. Could you add commits, then squash at the end?

sure, makes perfect sense - that was Gerrit habit ;)

@priteau
Copy link
Member

priteau commented Jan 9, 2023

This would be nice to complete and merge.

@bbezak bbezak marked this pull request as draft June 26, 2024 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants