|
| 1 | +============== |
| 2 | +Upgrading Ceph |
| 3 | +============== |
| 4 | + |
| 5 | +This section describes show to upgrade from one version of Ceph to another. |
| 6 | +The Ceph upgrade procedure is described :ceph-doc:`here <cephadm/upgrade>`. |
| 7 | + |
| 8 | +The Ceph release series is not strictly dependent upon the StackHPC OpenStack |
| 9 | +release, however this configuration does define a default Ceph release series |
| 10 | +and container image tag. The default release series is currently |ceph_series|. |
| 11 | + |
| 12 | +Prerequisites |
| 13 | +============= |
| 14 | + |
| 15 | +Before starting the upgrade, ensure any appropriate prerequisites are |
| 16 | +satisfied. These will be specific to each deployment, but here are some |
| 17 | +suggestions: |
| 18 | + |
| 19 | +* Ensure that expected test suites are passing, e.g. Tempest. |
| 20 | +* Resolve any Prometheus alerts. |
| 21 | +* Check for unexpected ``ERROR`` or ``CRITICAL`` messages in OpenSearch |
| 22 | + Dashboard. |
| 23 | +* Check Grafana dashboards. |
| 24 | + |
| 25 | +Consider whether the Ceph cluster needs to be upgraded within or outside of a |
| 26 | +maintenance/change window. |
| 27 | + |
| 28 | +Preparation |
| 29 | +=========== |
| 30 | + |
| 31 | +Ensure that the local Kayobe configuration environment is up to date. |
| 32 | + |
| 33 | +If you wish to use a different Ceph release series, set |
| 34 | +``cephadm_ceph_release``. |
| 35 | + |
| 36 | +If you wish to use different Ceph container image tags, set the following |
| 37 | +variables: |
| 38 | + |
| 39 | +* ``cephadm_image_tag`` |
| 40 | +* ``cephadm_haproxy_image_tag`` |
| 41 | +* ``cephadm_keepalived_image_tag`` |
| 42 | + |
| 43 | +Upgrading Host Packages |
| 44 | +======================= |
| 45 | + |
| 46 | +Prior to upgrading the Ceph storage cluster, it may be desirable to upgrade |
| 47 | +system packages on the hosts. |
| 48 | + |
| 49 | +Note that these commands do not affect packages installed in containers, only |
| 50 | +those installed on the host. |
| 51 | + |
| 52 | +In order to avoid downtime, it is important to control how package updates are |
| 53 | +rolled out. In general, Ceph monitor hosts should be updated *one by one*. For |
| 54 | +Ceph OSD hosts it may be possible to update packages in batches of hosts, |
| 55 | +provided there is sufficient capacity to maintain data availability. |
| 56 | + |
| 57 | +For each host or batch of hosts, perform the following steps. |
| 58 | + |
| 59 | +Place the host or batch of hosts into maintenance mode: |
| 60 | + |
| 61 | +.. code-block:: console |
| 62 | +
|
| 63 | + sudo cephadm shell -- ceph orch host maintenance enter <host> |
| 64 | +
|
| 65 | +To update all eligible packages, use ``*``, escaping if necessary: |
| 66 | + |
| 67 | +.. code-block:: console |
| 68 | +
|
| 69 | + kayobe overcloud host package update --packages "*" --limit <host> |
| 70 | +
|
| 71 | +If the kernel has been upgraded, reboot the host or batch of hosts to pick up |
| 72 | +the change: |
| 73 | + |
| 74 | +.. code-block:: console |
| 75 | +
|
| 76 | + kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml -l <host> |
| 77 | +
|
| 78 | +Remove the host or batch of hosts from maintenance mode: |
| 79 | + |
| 80 | +.. code-block:: console |
| 81 | +
|
| 82 | + sudo cephadm shell -- ceph orch host maintenance exit <host> |
| 83 | +
|
| 84 | +Wait for Ceph health to return to ``HEALTH_OK``: |
| 85 | + |
| 86 | +.. code-block:: console |
| 87 | +
|
| 88 | + ceph -s |
| 89 | +
|
| 90 | +Wait for Prometheus alerts and errors in OpenSearch Dashboard to resolve, or |
| 91 | +address them. |
| 92 | + |
| 93 | +Once happy that the system has been restored to full health, move onto the next |
| 94 | +host or batch or hosts. |
| 95 | + |
| 96 | +Sync container images |
| 97 | +===================== |
| 98 | + |
| 99 | +If using the local Pulp server to host Ceph images |
| 100 | +(``stackhpc_sync_ceph_images`` is ``true``), sync the new Ceph images into the |
| 101 | +local Pulp: |
| 102 | + |
| 103 | +.. code-block:: console |
| 104 | +
|
| 105 | + kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/pulp-container-{sync,publish}.yml -e stackhpc_pulp_images_kolla_filter=none |
| 106 | +
|
| 107 | +Upgrade Ceph services |
| 108 | +===================== |
| 109 | + |
| 110 | +Start the upgrade. If using the local Pulp server to host Ceph images: |
| 111 | + |
| 112 | +.. code-block:: console |
| 113 | +
|
| 114 | + sudo cephadm shell -- ceph orch upgrade start --image <registry>/ceph/ceph:<tag> |
| 115 | +
|
| 116 | +Otherwise: |
| 117 | + |
| 118 | +.. code-block:: console |
| 119 | +
|
| 120 | + sudo cephadm shell -- ceph orch upgrade start --image quay.io/ceph/ceph:<tag> |
| 121 | +
|
| 122 | +Check the update status: |
| 123 | + |
| 124 | +.. code-block:: console |
| 125 | +
|
| 126 | + ceph orch upgrade status |
| 127 | +
|
| 128 | +Wait for Ceph health to return to ``HEALTH_OK``: |
| 129 | + |
| 130 | +.. code-block:: console |
| 131 | +
|
| 132 | + ceph -s |
| 133 | +
|
| 134 | +Watch the cephadm logs: |
| 135 | + |
| 136 | +.. code-block:: console |
| 137 | +
|
| 138 | + ceph -W cephadm |
| 139 | +
|
| 140 | +Upgrade Cephadm |
| 141 | +=============== |
| 142 | + |
| 143 | +Update the Cephadm package: |
| 144 | + |
| 145 | +.. code-block:: console |
| 146 | +
|
| 147 | + kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/cephadm-deploy.yml -e cephadm_package_update=true |
| 148 | +
|
| 149 | +Testing |
| 150 | +======= |
| 151 | + |
| 152 | +At this point it is recommended to perform a thorough test of the system to |
| 153 | +catch any unexpected issues. This may include: |
| 154 | + |
| 155 | +* Check Prometheus, OpenSearch Dashboards and Grafana |
| 156 | +* Smoke tests |
| 157 | +* All applicable tempest tests |
| 158 | +* Horizon UI inspection |
| 159 | + |
| 160 | +Cleaning up |
| 161 | +=========== |
| 162 | + |
| 163 | +Prune unused container images: |
| 164 | + |
| 165 | +.. code-block:: console |
| 166 | +
|
| 167 | + kayobe overcloud host command run -b --command "docker image prune -a -f" -l ceph |
0 commit comments