Containerized OpenStack Backup Considerations¶
Backup of the containerized OpenStack application is performed as part of the StarlingX backup procedures.
See Back Up System Data Overview.
Recommendations¶
After executing the backup and restore procedure, including the latest procedural changes, previously backed up OpenStack VMs that were booted from NFS, iSCSI, or FC volumes may enter an ERROR state. In addition, any attempt to create new volumes after the restore fails, with newly created volumes transitioning to an ERROR state.
Note
Replace all instances of wr-openstack with stx-openstack.
Backup and Restore - Glance using PVC Backend¶
Glance, Cinder, and Nova can be configured to use NetApp PVCs for data persistence; however, these PVCs are not automatically backed up or restored by the StarlingX OpenStack backup and restore playbooks. As a result, the following issues may occur after the Backup and Restore procedure completes:
Glance images previously stored in the
glance-imagesPVC continue to appear in the output ofopenstack image listbut are no longer usable because the underlying image data is missing (for example, volume creation from image fails).Cinder volume backups previously stored in the
cinder-backupPVC continue to appear in the output of OpenStack volume backup list but are no longer usable because the backup data is unavailable (for example, volume creation from backup fails).Nova ephemeral volumes previously stored in the
nova-instancesPVC are no longer available, causing the affected ephemeral instances to enter an ERROR state.
Procedural Changes: Manually back up and restore NetApp PVCs using Kubernetes Volume Snapshots.
Procedure
Pre-backup: Create a snapshot of the Glance PVC.
Run before backup.yml
Get Trident snapshot class.
TRIDENT_SNAPCLASS=$(kubectl get volumesnapshotclass \ -o jsonpath='{.items[?(@.driver=="csi.trident.netapp.io")].metadata.name}')Create a snapshot name.
SNAP_TS=$(date +%Y%m%d%H%M%S) GLANCE_SNAP=glance-images-snap-${SNAP_TS}Create a VolumeSnapshot.
cat << EOF | kubectl apply -f - apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: name: ${GLANCE_SNAP} namespace: openstack spec: volumeSnapshotClassName: ${TRIDENT_SNAPCLASS} source: persistentVolumeClaimName: glance-images EOFWait until the snapshot is ready.
kubectl wait volumesnapshot/${GLANCE_SNAP} -n openstack \ --for=jsonpath='{.status.readyToUse}'=true --timeout=120sCapture the snapshot details.
GLANCE_VSC=$(kubectl get volumesnapshot ${GLANCE_SNAP} -n openstack \ -o jsonpath='{.status.boundVolumeSnapshotContentName}') GLANCE_SNAPSHOT_HANDLE=$(kubectl get volumesnapshotcontent ${GLANCE_VSC} \ -o jsonpath='{.status.snapshotHandle}')Ensure the snapshot is retained.
kubectl patch volumesnapshotcontent ${GLANCE_VSC} \ --type=merge -p '{"spec":{"deletionPolicy":"Retain"}}'Save restored metadata.
echo "GLANCE_SNAP=${GLANCE_SNAP}" > /home/sysadmin/br-snapshot-info.txt echo "GLANCE_SNAPSHOT_HANDLE=${GLANCE_SNAPSHOT_HANDLE}" >> /home/sysadmin/br-snapshot-info.txt kubectl get pvc glance-images -n openstack \ -o jsonpath='GLANCE_SC={.spec.storageClassName} {"\n"}GLANCE_SIZE={.spec.resources.requests.storage}{"n"} ' \ >> /home/sysadmin/br-snapshot-info.txt cat /home/sysadmin/br-snapshot-info.txt
Backup the application.
ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml \ -e "ansible_become_pass=<password> \ admin_password=<password> \ openstack_app_name=wr-openstack \ skip_os_dbs= ['Database','information_schema','performance_schema','mysql','horizon','panko','gnocchi','sys']"
Pre-Restore cleanup.
Remove VMs. If a StarlingX OpenStack deployment exists, all VMs must be removed before restore.
$ source /var/opt/openstack/admin-openrc $ openstack server list --all $ openstack server delete <vm_uuid>
Remove and re-upload the application.
Warning
Ensure all VMs are deleted before proceeding.
$ source /etc/platform/openrc $ system application-remove wr-openstack Wait until status = uploaded $ watch system application-show wr-openstack $ system application-delete wr-openstack $ system application-upload wr-openstack.tgz Wait again until status = uploaded $ watch system application-show wr-openstack
Apply the
--reuse-valuesworkaround.sudo find / -path "/restore-openstack" -type f -exec grep -l "system helm-override-update" {} \; 2>/dev/null | while read f; do tmp=$(mktemp) sudo sed -e '/--reuse-values/!s/system helm-override-update/system helm-override-update --reuse-values/g' -e 's/show_multiple_locations=True/show_multiple_locations=False/g' "$f" > "$tmp" chmod 644 "$tmp" sudo mount --bind "$tmp" "$f" doneVerify:
grep “helm-override-update” /usr/share/ansible/stx-ansible/playbooks/roles/restore-openstack/restore/tasks/main.yml
Restore Glance PVC (Before restore_openstack.yml).
$ source /home/sysadmin/br-snapshot-info.txt TRIDENT_SNAPCLASS=$(kubectl get volumesnapshotclass \ -o jsonpath='{.items[?(@.driver=="csi.trident.netapp.io")].metadata.name}') GLANCE_VSC_RESTORE=snapcontent-restore-${GLANCE_SNAP}Create namespace (this is required before restore).
$ kubectl create ns openstack
Create VolumeSnapshotContent (pre-provisioned).
cat << EOF | kubectl apply -f - apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotContent metadata: name: ${GLANCE_VSC_RESTORE} spec: deletionPolicy: Retain driver: csi.trident.netapp.io source: snapshotHandle: ${GLANCE_SNAPSHOT_HANDLE} sourceVolumeMode: Filesystem volumeSnapshotClassName: ${TRIDENT_SNAPCLASS} volumeSnapshotRef: name: ${GLANCE_SNAP} namespace: openstack EOFRecreate VolumeSnapshot.
cat << EOF | kubectl apply -f - apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: name: ${GLANCE_SNAP} namespace: openstack spec: volumeSnapshotClassName: ${TRIDENT_SNAPCLASS} source: volumeSnapshotContentName: ${GLANCE_VSC_RESTORE} EOFVerify readiness.
$ kubectl get volumesnapshotcontent ${GLANCE_VSC_RESTORE} $ kubectl get volumesnapshot ${GLANCE_SNAP} -n openstackRecreate PVC from the snapshot.
cat << EOF | kubectl apply -f - apiVersion: v1 kind: PersistentVolumeClaim metadata: name: glance-images namespace: openstack annotations: helm.sh/resource-policy: keep spec: accessModes: - ReadWriteMany resources: requests: storage: ${GLANCE_SIZE} storageClassName: ${GLANCE_SC} dataSource: name: ${GLANCE_SNAP} kind: VolumeSnapshot apiGroup: snapshot.storage.k8s.io$ kubectl get pvc -n openstack glance-images Wait until STATUS = Bound
Restore OpenStack.
ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_openstack.yml \ -e "initial_backup_dir=/opt/backups \ ansible_become_pass=<password> \ admin_password=<password> \ backup_filename=<wr-openstack-backup-tarball>.tgz \ openstack_app_name=wr-openstack"
Cleanup Snapshot Resources.
$ source /home/sysadmin/br-snapshot-info.txt $ kubectl delete volumesnapshot ${GLANCE_SNAP} -n openstack $ kubectl patch volumesnapshotcontent ${GLANCE_VSC_RESTORE} \ --type=merge -p '{"metadata":{"finalizers":[]}}' $ kubectl delete volumesnapshotcontent ${GLANCE_VSC_RESTORE}Reattach VMs (as Needed).
Note
Skip if VMs are already ACTIVE.
$ source /var/opt/openstack/admin-openrc $ nova reset-state --active <uuid> $ openstack server stop <uuid> $ openstack server shelve <uuid> $ openstack server unshelve <uuid>
Results
Clear ERROR state
Stop reboot loops
Reattach volumes
Note
Always source the correct environment before running commands. If you are using a new shell, source one of the following:
/etc/platform/openrc
/var/opt/openstack/admin-openrc or
/home/sysadmin/br-snapshot-info.txt
Do not proceed with restore if VMs still exist.
Ensure Glance PVC reaches Bound state before restore.
Backup Glance / Cinder Backend in HTTP and HTTPS Enabled Environment¶
When StarlingX OpenStack is deployed with TLS enabled (recommended for production
environments), a configuration-escaping issue may cause PostgreSQL fail while
processing the SQL file during execution of the restore-openstack playbook.
As a result, application configuration overrides are not fully restored, and
the StarlingX OpenStack restore operation fails during the application reapply phase.
Procedural Changes: Execute the restore-openstack playbook with
TLS disabled (HTTP mode) to restore StarlingX OpenStack, then reapply the application
with the required configuration overrides to enable TLS after restore
completes.
Backup environment.
$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml \ -e "ansible_become_pass=<password> \ admin_password=<password> \ openstack_app_name=wr-openstack \ skip_os_dbs= ['Database','information_schema','performance_schema','mysql','horizon','panko','gnocchi','sys']"
Pre-Restore Cleanup.
Remove VMs. All VMs must be deleted before restore.
$ source /var/opt/openstack/admin-openrc $ openstack server list --all $ openstack server delete <vm_uuid>
Remove and re-upload the application.
Warning
Do not proceed unless all VMs are deleted before restore.
$ system application-remove wr-openstack
Wait until status is uploaded.
$ source /etc/platform/openrc $ watch system application-show wr-openstack $ system application-delete wr-openstack $ system application-upload wr-openstack.tgz
Continue waiting until the status is uploaded and run the following command.
$ watch system application-show wr-openstack
Apply the
--reuse-values.$ sudo find / -path "/restore-openstack" -type f -exec grep -l "system helm-override-update" {} \; 2>/dev/null | while read f; do tmp=$(mktemp) sudo sed -e '/--reuse-values/!s/system helm-override-update/system helm-override-update --reuse-values/g' -e 's/show_multiple_locations=True/show_multiple_locations=False/g' "$f" > "$tmp" chmod 644 "$tmp" sudo mount --bind "$tmp" "$f" doneVerify:
grep “helm-override-update” /usr/share/ansible/stx-ansible/playbooks/roles/restore-openstack/restore/tasks/main.yml
Run the following only if the environment is HTTPS.
$ source /etc/platform/openrc $ system service-parameter-list | grep endpoint $ system service-parameter-delete <endpoint-id>
Restore (Runs in HTTP Mode).
$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_openstack.yml \ -e "initial_backup_dir=/opt/backups \ ansible_become_pass=<password> \ admin_password=<password> \ backup_filename=<wr-openstack-backup-tarball>.tgz \ openstack_app_name=wr-openstack"
Note
If running in an HTTP environment, proceed to step “Reattach VMs” below (If needed).
Post-Restore verification.
$ source /var/opt/openstack/admin-openrc $ openstack endpoint list
Certificate Location. Certificates generated before backup are reused.
/home/sysadmin/openstack-certs/
Reconfigure HTTPS (Post-Restore). Set Endpoint Domain.
$ system service-parameter-add openstack helm endpoint_domain=<your-lab>.wrs.com
Create clients override.
cat <<EOF > clients_override.yaml serviceEndpointPattern: "{service_name}-{endpoint_domain}" EOF
Apply Helm overrides.
$ source /etc/platform/openrc $ system helm-override-update wr-openstack clients openstack --reuse-values --set openstackCertificateFile=/home/sysadmin/openstack-certs/openstack-helm.pem $ system helm-override-update wr-openstack clients openstack --reuse-values --set openstackCertificateKeyFile=/home/sysadmin/openstack-certs/openstack-helm.pem $ system helm-override-update wr-openstack clients openstack --reuse-values --set openstackCertificateCAFile=/home/sysadmin/openstack-certs/openstack-helm-ca.crt $ system helm-override-update wr-openstack clients openstack --reuse-values --values clients_override.yaml
Verify if the overrides are applied.
$ system helm-override-show wr-openstack clients openstack
Re-apply the application.
$ system application-apply wr-openstack
Verify HTTPS endpoints.
$ source /var/opt/openstack/admin-openrc $ openstack endpoint list
Endpoints should now show HTTPS.
Reattach VMs (If needed).
Note
Skip if VMs are already ACTIVE
$ source /var/opt/openstack/admin-openrc $ nova reset-state --active <uuid> $ openstack server stop <uuid> $ openstack server shelve <uuid> $ openstack server unshelve <uuid>
Results
Clear ERROR state
Stop reboot loops
Reattach volumes
Note
Always source the correct environment before running commands. If you are using a new shell, source one of the following:
/etc/platform/openrc
/var/opt/openstack/admin-openrc or
/home/sysadmin/br-snapshot-info.txt
For example:
$ source /var/opt/openstack/admin-openrc
Note
If the environment is HTTPS, the restore process must run in HTTP mode first, and then HTTPS is re-applied afterward.
Do not proceed if VMs still exist
Ensure application state transitions complete before moving to next steps