Containerized OpenStack Backup Considerations

Backup of the containerized OpenStack application is performed as part of the StarlingX backup procedures.

See Back Up System Data Overview.

Recommendations

After executing the backup and restore procedure, including the latest procedural changes, previously backed up OpenStack VMs that were booted from NFS, iSCSI, or FC volumes may enter an ERROR state. In addition, any attempt to create new volumes after the restore fails, with newly created volumes transitioning to an ERROR state.

Note

Replace all instances of wr-openstack with stx-openstack.

Backup and Restore - Glance using PVC Backend

Glance, Cinder, and Nova can be configured to use NetApp PVCs for data persistence; however, these PVCs are not automatically backed up or restored by the StarlingX OpenStack backup and restore playbooks. As a result, the following issues may occur after the Backup and Restore procedure completes:

  • Glance images previously stored in the glance-images PVC continue to appear in the output of openstack image list but are no longer usable because the underlying image data is missing (for example, volume creation from image fails).

  • Cinder volume backups previously stored in the cinder-backup PVC continue to appear in the output of OpenStack volume backup list but are no longer usable because the backup data is unavailable (for example, volume creation from backup fails).

  • Nova ephemeral volumes previously stored in the nova-instances PVC are no longer available, causing the affected ephemeral instances to enter an ERROR state.

Procedural Changes: Manually back up and restore NetApp PVCs using Kubernetes Volume Snapshots.

Procedure

  1. Pre-backup: Create a snapshot of the Glance PVC.

    Run before backup.yml

    1. Get Trident snapshot class.

      TRIDENT_SNAPCLASS=$(kubectl get volumesnapshotclass \
       -o jsonpath='{.items[?(@.driver=="csi.trident.netapp.io")].metadata.name}')
      
    2. Create a snapshot name.

      SNAP_TS=$(date +%Y%m%d%H%M%S)
      GLANCE_SNAP=glance-images-snap-${SNAP_TS}
      
    3. Create a VolumeSnapshot.

      cat << EOF | kubectl apply -f -
      apiVersion: snapshot.storage.k8s.io/v1
      kind: VolumeSnapshot
      metadata:
       name: ${GLANCE_SNAP}
       namespace: openstack
      spec:
       volumeSnapshotClassName: ${TRIDENT_SNAPCLASS}
       source:
        persistentVolumeClaimName: glance-images
      EOF
      
    4. Wait until the snapshot is ready.

      kubectl wait volumesnapshot/${GLANCE_SNAP} -n openstack \
       --for=jsonpath='{.status.readyToUse}'=true --timeout=120s
      
    5. Capture the snapshot details.

      GLANCE_VSC=$(kubectl get volumesnapshot ${GLANCE_SNAP} -n openstack \
       -o jsonpath='{.status.boundVolumeSnapshotContentName}')
      
      GLANCE_SNAPSHOT_HANDLE=$(kubectl get volumesnapshotcontent ${GLANCE_VSC} \
       -o jsonpath='{.status.snapshotHandle}')
      
    6. Ensure the snapshot is retained.

      kubectl patch volumesnapshotcontent ${GLANCE_VSC} \
       --type=merge -p '{"spec":{"deletionPolicy":"Retain"}}'
      
    7. Save restored metadata.

      echo "GLANCE_SNAP=${GLANCE_SNAP}" > /home/sysadmin/br-snapshot-info.txt
      echo "GLANCE_SNAPSHOT_HANDLE=${GLANCE_SNAPSHOT_HANDLE}" >>
      /home/sysadmin/br-snapshot-info.txt
      
      kubectl get pvc glance-images -n openstack \
       -o jsonpath='GLANCE_SC={.spec.storageClassName}
      {"\n"}GLANCE_SIZE={.spec.resources.requests.storage}{"n"}
      ' \
      >> /home/sysadmin/br-snapshot-info.txt
      cat /home/sysadmin/br-snapshot-info.txt
      
  2. Backup the application.

    ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml \
     -e "ansible_become_pass=<password> \
        admin_password=<password> \
        openstack_app_name=wr-openstack \
        skip_os_dbs=
     ['Database','information_schema','performance_schema','mysql','horizon','panko','gnocchi','sys']"
    
  3. Pre-Restore cleanup.

    1. Remove VMs. If a StarlingX OpenStack deployment exists, all VMs must be removed before restore.

      $ source /var/opt/openstack/admin-openrc
      $ openstack server list --all
      $ openstack server delete <vm_uuid>
      
    2. Remove and re-upload the application.

      Warning

      Ensure all VMs are deleted before proceeding.

      $ source /etc/platform/openrc
      $ system application-remove wr-openstack
      Wait until status = uploaded
      $ watch system application-show wr-openstack
      
      $ system application-delete wr-openstack
      $ system application-upload wr-openstack.tgz
      Wait again until status = uploaded
      $ watch system application-show wr-openstack
      
  4. Apply the --reuse-values workaround.

    sudo find / -path "/restore-openstack" -type f -exec grep -l "system helm-override-update" {} \; 2>/dev/null | while read f; do
    tmp=$(mktemp)
    sudo sed -e '/--reuse-values/!s/system helm-override-update/system helm-override-update --reuse-values/g' -e 's/show_multiple_locations=True/show_multiple_locations=False/g' "$f" > "$tmp"
    chmod 644 "$tmp"
    sudo mount --bind "$tmp" "$f"
    done
    

    Verify:

    grep “helm-override-update” /usr/share/ansible/stx-ansible/playbooks/roles/restore-openstack/restore/tasks/main.yml

  5. Restore Glance PVC (Before restore_openstack.yml).

    $ source /home/sysadmin/br-snapshot-info.txt
    
    TRIDENT_SNAPCLASS=$(kubectl get volumesnapshotclass \
     -o jsonpath='{.items[?(@.driver=="csi.trident.netapp.io")].metadata.name}')
    
    GLANCE_VSC_RESTORE=snapcontent-restore-${GLANCE_SNAP}
    
    1. Create namespace (this is required before restore).

      $ kubectl create ns openstack
      
    2. Create VolumeSnapshotContent (pre-provisioned).

      cat << EOF | kubectl apply -f -
      apiVersion: snapshot.storage.k8s.io/v1
      kind: VolumeSnapshotContent
      metadata:
      name: ${GLANCE_VSC_RESTORE}
      spec:
      deletionPolicy: Retain
      driver: csi.trident.netapp.io
      source:
          snapshotHandle: ${GLANCE_SNAPSHOT_HANDLE}
      sourceVolumeMode: Filesystem
      volumeSnapshotClassName: ${TRIDENT_SNAPCLASS}
      volumeSnapshotRef:
          name: ${GLANCE_SNAP}
          namespace: openstack
      EOF
      
    3. Recreate VolumeSnapshot.

      cat << EOF | kubectl apply -f -
      apiVersion: snapshot.storage.k8s.io/v1
      kind: VolumeSnapshot
      metadata:
       name: ${GLANCE_SNAP}
       namespace: openstack
      spec:
       volumeSnapshotClassName: ${TRIDENT_SNAPCLASS}
       source:
        volumeSnapshotContentName: ${GLANCE_VSC_RESTORE}
      EOF
      
    4. Verify readiness.

      $ kubectl get volumesnapshotcontent ${GLANCE_VSC_RESTORE}
      $ kubectl get volumesnapshot ${GLANCE_SNAP} -n openstack
      
    5. Recreate PVC from the snapshot.

      cat << EOF | kubectl apply -f -
      apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
       name: glance-images
       namespace: openstack
       annotations:
        helm.sh/resource-policy: keep
      spec:
       accessModes:
        - ReadWriteMany
       resources:
        requests:
         storage: ${GLANCE_SIZE}
       storageClassName: ${GLANCE_SC}
       dataSource:
        name: ${GLANCE_SNAP}
        kind: VolumeSnapshot
        apiGroup: snapshot.storage.k8s.io
      
      $ kubectl get pvc -n openstack glance-images
      Wait until STATUS = Bound
      
  6. Restore OpenStack.

    ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_openstack.yml \
     -e "initial_backup_dir=/opt/backups \
        ansible_become_pass=<password> \
        admin_password=<password> \
        backup_filename=<wr-openstack-backup-tarball>.tgz \
        openstack_app_name=wr-openstack"
    
  7. Cleanup Snapshot Resources.

    $ source /home/sysadmin/br-snapshot-info.txt
    $ kubectl delete volumesnapshot ${GLANCE_SNAP} -n openstack
    $ kubectl patch volumesnapshotcontent ${GLANCE_VSC_RESTORE} \
       --type=merge -p '{"metadata":{"finalizers":[]}}'
    $ kubectl delete volumesnapshotcontent ${GLANCE_VSC_RESTORE}
    
  8. Reattach VMs (as Needed).

    Note

    Skip if VMs are already ACTIVE.

    $ source /var/opt/openstack/admin-openrc
    $ nova reset-state --active <uuid>
    $ openstack server stop <uuid>
    $ openstack server shelve <uuid>
    $ openstack server unshelve <uuid>
    

Results

  • Clear ERROR state

  • Stop reboot loops

  • Reattach volumes

Note

Always source the correct environment before running commands. If you are using a new shell, source one of the following:

  • /etc/platform/openrc

  • /var/opt/openstack/admin-openrc or

  • /home/sysadmin/br-snapshot-info.txt

Do not proceed with restore if VMs still exist.

Ensure Glance PVC reaches Bound state before restore.

Backup Glance / Cinder Backend in HTTP and HTTPS Enabled Environment

When StarlingX OpenStack is deployed with TLS enabled (recommended for production environments), a configuration-escaping issue may cause PostgreSQL fail while processing the SQL file during execution of the restore-openstack playbook. As a result, application configuration overrides are not fully restored, and the StarlingX OpenStack restore operation fails during the application reapply phase.

Procedural Changes: Execute the restore-openstack playbook with TLS disabled (HTTP mode) to restore StarlingX OpenStack, then reapply the application with the required configuration overrides to enable TLS after restore completes.

  1. Backup environment.

    $ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml \
    -e "ansible_become_pass=<password> \
    admin_password=<password> \
    openstack_app_name=wr-openstack \
    skip_os_dbs=
    ['Database','information_schema','performance_schema','mysql','horizon','panko','gnocchi','sys']"
    
  2. Pre-Restore Cleanup.

    1. Remove VMs. All VMs must be deleted before restore.

      $ source /var/opt/openstack/admin-openrc
      $ openstack server list --all
      $ openstack server delete <vm_uuid>
      
    2. Remove and re-upload the application.

      Warning

      Do not proceed unless all VMs are deleted before restore.

      $ system application-remove wr-openstack
      
      1. Wait until status is uploaded.

        $ source /etc/platform/openrc
        $ watch system application-show wr-openstack
        $ system application-delete wr-openstack
        $ system application-upload wr-openstack.tgz
        
      2. Continue waiting until the status is uploaded and run the following command.

        $ watch system application-show wr-openstack
        
  3. Apply the --reuse-values.

    $ sudo find / -path "/restore-openstack" -type f -exec grep -l "system helm-override-update" {} \; 2>/dev/null | while read f; do
    tmp=$(mktemp)
    sudo sed -e '/--reuse-values/!s/system helm-override-update/system helm-override-update --reuse-values/g' -e 's/show_multiple_locations=True/show_multiple_locations=False/g' "$f" > "$tmp"
    chmod 644 "$tmp"
    sudo mount --bind "$tmp" "$f"
    done
    

    Verify:

    grep “helm-override-update” /usr/share/ansible/stx-ansible/playbooks/roles/restore-openstack/restore/tasks/main.yml

  4. Run the following only if the environment is HTTPS.

    $ source /etc/platform/openrc
    $ system service-parameter-list | grep endpoint
    $ system service-parameter-delete <endpoint-id>
    
  5. Restore (Runs in HTTP Mode).

    $ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_openstack.yml \
    -e "initial_backup_dir=/opt/backups \
    ansible_become_pass=<password> \
    admin_password=<password> \
    backup_filename=<wr-openstack-backup-tarball>.tgz \
    openstack_app_name=wr-openstack"
    

    Note

    If running in an HTTP environment, proceed to step “Reattach VMs” below (If needed).

  6. Post-Restore verification.

    $ source /var/opt/openstack/admin-openrc
    $ openstack endpoint list
    
  7. Certificate Location. Certificates generated before backup are reused.

    /home/sysadmin/openstack-certs/
    
  8. Reconfigure HTTPS (Post-Restore). Set Endpoint Domain.

    $ system service-parameter-add openstack helm endpoint_domain=<your-lab>.wrs.com
    
    1. Create clients override.

      cat <<EOF > clients_override.yaml
      serviceEndpointPattern: "{service_name}-{endpoint_domain}"
      EOF
      
    2. Apply Helm overrides.

      $ source /etc/platform/openrc
      $ system helm-override-update wr-openstack clients openstack --reuse-values --set openstackCertificateFile=/home/sysadmin/openstack-certs/openstack-helm.pem
      $ system helm-override-update wr-openstack clients openstack --reuse-values --set openstackCertificateKeyFile=/home/sysadmin/openstack-certs/openstack-helm.pem
      $ system helm-override-update wr-openstack clients openstack --reuse-values --set openstackCertificateCAFile=/home/sysadmin/openstack-certs/openstack-helm-ca.crt
      $ system helm-override-update wr-openstack clients openstack --reuse-values --values clients_override.yaml
      
  9. Verify if the overrides are applied.

    $ system helm-override-show wr-openstack clients openstack
    
  10. Re-apply the application.

    $ system application-apply wr-openstack
    
  11. Verify HTTPS endpoints.

    $ source /var/opt/openstack/admin-openrc
    $ openstack endpoint list
    

    Endpoints should now show HTTPS.

  12. Reattach VMs (If needed).

    Note

    Skip if VMs are already ACTIVE

    $ source /var/opt/openstack/admin-openrc
    $ nova reset-state --active <uuid>
    $ openstack server stop <uuid>
    $ openstack server shelve <uuid>
    $ openstack server unshelve <uuid>
    

Results

  • Clear ERROR state

  • Stop reboot loops

  • Reattach volumes

Note

Always source the correct environment before running commands. If you are using a new shell, source one of the following:

  • /etc/platform/openrc

  • /var/opt/openstack/admin-openrc or

  • /home/sysadmin/br-snapshot-info.txt

For example:

$ source /var/opt/openstack/admin-openrc

Note

If the environment is HTTPS, the restore process must run in HTTP mode first, and then HTTPS is re-applied afterward.

  • Do not proceed if VMs still exist

  • Ensure application state transitions complete before moving to next steps