Replace Vault cluster node¶
Introduction¶
This article shows how to replace a Vault node in a cluster made highly available by means of the subordinate hacluster charm. It implies the removal and then the addition of a vault unit. This is done with generic Juju commands and actions available to the hacluster charm.
Important
This procedure will not result in cloud downtime providing that there is at least one functional Vault node present at all times.
Warning
This procedure will involve a sealed Vault instance. Please ensure that the requisite number of unseal keys are available before continuing.
Procedure¶
If the unit being removed is in a ‘lost’ state (as seen in juju status) please first see the Notes section.
List the application units¶
Display the units, in this case for the vault application:
juju status vault
This article will be based on the following (partial) output:
Unit Workload Agent Machine Public address Ports Message
vault/0* active idle 1/lxd/4 10.246.114.76 8200/tcp Unit is ready (active: false, mlock: disabled)
vault-hacluster/1 active idle 10.246.114.76 Unit is ready and clustered
vault-mysql-router/0* active idle 10.246.114.76 Unit is ready
vault/3 active idle 0/lxd/8 10.246.114.83 8200/tcp Unit is ready (active: true, mlock: disabled)
vault-hacluster/2 active idle 10.246.114.83 Unit is ready and clustered
vault-mysql-router/25 active idle 10.246.114.83 Unit is ready
vault/4 active idle 2/lxd/9 10.246.114.84 8200/tcp Unit is ready (active: false, mlock: disabled)
vault-hacluster/0* active idle 10.246.114.84 Unit is ready and clustered
vault-mysql-router/24 active idle 10.246.114.84 Unit is ready
In this example, unit vault/3
will be removed.
Pause the subordinate hacluster unit¶
Pause the hacluster unit that corresponds to the principal application unit
being removed. Here, unit vault-hacluster/2
corresponds to unit
vault/3
:
juju run-action --wait vault-hacluster/2 pause
Remove the principal application unit¶
Remove the principal application unit:
juju remove-unit vault/3
This will also remove the hacluster subordinate unit (and any other subordinate units).
Add a principal application unit¶
Scale out the existing vault application and place the new (containerised) unit on the same host that the removed unit was on (machine 0):
juju add-unit --to lxd:0 vault
Caution
If network spaces are in use the above command will not succeed. See Juju issue LP #1969523 for a workaround.
The new juju status output now contains:
Unit Workload Agent Machine Public address Ports Message
vault/0* active idle 1/lxd/4 10.246.114.76 8200/tcp Unit is ready (active: false, mlock: disabled)
vault-hacluster/1 active idle 10.246.114.76 Unit is ready and clustered
vault-mysql-router/0* active idle 10.246.114.76 Unit is ready
vault/4 active idle 2/lxd/9 10.246.114.84 8200/tcp Unit is ready (active: true, mlock: disabled)
vault-hacluster/0* active idle 10.246.114.84 Unit is ready and clustered
vault-mysql-router/24 active idle 10.246.114.84 Unit is ready
vault/6 blocked idle 0/lxd/9 10.246.114.83 8200/tcp Unit is sealed
vault-hacluster/28 active idle 10.246.114.83 Unit is ready and clustered
vault-mysql-router/40 active idle 10.246.114.83 Unit is ready
Notice that the new vault unit (vault/6
) is sealed.
Unseal the new Vault instance¶
Here we will assume that the original Vault deploy was initialised with a requirement of three unseal keys.
Set an environment variable based on the address of the newly-introduced unit, and unseal the instance:
export VAULT_ADDR="http://10.246.114.83:8200"
vault operator unseal
vault operator unseal
vault operator unseal
For more information on unsealing Vault see cloud operation Unseal Vault.
Verify cloud services¶
The final juju status vault (partial) output is:
Unit Workload Agent Machine Public address Ports Message
vault/0* active idle 1/lxd/4 10.246.114.76 8200/tcp Unit is ready (active: false, mlock: disabled)
vault-hacluster/1 active idle 10.246.114.76 Unit is ready and clustered
vault-mysql-router/0* active idle 10.246.114.76 Unit is ready
vault/4 active idle 2/lxd/9 10.246.114.84 8200/tcp Unit is ready (active: true, mlock: disabled)
vault-hacluster/0* active idle 10.246.114.84 Unit is ready and clustered
vault-mysql-router/24 active idle 10.246.114.84 Unit is ready
vault/6 active idle 0/lxd/9 10.246.114.83 8200/tcp Unit is ready (active: false, mlock: disabled)
vault-hacluster/28 active idle 10.246.114.83 Unit is ready and clustered
vault-mysql-router/40 active idle 10.246.114.83 Unit is ready
Ensure that all cloud services are working as expected.
Notes¶
Pre-removal, in the case where the principal application unit has transitioned to a ‘lost’ state (e.g. dropped off the network due to a hardware failure),
the first step (pause the hacluster unit) can be skipped
the second step (remove the principal unit) can be replaced by:
juju remove-machine N --force
N is the Juju machine ID (see the juju status command) where the unit to be removed is running.
Warning
Removing the machine by force will naturally remove any other units that may be present, including those from an entirely different application.