0
0
Kubernetesdevops~10 mins

etcd backup and recovery in Kubernetes - Commands & Configuration

Choose your learning style9 modes available
Introduction
etcd stores important data for Kubernetes clusters. If this data is lost or corrupted, the cluster can break. Backing up etcd regularly and knowing how to restore it helps keep your cluster safe and running.
When you want to save a snapshot of your cluster state before making big changes.
When your Kubernetes cluster is not responding due to etcd data corruption.
When migrating your cluster data to a new etcd instance or server.
When you want to recover your cluster after accidental deletion of resources.
When performing disaster recovery drills to ensure cluster resilience.
Commands
This command creates a backup snapshot of the etcd database and saves it to /tmp/etcd-backup.db. It uses secure connection parameters to access etcd.
Terminal
ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup.db --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key
Expected OutputExpected
Snapshot saved at /tmp/etcd-backup.db
--endpoints - Specifies the etcd server address
--cacert - CA certificate for secure connection
--cert - Client certificate for authentication
--key - Client key for authentication
This command checks the status and details of the saved snapshot to ensure it is valid.
Terminal
ETCDCTL_API=3 etcdctl snapshot status /tmp/etcd-backup.db
Expected OutputExpected
ETCD Version: 3.5.9 Size: 12345678 Revision: 12345 Raft Term: 5 Raft Index: 67890
Stop the Kubernetes API server to safely restore etcd data without interference.
Terminal
systemctl stop kube-apiserver
Expected OutputExpected
No output (command runs silently)
Restore the etcd data from the backup snapshot into a new data directory.
Terminal
ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcd-backup.db --data-dir=/var/lib/etcd-new
Expected OutputExpected
Restored etcd snapshot to /var/lib/etcd-new
--data-dir - Directory to restore the etcd data into
Rename the current etcd data directory and replace it with the restored data directory.
Terminal
mv /var/lib/etcd /var/lib/etcd-old && mv /var/lib/etcd-new /var/lib/etcd
Expected OutputExpected
No output (command runs silently)
Start the Kubernetes API server again after restoring etcd data.
Terminal
systemctl start kube-apiserver
Expected OutputExpected
No output (command runs silently)
Check the health of the etcd endpoint to confirm it is running properly after recovery.
Terminal
ETCDCTL_API=3 etcdctl endpoint health --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key
Expected OutputExpected
https://127.0.0.1:2379 is healthy: successfully committed proposal: took = 1.234ms
--endpoints - Specifies the etcd server address
--cacert - CA certificate for secure connection
--cert - Client certificate for authentication
--key - Client key for authentication
Key Concept

If you remember nothing else from this pattern, remember: always stop the Kubernetes API server before restoring etcd data to avoid corruption.

Common Mistakes
Trying to restore etcd snapshot while kube-apiserver is running
This can cause data corruption or conflicts because the API server is actively using etcd data.
Stop the kube-apiserver service before restoring etcd data.
Not using the correct certificates and keys when running etcdctl commands
etcd uses secure communication and will reject commands without proper authentication.
Always provide the correct --cacert, --cert, and --key flags pointing to valid certificates.
Overwriting the current etcd data directory without backing it up
If the restore fails, you lose the original data and cannot recover.
Rename or move the original data directory before replacing it with restored data.
Summary
Use etcdctl snapshot save to create a backup of your etcd data securely.
Stop the kube-apiserver before restoring etcd data to prevent conflicts.
Restore the snapshot to a new directory, then replace the old data directory safely.
Restart the kube-apiserver and verify etcd health to confirm successful recovery.