Hi all
I’m running several docker containers with local persistent volumes that I would like to backup. I haven’t found an easy method to do so.
What do you use / recommend to do that? AFAIK you can’t just rsync the volume directory while the container is running.
Use bind mounts instead of docker volumes. Then you just have normal directories to back up, the same as you would anything else.
In general, it’s not a problem to back up files while the container is running. The exception to this is databases. To have reliable database backups, you need to stop the container (or quiesce/pause the database if it supports it) before backing up the raw database files (including SQLite).
This is your answer. It also has the benefit of allowing you to have a nice folder structure for your Docker setup, where you have a folder for each service holding the corresponding compose yaml and data folder(s)
Exactly the reason why i always exchange the volumes in any compose file with bind mounts.
Also you don‘t have the provlem of many dangling volumes
I don’t even understand what the advantage is to using volumes rather than mounts? So I too always use mounts.
I think volumes are useful when you don’t want to deal with those files on the host. Mainly for development environments.
I wasn’t able to understand volumes at first, and my team mate told me I had to use binders to run mysql. My project folder used to have a docker/mysql/data. Now I just point MySQL data to a volume so I don’t loose data between restarts. And I don’t have to deal with a mysql folder on my project with files I would never touch directly.
In my opinion, volumes are useful for development / testing environments.
I’m not sure either. The only thing I could come up with is that with volumes you don’t have to worry about file ownership. That’s usually taken care of for you with volumes from what I understand.
it’s better to stop the service mounting those volumes before backing them up or you may break something with hot backup
docker volume is an exact same normal directory under /var/lib/docker, there’s no difference with regard to backup consistency.
there’s no silver bullet here, it’s best to use tools specific to whatever is running in the container i.e. wal-g for postgres, etc.
Rsync works fine for most data. (I use borgbackup) For any database data, create a dump using pg_dump or mysqldump or whatever. Then backup the dump and all other volumes but exclude the db volume.
I personally use a script which stops all containers, rsyncs the bind mounts (normal folders on the filesystem) and then restarts them. It runs every night so it isn’t a problem that services are down for a few minutes.
Ideally, you would also make a database dump instead of just backing up the bind mounts.
My persistent volumes are in a ZFS dataset, and I use Sanoid to periodically snapshot the dataset and Syncoid to transfer these snapshots to my backup host.
There is some offical documentation on this: https://docs.docker.com/storage/volumes/#back-up-restore-or-migrate-data-volumes
I personally just rsync my volumes to backblaze :)
In general there is no problem in rsync’ing the volume bind directory. But that depends on the application, which is running in the container. I. e. you should not copy the files of a running database. It may corrupt the data while it’s being written.
I have a script that reads all my compose files to determine each container’s persistent data (though this could also be done with docker inspect) and then uses docker cp to pipe it into restic, which can use data from stdin.
docker cp mycontainer:/files - | restick backup --stdin --stdin-filename mycontainer
Stopping databases is on my todo list.
deleted by creator
Besides using bind mounts(As @[email protected]) mentions, you can run a backup container, that mounts the volume, that you would like to create a backup for. The backup container would handle backing up the volume at regular interval.
This is what I do in thedocker-compose and k3s containers I backup. I can recommend autorestic as the container for backup, but there is a lot of options.
You can copy data from docker volumes to somewhere on the host node to do the backup from there. You can also have a container using the volumes and from there you can send directly to remote, or map a dorectory on the host node to copy the files to.
If you are running a database or something stateful, look at best practices for backup and then adjust to it.
Or not use volumes and map onto the host node directly. each works, and had its own advantages/disadvantages.
Bind mounts are easy to maintain and backup. However if you share data amongst multiple container docker volumes are recommend especially for managing state.
Backup volumes:
docker run --rm --volumes-from dbstore -v $(pwd):/backup containername tar cvf /backup/backup.tar /dbdata
- Launch a new container and mount the volume from the dbstore container
- Mount a local host directory as /backup
- Pass a command that tars the contents of the dbdata volume to a backup.tar file inside /backup directory.
Database volume backup without stopping the service: bash into the container, dump it, and copy it out with docker cp. Run it periodically via crontab