Making a project backup in a simple way

Этот пост был опубликован мной более года назад. Информация, описанная ниже, уже могла потерять актуальность, но всё ещё может быть полезна.

black and white plastic containers
Photo by Markus Winkler on Unsplash

It is 2022-07-07 today. Two days ago I’ve stupidly and accidentally lost literally all data from this server including this blog and some pet projects.

There was no regular backuping. All backups I had at the moment was dated may which is better than nothing at all. Well, either I became very tidy in my tech skills or life still teaches me nothing ¯\_(ツ)_/¯

I set up server from scratch and deployed my projects again from those backups. It took me about several hours during two days. Unfortunately, I absolutely don’t remember exactly what has been permanently lost. Shit happens.

Before that accident I didn’t bother with backups: I already have some ones, server is (almost) always stable and fine, bills are always paid in time, what may happen? This was wrong laziness.

This time I was lazy properly, as any tech guy should: if you don’t want to spend time on boring routine — automate it and don’t spend.

Since my VPS deployed without any containers-n-shit, my solution will be simple too: bash + mysqldump + gzip + rsync + crontab + second VPS as reserve remote storage.

This second you maybe already understand what I will do and talk about because our grandfathers already

At this second, you maybe already understood what I’ll do and tell and decided to close the tab, because even your grandfather didn’t do backups like that. Well, I don’t give a shit and anyway will tell my solution and attach gist which is ready to use.

Initial situation

Basically all we have and need is:

  • project at srvprod.org consisting of
    • project files
    • database (MariaDB/MySQL)
  • burnt chair
  • reserve server srvbak.com with
    • a lot of free space
    • and ssh access to it

Solution

First we must set up connection between production and reserve servers. It takes about ~2 min. I mean, i didn’t count but this is as easy as door.

On production server (srvprod.org):

$ ssh-keygen
<enter><enter><enter><enter><enter><enter>...
$ cat ~/.ssh/id_rsa.pub

Public key will be printed, copy it somewhere.

On reserve server (srvbak.com):

$ echo "<publickey>" >> ~/.ssh/authorized_keys

Now try to connect from prod to reserve via ssh:

$ ssh sukablyat666@srvbak.com

If you connected without password — alright, we almost done.

Next step is to make a database dump and archive project files. For the first operation we just use mysqldump piped into gzip, and tar -zcf is for the second one.

$ mysqldump -q <parameters> <dbname> | gzip > <path/to/file>.sql.gz
$ tar -zcf <projname>.tar.gz <path/to/proj>

Now we get two archives which must be sent to remote server. Use rsync or scp to send them to reserve safe place.

But first you need to check if these backups are recoverable. As of DB dump this may be done with:

$ zcat <path/to/file>.sql.gz | mysql -pu <dbuser> <dbname>

And for project sources:

$ tar -xzf <projname>.tar.gz

It’s enougth to do once or periodically as you want. If it’s okay — send archives.

Remote storage just adds some confidence we always have good backups. But also we will still store them locally. This is very risky but sometimes (almost never, to be honest) it may be useful for quick access or (more real) to try to send them again later.

File structure will be based on local date and time:

/backup/2022.02.03/12.34-sql.gz
mask:      |  |  |  |  |
          %Y.%m.%d %H.%M

On remote server we should create same directory right after local one:

$ mkdir -p /backup/$(date +%Y.%m.%d)
$ ssh sukablyat666@srvbak.com "mkdir -p /backup/$(date +%Y.%m.%d)/"

So that we can easily see when every snapshot taken and sort their list only by name.

Well, it seems everything has been done, it remains to finally send archives:

$ # in fact, both variants are identical:
$ rsync --progress <filename> sukablyat666@srvbak.com:/backup/2022.02.03/<filename>
$ scp <filename> sukablyat666@srvbak.com:/backup/2022.02.03/<filename>

It is reasonable to send database backup first: usually data changes more often than files. Then we send tarball with project sources: if project is git-ed and well documented it is not a big deal to restore them if rsync/scp halted due to bad connection or something. Then I will send log file with history of all operations: of course this is usefull file but it’s okay if it will not be sent.

Anyway we can send them later to the same or another reserve storage.

This is ready to use script:

Fill needed vars with proper values and add new command in crontab to run backup, for example, every 6 hours every day:

* */6 * * * .../backup.sh

You can also implement some additional functionality of your choice but I think these points are important:

  • clear old backups automatically on all storages:
    • local ones we should store not for long because:
      • they are unreliable, to store them on prod server is not good practice as this is single point of failure;
      • they use disk space which is more important to use another way than to store old data;
    • keeping them on reserve server should be much longer in time but its disk is not limitless too;
  • log and check archive hashes: I definitely should know if local and remote archives are identical;
  • do not try to send not created backups;

Leave a comment

Your email address will not be published. Required fields are marked *