 Bill_MIBill In MichiganPremium,MVM join:2001-01-03 Royal Oak, MI kudos:1 Reviews:
·WOW Internet and..
·Comcast
| reply to chrisretusn
Re: Why ever use a hard link? I got looking at this, thanks. Is this more for having last week and last month snapshot versions of a file always available within the same file system?
I see the advantage how the overwhelming number of files in these backups that don't change can be a tiny hardlink. But remote backups is pretty much rsync, right? |
|
 leiboldPremium,MVM join:2002-07-09 Sunnyvale, CA kudos:6 Reviews:
·SONIC.NET
| Hardlinks are useful whenever you want to be able to preserve many versions of a changing filesystem with minimal storage requirements.
It doesn't mean that the links have to be on the original filesystem that you are backing up, you can use the hardlinks on a remote backup filesystem (using rsync or similar programs to create and maintain the remote copy).
The advantage of creating the hardlinks in the original filesystem is instant access to prior snapshots (you don't have to restore from your backup first). However it is very important to keep in mind that this is no substitute for a proper backup! With hardlinks any damage to the current file also effects all prior snapshots (since there is only one real file with many links to it). Also the snapshots track adding/removing files correctly but not necessarily modifications to individual files. Unless you break the link and make a copy of the file first, modifying a file (depending on how that modification is made) will change the snapshot version too. Some applications (especially editors) always unlink the original and create a new, modified file and therefore those programs work well with hardlink snapshot archiving.
If you are interested in snapshots you may also want to check out a new version of links: reflinks or cowlinks. They behave like a hardlink when you first create them (must be on the same filesystem, all links refer to the same allocated blocks of data in the filesystem) but become separate files when modified. Most filesystems do not support this, but ZFS, Btrfs, Reiser4 and Ext3cow (a modified version of Ext3) do. I have been meaning to experiment with this but haven't done it yet.
Edit: cow = copy on write (the data is initially shared and copied only at the time of writing/modifying). -- Got some spare cpu cycles ? Join Team Helix or Team Starfire! |
|
|
|
 Bill_MIBill In MichiganPremium,MVM join:2001-01-03 Royal Oak, MI kudos:1 Reviews:
·WOW Internet and..
·Comcast
| Thanks, leibold. The light came on with rsnapshot's words of "creating the illusion of multiple full snapshots". I was missing what the goal was. The use of hardlinks is now obvious. |
|
 chrisretusnRetiredPremium join:2007-08-13 Philippines kudos:1 | reply to Bill_MI said by Bill_MI:I got looking at this, thanks. Is this more for having last week and last month snapshot versions of a file always available within the same file system? Even though rsnapshot implies snapshot, I wouldn't call if that. The first backup is a full backup, subsequent backup runs basically copies the previous backup to the next backup set. In a nut shell, the first backup daily.0 is created, the second daily run copies daily.0 to daily.1 then runs rsync on the changes which becomes daily.0. Unchanged files are hard links. The most recent backup is always daily.0. I backup across multiple file systems. I have rnsapshot configured for 7 daily, 4 weekly and 6 monthly backup sets.
I see the advantage how the overwhelming number of files in these backups that don't change can be a tiny hardlink. But remote backups is pretty much rsync, right? Both remote and local backups are done with rsync. I back up this system I am typing on to another hard drive in this computer. This computer also acts as a backup server and pulls backups from remote computers and stores it on a hard drive on this computer. I normally do full system backups starting with root "/" and selectively excluding unnecessary directories and files. -- Chris Living in Paradise!! |
|
 koitsuPremium,MVM join:2002-07-16 Mountain View, CA kudos:19 1 edit | FWIW, when I ran my co-lo for 18+ years, I went through many backup methods/models and ended up (very happily) using rsnapshot. On FreeBSD dump and restore are really wonderful, but only when they don't deadlock or screw up or piss off the kernel/underlying filesystem (which is the case today with SU+J / soft updates with journalling. You can't use dump any more on that due to SU+J design bugs that lock the system up).
rsync is used for the data transmission part (i.e. getting filesystem data off one machine and storing it on another) and the writing-to-the-disk part. You can use it across SSH (the default) or run rsyncd and use the native rsync protocol instead (which is often a better choice if you have lots of data, because even using SSH with an alternate cipher still makes SSH a network I/O bottleneck). We did use SSH and with a very specific set of settings (I had to test all the ciphers to see which was fastest, etc.).
rsnapshot just basically acts as a very intelligent wrapper (with config file support) around rsync.
We stored (at first) 30 days of backups, but as more and more customers began using more and more disk, by August 2012 we had cut that down to 12 days. Our backups were stored on a ZFS raidz1 pool consisting of three (3) 1TB disks, so we had 2TB of space to work with.
The advantage to rsnapshot is mainly ease of use when it comes to restoring from backups. I cannot tell you how much my users and few paying customers loved this thing, particularly since our backups were done nightly, across a dedicated gigE network, and stored on a server that exported the backups as an NFS filesystem. The boxes had an NFS mount (read-only) to filer:/backups/machinename, mounted locally as /backups, and one could do cd /backups ; cd daily.3 ; ls -l and get a full filesystem listing and treat that just as one would a full level 0 backup -- meaning despite rsnapshot/rsync only sending over the differences, due to use of hardlinks it acts/behaves like a full filesystem dump no matter if you're in daily.3 (4th-oldest backup) or daily.0 (most recent backup). Customers never needed to call me if they needed to restore some data from a backup -- they could do it themselves.
Certain things were excluded from the list (such as excluding /backups from the backup list ;-) ), and things like /dev, as /var/run, and lots of other places.
Anyway sorry for rambling, but rsnapshot is somewhat hard to describe in text. It's a lot easier to sit down with an SA and show them it. Once they see it in action it clicks almost immediately (once you bring up the hard links part) and they go "holy crap, that is SLICK".
You just gotta make sure you don't run out of space, as well as inodes, on your storage filesystem for the backups. :-) -- Making life hard for others since 1977. I speak for myself and not my employer/affiliates of my employer. |
|
 Bill_MIBill In MichiganPremium,MVM join:2001-01-03 Royal Oak, MI kudos:1 | Oh yes, it IS quite slick! I do use an rsync script to snapshot selected directories - all semi-manually. I'll definitely be looking into rsnapshot as I only maintain a single snapshot with rsync now. |
|