I have a 56 TB local Unraid NAS that is parity protected against single drive failure, and while I think a single drive failing and being parity recovered covers data loss 95% of the time, I’m always concerned about two drives failing or a site-/system-wide disaster that takes out the whole NAS.
For other larger local hosters who are smarter and more prepared, what do you do? Do you sync it off site? How do you deal with cost and bandwidth needs if so? What other backup strategies do you use?
(Sorry if this standard scenario has been discussed - searching didn’t turn up anything.)
Not all data is equal. I backup things i absolutely can not lose and yolo everything else. My love for this hobby does not extend to buying racks of hard drives.
True words of wisdom here from a self hosting perspective.
Same, my unraid server is over 40 tb but I only have ~1.5 tb of critical data, being my immich photos and some files. I have an on site and off site raspberry pi with 4tb nvme SSD for nightly backups
Personally I deal with it by prioritizing the data.
I have about the same total size Unraid NAS as you, but the vast majority is downloaded or ripped media that would be annoying to replace, but not disastrous.
My personal photos, videos and other documents which are irreplaceable only make up a few TB, which is pretty managable to maintain true local and cloud backups of.
Not sure if that helps at all in your situation.
I have data that I actually care about in RAIDZ1 array with a hot standby and it is syched to the cloud. The rest (the vast majority) is in a RAIDZ5. If I lose it, I “lose” it. Its recoverable if I decide I want it again.
You’ll think I’m crazy, and you’re not wrong, but: sneakernet.
Every time I run the numbers on cloud providers, I’m stuck with one conclusion: shit’s expensive. Way more expensive than the cost of a few hard drives when calculated over the life expectancy of those drives.
So I use hard drives. I periodically copy everything to external, encrypted drives. Then I put those drives in a safe place off-site.
On top of that, I run much leaner and more frequent backups of more dynamic and important data. I offload those smaller backups to cloud services. Over the years I’ve picked up a number of lifetime cloud storage subscriptions from not-too-shady companies, mostly from Black Friday sales. I’ve already gotten my money’s worth out of most of them and it doesn’t look like they’re going to fold anytime soon. There are a lot of shady companies out there so you should be skeptical when you see “lifetime” sales, but every now and then a legit deal pops up.
I will also confess that a lot of my data is not truly backed up at all. If it’s something I could realistically recreate or redownload, I don’t bother spending much of my own time and money backing it up unless it’s, like, really really important to me. Yes, it will be a pain in the ass when shit eventually hits the fan. It’s a calculated risk.
I am watching this thread with great interest, hoping to be swayed into something more modern and robust.
That is old-old-school. It works tho. You have to be a bit scheduled about it, to encompass current and future important data. IIRC AWS created a 100 petabyte drive and a truck to haul it around to basically do the same thiing, just in much larger amounts.
Sneakernet crew here too. My work offsite backup is in my backpack. Few times per week I do a sync which takes a few minutes and take it home again. (The sync archives old versions of files and the drive is encrypted.)
We tried several cloud-based solutions and they were all rather expensive or just plain hard to run to completion or both.
For me, I only back up data I can’t replace, which is a small subset of the capacity of my NAS. Personal data like photos, password manager databases, personal documents, etc. get locally encrypted, then synced to a cloud storage provider. I have my encryption keys stored in a location that’s automatically synced to various personal devices and one off-site location maintained by a trusted party. I have the backups and encryption key sync configured to keep n old versions of the files (where the value of n depends on how critical the file is).
Incremental synchronization really keeps the bandwidth and storage costs down and the amount of data I am backing up makes file level backup a very reasonable option.
If I wanted to back up everything, I would set up a second system off-site and run backups over a secure tunnel.
I’ve been following this post since the first comment.
And I have just put together my own RAID1 1TB NAS. And I did not think that 1TB will serve me forever, more like “a good start”.
But the numbers I’ve been seeing in here… you guys are nuts 😆
Well, first while raid is great, it’s not a replacement for backups. Raid is mostly useful if uptime is imperative, but does not protect against user errors, software errors, fs corruption, ransomware or a power surge killing the entire array.
Since uptime isn’t an issue on my home nas, instead of parity I simply have cold backups which (supposedly) I plug in from time to time to scrub the filesystems.
If a online drive dies I can simply restore it from backup and accept the downtime. For my anime I have simply one single backup, but or my most important files I have 2 backups just in case one fails. (Unfornately both onsite)
On the other hand, for a client of mine’s server where uptime is imperative, in addiction to raid I have 2 automatic daily backups (which ideally one should be offsite but isn’t, at least they are in different floors of the same building).
I have 3 main NASes
78TB (52TB usable) hot storage. ZFS1
160TB (120TB) warm storage ZFS2
48TB (24TB) off site. ZFS mirror
I rsync every day from hot to off site.
And once a month I turn on my warm storage and sync it.
Warm and hot storage is at the same location.
Off site storage is with a family friend who I trust. Data isn’t encrypted aside from in transit. That’s something else I’d like to mess with later.
Core vital data is sprinkled around different continents with about 10TB. I have 2 nodes in 2 countries for vital data. These are with family.
I think I have 5 total servers.
Cost is a lot obviously, but pieced together over several years.
The world will end before my data gets destroyed.
But would your data survive a nearby gamma-ray burst?
Amateurs not keeping at least one backup off-planet SMH
I put a QNAP on the ISS. Expensive, but I sleep soundly.
I have a 120TB unraid server at home, and a 40TB unraid server at work. Both use 2 x parity disks.
The critical work stuff backs up to home, and the critical home stuff backs up to work.
The media is disposable.
Both servers then back up to Crashplan on separate accounts - work uses the Australian server on a business account, home used the US server on a personal account.
I figure I should be safe unless Australia and the US are nuked simultaneously… At which point my data integrity is probably not the most pressing issue.
why is your work stuff at home and why is your personal stuff at work ಠ_ಠ
Yeah I guess it probably makes more sense when it’s my business… Maybe not if you’re an employee at some corporate randomly hosting backups of your dog photos.
I dunno. At a big company they probably won’t notice an extra TB of storage cost… So long as you’re discrete with the transfers.
With another large NAS.
In a different location
Well I personally have about 50tb, with one local copy and one remote copy but I’m very lucky to have access to old enterprise storage.
What’s your recovery needs?
It’s ok to take 6 months to backup to a cloud provider, but do you need all your data to be recovered in a short period of time? If so, cloud isn’t the solution, you’d need a duplicate set of drives nearby (but not close enough for the same flood, fire, etc.
But, if you’re ok waiting for the data to download again (and check the storage provider costs for that specific scenario), then your main factor is how much data changes after that initial 1st upload.
Sorry. Shortly after posting this and the initial QA I left for a trip.
I could definitely wait those time periods for a first backup and a restore, since I assume it’ll be a once in 10 year at worst situation. Data changes after the first upload should be show enough to keep up.
No worries, I don’t have a time limit on responses 😉
But… I took somethong like ~3 days to get an initial baxkup done.
Then ~3 years later I was at a different provider doing the same thing.
What I did do differently was to split the data into different backup pools (ie photos, music, work, etc) rather than 1 monolithic pool… that’ll make a difference.
That does make sense - also matches how I have currently sperated files so it’s a valuable idea. Thanks!
Backblaze offers unlimited data on a single computer, $99/year.
There might be some fine print that excludes your setup but might be worth investigating.
only windows (maybe mac)
Wine or there is a Docker container that runs the Backblaze client.
Yeah, people have done workarounds and stuff to get their entire NAS backed up but those seemed sketchy and bad when I looked into it.
if you break their TOS, you’ll likely lose your data. So… be careful. Mind you, I haven’t read their TOS, so i don’t know if those work arounds are breaking their TOS.
Oh shit.
Honestly, I’d buy 6 external 20tb drives and make 2 copies of your data on it (3 drives each) and then leave them somewhere-safe-but-not-at-home. If you have friends or family able to store them, that’d do, but also a safety deposit box is good.
If you want to make frequent updates to your backups, you could patch them into a Raspberry Pi and put it on Tailscale, then just rsync changes every regularly. Of course means that wherever youre storing the backup needs room for such a setup.
I often wonder why there isn’t a sort of collective backup sharing thing going on amongst self hosters. A sort of “I’ll host your backups if you host mine” sort of thing. Better than paying a cloud provider at any rate.
That NAS software company Linus (of Linus Tech Tips) funded has a feature for this planned I think.
An open-source standalone implementation would be dope as hell. Sure, it’d mean you’d need to double your NAS capacity (as you’d have to provide enough storage as you use), but that’s way easier than building a second NAS and storing/maintaining it somewhere else or constantly paying for and managing a cloud backup.
such a system would need a strict time limit for restoration after the catastrophe. Otherwise leeching would be too easy.
That’s an incredibly good point. Bad actors are the worst. Some ideas:
- Maybe you’d need to contribute your storage capacity +10% (or more), to account for your and other’s downtime during disasters.
- A time limit after disasters would be necessary. It’s difficult to think of a proper time limit though, as even a month might not be enough time if your entire house burns down.
- Maybe a payment system could be set up to where, if your server doesn’t ping for a week, your credit card is automatically charged (after pinging you with many emails). Sure, that’d suck, but it’d be better than loosing your data, and cheaper overall than paying for cloud backups. I’m not sure where that money would go. Maybe distributed to those who didn’t experience a disaster, or maybe to the software project, though that would mean people are profiting from a disaster. Maybe it could go to a charity of your choice or something.
Definitely a difficult problem to solve. I’m sure people smarter than me have ideas beyond mine.
A time limit after disasters would be necessary. It’s difficult to think of a proper time limit though, as even a month might not be enough time if your entire house burns down.
and also accounting for low bandwidth connections… whats more, some shitty providers even have monthly data caps
Maybe a payment system could be set up to where, if your server doesn’t ping for a week, your credit card is automatically charged (after pinging you with many emails).
yeah, that would be almost a necessary feature. being able to hold on to the backup when you really can’t restore.
I use aws s3 deep archive storage class, $0.001 per GB per month. But your upload bandwidth really matters in this case, I only have a subset of the most important things backed up this way otherwise it would take months just to upload a single backup. Using rclone sync instead of just uploading the whole thing each time helps but you still have to get that first upload done somehow…
I have complicated system where:
- borgmatic backups happen daily, locally
- those backups are stored on a btrfs subvolume
- a python script will make a read-only snapshot of that volume once a week
- the snapshot is synced to s3 using rclone with --checksum --no-update-modtime
- once the upload is complete the btrfs snapshot is deleted
I’ve also set up encryption in rclone so that all the data is encrypted an unreadable by aws.
It is cheap as long as you don’t need to restore your data. Downloading data from S3 costs a lot. OP asked about 56TB of storage, for which data retrieval would cost about 4.7k
https://aws.amazon.com/s3/pricing/ under data transfer
I’m aware, but I myself have < 3TB and if I actually need it I’ll be more happy to pay. It’s my “backup of last resort”, I keep other backups on site and infrequently on a portable HDD offsite.
Don’t do this. It’s a god damn nightmare to delete
How so? I can easily just delete the whole s3 bucket.
Maybe I’m thinking of glacier. It took months trying to delete that.
A second offsite NAS (my old one) with the same capacity for the larger files
Backblaze B2 and a Hezner storage box for Really Important stuff.
Okay Mr. Money Bags
It’s literally a Raspberry pi 3B+ and a USB hard drive in a plastic storage box at my parents house 😅
Tape or backblaze









