Performance of Backblaze vs. Arq Backup

Ohh you gave me an idea. Is B2 pricing competitive? If I tried B2 with Arq and didn’t like it, does BackBlaze have a way to “take over” that backup from B2 so I don’t have to start from scratch, or is the format all different?

I have been using BackBlaze for some time - I followed a TidBits recommendation after CrashPlan ended up being a non option. I have a lot of image files and I recall that the initial backup did take some time but I was expecting that.

Since then BackBlaze just works away in the background backing up four sets of enclosures on a reasonably continuous basis. I do not notice any impact on the operations of my Macs. I am based in Australia and so there is a bit of distance between the servers and me.

If I have done a reasonably serious project, then I might check BacBlaze soon afterwards to make sure the backups are there. Otherwise I give myself a weekly reminder to go into BackBlaze and check for the recency of certain files. BackBlaze also sends an email notification to advise how close it is to 100% backup.

I did have a major crash issue and was grateful to BackBlaze as I was able to recover almost every file.

If your AWS bill is very small, don’t worry about the cost details I listed. I doubt Arq has to make any actual data retrieval requests as part of backing up data so Glacier restore times don’t matter. If Arq needs anything, it’s just metadata, which is just an API call away.

Backblaze B2 is price competitive, $0.005/GB/month.

Why did you switch from Wasabi, which is $0.0059/GB/month and no API or egress fees to AWS, did your backup bill go up or down?

B2 and Cloud Backup are separate services but it wouldn’t hurt to ask.

Thanks for your reply.

You’re much more confident than I am. If Arq does a data verification of some type, it may need more than a byte size to do it. It may hash the data itself. I don’t know. But it may require actually fetching the data to validate it against the source file. Maybe that is slow?

I switched because it appeared cheaper. And it’s hard to compare because my volume goes up and down as I’ve been cleaning house. But in general, I used to get bills from Wasabi for about $15-19/mo, and AWS is more like $1 :sweat_smile:

Arq’s backup I am sure is completely different from BackBlaze. The B2 repositories created and used by ARQ are formatted and look just like Arq uses for a local backup. I am sure that they are proprietary and that BackBlaze uses their own proprietary storage method. (Plus my B2 is encrypted and I am going to guess that Arq and BackBlaze use a similar but different enough encryption algorithm.)

1 Like

AWS storage is extremely durable, there’s not really any reason for Arq to double-check file integrity after the initial upload. The AWS API for uploading involves hashing data parts on the client which are confirmed on the server side so Arq wouldn’t need to create their own.

Wow, that’s great! In that case, I wouldn’t worry about the costs related to early deletion of files from Glacier storage, they don’t seem significant enough to matter. Which leaves the original question, why Arq causes noticeable performance issues on your computer and why cleanup stage takes as long as it does.

Yes I’m sure. But Arq supports many types of destination, not just AWS, including local folders; and I doubt they have destination-specific logic that disables verification, eh? But neither is their anything in the GUI describing a verification phase at all, so I’m just speculating.

Maybe I have no choice but to start all over either with a different storage class or else BackBlaze.

I had a similar issue on Windows 10 after upgrading from Arq 5 to Arq 7: backups that would finish in minutes on Arq 5 would take days to complete on Arq 7.

I tried completely reinstalling Arq 7 and starting fresh. Same problem. Arq support wasn’t willing to work to solve the problem (they said they couldn’t do anything unless I told them why it wasn’t working and showed them exactly how to recreate it). So I went back to Arq 5 and it reverted to the previous performance: backups complete in minutes.

The point is there is something different in Arq 7’s backup engine that results in atrocious performance in some cases.

My suspicion (which I did tell to Arq support!) is that it is due to Microsoft OneDrive’s use of “Files on Demand”, which is where the file system is manipulated to have a kind of placeholder entry for files that are in the OneDrive cloud but not currently on the computer. OneDrive is using NTFS reparse points for this, and it uses them even when the file actually is stored locally. I have some evidence that this fools backup programs into thinking that every file is changed even when they aren’t.

Do you have something like this on your Mac? OneDrive, or Apple’s Deskop/Documents redirection into iCloud, or something like this for Google Drive? Maybe Arq 7 has the same problem on macOS as it does on Windows.

Thanks so much for sharing this.

I believe the “upgrade” from 5 to 7 also introduced this problem, though I don’t have an easy way to prove it. So your data point is helpful!

I don’t use One Drive or Google Drive (other than the cloud version). I do use Dropbox. But again, that was all true before I upgraded to Arq 7. I think their algorithm has changed.

I just set up a new AWS bucket and configured a fresh backup with “Standard” as the storage class. (I don’t think a new bucket was necessary; it appears I just need to add the same local folders with a new storage class; but I just want to keep them separate so I can just delete the other bucket if the new one turns out to be faster).

We’ll see how long it takes to give me a full backup and then I’ll check what the incremental look like…

I believe that the upgrade from 5 to 6 introduced thinning - that thinning wasn’t a thing with 5. I’m wondering if the performance issue you’re seeing is because of backup set thinning?

Perhaps you didn’t see my earlier posts where I do suspect the thinning of being the culprit. The logs seem to suggest that (see my OP).

Thinning is important. But you have to be able to do it efficiently.

Would love to hear from @yevp about whether BB handles file retention / thinning or other activities in the data center rather than relying on all the processing to be done remotely by the desktop client?

Dave,
I was a long time CrashPlan user and advocate for other friends to use it also. I switched to Backblaze for my iMacs and have been very happy. Never have I had any indication that the client running on macOS (through several versions, now on macOS 11) is slowing other processes. Those friends have also switched to BackBlaze and I’ve had no gripes from them either.

I also use B2 for storing snapshots of files that I expect to stay unchanged; like tax returns and records, my Photo library, backups made from other systems.

My restores have all been just individual files; no very large restores that have required BackBlaze to send me a storage device.

1 Like

So, I followed my own theory about the Deep Glacier Archive storage class of S3 as being the culprit here, and started all over with “standard storage class” at Amazon S3.

It took a couple days to do a full backup, of course. But overall, it seems that the backup durations are much briefer. But I can’t provide numbers, because I had been playing around with the backup frequency; and if you do more frequent backups, you should expect them to run faster. But even when I was running them frequently to Deep Glacier (or as frequently as I could), they took a long time. Now, hourly backups are running typically 1.5 to 2.5 hours, which is much better.

Still not sure about the performance impact on my Mac. I feel like I’ve still had slowdowns which were alleviated by pausing Arq. I do have “Throttle Disk I/O” on now, which was not always on before, partly because I wanted to make sure that wasn’t the cause of the slow backups.

So I’ll be keeping an eye on performance. And I won’t switch to BackBlaze for now, since I may have this under control.

Thanks for all the input!

1 Like

I still have two business computers backing up to Crashplan for $10 each. They are both under a terabyte. This article compares Backblaze and Crashplan, favoring Crashplan:

CrashPlan is for small businesses and not really viable for individual users for which is BackBlaze. CrashPlan has made that quite clear.

UPDATE:

  1. Performance - I don’t think this has been a problem, since I switched to Standard Storage Class at AWS. But I really don’t use my “main” Mac that much these days anyway, so I’m not sure I’m comfortable drawing too many conclusions about that.
  2. Price - This is a problem. The bills from AWS were high at first, but I thought maybe that was due to initial upload. But I get the latest monthly bill, and that was $65! Sorry, that’s a no-can-do.

I probably shouldn’t have deleted my “deep glacier archive” data, but oh well. I was optimistic. I have now initiated a fresh DGA-class backup.

Note: I didn’t realize at first that you can select the storage class on a PER FOLDER basis. It’s not like everything that’s part of the backup plan or to that destination has to be the same storage class. So that’s a handy level of flexibility to have available. Arq does manage that rather nicely in their GUI.

So I guess this is an experiment in progress. You may not care, but hopefully diving into this mystery will benefit someone other than myself!

1 Like

UPDATE (not that anyone is following or cares…)

My AWS monthly bill was getting over $30/mo, even using Deep Glacier Archive. I began to troubleshoot whether I had a ton of legacy junk using up space, but that AWS console is so extremely intractable that I gave up.

I was going to head back to Wasabi, but frankly that interface is not thrilling either.

Then it hit me: I’m already a paying Dropbox customer. I have the 2TB plan because it’s the cheapest option that gives me essential features like Selective Sync and the Cloud-only option, but of which are invaluable to having everything everywhere in a way that doesn’t kill my local storage. And Dropbox in general sits at the core of accessing my files from multiple Macs, multiple users, iPhones, iPads, etc.

But I was only using about 5% of that 2TB. To add my backups, I needed more, so I upgraded to the 3TB option for an incremental monthly amount. And I moved Arq to using Dropbox last week, and it caught up and seems to run faster than any of the previous options. Plus now, I’m using the space that I pay for, and the overall cost is back down where it should be. And it’s one less bill and account to worry about.

And I don’t have to deal with cryptic consoles, storage classes, etc.

So far so good.

2 Likes

Wow, that’s a change from the past. DropBox (and I’ve also used OneDrive) were always much slower for me than B2 or AWS using Arq.

One other PITA about backing up to one of the syncing services from Arq is that you have to be awfully careful to remember to turn on selective sync when you set up the syncing app on a new Mac. Otherwise it takes forever to download/sync, plus Arq creates a lot of files.

I guess I don’t back up a lot of data, because my AWS bill (for Glacier storage on Arq of my photo library and iTunes library and media) is about $2.50 a month and my B2 (for documents, mail folders, etc.) is about $2 a month.

I’ve recently switched to the idea of putting almost everything in Sync.com (I used to use DropBox but I just don’t love DB on MacOS) and almost nothing elsewhere in my home folder - but I do back up the sync.com library from my always-on Mac mini, as well as my iCloud Drive from there. I do have some stuff in iCloud Drive, but I also don’t love some of the limitations of iCloud Drive compared with Dropbox/OneDrive (and now sync.com).

I’m beginning to get tired of all of the apps that are starting to store data in the ~/Library/Containers folder these days. Such a pain for doing things like setting up Arq to backup those folders.

I completely agree. Storing data in ~/Library/Containers just continues to fill up my SSD whether MacbookPro or MacPro. I might look into Sync.com as I’d never heard of it. I still have a Crashplan subscription for my MacPro but to be honest, I’ve never had more trouble navigating a site or trying to figure out how to do things than with Crashplan. I’m getting out of it as soon as can find out how. So I use Arq backup cause Backblaze in the past got too bombastic with certain rules and backups and started charging me extra. That’s just more likely a reflection of me not understand their geeky language. Best, Patrick