Ohh you gave me an idea. Is B2 pricing competitive? If I tried B2 with Arq and didn’t like it, does BackBlaze have a way to “take over” that backup from B2 so I don’t have to start from scratch, or is the format all different?
I have been using BackBlaze for some time - I followed a TidBits recommendation after CrashPlan ended up being a non option. I have a lot of image files and I recall that the initial backup did take some time but I was expecting that.
Since then BackBlaze just works away in the background backing up four sets of enclosures on a reasonably continuous basis. I do not notice any impact on the operations of my Macs. I am based in Australia and so there is a bit of distance between the servers and me.
If I have done a reasonably serious project, then I might check BacBlaze soon afterwards to make sure the backups are there. Otherwise I give myself a weekly reminder to go into BackBlaze and check for the recency of certain files. BackBlaze also sends an email notification to advise how close it is to 100% backup.
I did have a major crash issue and was grateful to BackBlaze as I was able to recover almost every file.
If your AWS bill is very small, don’t worry about the cost details I listed. I doubt Arq has to make any actual data retrieval requests as part of backing up data so Glacier restore times don’t matter. If Arq needs anything, it’s just metadata, which is just an API call away.
Backblaze B2 is price competitive, $0.005/GB/month.
Why did you switch from Wasabi, which is $0.0059/GB/month and no API or egress fees to AWS, did your backup bill go up or down?
B2 and Cloud Backup are separate services but it wouldn’t hurt to ask.
Thanks for your reply.
You’re much more confident than I am. If Arq does a data verification of some type, it may need more than a byte size to do it. It may hash the data itself. I don’t know. But it may require actually fetching the data to validate it against the source file. Maybe that is slow?
I switched because it appeared cheaper. And it’s hard to compare because my volume goes up and down as I’ve been cleaning house. But in general, I used to get bills from Wasabi for about $15-19/mo, and AWS is more like $1
Arq’s backup I am sure is completely different from BackBlaze. The B2 repositories created and used by ARQ are formatted and look just like Arq uses for a local backup. I am sure that they are proprietary and that BackBlaze uses their own proprietary storage method. (Plus my B2 is encrypted and I am going to guess that Arq and BackBlaze use a similar but different enough encryption algorithm.)
AWS storage is extremely durable, there’s not really any reason for Arq to double-check file integrity after the initial upload. The AWS API for uploading involves hashing data parts on the client which are confirmed on the server side so Arq wouldn’t need to create their own.
Wow, that’s great! In that case, I wouldn’t worry about the costs related to early deletion of files from Glacier storage, they don’t seem significant enough to matter. Which leaves the original question, why Arq causes noticeable performance issues on your computer and why cleanup stage takes as long as it does.
Yes I’m sure. But Arq supports many types of destination, not just AWS, including local folders; and I doubt they have destination-specific logic that disables verification, eh? But neither is their anything in the GUI describing a verification phase at all, so I’m just speculating.
Maybe I have no choice but to start all over either with a different storage class or else BackBlaze.
I had a similar issue on Windows 10 after upgrading from Arq 5 to Arq 7: backups that would finish in minutes on Arq 5 would take days to complete on Arq 7.
I tried completely reinstalling Arq 7 and starting fresh. Same problem. Arq support wasn’t willing to work to solve the problem (they said they couldn’t do anything unless I told them why it wasn’t working and showed them exactly how to recreate it). So I went back to Arq 5 and it reverted to the previous performance: backups complete in minutes.
The point is there is something different in Arq 7’s backup engine that results in atrocious performance in some cases.
My suspicion (which I did tell to Arq support!) is that it is due to Microsoft OneDrive’s use of “Files on Demand”, which is where the file system is manipulated to have a kind of placeholder entry for files that are in the OneDrive cloud but not currently on the computer. OneDrive is using NTFS reparse points for this, and it uses them even when the file actually is stored locally. I have some evidence that this fools backup programs into thinking that every file is changed even when they aren’t.
Do you have something like this on your Mac? OneDrive, or Apple’s Deskop/Documents redirection into iCloud, or something like this for Google Drive? Maybe Arq 7 has the same problem on macOS as it does on Windows.
Thanks so much for sharing this.
I believe the “upgrade” from 5 to 7 also introduced this problem, though I don’t have an easy way to prove it. So your data point is helpful!
I don’t use One Drive or Google Drive (other than the cloud version). I do use Dropbox. But again, that was all true before I upgraded to Arq 7. I think their algorithm has changed.
I just set up a new AWS bucket and configured a fresh backup with “Standard” as the storage class. (I don’t think a new bucket was necessary; it appears I just need to add the same local folders with a new storage class; but I just want to keep them separate so I can just delete the other bucket if the new one turns out to be faster).
We’ll see how long it takes to give me a full backup and then I’ll check what the incremental look like…
I believe that the upgrade from 5 to 6 introduced thinning - that thinning wasn’t a thing with 5. I’m wondering if the performance issue you’re seeing is because of backup set thinning?
Perhaps you didn’t see my earlier posts where I do suspect the thinning of being the culprit. The logs seem to suggest that (see my OP).
Thinning is important. But you have to be able to do it efficiently.
Would love to hear from @yevp about whether BB handles file retention / thinning or other activities in the data center rather than relying on all the processing to be done remotely by the desktop client?
I was a long time CrashPlan user and advocate for other friends to use it also. I switched to Backblaze for my iMacs and have been very happy. Never have I had any indication that the client running on macOS (through several versions, now on macOS 11) is slowing other processes. Those friends have also switched to BackBlaze and I’ve had no gripes from them either.
I also use B2 for storing snapshots of files that I expect to stay unchanged; like tax returns and records, my Photo library, backups made from other systems.
My restores have all been just individual files; no very large restores that have required BackBlaze to send me a storage device.
So, I followed my own theory about the Deep Glacier Archive storage class of S3 as being the culprit here, and started all over with “standard storage class” at Amazon S3.
It took a couple days to do a full backup, of course. But overall, it seems that the backup durations are much briefer. But I can’t provide numbers, because I had been playing around with the backup frequency; and if you do more frequent backups, you should expect them to run faster. But even when I was running them frequently to Deep Glacier (or as frequently as I could), they took a long time. Now, hourly backups are running typically 1.5 to 2.5 hours, which is much better.
Still not sure about the performance impact on my Mac. I feel like I’ve still had slowdowns which were alleviated by pausing Arq. I do have “Throttle Disk I/O” on now, which was not always on before, partly because I wanted to make sure that wasn’t the cause of the slow backups.
So I’ll be keeping an eye on performance. And I won’t switch to BackBlaze for now, since I may have this under control.
Thanks for all the input!
I still have two business computers backing up to Crashplan for $10 each. They are both under a terabyte. This article compares Backblaze and Crashplan, favoring Crashplan: