Auditing Free Drive Space: Where Have All the Gigabytes Gone?

Originally published at: Auditing Free Drive Space: Where Have All the Gigabytes Gone? - TidBITS

Curious about how Time Machine snapshots can supposedly prevent the space occupied by deleted files from being recovered right away, Adam Engst ran some tests and came away more confused than when he started.

I can’t explain what you saw, but I’ve seen enough in the past to believe that Apple’s GUI tools (including the Finder) are incapable of showing you the complete picture.

Old-school Unix utilities like the df command can give you a better image. On a modern Mac with Time Machine snapshots, there are a lot of nearly-identical volumes listed, since each snapshot on a Time Machine volume is mounted, but it’s not too bad if you just make the Terminal window very wide and scroll up to see the lines you care about:

$ df -h
Filesystem                                                    Size   Used  Avail Capacity iused       ifree %iused  Mounted on
/dev/disk1s1s1                                               1.8Ti   14Gi  864Gi     2%  553779 19538474981    0%   /
devfs                                                        198Ki  198Ki    0Bi   100%     684           0  100%   /dev
/dev/disk1s5                                                 1.8Ti  3.0Gi  864Gi     1%       3 19539028757    0%   /System/Volumes/VM
/dev/disk1s3                                                 1.8Ti  598Mi  864Gi     1%    3811 19539024949    0%   /System/Volumes/Preboot
/dev/disk1s6                                                 1.8Ti  4.4Mi  864Gi     1%      18 19539028742    0%   /System/Volumes/Update
/dev/disk1s2                                                 1.8Ti  981Gi  864Gi    54% 3517961 19535510799    0%   /System/Volumes/Data
map auto_home                                                  0Bi    0Bi    0Bi   100%       0           0  100%   /System/Volumes/Data/home
/dev/disk3s2                                                 3.6Ti  1.3Ti  2.3Ti    37% 3303636 39064833804    0%   /Volumes/Time Machine
com.apple.TimeMachine.2022-05-11-212100.backup@/dev/disk3s2  3.6Ti  847Gi  2.3Ti    27% 3282330 39064855110    0%   /Volumes/.timemachine/3B5AA177-B31B-4AF6-AE89-96F69A2BE834/2022-05-11-212100.backup
com.apple.TimeMachine.2022-05-18-093552.backup@/dev/disk3s2  3.6Ti  864Gi  2.3Ti    27% 3288855 39064848585    0%   /Volumes/.timemachine/3B5AA177-B31B-4AF6-AE89-96F69A2BE834/2022-05-18-093552.backup
...

Of note:

  • There are some lines that are not actually file systems, but are ways to access device drivers or kernel data. In the above example these are the devfs file system (mounted at /dev, and contains filenames that represent access points for device drivers) and the map auto_home file system (mounted at /System/Volumes/Data/home and, I think, is used to represent certain kinds of remote network file systems that may auto-mount in an on-demand fashion).

  • The -h option tells the df command to display sizes in human-understandable units (e.g. Mi, Gi, Ti, etc.) instead of counts of bytes.

  • Notice that all the volumes on disk1 are identical size and have the same available space. This is because they all belong to a single APFS container and APFS volumes sharing a single container all share the same free space.

  • All of the time machine snapshot volumes show the same size and available space. This is because, like all other volumes in an APFS container, they share the same free space. Even though snapshots are read-only, they still have “free space”.

Note, however, what is not present:

  • There is no information about snapshots that are not mounted. Meaning all the snapshots on your internal file system (unless you mount them)
  • There is no information about how much space will become available if a snapshot is deleted. This is because it’s impossible to compute without knowing the reference count on all of the disk blocks used by the files in the snapshot. Any block whose count is not 1 will not be freed when the snapshot is deleted.

We know that some parts of the GUI present more information than this in a count of “free” space. For example, the “About this Mac” window includes “purgeable” storage - space that the system can make free, should it need to (probably including caches, temporary files, trashed files and local snapshots). Which is why on my system, it reports 934 GB free, even though df reports 864 GiB (which is about 928 GB, not 934 GB):

Screen Shot 2023-02-24 at 15.48.24

The Finder also reports this 934 GB size, and says that there is about 6 GB (the difference) as “purgeable”:

Screen Shot 2023-02-24 at 15.51.15

Disk Utility, on the other hand, is reporting the actual free space (928 GB), since it doesn’t know about purgeable content):

It’s worth noting that in your screen captures, there is a lot of purgeable data, and it varies as you go:

  • The first image says 137 GB available, but 38 GB purgeable. So there’s really only 99 GB free.
  • The second image shows 146 GB free, with 47 GB purgeable. So the actual free space is exactly the same - 99 GB. Emptying the trash didn’t create any free space, but increased the amount that can be purged (presumably by deleting snapshots)
  • The Disk Utility image shows 100 GB free. Which is about what I’d expect to see
  • The fourth image shows 164 GB free, but with with 47 GB purgeable. So now there is 117 GB free. I don’t know why the free space only rose by 18 GB when 36 GB (twice that) was actually purged by deleting the snapshot.
  • The final image shows about 160 GB free, with about 35 GB purgeable. So there is 125 GB of actual free space.

It’s still hard to understand, and I completely agree with your conclusion, but hopefully the numbers will make a little more sense.

4 Likes

Great details, and I agree that the purgeable number is one of the big wildcards in all this. I was hoping that, by working fairly quickly, the changes would reflect only what I was doing explicitly and not some background activity.

But at this point, who knows! If you sit and watch the Available and Used lines in the Get Info window for a drive, they’ll change constantly.

FWIW, some six hours later in the day, Get Info is now reporting 149 GB available, 34 GB of which is purgeable). So the purgeable number has barely changed, but 10 GB of space has disappeared since my final screenshot in the article, even though I’ve done nothing significant. Small amounts of email have come into Mimestream, and I’ve been writing in Google Docs, but certainly not 10 GB worth in either case.

It also occurred to me that you might be seeing the effect of duplicate (or partial-duplicate) files sharing blocks.

If some of those files were duplicated on an APFS volume, then they would be sharing the same blocks. So deleting one wouldn’t create any new free space and deleting both would only free half of the number you’d get by adding up their file sizes.

Thank you for this. I’ve been trying to understand why the available space on my M2 Mac mini seems to jump around unpredictably by many Gb. I’ll try to stop thinking about it now.

By the way, is there any way safely to purge ‘purgeable’ space?

My understanding is that purgeable space is space that macOS can take back if it needs to, not something you can control manually.

Understood. My HD shows 25Gb of purgeable space. I’ll wait and see how much is given back as free space.

It won’t be freed up over time. It will be freed when you start running low/out of free space and macOS needs to make room for something.

IMO, if you are running so low on space that macOS starts purging content, then you probably should have started deleting stuff quite a while ago.

There’s plenty of space, hundreds of Gb. Just like to keep track of what’s happening.

Having deprived us of progress bars, Apple now nominates “free space” as the last true source of randomness.

1 Like

Instead of “Absolutely Perplexing Fluctuating Space” I’d uncertainly suggest you call it “Approximately Probable Fluctuating Space”.:wink:

2 Likes

I didn’t actually assume the F really stood for Fluctuating, but that is me.

These issues are in play for me constantly, not necessary because of snapshots, though those too, but because I’m more than happy to use and abuse my favorite APFS capability, copy-on-write, which enables me to, say, think nothing about duplicating a VM or a sparse bundle or an iMovie library or a Photos library before making changes to it, since those copies take up no additional space. It’s so convenient! I can operate safely without doubling the space I’m using! But it also means that disk math is impossible, because of course it appears that I have hundreds more GB in use than I actually do.

The part that feels frustrating to me is that, in theory, a utility could analyze APFS data structures and theoretically figure this stuff out – I mean, after all, the computer has to know at some level what’s a lightweight duplicate or snapshot versus a block not otherwise represented. But I’m unaware of any tool, even seven years later, that works beneath the APFS file level, other than, of course, fsck_apfs. I don’t know if this is because Apple hasn’t released sufficient technical detail about APFS to make it possible, or whether it’s just because it’s so complex that no one wants to make it, but it’s a bummer. We could use it.

Also problematic is that if you make use of these APFS features, they’re part of the largely non-copyable container they’re in, meaning what fits on one computer may well not fit on another drive of the same size. This was a problem for me when cloning one computer that had about 1.9 TB used on a 2 TB drive – Migration Assistant just refused. I ended up using the Legacy Clone feature of Carbon Copy Cloner, which utilizes a somewhat unreliable Apple-provided underlying container copying mechanism, but it’s hardly something I’d want to depend on. I’d at least think Migration Assistant ought to have the capability of figuring out what is real space occupied when moving from machine to machine, but, no.

As for whether it’s possible to intentionally purge purgeable space – DaisyDisk does this, and in general has long been a tremendously excellent tool for managing disk space. It’s right up there with Carbon Copy Cloner as far as essential Mac tools go for me. I don’t think it has any way of knowing about what’s an APFS copy-on-write clone and what isn’t, but it at least helps me figure out how much space is hidden (meaning, snapshots and other stuff outside the immediate file system), and what the big fish are – or at least big fish candidates – when I’m looking to clear some space up.

3 Likes

A post was split to a new topic: Pining for the days of upgradable storage

Yeah, as soon as you get into any kind of behavior like this, the APFS Uncertainty Principle kicks into play big time. I was just revisiting an article Howard Oakley wrote about disk images, and how you can create a non-sparse image only to have it converted to a sparse image, but depending on what you do with it, it can revert to being non-sparse again.

I imagine that an APFS audit tool could be created, but it might be of relatively limited utility, given that the Fluctuating part of APFS would render whatever number it reports quite variable. It might also take a fair amount of time to run, as it evaluates every sparse image and duplicated block and snapshot to see how much data is really in use, such that it might not be accurate even by the time it finished.

2 Likes

Ah, I missed this one from Howard. Thanks for invoking it – fascinating reading.

I imagine that an APFS audit tool could be created, but it might be of relatively limited utility, given that the Fluctuating part of APFS would render whatever number it reports quite variable. It might also take a fair amount of time to run, as it evaluates every sparse image and duplicated block and snapshot to see how much data is really in use, such that it might not be accurate even by the time it finished.

YMMV, but I’d be more than happy to let something like this run all night to yield only ballpark accuracy! My question is whether Apple has even released enough information about APFS to make such an audit possible. Or to allow a program like Carbon Copy Cloner to theoretically recreate the internal APFS relationships on a target volume, so that that it ends up with roughly the same space used as the source volume, without having to unreliably block-copy the entire container. (And if not, couldn’t at least Migration Assistant do that? Or an expansion of “diskutil apfs” or “cp”?)

What about a tool to analyze your drive to utterly maximize disk space by identifying redundant true copies of things and turning them into lightweight APFS clones, or sparseify anything that can be sparseified? It would just be cool if the doors were open for devs to give power users tools to do powerful things, rather than one’s file storage being an opaque, unknowable, dynamic organism. Though, of course, Apple has been trending away from that for quite some time in most domains. (Sometimes you get something new and surprising though, like Shortcuts.)

I don’t know if I really am capable of going full zen on my disk space, as much as I’d like to; regardless, I don’t have much choice but to accept the mystery.

Yeah, this drives me nuts! After being a Mac Consultant for over 33 years, I’ve become increasingly critical toward Apple’s continual disregard for ease of use and understanding for their non tech users.

1 Like

This got critical for me on a previous Mac that had a lot of media files on it, when I tried to install an OS update and was told “not enough free space.” That caused me to do a dive (not as deep as Adam or David C.) and learn about local Time Machine backups, and how to purge them. So this can be more than just an ‘academic question’

And to The Mac Doctor: I’ve always been very pissed off at Apple’s “an error has occurred” (and there’s not a damn thing you can do about it) attitude towards error logging, etc. I’m not sure which is worse, “An error has occurred” or “Error 4231” with NO WAY to find the meaning for that particular error code. Sometimes I have been able to spend some time (hours) digging through console logs, etc, to figure out what went wrong. As Mac OS gets increasingly more complex (a lot of that due to paranoid security), things are much more likely to break in strange (and often not repeatable/Heisenbug) ways.

Thanks for this coverage! It’s enough to validate that I’m not crazy or an idiot since I can’t figure out what appears to be simple math… calculating free drive space.

Besides COW, another way that storage is oversubscribed is Time Machine’s use of hard links, if I recall correctly. Hard links are additional directory entries in a filesystem pointing to the same inode. The data will only free up when its link count drops to zero. Until then, some ways of counting consumed space may double report such storage.

But I’m probably off topic now :sweat_smile:

I think that the use of APFS snapshots in Time Machine is instead of hard links. Can’t remember where I read that, but I’m pretty sure hard links were the best that could be done on HFS+, but snapshots are a more robust and elegant solution for Time Machine incremental backups.

Aha thanks for that! Yea I learned what I learned from a white paper when it first came out I think.

More off topic, but I’m not a fan of APFS. It seems to be nothing but trouble. Drives I cannot boot from or backup, all kinds of ghost images floating around in utilities that display drives, corrupt file systems that cannot be repaired…

We explored ZFS ages ago and abandoned it, right? I wonder why that was… ZFS has emerged as the standard elsewhere, such as for Proxmox virtualization.