ROTW: sorting in the Finder

Rant Of The Week. Consider a set of files named

1foo
02foo
3foo
04foo
5foo

If you sorted this list “by name”, you would expect

02foo
04foo
1foo
3foo
5foo

But NO! A change maybe last year wants to interpret the name as numeric values. So the Finder puts them in that order I showed first, by interpreting “02” as the value “2” instead of the character sequence.

WTF, Apple? Which part of “name” do you not understand? It’s to be interpreted as a STRING, not as some bizarre algorithm that tries to convert part of the string to a number! (I ran into this when looking at a set of downloaded files, a monthly newsletter, where they changed their naming convention from 2 digit sequence numbers to 3 digit sequence numbers.)

I think you are mistaken. Apple’s Finder has sorted strings of numbers as units for a very long time (since 2001). If you want the other kind of sorting, by each character, you need to do it in Terminal.

4 Likes

OK, but I sure consider that as “unnatural”…

See also: File System Details: The Finder: Filename Sorting Rules:

You’ve got a contrived example, but imagine this sequence of files:

File1.txt
File2.txt
File15.txt
File27.txt
File1234.txt
File2513.txt

This is a natural sorting for files with names like this. Why should users be forced to provide leading zeros on the numeric part of the name in order to make the files sort this way?

In contrast, a strict ASCII-sort would produce:

File1.txt
File1234.txt
File15.txt
File2.txt
File2513.txt
File27.txt

Which I think most people would fine harder to use when looking for a file (or worse, a range of sequentially-numbered files).

7 Likes

Continuing…

The Apple algorithm gets more interesting if you have multiple numeric fields. Each group of digits is sorted numerically, with the intervening characters sorted as Unicode. So imagine a set of files with timestamps encoded into the filename:

File_YYYY-MM-DD_HH-mm-ss.sss

So you might have a file like “File_2024-04-18_12-36-25.123”, encoding April 18, 2024, 12:36:25 and 123 ms.

A strict ASCII sort will work to sort these files by timestamp, only if you use leading zeros. But many people consider it unnatural to use leading zeros for day, month and hour fields. So you may find files like:

File_2024-4-3_10-23-47.488
File_2024-4-18_12-36-25.123
File_2024-5-2_9-30-12.478
File_2024-5-2_13-17-11.917
File_2024-5-14_1-51-55.034
File_2024-10-2_0-08-42.572

With the Apple scheme, the above sequence of files will sort by the dates encoded in the filename. With strict ASCII, you’ll need to include leading zeros for all the fields that have single-digit values in order to get this sorting order.

2 Likes

NOT a contrived example. Here’s a snapshot of the actual filenames:
Screenshot 2024-04-19 at 13.34.34

p.s. I’ve worked on multiple ISO and IEEE standards committee in my career. I’ve noticed a lot of time they don’t “standardize existing practice” but rather standardize what they think -the practice should be.- The last one I worked on was taken over by a couple of people who had very perverse views of the world, but pretty much drove the rest of the participants out because we got so tired of explaining time and time again why what they wanted was not ‘existing practice’ or ‘was already provided by the standard’ or ‘contradicted other parts of the standard.’ Of course, if you can write an incomprehensible standard, there’s money to be made in consulting explaining how to use it.

I think the logical thing to do would be to allow users to choose which sort they want. Default it to the current method, but allow users to choose for any particular folder to switch it to strict alphanumeric sort.

One thing this would dramatically help with is sorting file names that contain hexadecimal numbers. Under the current sort algorithm, the digits 0–9 are sorted as numbers, but A–F are sorted as letters, making hex numbers sort completely bonkers. Allowing the user to sort folders containing such filenames by strict alphanumeric would alleviate this.

The fundamental problem with Apple’s current sort algorithm is that it assumes that consecutive digits in a filename should always be treated as a numerical quantity. This assumption is as wrong as assuming that they should always be treated as individual characters—for some purposes, the former is correct, and for others, the latter. For those of us with lengthy backgrounds in computers, the latter is what we usually expect, and we tend to choose filenames that reflect that. (For example, when I include a date in a filename, I always use a YYYY-MM-DD format, which will always sort correctly by date under either method.)

Of course, offering such an option to users would require Apple to admit that users should have a choice in such things, which runs counter to Apple’s current interface practices.

1 Like

Do you know if Windows offers the kind of option you suggest? I thought once that 3rd party apps like Pathfinder were a fix for Mac users, but I’m unsure whether the current verions of that include it.

Completely disagree. This is one of my favourite features of Apple’s platforms (and as @tom3 notes, it’s not a recent change; it’s been integral to my file naming since I started using Mac OS X). Numbers are not strings, they have meaning, and MacOS/iOS treat them as such. It’s part of having an intelligent interface, not one that just transfers the computer’s internal representation to the user and forces them into strange behaviours like padding out numbers with 0s. It’s kind of crazy that other OS’s still don’t do this – 05, 06, 07, 1, 10, 101, 2, 3, 4 is not a natural or sensible sort order!

8 Likes

It looks like Windows OS has done the same as MacOS since XP:

2 Likes

I’ve had similar problems with wanting to keep the original file name somehow, but the sorting being wrong as a naming scheme was changed. Two methods previously used.

  1. Quick way; simply folder them into separate folders named something relevant:
    eg. “Series 1” and “Series 2”, or perhaps “2000-2015” and “2016-present”, or similar.

  2. Just resort to adding my own extra string bit to help the sorting, but keeping the original title in the string too for searching and reference, as it often solves most problems. So in your example, something like:

2014-07 - 53.MRH14-07-Jul2014-LE.pdf
2014-08 - 54.MRH14-08-Aug2014-LE.pdf
2014-09 - 55.MRH14-09-Sep2014-LE.pdf
2014-10 - 56.MRH14-10-Oct2014-LE.pdf

2023-03 - 052.RE-2023-03-W-2.pdf
2023-04 - 053.RE-2023-04-W.pdf
2023-05 - 054.RE-2023-05-W-2.pdf


EDIT: I should say that I prefer dots to dashes, as despite it not being ISO standard, it allows me more options to show “from-to” info using the dash (eg. “2014.10-2015.03” as a folder/file name). So in your case I’d be using:
2014.07 - 53.MRH14-07-Jul2014-LE.pdf
2014.08 - 54.MRH14-08-Aug2014-LE.pdf

2023.03 - 052.RE-2023-03-W-2.pdf
2023.04 - 053.RE-2023-04-W.pdf

1 Like

As others have said, this behavior has been there for a long time. I think it is really cool and superior to a simple alpha sort.

At work, we recently implemented title sorting based on the Unicode Collation algorithm, just like Apple does. That is what our customers and product folk want. We have a large search corpus, over 2X bigger than Google, and the sorting works just fine.

I’ve been using Unix for 45 years and Macs for 30. I have no problem switching back and forth between Unix ls and the Finder.

Adding sorting options to the Finder would increase the QA expense to satisfy almost nobody. Sorry to dismiss your rant, but this is the first time I’ve heard someone complain about the Finder sorting.

2 Likes

I fail to see much increase in adding a single option to the “sort by” menu, an option that is already the default on the Terminal side. All the code for that sort is already present; it’s just a matter of allowing it to be implemented on individual folders. Add it, test it, and once it works properly, it’s done.

You could even make it a hidden option that’s available only by setting a flag via Terminal, to reduce the support issues from people who might choose it accidentally. Again, this is all just combining things that already exist in the code—you don’t have to implement any new algorithms or functions to enable this.

Because for the most part, those of us who are bothered by it know that complaining about it in a forum like this is wasted effort, shouting into the ether. We’ve been around the Apple block enough times to know that the chance of Apple ever bothering to implement something like this is minuscule.

There are various deficiencies in Finder. For instance, it doesn’t understand regular expressions. In such cases, I go into Terminal and use the underlying unix commands to sort, etc.