Modern CSV Lets You Manipulate CSV Files Directly

Originally published at: Modern CSV Lets You Manipulate CSV Files Directly - TidBITS

If you don’t know what CSV stands for, stop right here. I’ll save you the time of even reading that it’s “comma-separated values” and is a funny, least-common-denominator format used to pass field-based data in rows—tabular data—between apps. For those who read “CSV” and thought, “Aha!” I want to introduce you to an excitingly useful new app.

If you’re only roughly familiar with CSV, let’s review the basics. A CSV file represents values for fields, like those used in a spreadsheet, as a set of items identically presented row after row in a text file. These fields have to be separated from one another on each row, or delimited, using characters that let a CSV parser know that they’re used for field division.

Rows are always delimited by a Return character. In modern macOS and Unix, a Return character is actually a Line Feed (LF), while Windows uses a Carriage Return/Line Feed (CR LF) combination. (These characters came into computing from typewriter technology, where a carriage return moved the carriage back to the starting point and a line feed advanced the paper.) I’m unaware of any controversy surrounding using Return characters to delimit rows.

Delimiting columns is trickier. Any character used to delimit fields could also appear in the data. In the data part of a column, you have to encode the delimiter character in a special way, such as enclosing it in quotation marks or preceding it with a backslash, among other techniques, a process called “escaping.”

For many years, I thought tab-delimited files (TSV or tab-separated values) ruled the roost. Because tabs are used primarily for formatting in text, they seemed to me to be much less likely to appear in tabular data. Using a tab limited the need for escaping in the data portion of a file. However, I think that’s changed. Many apps and services only offer CSV as an export option and list CSV as the only plain-text import supported. (Some of those also quietly support TSV even when they don’t use the terminology). I’m not a fan of CSV because so many characters require escaping—starting with the frequently used comma. To avoid escaping commas, CSV exports often add quotation marks around all the fielded portions—like "lastname, firstname", "phone"—but then you have to deal with quotation marks instead. But there’s no point in arguing for tabs or other, more obscure characters at this point: it seems that commas have won.

Despite its uptick in popularity, the CSV style of encoding tabular data isn’t new; it was first used by the IBM Fortran compiler in 1972. The names “comma-separated values” and “CSV” date to 1983 and the SuperCalc spreadsheet. It wasn’t until 2005 that CSV files were formally standardized with RFC 4180—perhaps getting a standard gave CSV the advantage over TSV in the market. Whether most products and Web apps conform accurately to this standard is unknown.

CSV files can be opened in any text editor and are human-readable, but few people look at raw CSV files because they’re primarily used as an interchange format for moving data between systems. To work with the data in a CSV file, you generally import it into a spreadsheet or database-driven system. In today’s world, most people work with tabular data in Microsoft Excel, Apple’s Numbers, or Google Sheets. BBEdit has a few helpful features for working with columnar data directly, but they are unusual among text editors.

Because of my event management work for the Finger Lakes Runners Club, I frequently handle CSV files. For races where I’m managing the timing system, I need to download registration data to merge onto bib number labels and print lists of runners so volunteers can give each runner the correct bib number, so the local Wilderness Search & Rescue team can identify which runners have passed which checkpoints, so clothing coordinators know who ordered race shirts, and for any number of other reasons. After a race, I often have to download race results from the timing system and tweak them before uploading the processed information to the club website.

Because CSV is an interchange format, I usually import files into a spreadsheet, make any necessary changes, and then save, print, or export for whatever my next step is. I usually use Excel for processing because it can save an opened CSV without a separate export step, which Numbers requires. Google Sheets would also require exporting and would clutter my Google Drive with temporary documents that I need only briefly.

Please don’t interpret my usage of Excel as an endorsement, though. I have a fractious relationship with Excel, particularly when working with running times, which spreadsheets treat like times of day and often reformat in weird ways. A 5:24 mile has nothing in common with 5:24 AM, so it’s essential to block any automatic date formatting. And don’t get me started about ZIP codes that start with a zero!

Might there be a better way of working with CSV files?

Introducing Modern CSV

I ran across an intriguing app earlier this year that has become my go-to tool for working with CSV files: Modern CSV. When you launch it, it looks like a spreadsheet, displaying data in rows and columns, but it doesn’t require that you write formulas to manipulate data. Instead, it has an extensive set of data manipulation capabilities that you apply directly to the contents of a CSV file. In essence, Modern CSV uses CSV as its native format and lets you choose common data transformation, conversion, concatenation, and other actions from menus instead of pre-parsing files in a text editor or building formulas and juggling results columns in a spreadsheet app. With CSV as the native file format, you skip all that to work directly with tabular data.

Modern CSV main window

I have to note that Modern CSV is a cross-platform app with support for macOS, Windows, and Linux. That’s great for those who shift between platforms, but it means that Modern CSV doesn’t look particularly Mac-like. You won’t have trouble using Modern CSV, but aspects of its interface are both unfamiliar and ugly. The speed and flexibility with which I can work with CSV files in Modern CSV outweigh such aesthetic considerations for me, so I have changed my settings in the Finder so that all .csv files open in Modern CSV. (Select a CSV file, choose File > Get Info, choose Modern CSV from the Open With pop-up menu, and click Change All under “Use this application to open all documents like this one.”)

To help you understand what Modern CSV offers, I’m going to walk you through a sampling of real files that I need to work with, nicely anonymized by asking ChatGPT to generate random names to replace the actual ones. Some of these actions are simple, others complex, and a few somewhat contrived, but overall, they’re the sort of tasks I need to accomplish regularly. Advanced features in Modern CSV that go beyond what I need include a high-performance read-only mode for huge files, support for multiple delimiters and line endings, transposition of columns and rows, export and import of JSON and XML files, and support for multiple character encodings.

It’s worth keeping in mind that most of what I find so helpful in Modern CSV can be accomplished in a spreadsheet. But unless you’re already an Excel or Numbers expert, figuring out how to perform some action efficiently in a spreadsheet may take longer than doing it inefficiently or in another app. Feature discoverability in spreadsheets is extremely low. So, if you notice me praising Modern CSV for something you know a spreadsheet can do, it’s likely because I didn’t know it was possible.

Modern CSV offers a solid set of features for free, but you can increase its functionality with the $39 Premium Personal or the $59 Premium Business licenses—both are one-time fees, not subscriptions. Whether you need to upgrade to a paid plan depends entirely on the transformations you commonly make. Although I tested the Premium Business version of Modern CSV, I think the Premium Personal version would suffice for my needs.

Sort Columns

I frequently want to sort columns numerically or alphabetically. Although this is simple enough in most spreadsheets, Modern CSV makes it even easier by sorting the entire table when you double-click a column header. Double-click it a second time to reverse the sort order.

Many CSV files have a header row describing each column’s contents. That’s helpful, and it’s often essential to retain it if you’re uploading the edited CSV file to a system that requires a header row to understand the column layout. However, having text like “Bib” at the top of a numeric column of bib numbers can sometimes confuse sorting unless the app handling knows that row is header information, as most spreadsheets do (or can be convinced to do).

Header information isn’t usually a problem because Modern CSV usually assumes the first line is a header and excludes it from sorting. If your CSV file contains multiple non-data rows or multiple internal tables, you may have to identify a particular row as a header. Control-click the desired row and choose Set Header Rows to Selected Row. You can do the same thing with columns.

As you’ll see in the Data > Sort menu, simple column sorting is just the start for Modern CSV. It can also sort rows and selected cells, and, for the truly persnickety, make all sorts case-sensitive.

Modern CSV Data > Sort commands

Delete Columns and Rows

My downloaded data usually contains numerous extra columns that get in the way. Although Modern CSV doesn’t have much of an advantage over spreadsheets here, deleting columns is a breeze. Just select the desired columns or rows by Shift- or Command-clicking, Control-click one, and choose Remove Column(s) or Remove Row(s). I do this often enough that I’ve memorized the Command-Shift-L and Command-Shift-K keyboard shortcuts.

However, Modern CSV takes deletion to new levels. With a single command from Data > Delete, you can remove duplicate or empty rows or columns, which could make cleaning up a file far quicker. Although the menu options are extremely wordy, their long names help you understand what they’ll do.

Modern CSV Data > Delete commands

Move Columns, Rows, and More

When developing a list of runners for bib pickup, I want to ensure the Bib column is first. It’s not always in that position to start, so I need to move it. To move a column in Modern CSV, select a column, press Shift, and then drag it to the new position. Moving rows is just as easy, but I usually rely on sorting for rows. While Numbers and Google Sheets don’t require pressing Shift, I find Excel’s interface maddening: you must select the column, move the pointer out of the column header area, position it over the edge of the column such that it turns into a hand, press Shift, and then drag the column. If you fail to press Shift, Excel warns that it will overwrite the data in the destination column, which is an insane bit of user interface.

What’s really cool, though, is that Modern CSV can also move selected cells around. If you use the Data > Move commands, Modern CSV shuffles the data to avoid leaving empty space; Shift-dragging the selected cells replaces the contents in the destination cells (indicated by highlighting). This capability could be helpful with CSV files with a lot of manual formatting that breaks across rows and columns, something I’ve seen occasionally in output from older systems. In Excel and Numbers, dragging cells allows only moving or copying to overwrite the destination.

Modern CSV Data > Move commands

Join, Split, and Trim Columns

Whenever you receive data with people’s names as a combination of first and last (and any initials or other pieces before, after, or in between), you know you’ll need it broken into first name and last name for the ultimate source. Likewise, if you get a CSV with first and last names in separate columns, you know you’ll have to combine them. It’s just a principle of the universe.

In my running work, I often need to save space on printed lists or results tables, so I like to combine separate first and last name columns. Doing that in a spreadsheet requires creating a new Full Name column, writing a formula that concatenates the two names with a space between them, and then filling the formula to the bottom of the sheet. Then, I have to select that entire column (excluding header data) and use a special paste operation to insert the formula’s results while severing the relationship with the original columns—only then can I delete the original columns.

Joining columns in Modern CSV involves much less work: select them and choose Data > String Manipulation > Join Selected Cells Horizontally. A dialog prompts you for what text you want to separate the joined data; for a name, it’s a space. Press Return and the joined data appears in the first column; the other one becomes empty.

Modern CSV joining column example

Modern CSV’s other string manipulation commands let you split and trim selected cells in various ways.

Modern CSV Data > String Manipulation commands

The trimming is beneficial for me because I often need to remove tenths of seconds from the end of times (turning 17:56.3 into 17:56, for instance). There’s no easy way to perform that operation on duration in a spreadsheet app, so in the past, I would have used a grep search in BBEdit to remove the tenths. It’s much easier to use the Keep Only Left Part of Selected Cells option in Modern CSV, entering -2 for the index to tell it to delete the two rightmost characters. Note that header rows are subject to string manipulation, so you must deselect them to avoid unwanted changes.

Modern CSV trimming example

One final note. When race results include times that are under and over an hour, such as 0:58:36 and 1:02:46, I want to remove the 0: from the under-hour times. I could use Modern CSV’s Find and Replace commands, but it’s even easier to select just the under-hour times and trim off the first two characters with Keep Only Right Part of Selected Cells.

Normalize Case

I’ll admit to a certain level of obsessiveness here, but as a publishing professional, I can’t stand publishing results with random capitalization, often the result of a Caps Lock key being left on. Fixing such mistakes is harder than it’s worth in a spreadsheet, but Modern CSV can normalize case easily.

Select some data (probably a column or two) and choose Data > Case > Title Case. (Yes, I realize that a few people prefer their names in a specific case, but no one has ever complained, and without such a complaint, I’m going to set the house style for my results.)

Other case options include UPPER CASE, lower case, UpperCamelCase, lowerCamelCase, UPPER_SNAKE_CASE, lower_snake_case, and Convert to Acronym, which produces an uppercase word using the first letter of each word.

Find and Replace

In addition to string manipulation commands, Modern CSV provides powerful find and replace capabilities for when you want to match and make changes instead of applying an operation across a range of cells. The buttons below the sidebar let you perform case-sensitive searches, regular expression searches, match only whole words, match only entire cells, search only within selected cells, highlight matches, and preview replacements.

Imagine I have a set of times where some contain a leading zero that I want to remove. In the screenshot below, I’m searching for the string 0: within selected cells and highlighting the matches. The red highlighted cells indicate that there are several unwanted matches. Modern CSV has at least two ways to remove just leading zeros and their colons. First, what I show below restricts the replacement to the selected cells (highlighted in blue). A second approach would be to use a regular expression search for ^0: since the ^ indicates the following characters must be at the start of a line.

Modern CSV Find/Replace example

Filter Columns

I often need to work with a subset of my data—everyone who purchased shirts, for instance. Sorting sometimes helps with that, but a better approach is to use a filter that restricts the visible rows to just those that match the filter. Modern CSV has a plethora of filtering commands that I don’t entirely understand yet, but the easiest one lets you select one or more cells in a column and then choose Data > Filter > Filter Selected Column(s) with Selected Cell Contents. That displays only the rows whose cells in the selected column match those you had selected.

Modern CSV Data > Filter commands

For a more visible and flexible way of doing that, choose Data > Filter > Select Values in Selected Column to Filter. That brings up the File Analysis window’s Unique Values tab. Along with showing statistics about the contents of the column, the window lets you select the desired items (everything but None, shown below) and then click the Filter Selected Values In button to show just the matching rows. (Clicking Filter Selected Values Out would show only the rows containing None.)

Modern CSV Unique Values pane

If all that’s more than you need, Modern CSV has simple Hide/Show commands that let you control what you see without deleting data or structure.

Other Features

It’s impossible to list all the Modern CSV features that you might find helpful. Others that I’ve appreciated on occasion include:

  • The status bar at the bottom of the window (see below) displays the file grid size and selected cell grid. Cell length is the number of characters in the current cell. The Sum and Mean fields report on those calculations if you select cells containing numbers.
  • If you click the tiny button in the lower-right corner of the Modern CSV window, it displays a File Metadata sidebar with helpful information and controls. For instance, if you’ve opened a tab-delimited file but need to save a CSV, you can change the delimiter in the Save Parameters section to a comma. This feature might even help me with one app—the crufty HyTek Meet Manager—that requires a semicolon as a delimiter.
    Modern CSV metadata sidebar
  • The Data > Add or Update ID Column command is an easy way to create a column of sequential numbers starting with 1. If you’d prefer to start with another number or increment by something other than 1, choose Data > Multi-cell > Fill Selected Cells with Series, and enter your starting number and increment value. The Data > Numbers commands also provide interesting options for filling columns with random numbers.
  • Those who work with huge CSV files might appreciate Modern CSV’s extensive set of commands for scrolling or jumping around in a file, like Go To > Go To Cell. The Go To menu also has commands for following hyperlinks, emailing selected addresses, searching the Web for selected text, and searching for places on a map.
    Modern CSV Go To Cell dialog

What about Printing?

It might seem as though Modern CSV has an answer to nearly every CSV-related need you could have. Unfortunately, there is one lapse: printing. I often need to print lists of runners for race coordination, and Modern CSV can’t print at all. To an extent, this isn’t entirely surprising since printing probably differs between macOS, Windows, and Linux.

As a result, I often find myself opening a CSV file downloaded from a race registration system in Modern CSV, quickly removing unnecessary columns and rows, sorting and editing appropriately, saving, and opening in Excel, which has fairly flexible printing options. I appreciate that Excel can print just selected cells and can scale a printout to a single page wide. (Google Sheets can print selected cells, too, but Numbers can’t.)

Although this lack of printing support reduces the utility of filtering and hiding data in Modern CSV, its File > Save Visible Data As command provides a workaround, albeit at the cost of creating another file to print externally.

Happily, when I asked Modern CSV’s developer about printing, he said it was at the top of his priority list and would be included in the next revision to the app, which is due in a few months. I may never launch Excel after that.

Despite its current inability to print and occasionally wacky cross-platform interface, Modern CSV is a powerful tool to have at your disposal. It may be too specialized for those who work with CSV files only occasionally, but for people like me whose Downloads folders are littered with CSVs, it’s easy to look past Modern CSV’s peculiarities.

5 Likes

Thank you for brining this app to light - It works well. Another reason to support TidBITS.

1 Like

I’ve found Easy CSV Editor (on the Mac App store) to be a nice Mac-like CSV editor as well, if you’re looking for another Excel alternative…

4 Likes

Easy CSV Editor looks great, but lack of any way to try before you buy makes it a hard sell.

How does it compare with Modern CSV?

I’ve been using Modern CSV to manipulate sales data from itch-io, data from Google Sheets (copy and paste), and with an action to convert the tabular data into markdown it even ends up on my blog.

2 Likes

… And in classic (pre-Unix) Mac OS, lines are delimited with a CR character only. So there are basically three kinds of text files:

  • CR - classic Mac and some legacy 8-bit operating systems
  • LF - Unix and Unix-like systems (including modern macOS)
  • CR-LF - Microsoft operating systems (DOS, Windows, etc.)

Robust text editors should be able to load text files with any of the three line-ending formats, identify the one used, and save changes using the format that the file originally used, not changing the format unless explicitly told to do so.

Unfortunately, there are still some editors that aren’t this smart and may only support one format. Most notable is the Windows Notepad app, which has never gotten universal line-ending support since its introduction in Windows 1.0. Trying to load a file that has the wrong format may result in errors, bad display (e.g. all text run together on one line) or may mangle the file when it is saved.

2 Likes

Not having to edit CSV data, I find FileMaker Pro excellent for any kind of CSV usage.

1 Like

Excel has three CSV export options:

  • CSV UTF-8 (Comma-delimited) (.csv)
  • Macintosh Comma-separated Values (.csv)
  • MS-DOS Comma-separated Values (.csv)

The Data Merge function in InDesign for example only seems to work with MS-DOS Comma-separated Values unless the other two CSV options are first further massaged in a text editor to get rid the UTF8 BOM (Byte Order Mark).

What does Modern CSV do regarding BOMs?

If you are looking for a CSV editor that can process multiple files, the smasi CSV-Wizard is worth checking out. In addition, you can combine several processing steps as a batch and call them up for repetitive tasks (e.g. for data migrations).

A free version for macOS and also Windows is available to try out (see download from Mac App Store).

That works for times where the tenth part is lower than 5, but if the time were 17:56.5 for example the correct shortened value would be 17:57. A spreadsheet can easily do that. When a column is assigned a time value you can use a formatting rule to display the value as hh:mm which will assure appropriate rounding.

As long as it is consistently applied, rounding can be done in many ways. Only the person doing the work can decide what is “appropriate”.

As a game developer I use all of:

  • ceil
  • floor (Adam’s way)
  • //1 (integer division, equivalent to floor but faster)
  • round (to round floats)

Their behaviour with negative numbers might be surprising.

Further reading: Figuring out round, floor and ceil with integer division

3 Likes

I have used both on a Mac, Modern CSV and Easy CS Editor - extensively with millions of CSV rows.

On a Mac, Easy CSV Editor is hands-down the better product! As you already mentioned, the UI of Modern CSV is lacking the “Mac polish”. but also from a functional perspective I use Easy CSV Editor a lot more. Well, like in 99% of all cases. There is also a mobile app - but i prefer the OSX app.

3 Likes

Are you able to save a workflow so the process is automated for subsequent runs?

In one of the few reviews I can find of Easy CSV Editor it says “it is not possible to copy one cell into others” what do they mean by that?

Hah! I almost added that but decided that it was getting too far into the weeds that early in the article. :-)

I don’t really know (or understand the issue, honestly), but you can change the encoding of any CSV from the File Metadata sidebar when saving.

There are fairly specific USATF rules about how rounding should happen (you actually round up all hundredths to tenths unless it’s .00) but for the trail races I’m mostly dealing with, it’s way too fussy and far more precision than is warranted. So yeah, I just truncate most of the time. For sprints in track meets, I follow the USATF rules surrounding hand timing.

Nothing is easy when working with durations in a spreadsheet. :slight_smile: And using formatting doesn’t help when you need to export a CSV for uploading to a results website. The data has to actually disappear, not simply be hidden by a formatting rule.

No, but it’s a good suggestion for the developer. I’ve thought about it, but my CSV files are never quite standard enough (or what I want out of them varies too much) for automation to work.

No, unfortunately not. Still, the software will remember some important setups for every file opened, e.g. some details about separators etc.

(Edit: I just found out that you can use AppleScript to automatize workflows. I just never used it - I would use csvkit then.)

If you want to try out some professional workflows you can use the free command-line CSV toolkit “csvkit” (see https://csvkit.readthedocs.io/ en/latest/tutorial/1_getting_started.html ). I used this successfully. But Easy CSV Editor is a perfect tool to do some visual data exploration and it is way easier to “see & understand” the data.

In one of the few reviews I can find of Easy CSV Editor it says “it is not possible to copy one cell into others” what do they mean by that?

You’d have to ask the author of that review, who miscommunicated, I’m willing to bet.

Thanks @pukku and @rgv for mentioning Easy CSV Editor. Since I use Numbers extensively and primarily for non-formulaic lists and tables, my needs aren’t as intensive as Adam’s, while I’m perhaps more annoyed than he at non-standard interfaces, so that product sounds more likely to be what I need.

1 Like

But I’m asking somebody in this thread who uses the app. If you buy it, maybe you can chime in with your experience?

Is there a specific reason why you use Modern CSV in the remaining 1% of the cases?

1 Like

But I’m asking somebody in this thread who uses the app. If you buy it, maybe you can chime in with your experience?

I will, sure. I just can’t imagine that you can’t copy and paste between cells when that basic functionality is basically built into the coding framework of any app that works with text. The reviewer must mean something else, and I can’t imagine what.