ChatGPT Proves Useful for Data Exploration

Originally published at: ChatGPT Proves Useful for Data Exploration - TidBITS

Sometimes, when asked questions about data generated by tech systems I’ve created, I feel like I’m channeling Dr. McCoy in the original Star Trek. “Dammit, Jim, I’m a writer, not a statistician!” I have a firm grasp on integer math but a tenuous one on statistics, data analysis, and related concepts.

That ignorance—and, let’s be honest, lack of interest in learning skills in that field—caused me to turn to ChatGPT the other day for answers to some questions that would have taken me far too long to figure out any other way. I recently subscribed to ChatGPT Plus, which provides more extensive access to GPT-4o and better handling of file uploads, making this kind of data analysis possible. Other AI assistants like Claude offer similar capabilities.

Understanding Referral Sources

Here’s the problem: The Finger Lakes Runners Club puts on 25 races per year, but the number of races and minor data inconsistencies make large-scale analysis of the registration data daunting. For instance, for some of our races, we ask where registrants learned about the race, giving them a set of choices. Our intent is to learn which of our promotional efforts are the most successful.

I can easily imagine how to build a spreadsheet with separate sheets for each race and use lookups to count the number of each answer for each race. However, actually building such a spreadsheet would take hours, particularly troubleshooting the funky lookup formulas. It hasn’t happened. My recent realization that ChatGPT can analyze CSV and Excel files pointed to another solution.

First, I uploaded my files by dragging them into a new chat. (It could take only ten at a time, so I had to upload in batches.) I asked ChatGTP to create a table counting the number of times each item in the Source column appears across all the race registrations. Magically, it did! But once I could see it, the table wasn’t quite what I had in mind. It included an index column and summarized all races. I changed gears slightly, asking ChatGPT to remove the index column and add per-race data to the rows in the table. Its first try confused the rows and columns, but asking it to transpose the data fixed that. A few more commands gave me a Total row and column, an alphabetically sorted list of races, and shorter column names.

ChatGPT generated table of data

I confirmed a few of the numbers and calculations by cross-checking them against my original spreadsheets and doing spot-checks with simple formulas. The overall goal was also sufficiently unimportant that I wouldn’t have been all that perturbed by a few minor mistakes. Unfortunately, apart from some generalities about which approaches were the most common and a few outliers—the large “From a Friend” suggests that Turkey Trot works largely by word of mouth—the data was hard to understand.

Next, I asked ChatGPT to switch from raw numbers to percentages, which allowed me to compare the relative effectiveness of each promotional channel more easily. However, examining each number closely in the table still required significant effort, so I had ChatGPT create a visualization. It suggested a stacked bar chart, which produced the one below.

ChatGPT generated stacked bar chart

Frankly, this chart is exactly what I need to pass on to the club’s Communications Team so they can evaluate what they’re doing. (A few of the bars aren’t quite at 100% due to an earlier rounding step, but again, I’m interested in the forest here, not a few twigs.)

Evaluating Late Race Registrations

Buoyed by the relative ease of extracting the referral source data, I tackled a more complex problem. Over the past few years, we’ve seen a significant uptick in late race registrations, with many people registering in the last day or two before a race. Not having a feel for how large a race will be until the last few days causes uncertainty and stress for race directors, most notably when estimating how much food to buy for post-race snacks.

We’ve played with various techniques for encouraging earlier registrations, including pleas to register earlier, price increases, and early cutoffs, but none have moved the needle. People apologize for being a pain, but they don’t stop. Our races aren’t expensive enough that price increases make much difference. Early cutoffs generate anguished emails from those who missed out and people showing up at the race begging for day-of-race registration, both of which ultimately cause more stress and work. Since we’re unwilling to employ draconian policies inappropriate for a volunteer-led community club focused on inclusivity, late registrants have become a fact of our modern world.

However, knowing how many people will likely register in the last week would help us estimate food needs better and avoid worrying that insufficient PR has been done. Since I had already uploaded all those race registration spreadsheets to ChatGPT, I decided to see if it could help us visualize the percentage of runners who sign up late.

Since not all races had registrations on race day, and that date couldn’t be calculated any other way, I had to feed ChatGPT a list of the actual race dates. After that, I got it to create bar charts that showed how many people signed up the day of a race, one day before, two days before, and so on. As you can see, some races are much more heavily weighted to late registrations (Turkey Trot, left) than others (Skunk Cabbage, right).

ChatGPT generated bar charts

However, when I asked ChatGPT to combine all the races, convert the raw numbers to percentages of the whole, and condense the days into weeks, the resulting chart was stark—over 50% of people sign up in the last week.

ChatGPT generated bar chart

I would typically have stopped here, but I was curious if ChatGPT could suggest other helpful ways of examining the information. It offered a handful, including a cumulative line chart, a heatmap, a stacked area chart, and a percentage stacked bar chart.

Other suggested visualization from ChatGPT

Only the cumulative line chart made sense to me, and even there, I had trouble internalizing what the slope of the line meant. So, I had ChatGPT change the chronological order so that the race day was on the right side of the graph and the line increased over time. I also asked it to switch from raw numbers to percentages again, which produced graphs like this for Turkey Trot and Skunk Cabbage.

ChatGPT generated cumulative line charts

Now we’re talking! For Turkey Trot, which gets a lot of late registrations, I can easily see that 50% of registrants will sign up in the last three days. However, the 50% mark for Skunk Cabbage comes about two months beforehand. I plan to share these charts with the various race directors so they can understand what’s likely to happen with registrations this year.

To be clear, I wouldn’t have thought of doing a cumulative line chart on my own, and even if I had, I’m not sure I would have been able to construct these charts. Certainly not without hours of work, rather than the 15 or 20 minutes I spent giving ChatGPT directions.

The Extreme Oddity of an AI Assistant

My retelling of how I arrived at these charts glosses over all the back and forth it took. In most cases, I’d get something shockingly impressive after a prompt or two. Within minutes, I’d have tables or charts that looked reasonable.

However, even if I didn’t care about extreme accuracy, the numbers needed to be roughly correct. Several times, when I spot-checked a number, it was utterly wrong. That also happens in spreadsheets I build, but instead of figuring out where I’d gone wrong in a formula, I just told ChatGPT to fix the mistake. I had to repeat myself on a few occasions until the numbers aligned with what I had uploaded.

Most of my time was spent on little things. While working on the table of referral sources, I wanted to change how it was sorted. That was easily accomplished, but ChatGPT also sorted the Total row and column rather than leaving them at the bottom and right. It had no conception that they were different from the data and should remain in those positions, though it was happy to put them back there when I complained. It also used decimal notation in some places, even though everything was a whole number, so I had to tell it to stick with integers.

Also, when building the graphs, I had to make ChatGPT use integers on both the X and Y axes because the data did not contain half-registrations or half-days. In fact, it stumbled over the fact that the registration dates did include times—it was starting down the timezone rabbit hole—but I short-circuited that problem by telling it to ignore the times. That alone was vastly easier than fighting with date formats in a spreadsheet, which I find almost entirely inscrutable.

Even stranger was having to change formatting with prompts. Changing column titles, chart titles, and even the location of the legend required that I explain what I wanted rather than just doing it myself.

I seldom work with people on this sort of thing, so most of my iterative work happens in my head without ever being externalized. Upon reflection, though, the experience wasn’t all that dissimilar from when Josh Centers first developed the financial graphs we use when covering Apple’s quarterly reports. He would show me a graph, and I would ask him to change the color, tweak a title, or adjust the legend spacing. A few times, when I asked why the graphs didn’t look right, he found that a formula had gone awry and was calculating the numbers wrong. ChatGPT made much weirder errors than Josh, but it was also much faster to respond to requests like “Change the graphs so the line increases over time rather than decreasing.”

I did encounter a couple of unexpected snags. First, like a cat in a tree, ChatGPT has trouble turning around. If I sent it down a path that I later decided was a dead end, telling ChatGPT to go back and try a different approach often failed.

I think the problem lies in ChatGPT’s memory limits. It processes instructions in the current thread but doesn’t retain the full history of operations indefinitely. For example, once I told ChatGPT to switch from raw numbers to percentages, the raw numbers were no longer part of its context, making it difficult to revert. Without a mechanism to “go back,” earlier stages of data were essentially erased, requiring me to start over in a new chat.

The solution was to start over in a new chat, but that required repeating all the desired prompts up to the point I wanted to try a different approach. It might have been possible to copy all those prompts from the previous chat and combine them into a single one in the new chat, but I didn’t try that.

The second snag was that ChatGPT would get “tired” after a while. I apologize for the anthropomorphizing, but things it had done without issue earlier eventually became problematic. It started forgetting about some of the races I had uploaded, and when I complained, it made up data entirely. Eventually, it threw in the towel, saying:

It seems the working dataset for registrations (filtered_corrected_days_before_all_df) is no longer available. To accurately recalculate the total number of registrations, I’ll need to reload and process the original registration files. Could you please re-upload the necessary files or confirm how you’d like to proceed?

I suspect that this is the same problem as trying to get it to revert to an earlier result and start over. With a limited context, too much of my data had been tweaked, munged, and transmogrified to continue the conversation in any useful fashion. In other words, ChatGPT was like the little boy in The Far Side cartoon who says, “Mrs. Johnson, may I be excused? My brain is full.”

Despite some snags, generating tables and graphs by simply uploading data and prompting ChatGPT felt like a glimpse into the future. Once I adapted my approach to interacting with ChatGPT—framing directions clearly and anticipating iterative revisions as I would when working with a person—the process became surprisingly efficient. For anyone faced with analyzing large datasets, ChatGPT offers a compelling alternative to traditional tools. I encourage you to experiment with it the next time you want to explore or visualize complex data quickly.

2 Likes

I’m sure many people will appreciate that Claude can be much better at handling some of the more advanced uses you describe, so simply switching from ChatGPT to Claude would make the task of analysing data easier.

But another option is moving away from the provided consoles to a service that specialises in reading your local files directly and working with them (no uploads or copy/paste involved). I am biased towards Beyond Better for such tasks. It’s original design was as coding assistant (which it does really well) but it’s evolving into a general purpose assistant for reading files, and getting Claude (or ChatGPT) to analyse them and generate new docs/reports.

Beyond Better currently is limited to local text files, but an upcoming version will support cloud files (eg Google Docs) as well as various document formats for reading and writing. Since Beyond Better uses AI “tools” to get its work done, as the set of tools evolves, so will the capabilities of BB to accomplish more tasks, outside the confines of the AI console.

I use BB every day to make myself 10-100x more productive than I was last year; check it out.

That issue of accuracy of results given seems like one of the major complexities of the whole process. If you have to spot check results, only to find it’s made up data (even if that’s only partially true), how can one trust the robustness of these results.

Interesting write up though. I can see this being a useful tool in future, just not sure we’re there yet.

2 Likes

I don’t see that as an AI issue. If you hired someone to do the work, you would still need to spot-check the results. The benefit is the speed and cost using AI is much better than hiring someone else.

1 Like

Trusting the robustness is a key point and a lot of that is down to being able to repeat the steps you have done, or ‘replicability’ as is called in science. Even Excel spreadsheets are notoriously poor for replicability. While I am not expecting Adam to learn something like R for his data analysis, asking ChatGPT to create the R code to do something would be a way forward. It is as amazingly good at that as it is bad at math.

1 Like

I hadn’t used R before, so I was quite curious how well Beyond Better would assist. I was quite impressed.

BB gave me instructions for installing Rscript, while BB analysed a CSV I saved in the project folder (I chose the Quality of Life datasource from Kaggle). BB (using Claude) wrote an Rscript to generate some plots. BB failed to run it due to missing libraries, so it automatically installed those libraries and then ran the script again.** I got a few different graphs and plots (as PNG files), and a text summary explaining key observations. The whole exercise was less than 5 minutes, and most of that time was downloading Rscript.

Thanks for the tip Jon, I’m quite excited to see what else I can get BB to do with data analysis and creating plots.

** I had to add Rscript to the ‘allowed commands’ before BB would do anything useful, but otherwise it “just worked”.

Thank you for your article. I have had exactly the same experiences with ChatGPT Plus, including the “loss of memory” when a long conversation goes down an incorrect path and you can’t backtrack.

And one of the more annoying things I’ve found is that when I enter exactly the same request a week later with a new chat and some new data, the results differ markedly from the previous week.

But I am developing a series of repeatable, explicit instructions in a “library” so I can use them again and again.

And often I’ve found it’s useful to compose my request in TextEdit and massage it a bit before I paste it into ChatGPT.

I was struck by the results of registration timing for the Turkey Trot races. I am involved in processing the entries for the two juried shows run by the State of the Art Gallery. Even though the organizers are careful to open the entry process months in advance of the deadline, the number of entries per week show, like Turkey Trot, an exponential growth. I find this a bit puzzling. If someone wants to enter, why wait until the last moment?

Apparently, it’s not just artists. Some law of human behavior is manifesting. I propose “Engst’s Law”: Given a deadline and a sufficiently long period of opportunity to meet the deadline, the number of submissions will grow exponentially.

1 Like

This is an excellent example of using ChatGPT as an assistant. I suspect many people don’t realize it can be used this effectively. Personally, my favorite use case is brainstorming obscure topics—it’s surprisingly helpful!

Regarding your approach: An important step in processes like the one described in the article is to edit prompts where ChatGPT has responded inaccurately, rather than simply adding a new prompt. Rephrasing the original prompt to guide ChatGPT more effectively often helps prevent it from heading down the wrong path.

Remember that editing a prompt erases all the conversation after that prompt!

Agreed! I’m currently studying physics and math on my own, and have found that there are MANY cases where notation and terminology is different between disciplines, and over time. I have found ChatGPT is very good deciphering these differences, and at getting answers to very specific questions that a professor/instructor could easily answer, but would be completely impractical to obtain using search engines.

Hmm! I’m not having any luck getting started with it on my 27-inch iMac. It wants to download and install a BB Server, but every time I do that, it fails and goes back to saying the server needs to be installed.

As far as Claude goes, I’m having enough trouble focusing on one of these chatbots. Trying to replicate everything in a second one to see if it would do better seems problematic. :slight_smile:

As @cngarrison pointed out, that’s always the case, whether you hire someone or you’re just trying to figure out whether your calculations are correct. In this case, as I said, I was looking at a fairly big picture, so spot-checking to make sure that picture was right was all I needed.

Yeah, I was thinking about that a little, but in this case, I don’t really want to replicate the results often. Maybe I’d do something like this once a year, but if I wanted highly replicable results, I’d need to build a full system. Which in turn would require me to know what I wanted. In this case, lowering the barrier to entry was the big win.

I think that’s smart!

Oh, that’s interesting—I had no idea. That would also make it easier to build a library of successful prompts since you would have already done the iterative work to make sure it worked.

Hah! Yeah, this has become a true fact of the modern world—it’s another case of “just in time” thinking. I think technology in general causes it because we can do things so much more quickly and more accurately that our internal sense of how early we need to start has shifted.

For instance, I always used to plan out car trips in advance so I’d know where to go and be certain that I’d have appropriate directions and maps and whatnot. Now, almost no matter what, I don’t even put in destinations until I’m sitting in the car with Maps up on the iPhone.

With race registrations, people know they can register in mere minutes, so it’s easy to put off until the very end. Back when we had to register by mailing checks and paper forms, you knew you had to do it at least a week or two in advance to be sure it would work.

There may also be a bit of FOMO too. If you wait until the last minute to register for a race, you can assuage any incipient fear of missing out by knowing that you didn’t get a better offer before the race.

1 Like

Bruce Schneier has an interesting essay about AI mistakes, pointing out that while they can be different and weirder than human mistakes, that’s not necessarily as true as it might seem.

1 Like

Other reasons are apparent too… it might be people not knowing if they’re going to have time around work or other commitments (obviously), or even potential illness (especially during winter, when colds/flu/bugs spread easily) and therefore having to cancel and/or deal with fee refunds etc.

All are admin headaches that can be avoided simply by late registration, provided the event is not likely to be ‘sold-out’ or registrations are ‘closed’ on a certain early date.

I don’t envy you with such issues.

Those are absolutely the case (the other big one is not knowing what the weather would be like), but they have always been the case, whereas the trend toward ever-later registrations is more recent.

We used to be more perturbed by it, but over the past few years, we’ve come to terms with it just being the new normal.

1 Like

I find the same phenomena with managing non-running and non-arts events with which I am involved.

So I definitely like your proposed “Engst’s Law” and its description 'Given a deadline and a sufficiently long period of opportunity to meet the deadline, the number of presentations will grow exponentially.

I substituted ‘presentations’ for ‘submissions’ given the combination of deadlines and the manifestation of essays and papers and Powerpoint slides as well as submissions.

1 Like