Functional and Conceptual Pitfalls in Siri’s ChatGPT Integration

Originally published at: Functional and Conceptual Pitfalls in Siri’s ChatGPT Integration - TidBITS

Apple has integrated ChatGPT into Siri with the second set of Apple Intelligence features that debuted in this week’s operating system releases. Don’t get too excited.

I’ve been trying to use this feature in the betas and now in the release versions of macOS 15.2 and iOS 18.2, and if anything, it has increased my frustration when interacting with Siri. Worse, I fear that some deeper issues may argue against the integration of Siri and ChatGPT.

Functional Problems

A few of the functional problems I’ve encountered include:

  • You must enable ChatGPT separately for each device, so the first time you issue an involved query to Siri on a new device, your query will fail, and you’ll be prompted to enable ChatGPT. It doesn’t feel welcoming.
  • On my M1 MacBook Air, I am continually told that ChatGPT is unavailable and to try again later, even while it works fine on my iPhone 16 Pro. My MacBook Air is seemingly cursed because Apple Intelligence summarization never works on it, either. A call to Apple support is in my future.
  • For privacy reasons, Siri asks if you want to use ChatGPT on each prompt that goes beyond what Siri can answer internally or with a simple search. That’s annoying, but you can eliminate the confirmation step in Settings > Apple Intelligence & Siri > ChatGPT > Confirm ChatGPT Requests.

    ChatGPT in SiriAt least ChatGPT got the general area of Settings right.

  • On my M1 MacBook Air, triggering Siri by clicking its Dock icon brings up Type to Siri, and I have to click the microphone button to be able to dictate to it. That’s a change: on my 27-inch iMac, which doesn’t support Apple Intelligence, clicking the Siri button causes Siri to listen to the microphone instantly.
  • On the iPhone, you can invoke Siri by holding the side button or using “Hey Siri.” However, if you want to continue the conversation with ChatGPT, you may find that Siri doesn’t always listen while its splooshy animation joggles around the edge of the screen. Holding the side button down was more reliable but more awkward. Similar animations appear at times when you can’t dictate, too, so you can’t assume fancy graphics mean Siri is listening.
  • Depending on what words you use, Siri may give you a seemingly random response or provide Web search results rather than allowing you to engage with ChatGPT. Those random responses may even come during a discussion with ChatGPT. To ensure your prompts go to ChatGPT, say its name somewhere in your prompt.
    Siri's conversational flubs
  • Although you can talk to ChatGPT using Siri, its responses always come back as text. That’s fine in many cases, but anyone accustomed to ChatGPT’s Advanced Voice Mode (where ChatGPT provides spoken responses) will be disappointed.
  • The longer your prompt to ChatGPT, the more likely it is that Siri will stop listening at some point and send whatever it has up to that point. If you think it’s irritating when people interrupt you while you’re speaking, just wait until Siri indicates it’s bored with what you’re saying.
  • There’s no way to review the ChatGPT transcript to refer to previous responses—you can see only the last response. However, if you sign into your ChatGPT account in Settings > Apple Intelligence & Siri > ChatGPT, you can view the full transcripts of all your chats. (Signing into my ChatGPT account also fails on the M1 MacBook Air. Cursed, I say!) At least you can delete abortive transcripts that you mistakenly triggered with Siri.

Deeper Concerns

As frustrating as these issues are, I have deeper concerns. Is Siri a good way to interact with ChatGPT? Will increasing the number of ways Siri can mess up reduce our desire to use it? We hoped Siri’s Apple Intelligence enhancements—and particularly the ChatGPT integration—would make Siri less frustrating. Might the reverse be true?

Apple markets Siri as a digital assistant, capable of carrying out simple commands and performing highly directed searches. In my experience, Siri works fairly well for playing music using artist names, controlling HomeKit devices, setting timers, and making reminders. Some searches, such as asking about tomorrow’s weather, also work reasonably well.

But using Siri to trigger Web searches is frustrating, particularly if you become accustomed to using Siri on a HomePod. Such prompts usually generate, “I’ve found some Web results. I can show them to you if you ask again from your iPhone.” rather than a useful response. Also, although Apple has made slight improvements in Siri’s ability to maintain context in a conversation of late, we have 13 years of experience in failure with anything but single, separate commands. If Siri heads off down an incorrect response path, our only recourse is to shut it up with, “Hey Siri, stop,” and then issue a differently worded request rather than redirecting the conversation.

In contrast, ChatGPT cannot carry out commands of any sort, and it’s not a search engine, although OpenAI recently gave paying ChatGPT Plus users access to such capabilities. Much has been written about how generative AI systems get facts wrong and make things up, and that’s not wrong—if you want to search the Web, use a search engine. ChatGPT is far more valuable for analyzing data, creating content, and exploring unfamiliar topics. It’s designed for conversation, with follow-up queries, comments, and additional information necessary for optimal results. Siri and ChatGPT simply don’t do the same sort of things.

Apple doesn’t want us to anthropomorphize Siri, but that’s nearly impossible when speaking to a digital assistant that responds with a natural-sounding voice. So when Siri responds randomly, sometimes stops listening to you before you’re done speaking, and is generally a lousy conversationalist, it’s impossible to avoid the feelings of pique that a person with similar conversational traits would trigger. You wouldn’t keep trying with such a person, and many of us won’t keep trying with Siri either.

Even knowing that you can direct any prompt to ChatGPT by including its name in what you say isn’t sufficient to keep Siri from giving seemingly random responses, as some of the examples in my conversation screenshot above show. Siri suggested I call emergency services, created a reminder, did a Web search, thought I was asking for driving directions, and was just generally confused a few times. Yes, my prompts were an attempt to speak naturally, but isn’t Siri supposed to be able to handle that now?

If Apple Intelligence is going to improve Siri, it has to understand what’s being asked and do something sensible. In the past, Siri’s failure mode was mostly binary—it either did what you wanted or failed in a predictable way. With Apple Intelligence, Siri seems primed to fail in ever broader and more unpredictable ways, which could reduce our enthusiasm for using it for even simple tasks.

Apple will undoubtedly keep working on Siri, but I worry that it will be too little, too late. From experience, I know that I’m unlikely to retry a particular task with Siri after failing enough times, as has been the case with trying to add text to a note in Notes.

3 Likes

Apple Intelligence: a one-wheeled bicycle for the mind made by generative AI.
(apologies to Steve Jobs)

;-)

Like you, I fail to see how Siri presents a good conduit for ChatGPT.

To me this reeks of Apple just trying to get onto a hype bandwagon they feel they need to join to remain relevant. And I’m afraid that’s much more driven by their concern for stock markets than actual use case or user experience.

IMHO 99% of all Apple users would be much better served if they finally made Siri less dumb and actually useful, rather than trying to shoehorn somebody else’s LLM into it. And yet, the no-longer-dumb-as-a-rock Siri still remains many months away (x.4 or x.5 updates…?) and that’s before we know if those purported changes will even raise it to a useful level.

Federico Viticci has taken a pretty deep dive into the integration of Apple Intelligence and ChatGPT, which I haven’t digested yet.

2 Likes

Thanks for that deep dive link, Alan. Definitely a worthy read for anyone looking to investigate the pluses and minuses of using Siri’s integration with ChatGPT.

I just bookmarked that author’s website to read more of his writings.

Siri and ChatGPT should be a match made in heaven but I fear the current implementation makes them both worse.

In my personal experience, Siri is a major disappointment - typically failing to understand my wishes or unable to follow (what I feel are) the simplest instructions. The frustration is so complete I no longer use Siri for anything other than asking for directions in the car - and even then it often fails miserably.

IMHO, ChatGPT works how Siri should. We’re currently planning an extended European holiday and we spent about an hour chatting back and forth with ChatGPT about distances, airfares, flight times, rail passes and travel options as though I was speaking to a very knowledgable travel agent. My wife - a former airline reservations agent who’s not a ChatGPT user - was actually staggered at what I was doing and the information ChatGPT was generating.

In the current climate ChatGPT isn’t the weak link. Siri is the Achillies heel and Apple Intelligence will be a disappointment until Apple finds a way to make it considerably better.

1 Like

Double-check everything it tells you. Chatbots, including this one, have a habit of “hallucinating” facts. Don’t assume anything it tells you is real or you may have a rude awakening when you discover that some of its “facts” that you use as the basis for trip planning turn out to be wrong.

The Viticci article merits a read, his take is pretty good. I deleted Image playground after a quick play with it. As for ChatGPT, we are still in a transitional process with Siri integration, I dunno how much I’ll use any of it. I do want to develop some approaches for research assistance with it but day to day, hmmm.

In theory, that’s also happening now with Siri being able to handle more natural language input. I haven’t noticed much of a win there yet. What’s coming is Siri having more personal context and screen awareness, neither of which were things I was looking for.

It’s a good article and goes further into several of the criticisms I leveled at the Siri/ChatGPT integration.

And I do agree—Apple is several years behind here, and even if the company comes up with its own LLM, it will have a very hard time catching up.

I don’t think that really addresses its true shortcomings at all. The fact that it cannot even open a simple system setting shows the real problem is basic functionality, not understanding lingo or trying to be “intelligent”. It’s down to basically, do your job.

Me: Open Control Center settings.
Siri: Sorry, I can’t help you with that.

2nd try
Me: Open Settings
… Settings launches…
Me: Now, open Control Center
Siri: It doesn’t look like you have an app named “Control Center”.
:man_facepalming: :man_facepalming: :man_facepalming:

None of that requires AI or anything fancy. It just reveals that very basic functionality is — after all these years — still sorely missing.

1 Like

Hey Siri, play Avi Kaplan
Siri: You don’t have Avi Kaplan in your library

I manually pick an Avi Kaplan song, figuring it’s a pronunciation issue

Hey Siri, what’s playing?
Siri: I can’t get that information for you

It’s a song that resides on my phone and it’s playing, Siri seriously can’t tell me what it is anymore?

I sometimes wonder if my music issues are because I don’t subscribe to Apple Music, but they are frustrating nonetheless.

Diane

I guess that my experience with Siri is very different from nearly everyone who complains about it. I should mention that I never use it to play music - that seems to be a use case which causes problems for many. I am also very uninterested in a “conversational” Siri though the contextual improvements that Apple plans will be very welcome.

I use it for simple commands like “start a timer for xx minutes”, “open xxx app”, “take me to xxx” place, “add xxx to grocery” (a reminder) and the like. It works well for unit conversions (even “how many drops in a milliliter”). Its success rate for these things is at least 95%. It even works better in a car using CarPlay.

I agree that there are fundamental incompatibilities between a Siri and an LLM. (But I don’t trust LLMs as a source of information and have no interest in Apple’s integration of ChatGPT into Apple intelligence).

3 Likes

That’s why I said “in theory.” :slight_smile:

I just don’t get it. Siri can “Open Wi-Fi settings” or “Open Bluetooth settings” or “Open Notification settings” or “Open Battery settings”

But why can’t it do “Open Control Center settings” or “Open General settings”

And “Open Camera settings” just opens the Settings app.

Could Apple really have hard-coded only those things that work? I assumed the natural language enhancements would allow it to listen to an open command and apply it to anything in the Settings app by likely matches.

Because Siri is not (yet?) actually trying to understand anything you say. It is a simple scripting engine with a voice interface. Probably not very different from AppleScript. If someone in Cupertino doesn’t write/update a script to recognize a particular keyword or command, then Siri can’t do it.

And FWIW, Amazon’s Alexa is the same way. It maps voice commands onto AWS lambda functions running on Amazon cloud servers. See also the Alexa Presentation Language documentation.

Well, that’s the question. With the first release of Apple Intelligence described Siri like this:

More Natural and Conversational Siri
Siri becomes more natural, flexible, and deeply integrated into the system experience. It has a brand-new design with an elegant glowing light that wraps around the edge of the screen when active on iPhone, iPad, or CarPlay. On Mac, users can place Siri anywhere on their desktop to access it easily as they work. Users can type to Siri at any time on iPhone, iPad, and Mac, and can switch fluidly between text and voice as they use Siri to accelerate everyday tasks. With richer language-understanding capabilities, Siri can follow along when users stumble over their words and maintain context from one request to the next. In addition, with extensive product knowledge, Siri can now answer thousands of questions about the features and settings of Apple products. Users can learn everything from how to take a screen recording to how to easily share a Wi-Fi password.

With the second set of Apple Intelligence features, Apple only talks about the ChatGPT integration with Siri. However, at the end of that press release, the company discusses what’s coming:

Even More Capabilities Coming Soon
Additional Apple Intelligence capabilities will be available in the months to come. Siri will be even more capable, with the ability to draw on a user’s personal context to deliver intelligence that’s tailored to them. Siri will also gain onscreen awareness, and will be able to take hundreds of new actions in and across Apple and third-party apps.

So should Siri be performing better now? Apple’s PR would suggest yes. But real-world testing doesn’t seem to agree.

1 Like

So far, it sounds like an improved speech-to-text engine and the ability for its scripts to access a broader range of local data, but aside from offering to shunt requests to ChatGPT, I’m not seeing anything that implies more than a more advanced scripting engine.

WE WERE PROMISED A LESS STUPID SIRI! :slight_smile:

What I don’t understand about Siri is why Apple wouldn’t have algorithmically fed it (for instance) the hierarchical tree of all the Settings screens along with the text strings of all the settings inside. I mean, Siri can open whatever apps are on your iPhone and those change all the time. Why hard-code stuff related to Settings?

3 Likes

Well, Siri is making baby-steps, at least for me. We have several grocery-related shopping lists. Siri never had a problem with “add milk to the Albertsons list,” but could never find the “Trader Joe’s” list. Somewhere along the line (post iOS 18, I think), Siri can now find that list too. Woo hoo!

1 Like

I asked Siri what the latest features in tvOS were and it offered me to ask ChatGPT.

Gah.

All I want is a conversational Siri who does basic things well and can anticipate what I need next.

2 Likes

I am underwhelmed by the latest manifestations of Apple Intelligence. None seem particularly useful and productive to me. As other commentators have suggested, Apple Intelligence has some development to be done before it gets positive rating.

I have been using the MacGPT app for some time and continually surprised in its capability to support me in my various activities. Contrary to Adam’s view, it is a very good search engine to the point where the MacGPT app has almost completely replaced my need to search on Apple Safari and Google. I can get good results within seconds and avoid the tedious business of shifting out Google’s sponsored placements and lack of search precision.

Because of the MacGPT app, I am not all that interested in getting closely involved with Apple Intelligence on my Mac. There is a iPhone version of the MacGPT app and it works well and can listen to you and speaks its conclusions to you.

But there is a monthly cap on the free version of the MacGPT app and that is a downside. If you are running a business, then the paid version would be worthwhile.