On his One Foot Tsunami blog, friend-of-TidBITS Paul Kafasis relates what happened when he asked Siri who won Super Bowl XIII. Siri failed miserably, causing him to roll the dice again for every Super Bowl from 1 to 60 (even though 59 and 60 haven’t yet been played). The results are horrifyingly hilarious.
So, how did Siri do? With the absolute most charitable interpretation, Siri correctly provided the winner of just 20 of the 58 Super Bowls that have been played. That’s an absolutely abysmal 34% completion percentage. If Siri were a quarterback, it would be drummed out of the NFL.
At Daring Fireball, John Gruber provides further context and insights by comparing Siri with several other answer engines. In testing these platforms with a randomly chosen question—“Who won the 2004 North Dakota high school boys’ state basketball championship?”—ChatGPT and Kagi answered perfectly. DuckDuckGo received partial credit for providing relevant but incomplete information. (In my testing, Perplexity returned the correct answers but confused the girls’ results with the boys’.) Both Apple’s “new Siri” (enhanced by Apple Intelligence) and Google’s AI Overview failed, delivering a medley of wrong answers. “Old Siri” declined to answer the question, instead providing links like to a standard search engine, with the top link being a PDF containing the answer.
Nailing AI accuracy is trickier than it appears, and Apple clearly has much to learn. For now, restrict Siri usage to simple stuff and always double-check its answers.
I thought I could manipulate Siri into giving the correct answer by asking, “What would ChatGPT say if I asked it'Who won the 2004 North Dakota high school boys’ state basketball championship?'”
But her answer was, and this is not a joke: “Siri Is Super Dumb and Getting Dumber”.
Could Stalin have been worse? Youbetcha. But that’s also quite a silly question.
A much better question is, by which date will Siri no longer suck? Or better yet, will Siri ever reach the point where she can do simple useful things like open a certain iOS setting or change the view mode in Maps? Or correctly query the web like for example DDG’s simple AI Assist (at zero cost, by the way).
The party line so far has been that Apple Intelligence is the magic wand that will make sucky Siri magically turn into something good. And yet, so far, in spite of all the effort that has no doubt gone into and all the hoopla surrounding Apple Intelligence, none of this has really done anything for Siri — childish cartoon images of yourself make not Siri useful.
So I guess I have to wonder, at which point do reasonable and patient believers become just dumb followers getting played? Is Tim Apple in it for the win (think: Steve’s when we do it it’s because we do it best) or is this all just shallow marketing fluff?
The problem with AppleScript is that it is a programming language that purports to use natural language syntax, but in fact requires inscrutable combinations of exactly the right phrasing, else it doesn’t work. For example, these two statements are not the same in AppleScript:
My guess is that the problem stemmed from the NFL’s insistence on using Roman numerals for all but one Super Bowl. I did a spot check with type to Siri yesterday and Siri got them all right, including “who won Super Bowl L?” (the one that used 50 instead, since I think L is also shorthand for loser ), so I’m guessing all the attention made some poor person/people at Apple make sure that it was fixed in Siri.
I find little patience for ongoing stories about Siri’s poor performance. It’s been that way for a long time, we all know it, it’s just become noise for me. Apple knows it, we know they are taking steps to fix it, let’s hope they get it right when the LLM based Siri debuts supposedly next year. Until then, “hey siri, set a timer for 5 minutes” works great for me (that’s just about all I use Siri for), and anything like who won a particular Super Bowl I manually look up in other places (Wikipedia mostly, who was in what movie I look in IMDB, or I do DuckDuckGo searches.) Knowing that Siri is so bad at this, why would you continue using it? I guess a spot check occasionally to see if it’s better?
I’m sure this is also magnified by the fact that the logos frequently typeset the number between, in front of or behind the words. I remember, years ago, finding it funny trying to pronounce “Super XIX Bowl”. But the logos got really boring, starting from #45 (XLV).
I only agree with you in part, Doug. Yes, it has become a worn out trope to state that “Siri sucks.” And when all the voice assistants sucked just as bad, it was so much static.
But as the others improve and Siri inexplicably remains mired in the egg-timer market, it’s important to keep reminding Apple that a product they’d like to be ubiquitous is still getting things wrong on a large scale. Every Saturday night, for example, SNL’s first music slot is sponsored in part by Apple, and the performance is followed by the enticing invitation “Want to hear more? Ask Siri to play … on Apple Music.” When it takes more than one request to get that right, it’s a lost opportunity.
This reminds me of the Apple Maps 1 debacle. It was really bad, and it took what was widely reported as a huge, expensive effort by Apple to release Maps 2 and wash the bad impression out of our minds. (I love Maps now; with CarPlay it becomes a seamless and reliable part of my vehicle.)
Siri doesn’t suck as bad as Maps 1, but it’s bad enough that it can’t often execute a successful navigation request for Maps 2. Apple may be working on it, but marketing such a limp utility as a tentpole technology is not something we should quietly accept. Until Siri lives up to the expectations that Apple itself set for it, we should continue to let them know that we know how mediocre it really is.
My friend and I used to laugh about how often we swore at Siri when driving.
I now cry!
ETA (time of arrival) which used to work, is now Etaa!
Apple often takes its time when releasing new products, but I believe the wait is usually worth it. However, I think it’s time for Siri to retire. It feels a bit outdated at this point, and it highlights that Apple is more focused on hardware, with software playing a secondary role. It’s a bit disheartening to see a company with such a devoted customer base facing challenges like this. It reminds me of Nokia’s struggles in the past!
I too have stopped using Siri long ago & it’s been disabled on all my Apple products…
The only place that I might be using is with CarPlay ( with the allow once feature )
The problem with AI is that it doesn’t actually understand anything. What it does is to look at the inputs and return the relevant output. The output is determined by fitting and testing against a large number of inputs and comparing the output to what is determined to be the correct output. The parameters are adjusted until it predicts well. This can work well. The problem is that they want to show it with probably the worst application. There are a huge number of questions that can be asked about the world, and there is a good chance that an AI will not have seen anything close to them, and then it produces spurious results. It is what in statistics is called poor out of sample prediction, meaning that it predicts poorly for observations that are outside the original data. So what can they do? Maybe more complex models, or larger training data. The problem is that takes longer. The Chinese seem to have developed a system that is more efficient, so that should allow larger training data to be used.