Siri Semester Exam Grade Improves to C From D+
Semester exam results are in. In our latest 800 question test of AI assistant Siri, she was able to understand 99% of queries and correctly answer 75% of them, earning a C grade. In April of 2017, Siri earned a D+ on the same test where she understood 94% of queries and correctly answered 66%. These tests are conducted with the same methodology and question set as our smart speaker tests found here. It involves asking 800 questions divided into five categories (local, commerce, navigation, information, and command) designed to test the full range of abilities and accuracy of an AI assistant.
Performance. Siri’s 9% improvement in correct answers over an 8 month period is more or less in line with the high rate of improvement we are noting with smart speaker-based assistants like Amazon Alexa and Google Home. For comparison, Alexa answered 64% of queries correctly and Google Home 81%. These results, however, can not be compared to one another directly as a smartphone-based digital assistant responds differently to queries, geared more toward calling up information on the device’s screen or controlling the device itself. Nonetheless, Siri’s December performance, compared to our test in April of this year, and to previous tests with a similar methodology and question set we conducted in 2012, 2013, and 2014, shows improvement in all five categories.
We’re tough graders at Loup Ventures. We’re tough graders in that during our testing of Siri, we only counted correct answers when she was able to deliver a single concise answer herself rather than bringing you to search results that might help you find an answer. This means, “I found this on the web for…” is counted as incorrect. Siri improved 9% since our April test but remains far away from the A grade that we expect will drive AI assistant technology to mainstream adoption. Test scores should be in the 90% range (90% of queries answered correctly) to receive an A grade. The takeaway is that Siri does not yet act as the fabric that connects our computing experience as we hope AI assistants one day will. We can live without Siri for now, but at the rate of improvement we are seeing, we expect her to be indispensable in 2 years.
Methodology. Methodology. Just as we have in April of this year, we asked 800 questions to Siri on an iPhone X and 8 Plus the last week of December 2017. The queries covered five categories: Local, Commerce, Navigation, Information, and Command. Siri was graded on two metrics: did she understand what was asked? (this can be seen on the device’s screen), and did she answer or execute correctly? It is important to note that we have slightly modified our question set to be more reflective of the changing abilities of AI assistants. As voice computing becomes more versatile and digital assistants become more capable, we will continue to update our question set to be reflective of those improvements going forward. Our changes included questions around the use of smart home devices. We tested Siri with the Philips Hue smart lighting and Wemo Mini smart plugs.
Disclaimer: We actively write about the themes in which we invest: artificial intelligence, robotics, virtual reality, and augmented reality. From time to time, we will write about companies that are in our portfolio. Content on this site including opinions on specific themes in technology, market estimates, and estimates and commentary regarding publicly traded or private companies is not intended for use in making investment decisions. We hold no obligation to update any of our projections. We express no warranties about any estimates or opinions we make.