At its WWDC 2017 keynote on Monday, Apple showed off the fruits of its AI research labors. We saw a Siri assistant that’s smart enough to interpret your intentions, an updated Metal 2 graphics suite designed for machine learning and a Photos app that can do everything its Google rival does without an internet connection. Being at the front of the AI pack is a new position for Apple to find itself in. Despite setting off the AI arms race when it introduced Siri in 2010, Apple has long lagged behind its competitors in this field. It’s amazing what a year of intense R&D can do.
Well, technically, it’s been three years of R&D, but Apple had a bit of trouble getting out of its own way for the first two. See, back in 2010, when Apple released the first version of Siri, the tech world promptly lost its mind. “Siri is as revolutionary as the Mac,” the Harvard Business Review crowed, though CNN found that many people feared the company had unwittingly invented Skynet v1.0. But for as revolutionary as Siri appeared to be at first, its luster quickly wore off once the general public got ahold of it and recognized the system’s numerous shortcomings.
Fast forward to 2014. Apple is at the end of its rope with Siri’s listening and comprehension issues. The company realizes that minor tweaks to Siri’s processes can’t fix its underlying problems and a full reboot is required. So that’s exactly what they did. The original Siri relied on hidden Markov models — a statistical tool used to model time series data (essentially reconstructing the sequence of states in a system based only on the output data) — to recognize temporal patterns in handwriting and speech recognition.
The company replaced and supplemented these models with a variety of machine learning techniques including Deep Neural Networks and “long short-term memory networks” (LSTMNs). These neural networks are effectively more generalized versions of the Markov model. However, because they posses memory and can track context — as opposed to simply learning patterns as Markov models do — they’re better equipped to understand nuances like grammar and punctuation to return a result closer to what the user really intended.
The new system quickly spread beyond Siri. As Steven Levy points out, “You see it when the phone identifies a caller who isn’t in your contact list (but who did email you recently). Or when you swipe on your screen to get a shortlist of the apps that you are most likely to open next. Or when you get a reminder of an appointment that you never got around to putting into your calendar.”
By the WWDC 2016 keynote, Apple had made some solid advancements in its AI research. “We can tell the difference between the Orioles who are playing in the playoffs and the children who are playing in the park, automatically,” Apple senior vice president Craig Federighi told the assembled crowd.
The company also released during WWDC 2016 its neural network API running Basic Neural Network Subroutines, an array of functions enabling third party developers to construct neural networks for use on devices across the Apple ecosystem.
However, Apple had yet to catch up with the likes of Google and Amazon, both of whom had either already released an AI-powered smart home companion (looking at you, Alexa) or were just about to (Home would be released that November). This is due in part to the fact that Apple faced severe difficulties recruiting and retaining top AI engineering talent because it steadfastly refused to allow its researchers to publish their findings. That’s not so surprising coming from a company so famous for its tight-lipped R&D efforts that it once sued a news outlet because a drunk engineer left a prototype phone in a Palo Alto bar.
“Apple is off the scale in terms of secrecy,” Richard Zemel, a professor in the computer science department at the University of Toronto, told Bloomberg in 2015. “They’re completely out of the loop.” The level of secrecy was so severe that new hires to the AI teams were reportedly directed not to announce their new positions on social media.
“There’s no way they can just observe and not be part of the community and take advantage of what is going on,” Yoshua Bengio, a professor of computer science at the University of Montreal, told Bloomberg. “I believe if they don’t change their attitude, they will stay behind.”
Luckily for Apple, those attitudes did change and quickly. After buying Seattle-based machine learning AI startup Turi for around $200 million in August 2016, Apple hired AI expert Russ Salakhutdinov away from Carnegie Mellon University that October. It was his influence that finally pushed Apple’s AI out of the shadows and into the light of peer review.
In December 2016, while speaking at the the Neural Information Processing Systems conference in Barcelona, Salakhutdinov stunned his audience when he announced that Apple would begin publishing its work, going so far as to display an overhead slide reading, “Can we publish? Yes. Do we engage with academia? Yes.”
Later that month Apple made good on Salakhutdinov’s promise, publishing “Learning from Simulated and Unsupervised Images through Adversarial Training“. The paper looked at the shortcomings of using simulated objects to train machine vision systems. It showed that while simulated images are easier to teach than photographs, the results don’t work particularly well in the real world. Apple’s solution employed a deep-learning system, known as known as Generative Adversarial Networks (GANs), that pitted a pair of neural networks against one another in a race to generate images close enough to photo-realistic to fool a third “discriminator” network. This way, researchers can exploit the ease of training networks using simulated images without the drop in performance once those systems are out of the lab.
In January 2017, Apple further signaled its seriousness by joining Amazon, Facebook, Google, IBM and Microsoft in the Partnership on AI. This industry group seeks to establish ethical, transparency and privacy guidelines in the field of AI research while promoting research and cooperation between its members. The following month, Apple drastically expanded its Seattle AI offices, renting a full two floors at Two Union Square and hiring more staff.
“We’re trying to find the best people who are excited about AI and machine learning — excited about research and thinking long term but also bringing those ideas into products that impact and delight our customers,” Apple’s director of machine learning Carlos Guestrin told GeekWire.
By March 2017, Apple had hit its stride. Speaking at the EmTech Digital conference in San Francisco, Salakhutdinov laid out the state of AI research, discussing topics ranging from using “attention mechanisms” to better describe the content of photographs to combining curated knowledge sources like Freebase and WordNet with deep-learning algorithms to make AI smarter and more efficient. “How can we incorporate all that prior knowledge into deep-learning?” Salakhutdinov said. “That’s a big challenge.”
That challenge could soon be a bit easier once Apple finishes developing the Neural Engine chip that it announced this May. Unlike Google devices, which shunt the heavy computational lifting required by AI processes up to the cloud where it is processed on the company’s Tensor Processing Units, Apple devices have traditionally split that load between the onboard CPU and GPU.
This Neural Engine will instead handle AI processes as a dedicated standalone component, freeing up valuable processing power for the other two chips. This would not only save battery life by diverting load from the power-hungry GPU, it would also boost the device’s onboard AR capabilities and help further advance Siri’s intelligence — potentially exceeding the capabilities of Google’s Assistant and Amazon’s Alexa.
But even without the added power that a dedicated AI chip can provide, Apple’s recent advancements in the field have been impressive to say the least. In the span between two WWDCs, the company managed to release a neural network API, drastically expand its research efforts, poach one of the country’s top minds in AI from one of the nation’s foremost universities, reverse two years of backwards policy, join the industry’s working group as a charter member and finally — finally — deliver a Siri assistant that’s smarter than a box of rocks. Next year’s WWDC is sure to be even more wild.
Image: AFP/Getty (Federighi on stage / network of photos)