Keep It on the Chip
Alexa, Siri, and Google Assistant use the cloud to process voice commands. In a clever twist, the MIT chip handles much more of that processing itself, easing the burden on other components—and saving power. “Even if power consumption isn’t an issue, hardware accelerators can be useful for making devices simpler and lower cost,” says Michael Price, who designed the new chip. “If you can offload a difficult computation from the main processor, that processor doesn’t have to be as fast.”
That means manufacturers can use a less expensive processor. Cheaper is good, but increased efficiency is better—and Price set out to radically reduce the amount of power required to drive voice-assistant features. Smartphones generally need 1 watt of power to drive a single speech-recognition query, Price says. The system his team developed requires about 1/1,000 as much in the worst case. Some basic voice-processing functions sipped just 0.2 milliwatts, making it 5,000 times more efficient than the 1 watt benchmark.
The primary energy savings comes in making the chip more adept at recognizing speech. Instead of streaming audio over a web connection to a server, the processor converts speech to text locally. Processing those queries as a text file consumes far less power. The system also is more power-savvy when detecting speech; a low-power circuit within the chip detects when ambient noise is interrupted by a voice, then triggers the primary system when a voice command is registered.
The research process included a counterintuitive finding. Price’s team tested three circuits, and found that the most power-hungry of them resulted in the greatest energy savings. Why? Because it registered fewer false-positives than the others, which often activated the speech-processing chip after mistaking ambient noise for a voice command.
Its minuscule power requirement means you could see the chip providing voice-control capabilities in the next wave of smaller IoT devices with tiny batteries. That said, Price couldn’t comment on whether you’ll see the technology in consumer products anytime soon. MIT developed the chip specifically for battery-powered gadgets, but similar components could impact how plug-in devices such as the Amazon Echo and Google Home work.
“If an in-home device is doing speech recognition locally and that turns out to be a processing bottleneck, then our technology could be useful,” Price says.
Find Your Voice
The sound clips recorded by your Echo and Home stay on the companies’ servers even after they’re processed. Amazon and Google have excellent security and privacy protection records, but more on-device processing means less personal data stored in the cloud. Price says converting speech to text on the device before zinging it to a server would eliminate some info from the captured data, such as the speaker’s age, accent, gender, and ambient noise in the background.
“Of course, privacy is up to the system designer,” Price says. “There’s nothing stopping them from saving the audio on a device or transmitting it, even if the speech recognition is done locally.” True. But any technology that makes voice commands more efficient and less creepy is a good thing.