Qualcomm and Audio Analytic partnership: 2021 will be the year smartphones understand the acoustic world around us

The AI sound recognition revolution is here.

Today we announce that our ai3-nano™ software and Acoustic Scene Recognition AI technology is pre-validated and optimized to run in always-on, low-power mode running on the Qualcomm® Sensing Hub. As part of the new Qualcomm® Snapdragon™ 888 5G Mobile Platform, the 2nd generation Qualcomm Sensing Hub was unveiled yesterday at the Snapdragon Tech Summit Digital 2020. As a result, smartphone OEMs can now create high-value benefits and features for consumers based around the phone knowing whether the user is in a chaotic, lively, calm, or boring acoustic environment.

One of the many capabilities OEMs can deliver based on this contextual acoustic information, includes adapting how phones behave when calls or notifications arrive. No more embarrassing moments when you’ve left a chaotic coffee shop and returned to the calm office, only for your phone to ring at maximum volume at the most inappropriate moment. And no more missing important calls because your phone is still on vibrate when you’re in a busy bar.

Plus, for the very first time, smartphone users will not have to reluctantly decide between a either voice assistant or sound recognition. Our compact ai3-nano™ software takes up less than 40kB of space on the chipset, enabling it to run concurrently with the low-power audio subsystem within the Qualcomm® AI Engine and the Qualcomm Aqstic™ audio codec. Qualcomm Aqstic is a core part of the Qualcomm® Voice Assist technology, which supports voice assistants like Amazon Alexa, Google Assistant and more.

As well as always-on Acoustic Scene Recognition, we are also announcing further new applications for sound recognition on smartphones:

  • Media Tagging – Automatically tag the audio content of videos and photos to enable creative editing, social sharing, or easy retrieval. Quickly find that special moment where your child was laughing on holiday. Or edit and share creative content across social media that takes advantage of sound-related effects and filters applied when a guitar is played.
  • Sound+AR Gaming – Trigger funny video filters and effects based on the sounds you make for spontaneous silliness in video chats or games with friends and family. Old MacDonald Had a Farm will never be the same again.
  • Accessibility – Reliably recognize important sounds around users whose hearing is impaired. Alert users to the sound of danger, like smoke and CO alarms, or offer that helping hand around the house by alerting them to a knock at the door.

As with all of Audio Analytic’s technology, these applications run on-device. This means that no information is sent to the cloud for analysis so consumers can take advantage of cool new features without worrying about their privacy. Plus, thanks to its ultra-compact code footprint, Audio Analytic’s embedded software can support multiple use cases without wiping out the battery.

Audio Analytic CEO, Dr Chris Mitchell commented:

“Sound recognition is the most exciting branch of artificial intelligence right now. As humans, we make sense of the world around us through sound, and by empowering machines with a human-like sense of hearing we’re enabling the next wave of innovation on smartphones.”

“Our ability to deliver such breathtaking performance in such a small footprint is the result of our cutting-edge research and development in the field of sound recognition. This achievement builds on our complete expertise in machine learning: optimizing models based on a large amount of high-quality, diverse data (our Alexandria™ dataset contains 30m labelled recordings across 1,000 sound classes), developing acoustically-smart augmentation techniques, designing new DNN network architectures, creating new training methods such as a patented loss function framework built around the unique characteristics of sound and much, much more.”