Arm Blueprint blog | Dr Chris Mitchell, Audio Analytic
29 Jul 2021
Founder and CEO of Audio Analytic, Dr Chris Mitchell, sets out the three key challenges facing AI and machine learning: performance, compactness and data provenance.
Through artificial intelligence (AI), we offer consumers a world where the products that they buy are more helpful. That helpfulness manifests itself in many ways, from voice assistants offering more natural user interfaces to autonomous vehicles that will one day get us from A to B safely. Nearly every electronic product that we use can benefit from some form of machine learning (ML) intelligence that delivers tangible value to consumers.
However, for the industry to deliver on the promise of AI, there are three key challenges facing companies big and small:
AI must perform well to satisfy the users’ high expectations
- AI must be compact because irrespective of the target hardware, being optimized to minimize power, memory and computational load unlocks wider value
- AI must be built on data that has been collected ethically and legally to reduce serious legal and reputational risks both now and in the future
- Audio Analytic has established itself as the leader in the AI field of sound recognition. We were set up to solve the specific challenge of recognizing the broader world of sounds beyond just speech and music, and to bring that capability to consumer devices. Because we started in an applied context from day one, rather than from academic results, we discovered and solved a number of the challenges that the wider AI community is only starting to awaken to.
1. Performance
AI has moved along the hype cycle and is no longer the plaything of the early adopters. In the past, having AI seemed to outweigh the value it delivered. That is no longer enough. You can’t explain to a consumer that their device didn’t recognize an image because the lighting wasn’t right. In the same way, you can’t confuse the sound of a baby crying with an ambulance approaching.
There is now an expectation from consumers that ‘AI’ is mature. They aren’t going to accept poor performance, and they aren’t going to change the way they use products to adapt to the way the AI wants to work.
For me, performance is the combination of two factors – accuracy and robustness. AI has to be accurate (high true positives and low false positives), but it also has to deliver that performance in all environments in which consumers will use it. You can’t train a model only using data from your Silicon Valley parking lot when consumers expect the end product to work in New York, Cambridge, Rome, London or Beijing. This is why robustness is a critical component of performance.
The ability to deliver high performance is impacted by two things deep within the ML Pipeline: the data you use and the training methods you deploy.
2. Compactness
Even with powerful cloud supercomputers and dedicated NPU chips, designing compact AI systems is still critical to success. The cloud isn’t suitable for every AI application. Consumers aren’t comfortable with their data leaving their devices and being stored or analyzed in the cloud, and in some cases, it isn’t practical. For example, take sound recognition, it doesn’t have a wake word equivalent, so you’d have to stream it 24×7 to a server. This doesn’t fit with a privacy-conscious consumer base and negatively impacts battery life and connection requirements.
Consumers have to rely on the AI without it draining their battery or coming at the cost of other features. For example, you can’t ask consumers to switch off their voice assistant to benefit from sound recognition – that’s just a poor experience. If they want both, they should be able to access both.
It doesn’t matter whether device manufacturers are deploying AI on a dedicated NPU chip or an Cortex-M or Cortex-A processor, space is always at a premium. Well trained models and optimized inference software means that ‘TinyML’ can bring benefits without the burden on computation, memory or power. What’s more, manufacturers can deliver multiple AI-based capabilities to consumers.
As with ‘performance’, the key to compactness also lies deep within the ML Pipeline with data and training. Yes, hardware acceleration can offer some improvements, as can quantization, but the real gains in compactness come from optimized architectures, models and large amounts of relevant training and evaluation data.
3. Data provenance
The number one critical ingredient to ML is data, and the well-known concept of ‘garbage-in, garbage-out’ is especially apposite. Training and evaluation data needs to be devoid of bias and needs to be representative of the target application.
For example, in sound recognition, the challenge is that the human perception of sound and a machine’s perception of sound are very different. We understand sounds as concepts, so a poor recording of a car horn is still recognizable to a human as a car horn. To a machine, a recording from a phone, compressed and uploaded to YouTube and then played through a consumer-grade speaker isn’t the same thing at all.
As well as poor data leading to poor performance, a neglectful approach to data collection exposes everybody to significant legal and reputational risks. Whether you are designing the ML system yourself or licensing it in from another company, you need to know where the data originates. That detailed traceability enables you to deliver exceptional performance levels because you know whether your data is ethical, diverse, and representative. It also enables you to demonstrate that you have the necessary permission and consent to use the data in commercial applications.
Regulators worldwide are now focusing more on the issue of data collection and consent for ML. As an industry, we should expect to answer questions over whether we have permission to use data and whether it infringes somebody else’s intellectual property.
It all comes down to traceability. Anybody building commercial AI systems will need to connect their model with the data used to train it and the relevant consent or licenses to show that they have permission to use it for ML tasks.
Those who can’t and don’t will find themselves facing significant fines and long-term reputational damage. Two of my colleagues recently wrote whitepapers on the impact of poor data. The first whitepaper looks at the technical risks, while the second looks at the legal and reputational risks. If you want to understand the subject in more detail, they are both a great place to start.
To succeed in the era of intelligent computing, you have to conquer all three of these challenges. The result is reliable technology that underpins new experiences and value in every facet of consumers’ lives.