Four principles for deploying AI responsibly

Telecoms.com periodically invites expert third parties to share their views on the industry’s most pressing issues. In this piece Alejandro Saucedo, Machine Learning Engineering Director at Seldon, looks at best practice for the use of artificial intelligence.

Artificial Intelligence (AI) is in the process of transforming every industry, with over one in three organisations now having AI in widespread or limited production. But as is the case with any technology, AI comes with substantial economic and social risks such as the proliferation of unethical biases, the dilution of accountability, and data privacy violations.

In order to avoid these risks and deploy AI responsibly, the onus is on both regulatory policy and industry to develop processes and standards for the practitioners and users working around the technology. To this end, the team at the Institute for Ethical AI and ML has put together principles for responsible AI to empower practitioners to ensure these principles are embedded by design on the infrastructure and processes surrounding production AI & machine learning systems

This article provides a a breakdown of four of the eight principles: bias evaluation, explainability, human augmentation, and reproducibility.

Bias evaluation

In a sense, AI models carry inherent biases as they are designed to discriminate towards the relevant answers. That’s because at the heart of intelligence is the ability to recognise and act on patterns we see in the world. In developing AI models, we seek to replicate this exact ability and encourage AIs to spot patterns in the data they’re fed and develop biases accordingly. For example, a model that looks at the chemical data of proteins would inherently learn relevant biases towards those with structures that can fold a certain way, so as to find which are useful for use in relevant use-cases in medicine.

So we should be careful when speaking out against AI bias. What we often mean around the topic of AI bias are in fact undesired or unjustifiable biases, such as biases that discriminate on the grounds of protected characteristics, such as race, sexuality, or nationality.

But why would an AI model develop an unethical bias? The answer comes down to the data it’s fed. Models will end up reflecting the biases present in the data they’re trained with before being deployed, so if training data is unrepresentative or incorporates pre-existing biases, then the resulting model will end up reflecting them. As they say in computer science, garbage-in garbage-out.

Teams must also create a series of processes and procedures to correctly identify any undesirable biases around an AI’s training data, the training and assessment of the model itself, and the efficacy of the operational lifecycle of the model itself. A good example of this, which is worth following if you are deploying AI, is the eXplainable AI Framework from the Institute for Ethical AI & Machine Learning which is covered in more detail in the following section

Explainability

In order to ensure that an AI model is fit for purpose, it’s also important to involve the relevant domain experts. Such individuals can help teams to ensure an AI model is using correct performance metrics that go beyond mere statistical, accuracy-driven performance metrics. It’s worth emphasising that the term domain-experts does not only encompass technical experts, but would also encompass social science & humanities experts relevant to the use-case.

For this to work, though, it is also important to ensure that the predictions of the model can be interpreted by the relevant domain experts. However, advanced AI models often use state-of-the-art deep learning techniques that may not make it simple to explain why a specific prediction was made.

To combat this difficulty, organisations tend to leverage a multitude of techniques and tools for machine learning explainability that can be introduced to decipher the predictions of AI models. A comprehensive list of these tools and techniques can be reviewed here.

After explainability comes the operationalisation of an AI model. This is when it is surveyed and monitored by the relevant stakeholders. The lifecycle of this type of AI model only begins once it is correctly deployed in production. Once up and running, only then can a model suffer from a drop in performance as external pressures are placed upon it, be it a concept drift or a change in the environment in which the model operates.

Human augmentation

When deploying AI, it’s vital to first assess the current requirements of the original non-automated process, including outlining the risks of undesirable outcomes. This will then allow for a deeper understanding of the process and help to identify areas where human intervention may be needed to mitigate risk.

For instance, an AI that recommends meal plans to professional athletes has far fewer high-impact risk factors than an AI model that automates the backend loan approval process for a bank, suggesting human intervention is less necessary for the former than the latter. When a team identifies potential risk points of AI workflows, they can then consider implementing a “human-in-the-loop” review process (HITL).

HITL ensures that once a process is automated, there still exist various touchpoints where human intervention is needed to review outcomes, making it easier to provide a correction or undo a decision when necessary. This process can involve teams of both technologists and sector specialists (for example, an underwriter for a bank loan, or a nutritionist for a meal plan) to assess the decisions made by AI models and ensure they adhere to best practice.

Reproducibility

Reproducibility refers to the ability of teams to repeatedly run an algorithm on a data point and get the same result back each time. This is a core component of responsible AI as it is essential to ensure that a model’s previous predictions would be re-issued should a re-run be carried out at a later stage.

Naturally, reproducibility is tricky to achieve, largely due to the intrinsically difficult nature of AI systems. This is because outputs of AI models can vary based on miscellaneous background circumstances such as:

The code used to compute the AI interference
The weights learned from the data used
The environment, infrastructure, and configuration that the code ran on
The inputs and inputs structure provided to the model

This is a complex problem, especially so when an AI model is deployed at scale with a myriad of other tools and frameworks to consider. To this end, teams need to develop robust practices to help control for the above and implement tools to help improve reproducibility. For a start, many can be found in this list here.

Key takeaways

With high-level principles like the above, industry can ensure best practice is followed to responsibly use AI. The adoption of principles like these is crucial to ensuring that AI fully lives up to its economic potential and doesn’t become a that disempowers the vulnerable, reinforces unethical biases, or erodes accountability. Instead, it can be a technology that we can use to drive growth, productivity, efficiency, innovation, and greater benefits for all.