The Institute for Ethical AI & Machine Learning

The Responsible Machine Learning Principles

A practical framework to develop AI responsibly

The 8 principles of responsible ML development provide a practical framework to support technologists when designing, developing or maintaining systems that learn from data.

If these principles resonate with you, you invite you to join the Ethical ML Network (BETA), and be part of a global network of leaders driving forward positive change in this area.

The Responsible Machine Learning Principles

The Responsible Machine Learning Principles are a practical framework put together by domain experts.
Their purpose is to provide guidance for technologists to develop machine learning systems responsibly.

1. Human augmentation

I commit to assess the impact of incorrect predictions and, when reasonable, design systems with human-in-the-loop review processes

2. Bias evaluation

I commit to continuously develop processes that allow me to understand, document and monitor bias in development and production.

3. Explainability by justification

I commit to develop tools and processes to continuously improve transparency and explainability of machine learning systems where reasonable.

4. Reproducible operations

I commit to develop the infrastructure required to enable for a reasonable level of reproducibility across the operations of ML systems.

5. Displacement strategy

I commit to identify and document relevant information so that business change processes can be developed to mitigate the impact towards workers being automated.

6. Practical accuracy

I commit to develop processes to ensure my accuracy and cost metric functions are aligned to the domain-specific applications.

7. Trust by privacy

I commit to build and communicate processes that protect and handle data with stakeholders that may interact with the system directly and/or indirectly.

8. Data risk awareness

I commit to develop and improve reasonable processes and infrastructure to ensure data and model security are being taken into consideration during the development of machine learning systems.

Continue reading for more detail on each principle

1. Human augmentation

I commit to assess the impact of incorrect predictions and, when reasonable, design systems with human-in-the-loop review processes.

When introducing automation through machine learning systems, it's easy to forget the impact that wrong predictions can have in full end-to-end automation.

Technologists should understand the consequences of incorrect predictions, especially when automating critical processes that can have significant impact in human lives (e.g. justice, health, transport, etc).

However this isn't limited to obvious critical use-cases - enabling subject-domain-experts as human-in-the-loop reviewers at the end of ML systems can have significant benefits.

Join the network

1. Human augmentation

What are some examples where I should look towards adding human-in-the-loop review processes?

Automatic prision sentence scrutiny

A fully end-to-end machine learning system that predicts prison sentences automatically is a classic example of a system that should be deployed carefuly, ideally with a human-in-the-loop review. Especially given that in this example, the inner workings of the model cannot be explained which is addressed in Commitment #5.

Fraud detection evaluation

Fraud detection prediction is a perfect example where a human-in-the-loop process design should be necessary. Instead of fully removing humans from the process completely, a domain expert can be requested to verify some of the results from the model to ensure the performance is aligned with the objectives.

Often a partial automation (i.e. having 3 people instead of 50 performing a specific process) may still have significant value, and provide an extra layer of safety.

Temporary manual review process

When rolling out automation systems, the ultimate objective may be to fully automate a process end-to-end. However, when reasonable, it may be required to perform the deployment of the system with a human-in-the-loop review in place. The system's precision and recall can then be evaluated during a production period, and full automation may be performed once is deemed acceptable.

2. Bias evaluation

I commit to continuously develop processes that allow me to understand, document and monitor bias in development and production.

When building systems that have to make non-trivial decisions, we will always face the computational and societal bias that is inherent in data, which is impossible to avoid, but is possible to document and/or mitigate.

However we should take a step back from only trying to embed ethics directly into the algorithms themselves. Instead, technologists should focus on building processes & methods to identify & document the inherent bias in the data, features and inference results, and subsequently the implications of this bias.

Given that the implications of the bias identified are specific to the domain, and use-case of the technology, technologists should be able to create, identify and explain the bias in the data and features, so the right processes can be put in place to mitigate potential risks.

Join the network

2. Bias evaluation

What are some examples where I should look towards having effective bias evaluation?

Pragmatic evaluation of bias

As a technologist it is important to obtain an understanding of how potential biases might arise. Once the different sub-categories for bias are identified it's possible to evaluate the results on a breakdown based on precision, recall and accuracy for each of the potential inference groups.

Google's what if tool on income classification provides an interactive way to visualise and assess for model and data bias - it's possible to see that "race" and "sex" are two of the strongest features.

Having "the right" datasets

Whether it is from manual labelling, collecting from a data-source or generating it through simulations, it is important to appreciate that getting access to representative and balanced datasets is a non-trivial task.

"Don't expect good data by boring the hell out of underpaid people" - wise knowledge from the Core SpaCy Team, and basically a solid wake-up call for technologists so they are able to make explicit efforts when getting access or generating training or evaluation datasets.

Equity, equality and beyond

The deployment of a biased system, can have the effect of reinforcing that pre-existing societal bias (Professor Gina Neff provides an insight in her talk), "Does AI Have Gender?". It is certainly possible for the system to be configured in such a way that it works towards reduing that bias.

However this is an extremely sensitive and complex issue. For example, do we want to configure the system for equality? or for equity? These decisions should not be taken lightly. For most (if not all) cases, the decision should be beyond the technologists themselves.

Because of reasons like this, this commitment encourages technologists to focus on identifying and documenting the biases present together with their potential impact. Ethical decisions should be considered together with the relevant industry stakeholders (ethics boards, regulatory bodies, etc).

3. Explainability by justification

I commit to develop tools and processes to continuously improve transparency and explainability of machine learning models where reasonable.

With the deep learning hype, technologists often throw large amounts of data into complex ML pipelines hoping something will work, without understanding how the pipelines work internally. However technologists should invest reasonable efforts where necessary to continuously improve tools and process that allow them to explain results based on features and models chosen.

It is possible to use different tools and approaches to make ML systems more explainable, such as by adding domain knowledge through features themselves instead of just allowing deep/complex models to infer them.

Even though on certain situations accuracy may decrease, the transparency and explainability gains may be significant.

Join the network

3. Explainable by justification

What are some examples where I could get a better understanding on compliance by design?

Explainability through feature importance

Often the challenge of explainability can be simplified by reducing the scope of what needs to be explainable. In some occasions, it is possible to increase explainability of the model by analysing the features and inference results.

Getting a better understanding on the importance of each feature on each result would enable technologists to explain the model itself. There are several tools that can help for this, including Tensorboard's What-if Screen, as well as "SHAP (SHapley Additive exPlanations)" which allow for understanding of the effect of features.

Domain knowledge to increase explainability

Bons.ai has a great insight on explainability that shows how it is possible to introduce explainability even in very complex models by introducing domain knowledge.

Deep learning models are able to identify and abstract complex patterns that humans may not be able to see in data. However, there are many situations where introducing a-priori expert domain knowledge into the features, or abstracting key patterns identified in the deep learning models as actual features, it would be possible to break down the model into subsequent, more explainable pieces.

4. Reproducible operations

I commit to develop the infrastructure required to enable for a reasonable level of reproducibility across the operations of ML systems.

Often production machine learning systems don't have the capabilities to diagnose or respond effectively when something bad happens with a model, let alone reproduce the same results.

In production systems, it is important to perform standard procedures, such as reverting a model to a previous version, or reproducing an input to debug a specific functionality, which introduces complexity in infrastructure.

There are tools and best practices for machine learning operations. These aid reproducibility of machine learning systems by proividing ways to abstract computational graphs and archive data at each step of transformation pipelines. These should be adopted to provide a reasonable level of reproducibility of operations.

Join the network

4. Reproducible operations

What are some examples to develop infrastructure that enables reproducibility?

Abstracting each computational step

In order to make a machine learning model reproducible, it is necessary to abstract its constituent components: namely 1) data, 2) configuration/environment, and 3) computational graph. If all these three points are abstracted, it is possible to have a basis for model reproducibility.

Pachyderm has an excellent breakdown of how to abstract each computational step together with its components. Similarly, Seldon Core provides a flexible way to orchestrate the operations and serving of models in production.

Adopting Open Standards

It is often important to decide what the level of abstraction will be, as it is possible to focus on building very complex layers to abstract multiple machine learning libraries with specific data input/output formats.

There are multiple formats for trained machine learning models - the most popular include: Open Neural Network Exchange Format, Neural Network Exchange Format, and Predictive Model Markup Language.

5. Displacement strategy

I commit to identify and document relevant information so that business change processes can be developed to mitigate the impact towards workers being automated.

When rolling out systems that automate medium to large-scale processes, there is almost always an impact on an organisation- or industry-level, which would affect multiple individuals.

As technologists we should look beyond the technology itself, and have initiative to support the necessary stakeholders so they can develop a change-management strategy when rolling out the technology.

Although often technologists themselves may not be leading the operational transformation, it is still important to make sure the processes are in place when relevant, irrespective of the type of work being automated (i.e. skilled or otherwise).

Join the network

5. Displacement strategy

What are some examples where I should look towards developing displacement strategies?

Processes to reduce impact

There are currently a lot of articles covering the jobs being automated by AI (e.g. assembly line workers, field technicians, call center workers, etc), as well as technical articles providing insights on how to deploy machine learning models across production systems.

However it's often forgotten about the impact to individuals that are part of processes being automated. Fortunately business change has existed for a long time, and currently startups have been partnering with delivery partners such as the Big Three Management Consultancy firms. It is important for technologists to understand their potential impact, and subsequently the actions that can be taken to mitigate the impact.

Jevon's paradox

A very interesting concept relevant to the current state of AI is Jevon's paradox. This paradox talks about how during the industrial revolution, innovations allowed for machines to perform the same output with less coal consumption.

Intuitively, it was thought that this would mean that the total coal required to power the industry would decrease. What happened instead is that given the cost to perform the same action decreased and got commoditised, more demand arose and the total coal consumption to power the industry actually increased. Analogous to this could be the rise of Excel, and in some areas, the rise of AI.

AI business change strategies

When planning the rollout of a new technology to automate a process, there are a number of people who's role or at least responsibilities will be automated. If this is not taken into consideration, these people will not have a transition plan and it won't be possible to fully benefit from the time and resources gained from the automation.

Technologists should make sure they are able to raise the relevant concerns when business change or operational transformation plans are being set up, as this would make a significant positive impact in the rollout of the technology.

6. Practical accuracy

I commit to develop processes to ensure my accuracy and cost metric functions are aligned to the domain-specific applications.

When building systems that learn from data, it is important to obtain a thorough understanding on the underlying means to assess accuracy.

Often it is not enough just using plain accuracy or default/basic cost metrics as what may be "correct" for a computer, may be "wrong" for a human (and vice-versa).

Ensuring the right challenge is being addressed in the right way can be achieved by breaking down the implications of f-1 score metrics from a domain-specific perspective, as well as exploring alternative cost functions based on domain-knowledge.

Join the network

6. Practical accuracy

What are some examples where I could understand practical accuracy use-cases?

Beyond accuracy

It is not uncommon for teams to get stuck on default accuracy targets, doing everything possible to increase percentages naively. It is important to go beyond accuracy, and understand the performance of the model.

There is a large toolbox of different approaches that can be used to aid us in finding the most suitable accuracy metrics to use. This includes core fundamentals, such as precision, recall, F1-score, learning curves, error bars, confusion matrices and beyond. Technologists should make sure they understand and apply the fundamentals at all times.

Domain specific metrics

When tackling an industry or application-specific problem, technologists should make sure they question what the implications of different types of errors have, as well as what the right way of evaluating these errors should be.

In system critical situations, there may be constraints where some types of errors are less critical than others. Similarly, there is often a lot of domain knowledge that can be abstracted in the cost functions to understand what answers may be intuitively correct to humans and how to represent these into mathematical functions.

7. Trust by privacy

I commit to build and communicate processes that protect and handle data with stakeholders that may interact with the system directly and/or indirectly.

When developing large-scale systems that learn from data, there are often large number of stakeholders that may be affected directly and indirectly.

Building trust within relevant stakeholders is not only done through informing what data is being held, but also with the processes around the data, as well as the understanding of why protecting the data is important.

Technologists should enforce privacy by design across systems, as well as continuous processes to build trust not only with users, but also relevant stakeholders such as procurement frameworks, operational users, and beyond.

Join the network

7. Trust by privacy

What are some examples around building trust with stakeholders that interact with my models and systems?

Privacy at the right levels

One key way to establish trust with users and relevant stakeholders is by showing the right process and technologies are in place to protect personal data.

Uber's use of Differential Privacy is a prime example, where they introduced a system that adds noise to query results, where the noise is relative to the level of granularity required by the query, to ensure that analysis still get access to the relevant datasets, whilst avoiding exposure of personal information.

Personal data via metadata

Technologists should make explicit effort to understand the potential implications of metadata involved, and whether the metadata can expose unexpected personal information from relevant users or stakeholders.

The cambridge analytica scandal is the most relevant example, and a good generalisation for similar situations. Direct and in-direct users that interact with a system may give access to their data without realising the privacy breaches that could be extracted from metadata until it's too late.

8. Data risk awareness

I commit to develop and improve reasonable processes and infrastructure to ensure data and model security are being taken into consideration during the development of machine learning systems.

Autonomous decision-making systems open the doors to new potential security breaches.

More importantly, it is critical to be aware that large percentage of security breaches occur due to human error as opposed to actual hacks (i.e. someone sending the dataset attached in an email by accident, or losing their laptop/phone).

Technologists should commit to prepare for both types of security risks through explicit efforts, such as educating relevant personnel, establishing processes around data, and assess implications of ML backdoors (such as adversarial attacks).

Join the network

8. Security risks

What are some examples where I should focus to become aware of potential risks in my data and models?

Adversarial patch tricking models

It is worth remembering that given machine learning systems are simple functions that given the right inputs, it's possible to obtain an expected output. Adversarial patches can be used to trick machine learning models to misclassify examples by only adding small noise to the input. The AI Journal has a great video where they show how this could trick self-driving cars.

Security intelligence has a great write-up on this, as well as some suggestions on how to protect ourselves. As always with cybersecurity it is impossible to fully protect from attackers, but it's certainly possible to introduce processes that mitigate basic loopholes.

Email sent to the wrong person

A very large percentage of data breaches are caused due to simple human errors, such as sending the data to the wrong email address. Mimecast has an interesting article which points out this is the case with very sensitive data in healthcare.

It is important that technologists take into consideration the whole lifecycle of the machine learning algorithm. The process and infrastructure to store the training data, accuracy, documentation, trained model, orchestration of the model, inference results and beyond.