The Institute for Ethical AI & Machine Learning

Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.

THE ML ENGINEER 🤖

Issue #103

This week in Issue #103:

UK Data Strategy Consultation
Applying the MLOps Lifecycle
Break into NLP with Andrew NG
NLP Applications Podcast
Uber on Scale Data Queries
Featured OSS Production ML Libraries
Awesome AI Guidelines to check out this week
+ more 🚀

Forward email, or share the online version on 🐦 Twitter, 💼 Linkedin and 📕 Facebook!

If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute! We have received a lot of great suggestions in the past, thank you very much for everyone's support!

UK Data Strategy Consultation

The UK's Department for Digital, Culture, Media & Sport launched a consultation to gather expert insights on the National Data Strategy. We contributed to this consultation through the Association for Computing Machinery's European Technology Policy Committee, which is included in the full document outlining comments collected from the committee.

Applying the MLOps Lifecycle

A great overview of the MLOps lifecycle, covering a simplified architecture of how all the constituent components interact as a machine learning model evolves. This includes the training, deployment, monitoring, and other more specific needs including scoping considerations and further terminology.

Break into NLP with Andrew NG

NLP is an essential part of the practical application of AI. Andrew NG and the Deeplearning.ai team have put together a panel of experts in the NLP field where they dive into their current projects, and the future of NLP, as well as career advice for ML practitioners or non-MLEs hoping to break into NLP.

Uber on Scale Data Queries

Uber deals with complex and large-scale challenges that require real time data queries across a broad range of distributed and varied datasets and datastores. In this blog post they cover how they leverage Apache Pinot to achieve low latency analytical queries across the Uber marketplace ecosystem in multi-cluster, multi-data-store contexts.

NLP Applications Podcast

The data exchange podcast presents a conversation with Google AI Resident Jack Morris on Adversarial Attacks, Data Augmenttation and Adversarial Training in the field of NLP.

OSS: Adversarial Robustness

The topic for this week's featured production machine learning libraries is Adversarial Robustness. We are currently looking for more libraries to add - if you know of any that are not listed, please let us know or feel free to add a PR. The four featured libraries this week are:

AdvBox - generate adversarial examples from the command line with 0 coding using PaddlePaddle, PyTorch, Caffe2, MxNet, Keras, and TensorFlow. Includes 10 attacks and also 6 defenses. Used to implement StealthTshirt at DEFCON!
Foolbox - second biggest adversarial library. Has an even longer list of attacks - but no defenses or evaluation metrics. Geared more towards computer vision. Code easier to understand / modify than ART - also better for exploring blackbox attacks on surrogate models.
IBM Adversarial Robustness 360 Toolbox (ART) - at the time of writing this is the most complete off-the-shelf resource for testing adversarial attacks and defenses. It includes a library of 15 attacks, 10 empirical defenses, and some nice evaluation metrics. Neural networks only.
CleverHans - library for testing adversarial attacks / defenses maintained by some of the most important names in adversarial ML, namely Ian Goodfellow (ex-Google Brain, now Apple) and Nicolas Papernot (Google Brain). Comes with some nice tutorials!

If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request!

OSS: Awesome AI Guidelines

As AI systems become more prevalent in society, we face bigger and tougher societal challenges. We have seen a large number of resources that aim to takle these challenges in the form of AI Guidelines, Principles, Ethics Frameworks, etc, however there are so many resources it is hard to navigate. Because of this we started an Open Source initiative that aims to map the ecosystem to make it simpler to navigate. You can find multiple principles in the repo - some examples include the following:

AI & Machine Learning 8 principles for Responsible ML - The Institute for Ethical AI & Machine Learning has put together 8 principles for responsible machine learning that are to be adopted by individuals and delivery teams designing, building and operating machine learning systems
An Evaluation of Guidelines - The Ethics of Ethics; A research paper that analyses multiple Ethics principles
ACM's Code of Ethics and Professional Conduct - This is the code of ethics that has been put together in 1992 by the Association for Computer Machinery and updated in 2018
From What to How - An initial review of publicly available AI Ethics Tools, Methods and Research to translate principles into practices

If you know of any guidelines that are not in the "Awesome AI Guidelines" list, please do give us a heads up or feel free to add a pull request!