Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.

Issue #51
This week in Issue #51:
Forward the email, or share the online version on 🐦 Twitter,  💼 Linkedin and  📕 Facebook!
If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to! We have received a lot of great suggestions in the past, thank you very much for everyone's support!
"AI Dungeon 2", a new dungeon-crawling game was built using the GPT-2 model, and has been very well received by the community. This "open-world" game allows you to interact with the storyline by providing actions that are followed by results that expand on a story that is generated on the go. This is a very creative use of pre-trained language models, and certainly quite an exciting one that could be interesting to explore in many different industry applications.
Given easy-to-use machine learning libraries like scikit-learn and Keras, it is straightforward to fit many different machine learning models on a given predictive modeling dataset. The challenge of applied machine learning, therefore, becomes how to choose among a range of different models that you can use for your problem. Machine learning mastery has put together a great article containing insights on what is model selection, considerations for model selection and techniques available.
Code-review methodologies have brought robust development practice into software development. A new exciting project is now extending existing frameworks to provide further code-review functionality into Jupyter notebooks specifically. This project has been named ReviewNB, and it is a visual diff for Jupyter notebooks presented as a GitHub app that communicates to GitHub APIs directly, and processes changes which are then displayed as side-by-side diff formats. Very exciting project, and certainly a space to keep an eye on.
Netflix's data-science team has open-sourced its Metaflow Python library, a key part of the 'human-centered' machine-learning infrastructure it uses for building and deploying data-science workflows. It's great to see tech giants contributing to open source, especially in areas that are currently progressing at breakneck speed, namely the intersection between data science, devops and software engineering.
Adversarial detection algorithms are growing in popularity due to growing concern in exploitation of production machine learning models. A great tutorial was put together by the data science team at Seldon outlining how to use Adversarial Variational Autoencoder Detection algorithms specifically on the MNIST dataset (and more generally on image datasets).
The theme for this week's featured ML libraries is Data Science Notebooks. The four featured libraries this week are:
  • ML Workspace - All-in-one web IDE for machine learning and data science. Combines Jupyter, VS Code, Tensorflow, and many other tools/libraries into one Docker image.
  • Polynote - Polynote is an experimental polyglot notebook environment. Currently, it supports Scala and Python (with or without Spark), SQL, and Vega.
  • Stencila - Stencila is a platform for creating, collaborating on, and sharing data driven content. Content that is transparent and reproducible.
  • RMarkdown - The rmarkdown package is a next generation implementation of R Markdown based on Pandoc.
If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request
As AI systems become more prevalent in society, we face bigger and tougher societal challenges. We have seen a large number of resources that aim to takle thiese challenges in the form of AI Guidelines, Principles, Ethics Frameworks, etc, however there are so many resources it is hard to navigate. Because of this we started an Open Source initiative that aims to map the ecosystem to make it simpler to navigate. We will be showcasing three resources from our list so we can check them out every week. This week's resources are:
  • Oxford's Recommendations for AI Governance - A set of recommendations from Oxford's Future of Humanity institute which focus on the infrastructure and attributes required for efficient design, development, and research around the ongoing work building & implementing AI standards.
  • San Francisco City's Ethics & Algorithms Toolkit - A risk management framework for government leaders and staff who work with algorithms, providing a two part assessment process including an algorithmic assessment process, and a process to address the risks.
  • ISO/IEC's Standards for Artificial Intelligence - The ISO's initiative for Artificial Intelligence standards, which include a large set of subsequent standards ranging across Big Data, AI Terminology, Machine Learning frameworks, etc.
  • Linux Foundation AI Landscape - The official list of tools in the AI landscape curated by the Linux Foundation, which contains well maintained and used tools and frameworks.
If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request
We feature conferences that have core  ML tracks (primarily in Europe for now) to help our community stay up to date with great events coming up.
Technical & Scientific Conferences
  • Data Natives [21/11/2019] - Data conference in Berlin, Germany.
  • ODSC Europe [19/11/2019] - The Open Data Science Conference in  London, UK.
Business Conferences
  • Big Data LDN 2019 [13/11/2019] - Conference for strategy and tech on big data in London, UK.
About us
The Institute for Ethical AI & Machine Learning is a UK-based research centre that carries out world-class research into responsible machine learning systems.
Check out our website
© 2018 The Institute for Ethical AI & Machine Learning