The Institute for Ethical AI & Machine Learning

Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.

THE ML ENGINEER 🤖

Issue #52

This week in Issue #52:

Top Python ML Libraries in 2019
NeurIPS 2019 Videos are Out
Modern NLP with SpaCy Podcast
Testing Guide for Software
Spotify on Better ML Infrastructure
Featured OSS Production ML Libraries
Awesome AI Guidelines to check out this week
AI conferences
+ more 🚀

Forward the email, or share the online version on 🐦 Twitter, 💼 Linkedin and 📕 Facebook!

If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute! We have received a lot of great suggestions in the past, thank you very much for everyone's support!

Top Python ML Libraries in 2019

This last year we have seen a large number of open source libraries coming out. This article highlights 10 python machine learning libraries that came out in 2019 which are must watch, many of the libraries in the list which are machine learning related. This list includes HTTX, Starlette, FastAPI, Immutables, Pyodide, Modin, Streamlit, Transformers, Detectron2 and Metaflow.

NeurIPS 2019 Videos are Out

Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. The videos for this year's conference are now online and available at https://slideslive.com/neurips/

Modern NLP with SpaCy Podcast

SpaCy is an awesome NLP open source library! It’s easy to use, has widespread adoption, is open source, and integrates the latest language models. Ines Montani and Matthew Honnibal (core developers of spaCy and co-founders of Explosion) join the PracticalAI podcast to discuss the history of the project, its capabilities, and the latest trends in NLP. They also dive into the practicalities of taking NLP workflows to production.

Testing Guide for Software

As software approaches production scale, it requires the relevant amount of testing on a component and system level. The approaches involve when testing systems, especially in machine learning become more ambiguous, and benefit from the best practices that have been gathered. The testing guide in martin fowler's blog is an excellent and comprehensible source of information about testing, which can be adopted not only for traditional software projects but also for machine learning / data science projects.

Spotify on Better ML Infrastructure

When Spotify launched people were amazed that they could access almost the world’s entire music catalog instantaneously. More users and more features led to more systems that relied on Machine Learning to scale inferences across a growing user base. As these ML systems were buildt, they started to hit a point where engineers spent more of their time maintaining data and backend systems in support of the ML-specific code than iterating on the model itself. They realized we needed to standardize best practices and build tooling to bridge the gaps between data, backend, and ML. This blog post outlines their experience building just that, and how they leverage Tensorflow Extended (TFX) and Kubeflow in their Paved Road for ML systems.

OSS: Data Science Notebooks

The theme for this week's featured ML libraries is Data Science Notebooks. The four featured libraries this week are:

ML Workspace - All-in-one web IDE for machine learning and data science. Combines Jupyter, VS Code, Tensorflow, and many other tools/libraries into one Docker image.
Polynote - Polynote is an experimental polyglot notebook environment. Currently, it supports Scala and Python (with or without Spark), SQL, and Vega.
Stencila - Stencila is a platform for creating, collaborating on, and sharing data driven content. Content that is transparent and reproducible.
RMarkdown - The rmarkdown package is a next generation implementation of R Markdown based on Pandoc.

If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request!

OSS: Awesome AI Guidelines

As AI systems become more prevalent in society, we face bigger and tougher societal challenges. We have seen a large number of resources that aim to takle thiese challenges in the form of AI Guidelines, Principles, Ethics Frameworks, etc, however there are so many resources it is hard to navigate. Because of this we started an Open Source initiative that aims to map the ecosystem to make it simpler to navigate. We will be showcasing three resources from our list so we can check them out every week. This week's resources are:

Oxford's Recommendations for AI Governance - A set of recommendations from Oxford's Future of Humanity institute which focus on the infrastructure and attributes required for efficient design, development, and research around the ongoing work building & implementing AI standards.
San Francisco City's Ethics & Algorithms Toolkit - A risk management framework for government leaders and staff who work with algorithms, providing a two part assessment process including an algorithmic assessment process, and a process to address the risks.
ISO/IEC's Standards for Artificial Intelligence - The ISO's initiative for Artificial Intelligence standards, which include a large set of subsequent standards ranging across Big Data, AI Terminology, Machine Learning frameworks, etc.
Linux Foundation AI Landscape - The official list of tools in the AI landscape curated by the Linux Foundation, which contains well maintained and used tools and frameworks.

If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request!

MLConf = Conferences & Events

We feature conferences that have core ML tracks (primarily in Europe for now) to help our community stay up to date with great events coming up.

Technical & Scientific Conferences

EURNLP 2019 [11/10/2019] - European NLP Research summit in London, UK.

Data Natives [21/11/2019] - Data conference in Berlin, Germany.

ODSC Europe [19/11/2019] - The Open Data Science Conference in London, UK.

Khipu AI [11/11/2019] - Latin American Meeting in Artifical Intelligence in Montevideo, Uruguay.

Business Conferences

Predictive Analytics World [18/11/2019] - Conference for Business AI in Berlin, Germany.

Big Data LDN 2019 [13/11/2019] - Conference for strategy and tech on big data in London, UK.

About us

The Institute for Ethical AI & Machine Learning is a UK-based research centre that carries out world-class research into responsible machine learning systems.

Check out our website