The Institute for Ethical AI & Machine Learning

Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.

THE ML ENGINEER 🤖
Issue #21

This week in Issue #21:

Alibi black box ML explanations, Karpathy's tips on NNs, Nando on learning to learn, a gentle intro to imagenet, sparkML with Kafka, computer vision, ML streaming libraries, upcoming ML conferences, data science / ML engineering jobs and more 🚀.

Support the ML Engineer!

Forward the email, or share the online version on 🐦 Twitter, 💼 Linkedin and 📕 Facebook!

If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute! We have received a lot of great suggestions in the past, thank you very much for everyone's support!

Alibi for black box explanations

The team behind SeldonIO has open sourced a library to perform explanations on black box machine learning models. Their initial release brings together three incredibly interesting (and very well documented) approaches towards explainability. These include anchor methods, contrastive explanations, and trust scores. It's highly recommended to read through the documentation as they have provided thorough high level explanations on their approaches, together with references to really interesting research in this area.

Karpathy's tips on training NNs

Andrej Karpathy has put together yet another incredible resource - this time on tips around training neural networks. This post was inspired from his recent tweet which outlined the most common mistakes training a NNs. In this post Andrej argues a recipe to deal the the natural challenges of the leaky abstractions and silent failures that come with neural networks. His tips include "becoming one with the data", setting up training/evaluation skelletons, avoiding overfit, using regularization, tuning, and "squeezing the juice".

Nando on learning to learning

Nando de Freitas gave a talk in London last week with the Alan Turing Institute, which covered a wide range of topics, including a high level introduction to machine learning, and then a more specific overview on the work he's currently focusing on: Meta-learning. The field of meta-learning, and multi-task learning is incredibly interesting, as it requires machine learning models to succeed at a wide variety of tasks, without access to the sheer amounts of data that the world of deep learning requires. Nando released a paper a few years back titled "Learning to learn by gradient descent by gradient descent", which showcases how this technique can be exploited to build more general and even reusable algorithms.

A gentle intro to ImageNet

Machine Learning Mastery comes back this week with a great post shedding light on a topic you may have heard repeatedly, the "ImageNet Challenge". In this tutorial, Jason provides an overview of what the ImageNet Challenge is, together with an insight on the dataset (21k classes and 1m+ images), and talks about the deep learning achievements that have appeared throughout the last few years.

SparkML Kafka Environment

Following up from our talk last week on Real Time Machine Learning using Kafka and Spark Streaming, this week we have open sourced the one-click deploy foundation on docker-compose, together with a simple tutorial that allows you to get started quickly with real time ML streams. The brief overview provides instructions on how to 1) run the whole stack after installing docker-compose, 2) run a producer that pushes data to the stream, 3) run a consumer that processes the data, and 4) monitor the whole stack using Grafana and Kafka Manager.

An introduction to computer vision

TyrosLabs has put together a great and extensive introduction to computer vision, where they outline the areas in which computer vision has been used. The post also covers all-things-computer-vision from a higher level, and provides links that could give new-comers to the field a good intuition not only around techniques but also around the business and practical applications of these tools.

MLOps = Featured OS Libraries

The theme for this week's featured ML libraries is Real time Machine Learning with data streaming pipelines, which falls on our Responsible ML Principle #4. This week we want to dive deeper and feature some fast growing libraries in this space - four featured libraries on data stream processing this week are:

Apache Flink - Open source stream processing framework with powerful stream and batch processing capabilities.
Faust - Streaming library built on top of Python’s Asyncio library using the async kafka client inspired by the kafka streaming library.
Kafka Streams - Kafka client library for buliding applications and microservices where the input and output are stored in kafka clusters
Spark Streaming - Micro-batch processing for streams using the apache spark framework as a backend supporting stateful exactly-once semantics

If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request!

MLConf = Conferences & Events

We feature conferences that have core ML tracks (primarily in Europe for now) to help our community stay up to date with great events coming up.

Technical Conferences

PyCon + PyData Florence [02/05/2019] - Python X comes this year with a PyData focus in Florence, Italy.

AI Conference Beijing [18/06/2019] - O'Reilly's signature applied AI conference in Asia in Beijing, China.

RAAIS 2019 [28/06/2019] - The Research and Applied AI Summit in London, UK

Data Natives [21/11/2019] - Data conference in Berlin, Germany.

ODSC Europe [19/11/2019] - The Open Data Science Conference in London, UK.

Spacy IRL [05/07/2019] - SpaCy NLP's First F2F Conference in Berlin, Germany.

EurNLP [11/10/2019] - Europe's NLP research conference (pronounced "Your NLP") in London, UK

Khipu AI [11/11/2019] - Latin American Meeting in Artifical Intelligence in Montevideo, Uruguay.

Business Conferences

World Summit AI Americas [10/04/2019] - Large scale AI summit in Montreal, Canada.
- Come join our panel on AI Ethics and Tools.

AI Expo Global [19/04/2019] - Global conference on artificial intelligence in London, UK.
- Come join us at our talk on AI orchestration at scale.

Predictive Analytics World [18/11/2019] - Conference for Business AI in Berlin, Germany.

Big Data LDN 2019 [13/11/2019] - Conference for strategy and tech on big data in London, UK.

MLJobs = Jobs & Careers

We showcase Machine Learning Engineering jobs (primarily in London for now) to help our community stay up to date with great opportunities that come up. It seems that the demand for data scientists continues to rise!

Leadership Opportunities

Algorithmia is hiring for a VP of Engineering in Seatle, USA
Fractal Labs is hiring for a VP of Engineering in London
Distributed is hiring for a VP of Engineering in London
FactMata is hiring for a Head of Machine Learning in London
Brainpool.ai is hiring for a Head of Machine Learning in London, UK
Cytora is hiring for a Data Science Director in London

Mid-level Opportunities

Proportunity is hiring for a Senior Machine Learning Engineer in London
Twitter is hiring for a Senior Machine Learning Engineer in London
Atlas ML is hiring for a Lead NLP Engineer in London
StreetBees is hiring for a Senior Data Scientist in London
Expedia is hiring for a Principal Data Scientist in London
QuantumBlack is hiring for a Senior Machine Learning Engineer in London
Tractable is hiring for a Senior Deep Learning Engineer

Junior Opportunities

Seldon is hiring for a Machine Learning / Data Engineer in London
Migacore is hiring for a Machine Learning Engineer in London
CloudNC is hiring for a Machine Learning Engineer in London
Babylon Health is hiring for a Machine Learning Engineer in London
Chattermill is hiring for a Machine Learning Engineer in London