Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.


THE ML ENGINEER
Issue #6

 
 
This week in Issue #6:
 
Support the ML Engineer!
Forward the email, or share the online version on 🐦 Twitter,  💼 Linkedin and  📕 Facebook!
 
If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute!
 
 
 
 
In-house development of end-to-end machine learning orchestration frameworks is a trend for tech companies. A few months ago, Uber released an article where they share key lessons learned from the first 3 years working with their in-house machine learning framework "Michaelangelo". Michaelangelo is an awesome example of machine learning operations. If you are curious and wish to know more about it, here is an in-depth article that introduces Michaelangelo and here is a video that provides an overview.
 
 
The role of the data engineer as we've known it is changing rapidly. This FishTown Analytics article provides a great insight on how new tools are automating the "boring parts" of data engineering, enabling for focus on higher-level more important tasks. Interesting first hand overview of what will be an incredibly fast evolving role in the next few years.
 
 
An interesting perspective from opensource.com where they argue why data scientists like MLOps, and more specifically, Kubernetes. The article covers fundamentals of "Principles 4 - reproducibility" and how MLOps tools like Kubernetes allows data scientists to focus on the more important tasks, whilst being able to perform work at scale without much complexities. There is also a great accompanying video covering the concepts in the article. If you are curious to extend your knowledge and learn some DevOps fundamentals, here is a great tutorial that teaches "enough docker to be useful".
 
 
Really awesome (and visually pleasing) introduction to probability and statistics. "Seeing theory" is a short course that introduces basic probability, compound probability, probability distributions, frequentist inference, bayesian inference and regression analysis. Kudos to Daniel Kunin from Brown University for such a great interactive course.
 
 
Incredibly insightful discussion about the friction between software engineering, data science and data engineering teams in industry. The article summarises a panel conversation between several professionals in this field, and covers some of the main challenges that teams face where these different professions intersect. It covers key points around collaboration, tension points, ownership, mindset, communication and training. It also covers some core differences between academia and industry. This last point is especially interesting, as there has been some tension in scientific conferences between data science and engineering, and more specifically when a breakthrough of the latter is also considered one of the former. The discussion in the review of a recent paper submitted for ICLR 2019 provides an interesting insight on this.
 
 
Machine Learning Mastery comes back this week with yet another great hands on tutorial. This week Jason provides an insight on how to Accelerate Learning of Deep Neural Networks With Batch Normalization. This tutorial includes an overview of how to create and configure a BatchNormalisation layer using the Keras API, how to add the layer to a deep learning model, and how to update a model to use batch normalisation to accelerate training on a binary classification problem.
 
 
 
This week's edition is focused on privacy preserving machine learning frameworks which fall on our Responsibel ML Principle #7. The four featured libraries this week from the Awesome MLOps list are:
 
  • Tensorflow Privacy - A Python library that includes implementations of TensorFlow optimizers for training machine learning models with differential privacy.
  • TF-Encrypted - A Python library built on top of TensorFlow for researchers and practitioners to experiment with privacy-preserving machine learning.
  • PySyft - A Python library for secure, private Deep Learning. PySyft decouples private data from model training, using Multi-Party Computation (MPC) within PyTorch.
  • Uber SQL Differencial Privacy - Uber's open source framework that enforces differential privacy for general-purpose SQL queries.
 
If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request
 
 
 
Since last week, we started showcasing Machine Learning Engineering jobs (primarily in London for now) to help our community to stay up to date with great opportunities that come up.
 
Junior Opportunities
 
 
Mid-level Opportunities
 
Leadership Opportunities
 
 
From this week on, we will be featuring conferences that are primarily machine learning or that have a core ML track (primarily in Europe for now) to help our community to stay up to date with great events coming up.
 
Technical Conferences
 
 
 
 
 
Business Conferences