Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.


THE ML ENGINEER 🤖
Issue #42
 
 
This week in Issue #42:
 
 
Forward the email, or share the online version on 🐦 Twitter,  💼 Linkedin and  📕 Facebook!
 
If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute! We have received a lot of great suggestions in the past, thank you very much for everyone's support!
 
 
 
Serving machine learning models at scale is one of the biggest challenges. The KFServing project aims to tackle this. KFserving is a cross-industry open source collaboration (currently led by multiple technology companies including Seldon, Google, Microsoft, IBM and Bloomberg) with the objective to develop a fully fledged machine learning serving and orchestration framework in Kubernetes. This initiative is incredibly exciting, because it has several tech leaders collaborating on defining what production ML could look like, and working towards abstracting some of very complex and heterogeneous production ML terminology, into standardised protocols and interfaces.
 
 
 
Machine learning models that are trained with very large datasets introduce new complexities, including large memory usage, heavy compute, black box constraints and more. The team at Monzo has put together a great overview that provides an outline of the key concepts that are often taken into consideration when moving a model into production, and dive into their use-case leveraging the HuggingFace library.
 
 
 
Machine learning systems have historically been constrainted into either vertical use-cases, or specialisations in a subset of the model's lifecycle (training vs data analysis vs deployment). Lately there has been an increase in end-to-end machine learning systems that are flexible to fit any use-case. Apple has released a paper where they describe their approach to this, which they have named "Overton". The paper describes the challenge as well as architectural components, and interaction from engineers with the system. This is certainly an exciting space, in which we'll be seeing a lot of great innovations coming in the next few years.
 
 
 
Historically in data science, the time it takes to convert an idea into an interactive application takes a non-trivial amount of time. A new tool called Streamlit provides a way to easily build interactive applications from complex data science tools without the need to deal with the underlying infrastructural complexities (wrapping the backend in a microservice, exposing endpoints, building a UI to consume them, etc). Really awesome tool, definitely recommend checking it out.
 
 
 
As an organisation scales and teams become more distant, there is a risk for innovation to stagnate, and a lot of the challenges in the organisational structure starts to reflect in the product/service interfaces - often for the worse. Amazon provides an interesting retrospective view of how they have tackled this to be able to build modern applications at Amazon Web Services.
 
 
 
 
 
 
The theme for this week's featured ML libraries is Machine learning Deployment and Orchestration Libraries, and we're happy to share brand new libraries into that section. The four featured libraries this week are:
 
  • Seldon - Open source platform for deploying and monitoring machine learning models in kubernetes - (Video)
  • KFServing - Serverless framework to deploy and monitor machine learning models in Kubernetes - (Video)
  • Redis-AI - A Redis module for serving tensors and executing deep learning models. Expect changes in the API and internals.
  • Model Server for Apache MXNet (MMS) - A model server for Apache MXNet from Amazon Web Services that is able to run MXNet models as well as Gluon models (Amazon's SageMaker runs a custom version of MMS under the hood)
 
If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request
 
 
 
We feature conferences that have core  ML tracks (primarily in Europe for now) to help our community stay up to date with great events coming up.
 
Technical & Scientific Conferences
 
 
 
  • Data Natives [21/11/2019] - Data conference in Berlin, Germany.
 
  • ODSC Europe [19/11/2019] - The Open Data Science Conference in  London, UK.
 
 
 
Business Conferences
 
 
  • Big Data LDN 2019 [13/11/2019] - Conference for strategy and tech on big data in London, UK.
 
 
 
We showcase Machine Learning Engineering jobs (primarily in London for now) to help our community stay up to date with great opportunities that come up.
 
Leadership Opportunities
 
Mid-level Opportunities
 
Junior Opportunities
 
 
 
 
 
© 2018 The Institute for Ethical AI & Machine Learning