Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.


THE ML ENGINEER 🤖
Issue #55
 
 
This week in Issue #55:
 
 
Forward the email, or share the online version on 🐦 Twitter,  💼 Linkedin and  📕 Facebook!
 
If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute! We have received a lot of great suggestions in the past, thank you very much for everyone's support!
 
 
 
With the rise of large scale machine learning applications, it is becoming increasingly critical for practitioners to learn the best practices in machine learning system design. This great booklet covers four may steps of desingin machine learning systems, including 1) project setup, 2) data pipeline, 3) modeling, and 4) serving. The booklet itself also contains 27 open-minded machine learning system design questions that might come up in machine learning interviews.
 
 
 
As the role of machine learning engineer becomes more prominent in industry, more useful content is contributed by the community to define the role, together with the best practices, and even advise on job interviews. This presentation by Machine Learning Engineer Chip Huyen provides great insight on the role of the MLE, together with advice on how to best approach machine learning interviews.
 
 
 
A fantastic resource that provides a very comprehensible introduction to online learning, which comes together with a set of lecture notes from Boston University's “Introduction to Online Learning” course. This first lecture provides an initial insight on the topic, with a strong technical foundation as well as an exercise to put the learnings into practice.
 
 
 
Amazon Applied Scientist Rakesh Chada has put together a great post that showcases the power of GPT-2. The language model GPT-2 from OpenAI is one of the most coherent generative models for text out there. While its generation capabilities are impressive, it’s ability to zero-shot perform some of the Natural Language Understanding (NLU) tasks seems even more fascinating to Rakesh. In this blog post, some of those capabilities are highlighted as well as a deep dive on one such fun use-case of converting singular nouns in english to their plural counterparts (and vice-versa).
 
 
 
The open source software (OSS) movement has created some of our most important and widely used technologies, including operating systems, web browsers, databases and (of course) machine learning. Our world would not function, or at least not function as well, without open source software. In this podcast, Peter Levene shares some of his experience working with open source as a developer, entrepreneur and investor around business models for open source projects.
 
 
 
 
 
OSS: Data Stream Processing
 
The theme for this week's featured ML libraries is Data Stream Processing. The four featured libraries this week are:
 
  • Apache Flink - Open source stream processing framework with powerful stream and batch processing capabilities.
  • Faust - Streaming library built on top of Python's Asyncio library using the async kafka client inspired by the kafka streaming library.
  • Kafka Streams - Kafka client library for buliding applications and microservices where the input and output are stored in kafka clusters
  • Spark Streaming - Micro-batch processing for streams using the apache spark framework as a backend supporting stateful exactly-once semantics
 
If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request
 
 
 
 
As AI systems become more prevalent in society, we face bigger and tougher societal challenges. We have seen a large number of resources that aim to takle thiese challenges in the form of AI Guidelines, Principles, Ethics Frameworks, etc, however there are so many resources it is hard to navigate. Because of this we started an Open Source initiative that aims to map the ecosystem to make it simpler to navigate. We will be showcasing three resources from our list so we can check them out every week. This week's resources are:
 
 
If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request
 
 
 
© 2018 The Institute for Ethical AI & Machine Learning