Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.


THE ML ENGINEER πŸ€–
Issue #32
 
The ML Engineer πŸ€– has reached 1400+ subscribers πŸš€ and the open source Production Machine Learning repo has surpassed 1300 stars (over 30% increase in a week πŸ”₯πŸ”₯πŸ”₯) a massive thank you to all our subscribers and community members for all your support βœ¨πŸ‘πŸŽ‰πŸ˜ƒ
 
 
This week in Issue #32:
 
 
Forward the email, or share the online version on 🐦 Twitter,  πŸ’Ό Linkedin and  πŸ“• Facebook!
 
If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute! We have received a lot of great suggestions in the past, thank you very much for everyone's support!
 
 
 
IBM Principal Engineer Nick Pentreath and Ben Lorica dive into end-to-end machine learning pipelines, and discuss the challenges and opportunities unlocking the potential of machine learning at scale. During this conversation, they cover fundamental topics not only in the training phase of machine learning but also focus on the deployment, monitoring and governance of machine learning systems at scale. An excellent overview + deep dive on an incredibly important topic.
 
 
 
Uber engineering is making deep learning more accessible through their open source code-free deep learning framework called Ludwig. As they mention, Ludwig is unique in its ability to help make deep learning easier to understand for non-experts and enable faster model improvement iteration cycles for experienced machine learning developers and researchers alike. Of course, with great powers comes great responsibility, so we recommend any new-commers to the deep learning world to check out and follow our 8 principles for responsible Machine Learning.
 
 
 
An incredibly insightful research paper which could have a significant impact in privacy, where they propose a method that can accurately estimate the likelihood of a specific person to be correctly re-identified, even in a heavily incomplete dataset. Some of their results are impressive: "Using our model, we find that 99.98% of Americans would be correctly re-identified in any dataset using 15 demographic attributes". With the rise of privacy protection laws such as GDPR, it will be important to consider these kind of loopholes and semi-indirect (but still fully relevant) challenges.
 
 
 
Great positive take by Venturebeat on two of the biggest changers in technology in 2019, open source and artificial intelligence. In this article they cover openess and collaboration, the spaceborne computer example, open source software+hardware and augmenting human capability with AI.
 
 
 
Yet another great article by one of Uber Engineering Manager Gergely Orosz on "Operating a Large, Distributed System in a Reliable Way". In this article Gergely takes us in a high level overview of the key themes he has identified managing the payments system at Uber. In this post he covers fundamental (and super interesting concepts) including Monitoring, Oncall, Anomaly Detection, Alerting, Outages, Incident Management Processes, Postmortems, Incident Reviews, a Culture of Ongoin, Improvements, Failover Drills, Capacity Planning & Blackbox Testing and more (much, much more).
 
 
 
 
 
OSS: Stream Processing
The theme for this week's featured ML libraries is Stream Processing which you can find in our Production Machine Learning ecosystem list. These libraries are an incredibly exciting addition that fall in our Responsible ML Principle #4. The four featured libraries this week are:
 
  • Apache Flink - Open source stream processing framework with powerful stream and batch processing capabilities.
  • Faust - Streaming library built on top of Python’s Asyncio library using the async kafka client inspired by the kafka streaming library.
  • Kafka Streams - Kafka client library for buliding applications and microservices where the input and output are stored in kafka clusters
  • Spark Streaming - Micro-batch processing for streams using the apache spark framework as a backend supporting stateful exactly-once semantics
  •  
If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request
 
 
 
We feature conferences that have core  ML tracks (primarily in Europe for now) to help our community stay up to date with great events coming up.
 
Technical & Scientific Conferences
 
 
 
  • Data Natives [21/11/2019] - Data conference in Berlin, Germany.
 
  • ODSC Europe [19/11/2019] - The Open Data Science Conference in  London, UK.
 
 
 
 
Business Conferences
 
 
  • Big Data LDN 2019 [13/11/2019] - Conference for strategy and tech on big data in London, UK.
 
 
 
We showcase Machine Learning Engineering jobs (primarily in London for now) to help our community stay up to date with great opportunities that come up.
 
Leadership Opportunities
 
Mid-level Opportunities
 
Junior Opportunities
 
 
 
 
 
Β© 2018 The Institute for Ethical AI & Machine Learning