The Institute for Ethical AI & Machine Learning

Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.

THE ML ENGINEER 🤖

Issue #32

The ML Engineer 🤖 has reached 1400+ subscribers 🚀 and the open source Production Machine Learning repo has surpassed 1300 stars (over 30% increase in a week 🔥🔥🔥) a massive thank you to all our subscribers and community members for all your support ✨👏🎉😃

This week in Issue #32:

End-to-end ML Pipelines in Enterprise
Code-free deep learning with Ludwig
ML Reidentification and Privacy Issues
How OSS and AI will take us to the moon
Managing large-scale distributed systems
Stream Processing OSS Libraries
AI conferences
ML jobs
+ more 🚀

Forward the email, or share the online version on 🐦 Twitter, 💼 Linkedin and 📕 Facebook!

If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute! We have received a lot of great suggestions in the past, thank you very much for everyone's support!

E2e ML Pipelines in Enterprise

IBM Principal Engineer Nick Pentreath and Ben Lorica dive into end-to-end machine learning pipelines, and discuss the challenges and opportunities unlocking the potential of machine learning at scale. During this conversation, they cover fundamental topics not only in the training phase of machine learning but also focus on the deployment, monitoring and governance of machine learning systems at scale. An excellent overview + deep dive on an incredibly important topic.

Code-free deep learning Ludwig

Uber engineering is making deep learning more accessible through their open source code-free deep learning framework called Ludwig. As they mention, Ludwig is unique in its ability to help make deep learning easier to understand for non-experts and enable faster model improvement iteration cycles for experienced machine learning developers and researchers alike. Of course, with great powers comes great responsibility, so we recommend any new-commers to the deep learning world to check out and follow our 8 principles for responsible Machine Learning.

ML Reidentification and Privacy

An incredibly insightful research paper which could have a significant impact in privacy, where they propose a method that can accurately estimate the likelihood of a specific person to be correctly re-identified, even in a heavily incomplete dataset. Some of their results are impressive: "Using our model, we find that 99.98% of Americans would be correctly re-identified in any dataset using 15 demographic attributes". With the rise of privacy protection laws such as GDPR, it will be important to consider these kind of loopholes and semi-indirect (but still fully relevant) challenges.

OSS + AI will take us to the Moon

Great positive take by Venturebeat on two of the biggest changers in technology in 2019, open source and artificial intelligence. In this article they cover openess and collaboration, the spaceborne computer example, open source software+hardware and augmenting human capability with AI.

Large Scale Distributed Systems

Yet another great article by one of Uber Engineering Manager Gergely Orosz on "Operating a Large, Distributed System in a Reliable Way". In this article Gergely takes us in a high level overview of the key themes he has identified managing the payments system at Uber. In this post he covers fundamental (and super interesting concepts) including Monitoring, Oncall, Anomaly Detection, Alerting, Outages, Incident Management Processes, Postmortems, Incident Reviews, a Culture of Ongoin, Improvements, Failover Drills, Capacity Planning & Blackbox Testing and more (much, much more).

OSS: Stream Processing

The theme for this week's featured ML libraries is Stream Processing which you can find in our Production Machine Learning ecosystem list. These libraries are an incredibly exciting addition that fall in our Responsible ML Principle #4. The four featured libraries this week are:

Apache Flink - Open source stream processing framework with powerful stream and batch processing capabilities.
Faust - Streaming library built on top of Python’s Asyncio library using the async kafka client inspired by the kafka streaming library.
Kafka Streams - Kafka client library for buliding applications and microservices where the input and output are stored in kafka clusters
Spark Streaming - Micro-batch processing for streams using the apache spark framework as a backend supporting stateful exactly-once semantics

If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request!

MLConf = Conferences & Events

We feature conferences that have core ML tracks (primarily in Europe for now) to help our community stay up to date with great events coming up.

Technical & Scientific Conferences

EURNLP 2019 [11/10/2019] - European NLP Research summit in London, UK.

Data Natives [21/11/2019] - Data conference in Berlin, Germany.

ODSC Europe [19/11/2019] - The Open Data Science Conference in London, UK.

EurNLP [11/10/2019] - Europe's NLP research conference (pronounced "Your NLP") in London, UK

Khipu AI [11/11/2019] - Latin American Meeting in Artifical Intelligence in Montevideo, Uruguay.

Business Conferences

Predictive Analytics World [18/11/2019] - Conference for Business AI in Berlin, Germany.

Big Data LDN 2019 [13/11/2019] - Conference for strategy and tech on big data in London, UK.

MLJobs = Jobs & Careers

We showcase Machine Learning Engineering jobs (primarily in London for now) to help our community stay up to date with great opportunities that come up.

Leadership Opportunities

Algorithmia is hiring for a VP of Engineering in Seatle, USA
Fractal Labs is hiring for a VP of Engineering in London

Mid-level Opportunities

Seldon is hiring for a Senior Machine Learning Engineer in London
Proportunity is hiring for a Senior Machine Learning Engineer in London
Atlas ML is hiring for a Lead NLP Engineer in London
StreetBees is hiring for a Senior Data Scientist in London
Tractable is hiring for a Senior Deep Learning Engineer

Junior Opportunities

Migacore is hiring for a Machine Learning Engineer in London
Babylon Health is hiring for a Machine Learning Engineer in London