Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.


THE ML ENGINEER πŸ€–
Issue #33
 
The ML Engineer πŸ€– has reached 1500+ subscribers πŸš€ and the open source Production Machine Learning repo has also surpassed 1500 stars (non-stop growth πŸ”₯πŸ”₯πŸ”₯) a massive thank you to all our subscribers and community members for all your support βœ¨πŸ‘πŸŽ‰πŸ˜ƒ
 
 
This week in Issue #33:
 
 
Forward the email, or share the online version on 🐦 Twitter,  πŸ’Ό Linkedin and  πŸ“• Facebook!
 
If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute! We have received a lot of great suggestions in the past, thank you very much for everyone's support!
 
 
 
Linkedin open sources Brooklin, a distributed service for streaming data in near real-time at scale, currently powering over 2 trillion messages per day at Linkedin. Data streaming is truly driving the way for real-time machine learning usecases. This is also a very interesting project, primarily as it doesn't aim to replace OSS projects like Kafka, instead it sits on a higher level providing a primary solution for streaming across various stores and messaging systems (Kafka, Azure Events Hub, Kinesis, etc). In this post, they showcase how Brooklyn can be used as a streaming bridge across these heterogeneous messaging services, as well as mirroring kafka functionality, and beyond.
 
 
 
The tensorflow team enters the ML model interpretability arena with TFExplain - a library that offers interpretability methods to understand model predictions. The library is adapted to the Tensorflow 2.0 workflow, using tf.keras API as possible, prividing: 1) heatmap visualisations & gradient analysis, 2) off-training & keras.callback usages, and 3) tensorboard integration.
 
 
 
A project that would have considered a dream for smart-city enthusiast has been fully open sourced. The Urban Modelling Group at University College Dublin has captured major area of Dublin city centre (around 5.6km^2) and made available as the densest LiDAR point cloud and imagery dataset (260m points out of 1.4b are labelled).
 
 
 
A five-part video series released by Aljazeera covering high level concepts that break down some of the biggest challenges in AI through a mainstream media lens. The five parts basically break down into: 1) Trust & bias, 2) Big Tech monopolies, 3) Missinformation, 4) Surveilance, and 5) Regulation around data & privacy.
 
 
 
An excellent project that tries to simplify one of the most popular concepts around machine learning. "Machines gone wrong" covers foundational topics in the challenges of AI such as an explanation of AI ethics, why AI is different when talking about these issues, as well as some key themes like algorithmic bias.
 
 
 
 
 
OSS: Stream Processing
The theme for this week's featured ML libraries is Stream Processing which you can find in our Production Machine Learning ecosystem list. These libraries are an incredibly exciting addition that fall in our Responsible ML Principle #4. The four featured libraries this week are:
 
  • Apache Flink - Open source stream processing framework with powerful stream and batch processing capabilities.
  • Faust - Streaming library built on top of Python’s Asyncio library using the async kafka client inspired by the kafka streaming library.
  • Kafka Streams - Kafka client library for buliding applications and microservices where the input and output are stored in kafka clusters
  • Spark Streaming - Micro-batch processing for streams using the apache spark framework as a backend supporting stateful exactly-once semantics
  •  
If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request
 
 
 
We feature conferences that have core  ML tracks (primarily in Europe for now) to help our community stay up to date with great events coming up.
 
Technical & Scientific Conferences
 
 
 
  • Data Natives [21/11/2019] - Data conference in Berlin, Germany.
 
  • ODSC Europe [19/11/2019] - The Open Data Science Conference in  London, UK.
 
 
 
 
Business Conferences
 
 
  • Big Data LDN 2019 [13/11/2019] - Conference for strategy and tech on big data in London, UK.
 
 
 
We showcase Machine Learning Engineering jobs (primarily in London for now) to help our community stay up to date with great opportunities that come up.
 
Leadership Opportunities
 
Mid-level Opportunities
 
Junior Opportunities
 
 
 
 
 
Β© 2018 The Institute for Ethical AI & Machine Learning