Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.


THE ML ENGINEER 🤖
Issue #27
 
 
 Last week we gave a 3-hour tutorial diving into AI Explainability at the AI O'Reilly Beijing and this coming week we'll be speaking on Machine Learning Orchestration at Kubecon Shanghai, Open Source Summit, OSCon and Slush China 🚀 If you're around drop us a line or say hello on twitter!
 
 
This week in Issue #27:
 
 
Forward the email, or share the online version on 🐦 Twitter,  💼 Linkedin and  📕 Facebook!
 
If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute! We have received a lot of great suggestions in the past, thank you very much for everyone's support!
 
 
 
Last week we attended China's first Ray meetup, and had a huge pleasure to see Ion Stoica, Apache Spark founder and Databricks Chairman, presenting a technical deep dive on Ray, followed by a set of great talks from Alibaba, Didi and Ant Financial engineeering leaders on Ray usecases. Ray is a fast and simple framework for building and running distributed applications, and comes in with a broad set of tools including Tune (Rapid Hyperparam search), RLib (Scalable Reinforcement Learning), and Distributed Training, between several other features.
 
 
 
Janis Klaise, Data Scientist at Seldon, joins Daniel Whitenack and Chris Benson on their Practical AI podcast to talk about the challenges of production machine learning, and how Seldon is tackling the challenge with open source particularily in the theme of explainable machine learning with Alibi Black Box Model Explanations. Janis provides an introduction to the challenges of production machine learning, as well as the different approaches that can be used in machine learning explainability.
 
 
 
A great blog post that summarises a set of principles presented at a talk by Patrick Ball with the Data & Society Research Institute titled Principled Data Processing. Transparency, accountability, reproducibility and scalability, which truly resonate with our 8 principles for responsible machine learning.
 
 
 
Machine learning mastery comes back this week with a deep dive analysing results of classical and machine learning methods for time series analysis. In this post, James three key things: 1) classical methods like ETS and ARIMA out-perform ML/DL methods for one-step forecasting on univariate datasets, 2) How classical methods like Theta and ARIMA outperform DL,ML models for multi-step forecasting on univariate datasets, and how ML/DL methods do not yet deliver on their promse for univariate time series forecasting.
 
 
 
Ben Lorica and Ihab Ilyas bring us an excellent piece this week covering machine learning solutions for data integration, cleaning, and data generation, which are quickly gaining traction and popularity. This post covers fundamental topics like data integration / cleaning, data programming and https://www.oreilly.com/ideas/the-quest-for-high-quality-datamarket validation.
 
 
 
 
 
OSS: Adversarial Robustness
The theme for this week's featured ML libraries is Adversarial Robustness, which includes tools for adversarial attacks and adversarial security. These libraries are an incredibly exciting addition that fall in our Responsible ML Principle #8, and the whole section was contributed by one of the Fellows at the Institute Ilja Moisejevs from Calipso AI. The four featured libraries this week are:
 
  • CleverHans - library for testing adversarial attacks / defenses maintained by some of the most important names in adversarial ML, namely Ian Goodfellow (ex-Google Brain, now Apple) and Nicolas Papernot (Google Brain). Comes with some nice tutorials!
  • Foolbox - second biggest adversarial library. Has an even longer list of attacks - but no defenses or evaluation metrics. Geared more towards computer vision. Code easier to understand / modify than ART - also better for exploring blackbox attacks on surrogate models.
  • IBM Adversarial Robustness Toolbox (ART) - at the time of writing this is the most complete off-the-shelf resource for testing adversarial attacks and defenses. It includes a library of 15 attacks, 10 empirical defenses, and some nice evaluation metrics. Neural networks only.
  • AdvBox - generate adversarial examples from the command line with 0 coding using PaddlePaddle, PyTorch, Caffe2, MxNet, Keras, and TensorFlow. Includes 10 attacks and also 6 defenses. Used to implement StealthTshirt at DEFCON!
 
 
If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request
 
 
 
We feature conferences that have core  ML tracks (primarily in Europe for now) to help our community stay up to date with great events coming up.
 
Technical & Scientific Conferences
 
  • AI Conference Beijing [18/06/2019] - O'Reilly's signature applied AI conference in Asia in Beijing, China.
 
 
 
  • Data Natives [21/11/2019] - Data conference in Berlin, Germany.
 
  • ODSC Europe [19/11/2019] - The Open Data Science Conference in  London, UK.
 
 
 
 
 
Business Conferences
 
 
  • Big Data LDN 2019 [13/11/2019] - Conference for strategy and tech on big data in London, UK.
 
 
 
We showcase Machine Learning Engineering jobs (primarily in London for now) to help our community stay up to date with great opportunities that come up. It seems that the demand for data scientists continues to rise!
 
Leadership Opportunities
 
Mid-level Opportunities
 
Junior Opportunities
 
 
 
 
 
© 2018 The Institute for Ethical AI & Machine Learning