Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.


THE ML ENGINEER
Issue #11

 
 
This week in Issue #12:
Bias and explainability in machine learning, federated learning with PyTorch, ultra data visualisation deep dive, practical tutorial on recommenders, intro to learning curves for model evaluation, cybersecurity in development of ML, data optimisation frameworks, new AI conferences, ML jobs and more!
 
Support the ML Engineer!
Forward the email, or share the online version on 🐦 Twitter,  💼 Linkedin and  📕 Facebook!
 
If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute!
 
 
 
 
A deep dive on algorithmic bias and explainability in data and machine learning using the XAI Library. This video provides a case study automating a loan aproval process to show how undesired bias can be introduced throughout the process. The talk also covers how these undesired biases can be tackled using open source tools, together with a three step process consisting of 1) data analysis, 2) model evaluation and 3) production monitoring.
 
 
Great practical tutorial to introduce federated learning using the PySift Library to distribute a MINST deep learning task across multiple devices. Federated learning is a method that allows for machine learning models to be trained across multiple edge devices in the network instead on a central server, which we covered a few weeks ago
 
 
Really awesome free e-book on all-things data visualisation using R. This very comprehensible resource introduces best practices in data analysis, R usage, data transformation, colour/display selection, graph usage, models, maps and more (much more). You can also buy a hard copy in amazon to support the cause.
 
 
Hands on end-to-end tutorial on recommender systems that dives into the movie dataset. It follows a CRISP-DM-like process to explain every stage in the model developing stage (e.g. data understanding, preparation, etc). The tutorial covers fundamental concepts in recommender systems such as explicit/implicit feedback and dives into practical examples implementing Collaborative Filtering (ALS), Neural Collaborative Filtering, Restricted Boltzman Machine, Smart Adaptive Recommendations, Surprise SVD and Vowpal Wabbit.
 
 
Learning curves are a fundamental technique to evaluate machine learning models. Machine learning mastery brings us a gentle introduction to learning curves to diagnoes machine learning model performance. It provides an introduction to learning curves, an example on how to implement them, and an overview on how to read the graphs to diagnose an underfit, overfit or a well-fit model.
 
 
It is common to hear machine learning applications to cybersecurity, but it's also critical to dive into the cybersecurity applications in machine learning. This great post in the O'Reilly blog provides an overview of the flavours in which vulnerabilities can appear in machine learning model development, together with a few conceptual steps to take into account when taking into consideration machine learning development security.
 
 
 
 
We are excited to see the Awesome MLOps list growing to almost 300 stars now! Thanks to everyone for your support! This week's edition is focused on new libraries on Data Storage Optimisation which fall on our Responsible ML Principle #4. The four featured libraries this week are:
 
  • Alluxio - A virtual distributed storage system that bridges the gab between computation frameworks and storage systems.
  • EdgeDB - NoSQL interface for Postgres that allows for object interaction to data stored.
  • BayesDB - Database that allows for built-in non-parametric Bayesian model discovery and queryingi for data on a database-like interface
  • Apache Arrow - In-memory columnar representation of data compatible with Pandas, Hadoop-based systems, etc
 
If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request
 
 
 
We feature conferences that have core  ML tracks (primarily in Europe for now) to help our community stay up to date with great events coming up.
 
Technical Conferences
 
  • DataFest19 [11/03/2019] - Two week festival of Data Innovation hosted across Scotland, UK.
 
 
  • AI Conference Beijing [18/06/2019] - O'Reilly's signature applied AI conference in Asia in Beijing, China.
 
 
  • Data Natives [21/11/2019] - Data conference in Berlin, Germany.
 
  • ODSC Europe [19/11/2019] - The Open Data Science Conference in  London, UK.
 
 
Business Conferences
 
 
 
 
  • Big Data LDN 2019 [13/11/2019] - Conference for strategy and tech on big data in London, UK.
 
 
 
We showcase Machine Learning Engineering jobs (primarily in London for now) to help our community stay up to date with great opportunities that come up. It seems that the demand for data scientists continues to rise!
 
Junior Opportunities
 
 
Mid-level Opportunities
 
Leadership Opportunities
 
 
 
© 2018 The Institute for Ethical AI & Machine Learning