Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.


THE ML ENGINEER
Issue #11

 
 
This week in Issue #11:
Top sources for machine learning datasets, major reinforcement learning achievements in 2018, getting better deep learning results, a beginners intro to deep NLP, feature flags in the world of AI, StyleGANs and language models in Airbnb, Function-as-a-service (FaaS) frameworks, new AI conferences, ML jobs and more!
 
Support the ML Engineer!
Forward the email, or share the online version on 🐦 Twitter,  💼 Linkedin and  📕 Facebook!
 
If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute!
 
 
 
 
It can be quite hard to find a specific datasets to use for a variety of machine learning problems or to even experiment on. This post provides a great list of sources to find datasets for experimentation. It also includes a description, usage examples and in some cases the algorithm code to solve the machine learning problem associated with that dataset.
 
 
Really insightful article listing 10 of the "top reinforcement learning papers in 2018". It includes a very comprehensible summary for each paper, together with a brief overview of the core idea, it's achievement and other insights such as the thoughts from the community, future research areas, etc.
 
 
Machine learning mastery brings us a one-week (completely free) mini-course to "get better performance from your deep learning in 7 days". The course covers key topics including a high level conceptual framework, batch size, learning rate schedule, batch normalisation, weight regularisation, noise and early stopping. What a better way to start your week tomorrow than with a 7-day commitment to improve on your deep learning skills.
 
 
Really comprehensible set of tutorials to introduce beginners to deep NLP with PyTorch. The github repo includes a tutorial for Neural Machine Translation, Movie Rating Classification, News Category Classification and Question Answering for SQuAD. The repo also provides references to relevant papers for further reading.
 
 
A humorous use of StyleGANs and a language model trained on OpenDataSoft’s AirBNB Listings. The thisairbnbdoesnotexist.com website creates a new synthetic listing every time you refresh the page. This project was inspired by thispersondoesnotexist.com which does the same thing but with synthetic faces (and minus the text). Perhaps we are seeing a new trend on the this-X-doesnotexist.com.
 
 
In this blog post, Slack's director of engineering makes a brief case for feature flags in the use of machine learning. Software engineerss have always held a love-hate relationship with feature flags. Some say that Feature Flags are one of the worst kinds of technical debt. Regardless, Feature Flags are rarely is left out in larger enterprise codebases, and have slowly made their way in DevOps infrastructure. 
 
 
 
 
We are excited to see the Awesome MLOps list growing to almost 300 stars now! Thanks to everyone for your support! This week's edition is focused on new libraries on Function as a Service Frameworks which fall on our Responsible ML Principle #4. The four featured libraries this week are:
 
  • OpenFaaS - Serverless functions framework with RESTful API on Kubernetes
  • Fission - Serverless functions as a service framework on Kubernetes
  • Hydrosphere ML Lambda - Open source model management cluster for deploying, serving and monitoring machine learning models and ad-hoc algorithms with a FaaS architecture
  • Hydrosphere Mist - Serverless proxy for Apache Spark clusters
 
If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request
 
 
 
We feature conferences that have core  ML tracks (primarily in Europe for now) to help our community stay up to date with great events coming up.
 
Technical Conferences
 
  • DataFest19 [11/03/2019] - Two week festival of Data Innovation hosted across Scotland, UK.
 
 
  • AI Conference Beijing [18/06/2019] - O'Reilly's signature applied AI conference in Asia in Beijing, China.
 
 
  • Data Natives [21/11/2019] - Data conference in Berlin, Germany.
 
  • ODSC Europe [19/11/2019] - The Open Data Science Conference in  London, UK.
 
 
Business Conferences
 
 
 
 
  • Big Data LDN 2019 [13/11/2019] - Conference for strategy and tech on big data in London, UK.
 
 
 
We showcase Machine Learning Engineering jobs (primarily in London for now) to help our community stay up to date with great opportunities that come up. It seems that the demand for data scientists continues to rise!
 
Junior Opportunities
 
 
Mid-level Opportunities
 
Leadership Opportunities
 
 
 
© 2018 The Institute for Ethical AI & Machine Learning