The Institute for Ethical AI & Machine Learning

Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.

THE ML ENGINEER 🤖
Issue #15

This week in Issue #15:

The NLP of human noises, the challenges of debiasing AI, feature visualisations, the GAN stroke of genius, essential NLP tools & tips, AI comedy generated by humans, privacy preserving ML libraries, upcoming AI conferences, new Machine Learning jobs and more 🚀.

Support the ML Engineer!

Forward the email, or share the online version on 🐦 Twitter, 💼 Linkedin and 📕 Facebook!

If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute! We have received a lot of great suggestions in the past, thank you very much for everyone's support!

AI debiasing doesn't debias bias

There has been a lot of very interesting research in the area of algorithmic bias. This paper is no exception - a great insight on how some of the debiasing methods propsed in previous research only cover up the systematic gender biases identified as opposed to actually removing them. Very recommended (relatively short) read: "Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them".

The NLP of human noises

Very interesting piece of research that goes beyond words. This paper dives into the noises the humans vocalise during conversations. These include all the "hmm"s, "umm"s, "ehem"s and beyond. What's better is that they actually made available a really interesting interactive visualisation which allows us to see how these sounds are clustered together. Try it out by hovering with your mouse (you may want to use headphones or lower the volume of your laptop).

Feature visualisation via activation

This interesting post goes beyond feature importance and introduces the concept of "feature visualisation". As they put it, feature visualisation allows us to wear "lenses" to look through the eyes of the network. This is great as it promises to help us answer the question of "What have these networks learned that allows them to classify images so well?". This area could complement metrics such as feature importance and introduce explainability into models that could then allow domain experts to interpret the reasoning behind complex networks.

The GAN stroke of genious

NVIDIA comes back with a very interesting piece of research. They use a deep neural network to convert doodles into photorealistic images. This could be a great tool for graphic designers (and other creative jobs), allowing experts to prototype and experiment faster before diving into their craft. In their video they actually show how this tool can be used. We may finally be able to keep up with the legendary tutorials of Bob Ross thanks to these type of tools.

Essential NLP Tools, Code & Tips

Great post which outlines key fundamentals in NLP. The post includes key methods such as tokenisation, bag of words, stop words removal, stemming, lemmatization and topic modelling. For the curious ones that would be keen to dive deeper, check out the PyTorch tutorial we shared a few weeks ago which provides examples on Neural Machine Translation, Question-Answering Matching, News Category Classification and Movie Rating Classification.

AI comedy generated by humans

Edinburgh AI NLP researcher Naomi Shaphra delivers a comedy sketch where she covers topics around her research adventures, including funding, sources, projects and beyond. The internet has seen great comedy sketches diving into programming (such as this Javascript talk), perhaps comedy in technical fields will become a bigger thing. If anyone in our community knows of any events of these kind we'd be happy to feature them in our upcoming events section below.

MLOps = Featured OS Libraries

We are excited to see the Awesome MLOps list growing to over 300 stars now! Thanks to everyone for your support! This week's edition is focused on new libraries on Privacy Preserving Machine Learning which fall on our Responsible ML Principle #6. The four featured libraries this week are:

Tensorflow Privacy - A Python library that includes implementations of TensorFlow optimizers for training machine learning models with differential privacy.
TF-Encrypted - A Python library built on top of TensorFlow for researchers and practitioners to experiment with privacy-preserving machine learning.
PySyft - A Python library for secure, private Deep Learning. PySyft decouples private data from model training, using Multi-Party Computation (MPC) within PyTorch.
Uber SQL Differencial Privacy - Uber's open source framework that enforces differential privacy for general-purpose SQL queries.

If you know of any libraries that are not in the "Awesome MLOps" list, please do give us a heads up or feel free to add a pull request!

MLConf = Conferences & Events

We feature conferences that have core ML tracks (primarily in Europe for now) to help our community stay up to date with great events coming up.

Technical Conferences

DataFest19 [11/03/2019] - Two week festival of Data Innovation hosted across Scotland, UK.

PyCon + PyData Florence [02/05/2019] - Python X comes this year with a PyData focus in Florence, Italy.

AI Conference Beijing [18/06/2019] - O'Reilly's signature applied AI conference in Asia in Beijing, China.

RAAIS 2019 [28/06/2019] - The Research and Applied AI Summit in London, UK

Data Natives [21/11/2019] - Data conference in Berlin, Germany.

ODSC Europe [19/11/2019] - The Open Data Science Conference in London, UK.

Business Conferences

AI Expo Global [19/04/2019] - Global conference on artificial intelligence in London, UK.
- Come join us at our talk on AI orchestration at scale.

Predictive Analytics World [18/11/2019] - Conference for Business AI in Berlin, Germany.

Big Data LDN 2019 [13/11/2019] - Conference for strategy and tech on big data in London, UK.

MLJobs = Jobs & Careers

We showcase Machine Learning Engineering jobs (primarily in London for now) to help our community stay up to date with great opportunities that come up. It seems that the demand for data scientists continues to rise!

Junior Opportunities

Seldon is hiring for a Machine Learning / Data Engineer in London
Migacore is hiring for a Machine Learning Engineer in London
CloudNC is hiring for a Machine Learning Engineer in London
Babylon Health is hiring for a Machine Learning Engineer in London
Chattermill is hiring for a Machine Learning Engineer in London

Mid-level Opportunities

Proportunity is hiring for a Senior Machine Learning Engineer in London
Twitter is hiring for a Senior Machine Learning Engineer in London
Atlas ML is hiring for a Lead NLP Engineer in London
StreetBees is hiring for a Senior Data Scientist in London
Expedia is hiring for a Principal Data Scientist in London
QuantumBlack is hiring for a Senior Machine Learning Engineer in London
Tractable is hiring for a Senior Deep Learning Engineer

Leadership Opportunities

Fractal Labs is hiring for a VP of Engineering in London
Distributed is hiring for a VP of Engineering in London
FactMata is hiring for a Head of Machine Learning in London
Brainpool.ai is hiring for a Head of Machine Learning in London, UK
Cytora is hiring for a Data Science Director in London