|
|
|
|
THE ML ENGINEER 🤖
Issue #17
|
|
|
|
|
|
|
|
Support the ML Engineer!
If you would like to suggest articles, ideas, papers, libraries, jobs, events or provide feedback just hit reply or send us an email to a@ethical.institute! We have received a lot of great suggestions in the past, thank you very much for everyone's support!
|
|
|
|
|
|
|
|
|
Check out this great tutorial on pre-processing for image data using Keras. Great resource which provides hands on examples on how to use some of the most common pre-processing approaches to image data, including normalising, centering and standardising images. This article contains details on: 1) How to configure and a use the ImageDataGenerator class (in keras) for train, validation, and test datasets of images. 2) How to use the ImageDataGenerator to normalize pixel values when fitting and evaluating a convolutional neural network model. 3) How to use the ImageDataGenerator to center and standardize pixel values when fitting and evaluating a convolutional neural network model.
|
|
|
|
|
|
|
The international journal of science "Nature" has released a very interesting article which brings attention to a key challenge in the scientific community, and it is "calling to retire statistical significance and use confidence intervals" instead. The article argues that significant p-values may not always be fully representative - an issue which has led to overhyped claims and even the dismissal of possibly crucial effects. There are some really great initiatives in the machine learning community which help raise the bar for quality through reproducibility of results to ensure they can be evaluated properly: one very exciting initiative which we mentioned a few weeks ago is Papers With Code, which just released a new feature to provide GitHub badges that show SotA performance.
|
|
|
|
|
|
|
|
Forecasting can be an incredibly valuable analytical skill to have in your toolset. This short article provides a great introduction to the family of probability models for time series called "structural time series models" - this family of models encompass autoregressive processes, moving averages, local linear trends, seasonality and regression. The article also provides a hands on example using the Tensorflow Probability library, forcasting CO2 Concentration using data from the Mauna Loa observatory in Hawaii.
|
|
|
|
|
|
|
A group of Stanford researchers release an update on their work with Snorkel MeTal to tackle massive multi-task learning in natural language understanding. In this post, they talk about how they use Snorkel MeTaL to construct a simple model (pretrained BERT + linear task heads) and incorporate a variety of supervision signals (traditional supervision, transfer learning, multi-task learning, weak supervision, and ensembling) in a Massive Multi-Task Learning (MMTL) setting, achieving a new state-of-the-art score on the GLUE Benchmark and four of its nine component tasks (CoLA, SST-2, MRPC, STS-B).
|
|
|
|
|
|
|
|
|
MLOps = Featured OS Libraries
We are excited to add a new section to the MLOps library on data stream processing! Data stream processing falls on our Responsible ML Principle #4. The four featured libraries on data stream processing this week are:
- Apache Flink - Open source stream processing framework with powerful stream and batch processing capabilities.
- Faust - Streaming library built on top of Python’s Asyncio library using the async kafka client inspired by the kafka streaming library.
- Kafka Streams - Kafka client library for buliding applications and microservices where the input and output are stored in kafka clusters
- Spark Streaming - Micro-batch processing for streams using the apache spark framework as a backend supporting stateful exactly-once semantics
|
|
|
|
|
|
|
|
|
We feature conferences that have core ML tracks (primarily in Europe for now) to help our community stay up to date with great events coming up.
Technical Conferences
- DataFest19 [11/03/2019] - Two week festival of Data Innovation hosted across Scotland, UK.
- AI Conference Beijing [18/06/2019] - O'Reilly's signature applied AI conference in Asia in Beijing, China.
- Data Natives [21/11/2019] - Data conference in Berlin, Germany.
- ODSC Europe [19/11/2019] - The Open Data Science Conference in London, UK.
Business Conferences
- World Summit AI Americas [10/04/2019] - Large scale AI summit in Montreal, Canada.
- Come join our panel on AI Ethics and Tools.
- AI Expo Global [19/04/2019] - Global conference on artificial intelligence in London, UK.
- Big Data LDN 2019 [13/11/2019] - Conference for strategy and tech on big data in London, UK.
|
|
|
|
|
|
|
Junior Opportunities
Mid-level Opportunities
Leadership Opportunities
|
|
|
|
|
|
|
© 2018 The Institute for Ethical AI & Machine Learning
|
|
|
|
|