Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.

Issue #384 🤖

Thank you for being part of over 70,000+ ML professionals and enthusiasts who receive weekly articles & tutorials on Machine Learning & MLOps 🤖 You can join the newsletter https://bit.ly/state-of-ml-2025 ⭐

If you like the content please support the newsletter by sharing with your friends via ✉️ Email, 🐦 Twitter, 💼 Linkedin and 📕 Facebook!

This week in ML Engineering:

Intercom 2X'd Engineering Velocity
Qwen3.6-27B Coding OSS Model
Scientific Theory for Deep Learning
META KernelEvolve Optimizes AI Infrastructure
OpenAI Releasing GPT-5.5
Open Source ML Frameworks
Awesome AI Guidelines to check out this week
+ more 🚀

Intercom 2X'd Engineering Velocity

Intercom claims 2x productivity increase with coding agents, and this podcast has some pretty interesting insights on on how they've been approaching this across their org: Many organisations are aggressively trying to figure out how to unlock AI coding productivity beyond the individual and across the organisation, which seems Intercom has figured out a way forward. In nine months, it seems they doubled merged PRs metric while keeping quality stable by treating the AI workflow like an internal product, instrumenting usage with telemetry, analyzing anonymized session data, and building a shared skills repository with hooks that enforce engineering standards automatically. It is great to see that often the secret is actually nothing more than solid engineering practices in the foundation; gains come less from "allow everyone to do tokenmaxing" and instead more from building the surrounding foundation for high quality PRs, flaky tests, CI, internal tools and reviews. This is the most important time to invest in CI/code-review bottlenecks, as well as DORA metric improvements, as well as a culture where PMs, designers and engineers can all safely ship code.

Qwen3.6-27B Coding

Chinese giant Alibaba releases another impressive open source model with Qwen3.6, with only 27B parameters which shows impressive performance on coding tasks: From the reports it seems that this release brings flagship-level agentic coding into a dense 27B open-weight model, which is super impressive how much can be packed in such a relatively small model. This basically reducess the complexity from very large MoE systems while still outperforming Qwen’s previous 397B-total / 17B-active open-source flagship on major coding-agent benchmarks like SWE-bench, etc The interesting bit is not just the scores, but the fact that this model is open sourced under Apache-2.0 and it supports text/image/video inputs, offers a 262K native context window extendable up to 1M tokens, and introduces "thinking preservation" for multi-turn agentic workflows.

Scientific Theory for Deep Learning

It seems we're at a stage where deep learning is evolving from alchemy into an engineering discipline; this is an exciting paper which lays out that a scientific theory is emerging for Deep Learning: This is great as having a robust scientific foundation means fewer blind hyperparameter searches, more predictable scaling, better interpretability, and stronger foundations for safety. Deep learning theory is starting to look less like scattered math and more like an emerging "mechanics of learning" as a physics-style framework for predicting training dynamics, representations, final weights, and model performance. The paper breaks down into five buckets, including 1) solvable toy settings such as deep linear networks and NTKs; 2) useful limits such as infinite width/depth and lazy vs. rich feature learning; 3) empirical laws such as scaling laws and edge-of-stability behavior; 4) hyperparameter theories such as μP and learning-rate/batch-size scaling; and 5) universal phenomena where different architectures, datasets, and training recipes converge to similar representations.

META KernelEvolve Optimizes AI Infrastructure

META is building agentic systems that are optimizing the AI infrastructure under their large scale machine learning models, and they have released their framework: META's KernelEvolve is a framework they are using for optimizing the low-level infrastructure that determines whether large-scale models are economically viable in production. KernelEvolve sits inside Meta’s Ranking Engineer Agent stack and turns kernel authoring into a closed-loop search problem across NVIDIA GPUs, AMD GPUs, MTIA chips, and CPUs, using LLM-generated candidates, retrieval-augmented hardware knowledge, tree search, profiling feedback, and automated correctness/performance evaluation. Meta reports compressing weeks of expert kernel work into hours, achieving over 60% inference throughput improvement for the Andromeda Ads model on NVIDIA GPUs and over 25% training throughput improvement for an ads model on MTIA, while supporting DSLs and backends like Triton, CuTe DSL, FlyDSL, CUDA, HIP, and MTIA C++. For production ML practitioners, the key takeaway is that the next bottleneck in model iteration may be less about model design alone and more about automating the systems layer around it: kernel generation, hardware portability, profiling, benchmarking, and continuous optimization across increasingly heterogeneous accelerator fleets.

OpenAI Releasing GPT-5.5

OpenAI has released their latest model GPT-5.5. It seems this model is positioned less as a chatbot and more as a long-running worker for coding, research, document-heavy knowledge work, tool use, and computer operation. For production ML practitioners the main callouts are that the new model is focused on stronger agentic coding and systems reasoning, better long-context performance up to 1M tokens, improved tool reliability, and greater token efficiency while maintaining GPT-5.4-like per-token latency. The benchmarks sound impressive (although we know how these are never confirmed until they are taken for a spin), with 82.7% on Terminal-Bench 2.0, 58.6% on SWE-Bench Pro, etc. Keen to see how this performs in the wild, would be great to hear experiences from practitioners as they take them into production projects.

Upcoming MLOps Events

The MLOps ecosystem continues to grow at break-neck speeds, making it ever harder for us as practitioners to stay up to date with relevant developments. A fantsatic way to keep on-top of relevant resources is through the great community and events that the MLOps and Production ML ecosystem offers. This is the reason why we have started curating a list of upcoming events in the space, which are outlined below.

Events we are speaking at this year:

eTail Europe - March @ Berlin
World Summit AI Europe - September @ Amsterdam

Other relevant events:

KubeCon Europe - March @ Amsterdam
PyData Berlin - April @ Frankfurt
Databricks Summit - June @ San Francisco
World Developer Congress - July @ Berlin
EuroPython 2026 - July @ Prague
EuroSciPy 2026 - July @ Krakow
AI Infra Summit 2026 - Sept @ California
Code.Talks 2026 - Nov @ Hamburg
MLOps World 2026 - Nov @ Austin

In case you missed our talks, check our recordings below:

The State of AI in 2025 - WeAreDevelopers 2025
Prod Generative AI in 2024 - KubeCon AI Day 2025
The State of AI in 2024 - WeAreDevelopers 2024
Responsible AI Workshop Keynote - NeurIPS 2021
Practical Guide to ML Explainability - PyCon London
ML Monitoring: Outliers, Drift, XAI - PyCon Keynote
Metadata for E2E MLOps - Kubecon NA 2022
ML Performance Evaluation at Scale - KubeCon Eur 2021
Industry Strength LLMs - PyData Global 2022
ML Security Workshop Keynote - NeurIPS 2022

Open Source MLOps Tools

Check out the fast-growing ecosystem of production ML tools & frameworks at the github repository which has reached over 20,000 ⭐ github stars. We are currently looking for more libraries to add - if you know of any that are not listed, please let us know or feel free to add a PR. Here's a few featured open source libraries that we maintain:

KAOS - K8s Agent Orchestration Service for managing the KAOS in large-scale distributed agentic systems.
Kompute - Blazing fast, lightweight and mobile phone-enabled GPU compute framework optimized for advanced data processing usecases.
Production ML Tools - A curated list of tools to deploy, monitor and optimize machine learning systems at scale.
AI Policy List - A mature list that maps the ecosystem of artificial intelligence guidelines, principles, codes of ethics, standards, regulation and beyond.
Agentic Systems Tools - A new list that aims to map the emerging ecosystem of agentic systems with tools and frameworks for scaling this domain

Please do support some of our open source projects by sharing, contributing or adding a star ⭐

About us

The Institute for Ethical AI & Machine Learning is a European research centre that carries out world-class research into responsible machine learning.

Check out our website

✉️ Email, 🐦 Twitter, 💼 Linkedin

This email was sent to You received this email because you are registered with The Institute for Ethical AI & Machine Learning's newsletter "The Machine Learning Engineer"

Unsubscribe here