Subscribe to the Machine Learning Engineer Newsletter

Receive curated articles, tutorials and blog posts from experienced Machine Learning professionals.

Issue #393 🤖

Thank you for being part of over 70,000+ ML professionals and enthusiasts who receive weekly articles & tutorials on Machine Learning & MLOps 🤖 You can join the newsletter https://bit.ly/state-of-ml-2025 ⭐

If you like the content please support the newsletter by sharing with your friends via ✉️ Email, 🐦 Twitter, 💼 Linkedin and 📕 Facebook!

This week in ML Engineering:

The Future of Agents with Andrew Ng
Sebastian Raschka's Local Agent Setup
Memory in the Age of Agents
PydanticAI V2 Now Released
O'Reilly Radar Trends to Watch in 2026
Open Source ML Frameworks
Awesome AI Guidelines to check out this week
+ more 🚀

The Future of Agents with Andrew Ng

This is one of the best breakdowns on the current state of agentic systems and Agentic Development, by the one and only Andrew Ng! He breaks down his setup, what he's working on, and a few predictions for the future: Andrew Ng talked about how coding agents have advanced so fast that they are only now catching up across other domains, such as crafting product definition, legal review, design, marketing, and data access. One point that really resonated is how now all organisations are in a race to sort out their data foundation, as agentic systems are multiplying the value of data by enabling insights that could not have been available at such a speed and detail. There are new delivery bottlenecks, however industry is figuring out how to unblock them with small teams of high-context engineers who can use AI tools across adjacent functions. Andrew Ng also distinguishes incremental automation from broader process redesign, using loan underwriting as an example where value comes from reworking the full business workflow rather than automating one review step. Definitely recommend listening to this fireside, it's always surprising how great Andrew's takes are!

Sebastian Raschka's Local Agent Setup

Sebastian Raschka just dropped a full breakdown of his local LLM agent setup, and as always he has super relevant insights for any engineering practitioner - here's a few highlights: Sebastian is clearly an advocate for open-weight models, and seems to be using Ollama heavily as the model-serving layer. In regards to agent harnesses he tends to favour Qwen Code (first time I hear about it), Codex CLI and Claude Code. He really emphasises that local agent workflows depend largely on inference speed, long-context behavior, tool-call reliability, permissions, telemetry, and task-specific evaluation. And as always an article wouldn't have the Sebastian's signature without an in-depth LLM architectural piece; there was an emphasis on 30–35B Mixture-of-Experts coding models such as Qwen3.6 35B-A3B, North Mini Code, and Nemotron 3 Nano, as good mode can be usable for routine coding-agent tasks on workstation-class hardware, but that harness choice and token use materially affect performance. Hopefully this becomes a series, and starts also diving into his agentic engineering workflows as I'd definitely be keen on these!

Memory in the Age of Agents

A large consortium of universities from across America, Europe and Asia have published a comprehensive survey on the state of "Memory" in agentic systems - key highlights:

The paper's core is actually a structured taxonomy that separates memory by form, function and dynamics, including what carries memory, why agents need it, and how it is formed. There is an interesting distinction of "Memory" across 1) token-level memory, 2) parametric memory and 3) what they call "latent memory" - and it maps memory functions into factual, experiential and working memory. It also clarifies how agent memory differs from adjacent areas such as RAG, context engineering and long-context model design - which although there are some intersections, these are completely different beasts alltogether. For production ML practitioners, the paper is useful because it helps standardise concrete engineering choices around memory, including persistence, retrieval quality, auditability, privacy, latency and evaluation. It is becoming clear that memory should be treated as a core system component in long-running agents, and with components that support, short/long-term memory as well as independent/shared memory.

PydanticAI V2 Now Released

Pydantic AI v2 is out! Pretty cool to see this agent harness maturing on capabilities - here's some of the highlights: They added composable units that package instructions, tools, lifecycle hooks, and model settings which now can be used as lego-blocks. There are now learnings from production agent systems where the operational complexity sits outside the basic model-tool loop, so they added better context control, tool loading, steering, guardrails, code execution, and instrumentation. One of the areas I am particularly excited about is the new defer_loading=True as it lets agents expose a compact catalog and load a workflow only when needed. This is particularly game changing for projects like the K8s Agent OS project that we maintain as each agent may have a long list of MCP Tools and Sub-Agents, and enabling discovery on demand can help optimize context and tokens. We have already updated PydanticAI to v2 in the K8s Agent OS (KAOS) and are keen to start exploring some of the new features.

O'Reilly Radar Trends to Watch in 2026

O’Reilly's published the 2026 Radar Trends this month, and this has quite a useful cross-section of where agentic systems are trending towards - here's a few highlights: The most relevant theme for production ML practitioners is the emergence of infrastructure that allows agents to provision accounts, register domains, initiate payments, obtain credentials, and deploy applications with limited human intervention. This changes the boundary of MLOps, as teams now need to reason not only about model quality and serving latency, but also about authorization, spending controls, audit trails, sandboxing, and dependency risk. There are also key insights on what is becoming a more fragmented model landscape, with general-purpose frontier models becoming more relevant in specialised contexts. The security section is equally relevant, as it includes AI-assisted vulnerability discovery, supply-chain attacks, and credential leakage from coding agents suggest that agent workflows. Definitely worth checking out!

Upcoming MLOps Events

The MLOps ecosystem continues to grow at break-neck speeds, making it ever harder for us as practitioners to stay up to date with relevant developments. A fantsatic way to keep on-top of relevant resources is through the great community and events that the MLOps and Production ML ecosystem offers. This is the reason why we have started curating a list of upcoming events in the space, which are outlined below.

Events we are speaking at this year:

Signals Conference - September @ Berlin
World Summit AI Europe - September @ Amsterdam

Other relevant events:

KubeCon Europe - March @ Amsterdam
PyData Berlin - April @ Frankfurt
Databricks Summit - June @ San Francisco
World Developer Congress - July @ Berlin
EuroPython 2026 - July @ Prague
EuroSciPy 2026 - July @ Krakow
AI Infra Summit 2026 - Sept @ California
Code.Talks 2026 - Nov @ Hamburg
MLOps World 2026 - Nov @ Austin

In case you missed our talks, check our recordings below:

The State of AI in 2025 - WeAreDevelopers 2025
Prod Generative AI in 2024 - KubeCon AI Day 2025
The State of AI in 2024 - WeAreDevelopers 2024
Responsible AI Workshop Keynote - NeurIPS 2021
Practical Guide to ML Explainability - PyCon London
ML Monitoring: Outliers, Drift, XAI - PyCon Keynote
Metadata for E2E MLOps - Kubecon NA 2022
ML Performance Evaluation at Scale - KubeCon Eur 2021
Industry Strength LLMs - PyData Global 2022
ML Security Workshop Keynote - NeurIPS 2022

Open Source MLOps Tools

Check out the fast-growing ecosystem of production ML tools & frameworks at the github repository which has reached over 20,000 ⭐ github stars. We are currently looking for more libraries to add - if you know of any that are not listed, please let us know or feel free to add a PR. Here's a few featured open source libraries that we maintain:

SARC - Provides wrappers for popular agentic frameworks to enable guardrails and constraints that are enforced through the flow.
KAOS - K8s Agent Orchestration Service for managing the KAOS in large-scale distributed agentic systems.
Kompute - Blazing fast, lightweight and mobile phone-enabled GPU compute framework optimized for advanced data processing usecases.
Production ML Tools - A curated list of tools to deploy, monitor and optimize machine learning systems at scale.
AI Policy List - A mature list that maps the ecosystem of artificial intelligence guidelines, principles, codes of ethics, standards, regulation and beyond.
Agentic Systems Tools - A new list that aims to map the emerging ecosystem of agentic systems with tools and frameworks for scaling this domain

Please do support some of our open source projects by sharing, contributing or adding a star ⭐

About us

The Institute for Ethical AI & Machine Learning is a European research centre that carries out world-class research into responsible machine learning.

Check out our website

✉️ Email, 🐦 Twitter, 💼 Linkedin

This email was sent to You received this email because you are registered with The Institute for Ethical AI & Machine Learning's newsletter "The Machine Learning Engineer"

Unsubscribe here