DuckDB Agent Data Extension Excited to announce a new project release! Our DuckDB extension for querying Agent Data, which is now officially part of the DuckDB Community Extensions 🚀🚀🚀 This means you can load it directly in your duckdb session with:
INSTALL agent_data FROM community; LOAD agent_data;
This has been something I've been looking forward for a while, as there is so much you can do with local Agent data from Copilot, Claude, Codex, etc; now you can easily ask any questions such as:
-- How much have I used Claude Code recently? SELECT date, message_count, tool_call_count FROM read_stats() ORDER BY date DESC LIMIT 10;
-- Which tools does github copilot use most? SELECT tool_name, COUNT(*) AS uses FROM read_conversations('~/.copilot') GROUP BY tool_name ORDER BY uses DESC;
This also has made it quite simple to create interfaces to navigate agent sessions across multiple providers. For this, the repo comes with a simple Marimo example, as well as a Streamlit example that allow you to play around with your local data.
Best thing is that you can do this from the comfort of the proven and tested DuckDB engine without any dependencies. Besides extending to other providers (Gemini, Codex, etc), there are also interesting avenues exploring streaming, and other features.
Check it out - do share feedback and thoughts! |
|
|
|---|
|
State of Data Eng Report This "State of Data Engineering" is one of the best reports I have seen, and the interactive charts are one of the best UX I've come across - key insights:
* cloud data warehouses remain the default (~44%) * lakehouse adoption continues to grow (~27%) * architecture choices vary by org size * individual AI tool usage is now pervasive (82% daily+) * organizational AI maturity still lagging
Some of the top challenges highlighed:
* the biggest blockers are organizational * data modeling stands out as a widespread pain point * unclear ownership * long-term maintainability issues * burden of firefighting
This is a great interactive experience, check out the overview, as well as the interactive charts, huge kudos for such a great interactive experience + using DuckDB-WASM! |
|
|
|---|
|
New Tabular Foundation Model Inria launches a new tabular foundation model! This space is one of the most exciting areas of "boring ML", as it could be transformational for key areas like risk, fraud, ops, pricing, forecasting and more: This is a hard problem as tabular datasets are highly heterogeneous; this new foundation model TabICLv2 makes training-free in-context learning practical for real tabular workloads, and one of the things that is still hard to believe is that it's trained mostly in sinthetic data (which seems to be the case for most). From an architecture side it includes a scalable softmax attention temperature scheme to avoid attention degradation as the number of rows grows so it can generalize to much larger tables without having to pretrain on prohibitively long sequences. It also has improved pretraining protocol, showing in benchmarks across TabArena and TALENT that out of the box surpasses various other models. It is quite interesting to see how these models are evolving at fast speed, with various competing models and architectures appearing every couple of months - this is certainly an exciting field to keep an eye on! |
|
|
|---|
|
| | Inference speed is becoming the new MOAT, the recent Opus 4.6 fast vs Codex fast is a good example - what is most interesting is the different approaches that orgs are taking to get there:
1) Anthropic's fast mode seems that keeps the same Opus 4.6 model but runs it with much smaller batch sizes: you pay a big premium to avoid queueing/throughput optimization, improving per-user latency/throughput while reducing overall hardware efficiency.
vs 2) OpenAI’s fast mode instead achieves an order-of-magnitude token/s jump by serving a different model (Codex-Spark) on Cerebras wafer-scale chips, whose large on-chip SRAM can keep more of the model in fast memory and avoid weight streaming bottlenecks.
The takeaway seems to be that "fast" can mean either premium low-batch serving of the same model (speed via scheduling/efficiency trade) or specialized hardware enabling a smaller model at extreme speed (speed via architecture/model swap). The business question is whether higher token/s actually helps end-to-end developer productivity when error rates and rework dominate, or whether we'll be able to get our cake and eat it and get both. I feel like we are seeing the new CAP table in the making for ML! |
|
|
|---|
|
Gemini Deep Think 3 Google has released Deep Think 3, which basically takes Deep Research to it's absolute limit, showing performance of 84.5% on ARC (vs ~65 Opus 4.6!), some really exciting insights: Google claims some really huge jumps on hard reasoning benchmarks including 48.4% (no tools) on Humanity’s Last Exam, 84.6% on ARC-AGI-2, Codeforces Elo 3455, and gold-medal–level performance on IMO 2025. From a production ML practitioner lens, it sounds like we can treat this as "reasoning-as-a-component" that could improve complex analysis, code generation for simulation/modeling, and review/verification workflows. It is really surprising to see how fast Google is taking over all other competitors and really leaving them far behind as this is integrated into their entire cloud and workspace environments, it will be interesting to see how the rest replies. |
|
|
|---|
|
Upcoming MLOps Events The MLOps ecosystem continues to grow at break-neck speeds, making it ever harder for us as practitioners to stay up to date with relevant developments. A fantsatic way to keep on-top of relevant resources is through the great community and events that the MLOps and Production ML ecosystem offers. This is the reason why we have started curating a list of upcoming events in the space, which are outlined below.
Events we are speaking at this year:
Other relevant events:
In case you missed our talks, check our recordings below:
|
|
|---|
| | |
Check out the fast-growing ecosystem of production ML tools & frameworks at the github repository which has reached over 20,000 ⭐ github stars. We are currently looking for more libraries to add - if you know of any that are not listed, please let us know or feel free to add a PR. Here's a few featured open source libraries that we maintain: - KAOS - K8s Agent Orchestration Service for managing the KAOS in large-scale distributed agentic systems.
- Kompute - Blazing fast, lightweight and mobile phone-enabled GPU compute framework optimized for advanced data processing usecases.
- Production ML Tools - A curated list of tools to deploy, monitor and optimize machine learning systems at scale.
- AI Policy List - A mature list that maps the ecosystem of artificial intelligence guidelines, principles, codes of ethics, standards, regulation and beyond.
- Agentic Systems Tools - A new list that aims to map the emerging ecosystem of agentic systems with tools and frameworks for scaling this domain
Please do support some of our open source projects by sharing, contributing or adding a star ⭐ |
|
|---|
| | |
| | | | The Institute for Ethical AI & Machine Learning is a European research centre that carries out world-class research into responsible machine learning. | | | | |
|
|
|---|
|
|
This email was sent to You received this email because you are registered with The Institute for Ethical AI & Machine Learning's newsletter "The Machine Learning Engineer" |
| | | | |
|
|
|---|
|
© 2023 The Institute for Ethical AI & Machine Learning |
|
|---|
|
|
|