Stanford AI Index Report Stanford University has released the 2025 AI Index Report with an extensive deep dive into advancements, benchmarks, trends, policy and perception of AI & ML: 1) New benchmarks continue to see significant improvements with more robust, efficient deployment pipelines in production environments. 2) AI is moving well beyond experimental settings across autonomous vehicles to FDA-approved AI medical devices. 3) Private AI investments are at record levels with U.S. investment notably outpacing competitors while China is swiftly closing the gap in model quality. 4) Dramatic drop in inference costs (over 280-fold for systems comparable to GPT-3.5) and ongoing reductions in hardware expenses (approximately 30% per year). 5) A clear acceleration in regulatory activities worldwide with practitioners needing to integrate rigorous Responsible AI evaluations and governance frameworks. There are a large number of great insights on this year's Stanford AI Index Report, make sure to check it out! |
|
|
---|
|
Google's Agent2Agent Protocol Google jumps into the MCP train with a new Agent2Agent protocol drafter together with Cohere, Intuit, Box, DataRobot and dozens of tech giants and organisations. It is interesting to see the rise of LLM-based protocols to enable an open standard for standardised communication among heterogeneous AI agents in production environments. Similar to MCP it is built on top of HTTP / JSON but aims to provide a higher level framework for enabling autonomous agents to collaborate on complex, long-running tasks, enabling: 1) Discoverability through "Agent Cards" and respective agent metadata. 2) Standardized task management, updates, HITL interactions, etc. 3) This time it suggests focus on security explicitly. 4) Accomodating various modalities (audio, video, forms) beyond just text. This is still a nascent field which is reflect by the open gaps and challenges, however these also drive opportunities for innovation such as these initiatives - hopefully we just don't end up with dozens of protocols that aim to standardise all other protocols! |
|
|
---|
|
Yann LeCun on Future of LLMs Yann LeCun argues that Auto-Regressive LLMs are Doomed, and it's hard to disagree: That the current paradigm of auto-regressive large language models has some fundamental (+ well known) flaws for achieving human-level AI despite their impressive performance. As we know these models try to predict each token, which in practice can lead to an exponential buildup of errors, making them unsuitable for tasks that require reliable long-horizon planning and reasoning (this is what we often see on "hallucinations"). Although this is widely known and accepted, we indeed have not yet found a specific alternative that shows as the next-generation path, but it seems that we may soon see some step-change innovation through different architectures. |
|
---|
|
The S in MCP is for Security The “S” in MCP Stands for Security. Spoiler; it doesn’t, but it should: The need to integrate tools and services into LLM agents has led to the creation of protocols like MCP which has aimed to provide a standardised interface to make adoption easier, however this comes with clear security shortcomings that production ML teams must address. 1) MCP implementations lack essential security features like authentication, context encryption, and tool integrity verification. This creates risks when agents connect to unverified or arbitrary servers. 2) Several attack vectors are outlined, including command injection (unsafe shell calls), tool poisoning attacks (hidden malicious instructions), silent redefinition (tools changing behaviour), between many others. 3) Developers should prioritize basic security best practices such as input validation, version pinning, and sanitizing tool metadata, but also at platform and infra level. Overall it is great to see some progress on standardising interfaces for interoperability, however it will be important to see an investment in security by design on upcoming protocols to ensure reliable agentic operations. This is a great article that dives into practical examples and code snippets for each of these. |
|
|
---|
|
DeepMind and GenAI Competition It seems that Google DeepMind may actually be winning on most GenAI fronts: This is a great analysis that reflects on the highs and lows of Google throughout the current AI race, namely on how they initially missed the train but now are catching up and beating the competition across various fronts. Although this seems to make a strong case for Google, it seems to echo the insights we've seen in Stanford's Annual AI Report, namely that the gap between OpenAI and the competition (incl. China) has narrowed down. Having said that, there has been an impressive leap in progress from Google DeepMind on their recent releases such as Gemini 2.5 Pro blowing away benchmarks, showcasing impressive cost/energy/model efficiency, and seamless integrations to the google ecosystem. One thing is clear, that there are surprises at every corner and what may seem as huge moat may actually not be as defensible as expected, so it certainly continues to be an exciting space to keep an eye on. |
|
|
---|
|
Upcoming MLOps Events The MLOps ecosystem continues to grow at break-neck speeds, making it ever harder for us as practitioners to stay up to date with relevant developments. A fantsatic way to keep on-top of relevant resources is through the great community and events that the MLOps and Production ML ecosystem offers. This is the reason why we have started curating a list of upcoming events in the space, which are outlined below. Upcoming conferences where we're speaking: Other upcoming MLOps conferences in 2025:
In case you missed our talks:
|
|
---|
| |
Check out the fast-growing ecosystem of production ML tools & frameworks at the github repository which has reached over 10,000 ⭐ github stars. We are currently looking for more libraries to add - if you know of any that are not listed, please let us know or feel free to add a PR. Four featured libraries in the GPU acceleration space are outlined below. - Kompute - Blazing fast, lightweight and mobile phone-enabled GPU compute framework optimized for advanced data processing usecases.
- CuPy - An implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it.
- Jax - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
- CuDF - Built based on the Apache Arrow columnar memory format, cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data.
If you know of any open source and open community events that are not listed do give us a heads up so we can add them! |
|
---|
| |
As AI systems become more prevalent in society, we face bigger and tougher societal challenges. We have seen a large number of resources that aim to takle these challenges in the form of AI Guidelines, Principles, Ethics Frameworks, etc, however there are so many resources it is hard to navigate. Because of this we started an Open Source initiative that aims to map the ecosystem to make it simpler to navigate. You can find multiple principles in the repo - some examples include the following: - MLSecOps Top 10 Vulnerabilities - This is an initiative that aims to further the field of machine learning security by identifying the top 10 most common vulnerabiliites in the machine learning lifecycle as well as best practices.
- AI & Machine Learning 8 principles for Responsible ML - The Institute for Ethical AI & Machine Learning has put together 8 principles for responsible machine learning that are to be adopted by individuals and delivery teams designing, building and operating machine learning systems.
- An Evaluation of Guidelines - The Ethics of Ethics; A research paper that analyses multiple Ethics principles.
- ACM's Code of Ethics and Professional Conduct - This is the code of ethics that has been put together in 1992 by the Association for Computer Machinery and updated in 2018.
If you know of any guidelines that are not in the "Awesome AI Guidelines" list, please do give us a heads up or feel free to add a pull request!
|
|
---|
| |
| | The Institute for Ethical AI & Machine Learning is a European research centre that carries out world-class research into responsible machine learning. | | |
|
|
---|
|
|
This email was sent to You received this email because you are registered with The Institute for Ethical AI & Machine Learning's newsletter "The Machine Learning Engineer" |
| | |
|
|
---|
|
© 2023 The Institute for Ethical AI & Machine Learning |
|
---|
|
|
|