Atlassian 2025 State of DevEx Atlassian’s 2025 DevEx report suggests that GenAI is delivering: 68% of developers save 10+ hours/week, mostly on non-coding work (search, testing, docs, and automation) which sounds optimistic, particularly with many opposing view we've seen in recent reports: One thing is for certain, most of the inefficiencies (and opportunities) are on traditional drives, such as hours lost on organizational overhead. Only ~16% of time is spent coding (!); teams that enable self-service on knowledgesharing are 4.9× more effective and 4.4× more productive/adaptable. Seems measurement of outcomes is moving towards the SPACE framework from Microsoft. Internal dev platforms are now mainstream (74% use, 24% plan) and seems enable organisations across reliability, faster delivery, and lower ops costs. |
|
|
---|
|
GPT-5 Attributes, Pricing + Card Simon Willison puts together another fantastic overview, now of GPT-5 characteristics, model card and pricing nuances: GPT-5 seems like a relative upgrade, however not yet an order of magnitude improvement (and instead more of a improved smart model-router). It provides 272k input context and 128k output (includes hidden reasoning), supporting text+image-in. It is also interesting to see the anger from the community when it was announced that other models will be removed, which seems have triggered backtracking to potentially keep some of these. Pricing seems to be the most aggressive update with $1.25/$10 (input/output per M tokens) for GPT-5, $0.25/$2 for mini, $0.05/$0.40 for nano. Some quality of life features that were interesting are that reasoning traces are now retrievable to balance transparency vs. latency, and system card claims fewer hallucinations and less sycophancy - let's see how this compares to the competition. |
|
|
---|
|
Qwen Image Technical Report Qwen-Image now provides one of the best production-ready models for image-generation models: This is an interesting approach which includes a model trained with a flow-matching objective and a progressive "curriculum" (from non-text to paragraph-level prompts), and enables for edits to preserve both meaning and visual fidelity. The report shows relatively strong results for general creation/editing, as well as pretty impressive text rendering (which is something that normally these models struggle with). |
|
---|
|
Google DeepMind World Model Text-to-3D-world models are now a thing; DeepMind Releases Genie 3 providing an interactive 3D text-to-real-time 3D world generation: This new model generates interactive environments at 720p resolution with 24 fps with minute-scale visual memory. This is an interesting evolution to previous models, we recently saw a similar effort re-enacting an Operating System as a 3D world. Unlike NeRFs/Gaussian Splatting it requires no explicit 3D assets, autoregressively conditioning each frame on the growing action trajectory. There are still some limitations, where the episodes can only last a few minutes, direct action space is limited, multi-agent interaction fidelity is weak, geographic accuracy isn’t guaranteed, and text rendering is brittle unless specified. However this is quite an exciting space that initially I assumed was more of a fun/interesting set of prototypes but seems are actually evolving towards potentially usable resources. |
|
|
---|
|
Raschka on Qwen 3 From Scratch MoE LLMs are how we scale capability without scaling cost. A minimal, Apache-2.0 PyTorch notebook re-implements the Qwen3-30B-A3B MoE model (Coder/Instruct/Thinking) with Llama-3–style components—GQA (32 heads, 4 KV groups), RoPE (θ=1e7) for 262k context, RMSNorm, bf16—and an MoE MLP (128 experts, top-8 routing); config: 48 layers, 2048 dim, head_dim 128, vocab 151,936, with weight tying. It loads official HF safetensors shards and tokenizer, moving experts off “meta” to CPU to cut VRAM; despite a ~114 GB bf16 footprint, it runs on a single 80 GB A100/H100 via CPU offload. The reference favors clarity over speed (the naive “compute all experts” beats sparse dispatch here) and provides greedy, streaming generation without a KV cache (a related KV-cache notebook is ~3× faster). Treat it as a didactic baseline: for production, add KV/paged attention, quantization, fused kernels/Flash-Attn, and a deliberate offload/serving plan to hit real-time throughput and long-context stability. |
|
|
---|
|
Upcoming MLOps Events The MLOps ecosystem continues to grow at break-neck speeds, making it ever harder for us as practitioners to stay up to date with relevant developments. A fantsatic way to keep on-top of relevant resources is through the great community and events that the MLOps and Production ML ecosystem offers. This is the reason why we have started curating a list of upcoming events in the space, which are outlined below. Upcoming conferences where we're speaking: Other upcoming MLOps conferences in 2025:
In case you missed our talks:
|
|
---|
| |
Check out the fast-growing ecosystem of production ML tools & frameworks at the github repository which has reached over 10,000 ⭐ github stars. We are currently looking for more libraries to add - if you know of any that are not listed, please let us know or feel free to add a PR. Four featured libraries in the GPU acceleration space are outlined below. - Kompute - Blazing fast, lightweight and mobile phone-enabled GPU compute framework optimized for advanced data processing usecases.
- CuPy - An implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it.
- Jax - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
- CuDF - Built based on the Apache Arrow columnar memory format, cuDF is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data.
If you know of any open source and open community events that are not listed do give us a heads up so we can add them! |
|
---|
| |
As AI systems become more prevalent in society, we face bigger and tougher societal challenges. We have seen a large number of resources that aim to takle these challenges in the form of AI Guidelines, Principles, Ethics Frameworks, etc, however there are so many resources it is hard to navigate. Because of this we started an Open Source initiative that aims to map the ecosystem to make it simpler to navigate. You can find multiple principles in the repo - some examples include the following: - MLSecOps Top 10 Vulnerabilities - This is an initiative that aims to further the field of machine learning security by identifying the top 10 most common vulnerabiliites in the machine learning lifecycle as well as best practices.
- AI & Machine Learning 8 principles for Responsible ML - The Institute for Ethical AI & Machine Learning has put together 8 principles for responsible machine learning that are to be adopted by individuals and delivery teams designing, building and operating machine learning systems.
- An Evaluation of Guidelines - The Ethics of Ethics; A research paper that analyses multiple Ethics principles.
- ACM's Code of Ethics and Professional Conduct - This is the code of ethics that has been put together in 1992 by the Association for Computer Machinery and updated in 2018.
If you know of any guidelines that are not in the "Awesome AI Guidelines" list, please do give us a heads up or feel free to add a pull request!
|
|
---|
| |
| | The Institute for Ethical AI & Machine Learning is a European research centre that carries out world-class research into responsible machine learning. | | |
|
|
---|
|
|
This email was sent to You received this email because you are registered with The Institute for Ethical AI & Machine Learning's newsletter "The Machine Learning Engineer" |
| | |
|
|
---|
|
© 2023 The Institute for Ethical AI & Machine Learning |
|
---|
|
|
|