DiffusionGemma 26B 登陸 M2 Max:MLX 吞吐量實測與 Context 極限挑戰

為了找到一些在地端也能讓 Agent 有無限 token 自由的毒駕的方法,原本用手邊的M4 24GB Mac 上嘗試執行 DiffusionGemma 26B,卻悲慘的連 1,000 tokens 的 Context 都撐不住,直接迎來 OOM(記憶體不足)的悲劇。

換到 M2 Max 96GB 後,終於可以展現出它應有的實力? 我改用MLX(mlx-vlm 0.6.3),過程中雖然踩了 MXFP4 的量化 Bug 並手動處理了 Patch,但最後成功在4-bit 格式下跑完整套 Benchmark。

本文記錄這幾天 DiffusionGemma 26B 在 Apple Silicon 上的吞吐量極限、Prompt 載入成本、以及 Context 長度與對記憶體的代價,同時,我們也會拿這些實測數據來作為後續 GH200 與 GB10 跨平台效能對比的 Baseline…

Read more →
Portrait Generation Benchmark Q1 2026: Flux.2 vs SDXL vs Proprietary

Every quarter, we benchmark every major image generation model against real production workloads from our platform. Not synthetic tests, actual jobs from customers generating AI headshots at scale.

This quarter, we tested 8 models across 12,000 inference jobs , scoring each on quality (FID, CLIP, human eval), cost per image, and p95 latency. Here’s the full breakdown.

Why We…

Read more →
Model Showdown Round 7: Five Local Models vs. One Cloud Model on a Real Coding Task

Five local models. One frontier cloud model. The same coding task. Zero hand-holding.

Only two shipped code. One of them was the cloud model.

Part of my goal with this series is to continuously test the viability and maturity of local models. I've done it for basic agentic tasks. Today we're revisiting coding tasks.

What did we learn?

Local models are not ready — yet. At least not for…

Read more →
Vincent Bernat: Scaling Akvorado BMP RIB with sharding

To associate routing information—like AS paths or BGP communities—to flows, Akvorado can import routes through the BGP Monitoring Protocol (BMP). As the Internet routing table contains more than 1 million routes, Akvorado needs to scale to tens of millions of routes.1 This has been a long-standing challenge,2 but I expect this issue is now fixed by using RIB sharding, a method that splits…

Read more →
C
ASUS Zenbook A16 – A $1699 Qualcomm Snapdragon X2 Elite Extreme CoPilot+ laptop

ASUS Zenbook A16 is one of the first Copilot+ PCs/laptops based on the Qualcomm Snapdragon X2 Elite Extreme 18-core Armv9 SoC and is now available for $1,699 on BestBuy or $1,999 on the ASUS website. The laptop features a 16-inch “3K” OLED with touchscreen, 48GB of RAM, a 1TB NVMe SSD, HDMI 2.1 video output, WiFi 7 and Bluetooth connectivity, and a few Thunderbolt and USB ports. ASUS Zenbook A16…

Read more →
C
Entry-level Intel Core 3 304 Wildcat Lake processor details and benchmarks surface

Intel discreetly introduced the Wildcat Lake Core Series 3 CPUs at CES 2026 with some high-level specifications, but no part numbers or exact CPU, GPU, and NPU frequencies. Some leaks reveal more details about the Wildcat Lake parts, notably the Intel Core 3 304 penta-core processor, for which we also have some benchmarks. Intel Wildcat Lake SKUs The data below comes from a post by Jaykihn on X.…

Read more →
IEEE Partners With Academia to Create Microcredential Programs

The rapid ascent of artificial intelligence and semiconductor manufacturing has created a paradox: Industries are booming yet they face a critical shortage of skilled workers. Demand for data center technicians, fabrication facility workers, and similar positions is growing. There aren’t enough candidates with the right skill sets to fill the in-demand jobs.

Although those technical roles…

Read more →
Page 1