Brown Caught 50 Students Using ChatGPT On One Midterm. Here’s What Actually Failed.
Key Takeaways – Roberto Serrano, Brown University economist, flagged at least 50 students using ChatGPT on the midterm–...
GLM-5.2 Benchmarks Beat GPT-5.5 at One-Sixth the Cost. Here’s Why That Matters.
TL;DR – GLM-5.2 scored 77.8 on SWE-bench Verified and 56.2 on Terminal Bench 2.0, surpassing every open-weight model...
Ford AI Quality Control Failure: Why 350 Human Inspectors Came Back
Key Takeaways – Ford’s bet-everything-on-AI quality control strategy imploded. Billions in recalls. Defects everywhere. Course correction took three...
The AI Gap Won’t Close in 2026. Here’s Why the Viral Chart Was Wrong.
TL;DR – Doubleword.ai ran all benchmarks from Artificial Analysis. Average lag between open-weight and closed-source LLMs sits at...
GPT-5.6 Sol Hits 750 Tokens/Second. What Actually Changes for You
Three hundred a month for Grok access felt steep until I did the math on AI infrastructure at...
How AI Read a Herculaneum Scroll: Vesuvius Challenge
How AI Read a Herculaneum Scroll: Vesuvius Challenge Breakthrough TL;DR – Vesuvius Challenge read an unopened Herculaneum scroll...
IBM Built a 100B-Transistor Chip. Five Years Is Fine
TL;DR– IBM crammed 100 billion transistors into something smaller than your fingernail using 3D nanostack tech.– That means...
OpenAI’s Jalapeño Chip Cracked Nine Months. Here’s the Real Story.
TL;DR – OpenAI and Broadcom pushed Jalapeño from empty whiteboard to manufacturing tape-out in nine months. The fastest...
Anthropic Caught Alibaba Stealing Claude Interactions
Key Takeaways – Alibaba ran a campaign generating a large number of Claude interactions via fraudulent accounts, according...
VibeThinker-3B Matches 671B Models on Math. Here’s What That Actually Means for You
Key Takeaways– VibeThinker-3B scores 94.3 on AIME26 (97.1 with test-time scaling). Matching larger models– At roughly 6GB in...