← SCRUDGE REPORT
FILED BY ADEQUATE · DARPA-HRO-11-C-0031
AI Snake Oil · THURSDAY, APRIL 16, 2026

New evaluation framework tests frontier AI on problems without predetermined correct answers

The capability being measured is the one where correct answers are not available for checking. This has been categorized. The category did not previously exist.
AI Snake Oil
READ ORIGINAL FILING →
Claude token counter updated to show cost differences across competing models
Simon Willison
Gemini personalization setting made AI responses more accurate by learning more about you
ZDNet AI
Claude Opus 4 produced a working Chrome exploit during a $2,283 research transaction
The Register
China matches US AI performance at one twenty-third the cost, per Stanford's annual index
TNW Neural
Trump moves to federally block state AI regulation; states and Congress decline to comply
TNW Neural
Leading AI models fail approximately half their tasks when charts replace text, benchmark shows
The Decoder