The Joule Index
An Independent Benchmark · Reported May 2026
Benchmark report · Reported May 2026

The only AI benchmark where every score is auditable.

Real May 2026 open-source bug fixes. Real published model prices. Real maintainer-merge-ready diffs. Four axes (dollars, joules, attention, accessibility) on a single chart, with full observational traces published under the same disclosure rules MLCommons established for MLPerf Power.

9 of 9 runs · attention F1
1.000
3 real OSS tasks across 3 Dropstone tiers
Heavy vs Fast joule premium
7.5×
for identical engineering output
Mean $ / task · Fast
$0.082
DeepSeek V4-Flash via Dropstone CLI

The finding · Reported May 2026

Across three model tiers and three real bugs, nine runs produced the same merged diff.

Blankline's research team ran Dropstone CLI on three real May 2026 open-source bug fixes: tiny RSS routes in DIYgod's RSSHub(a 44,000-star aggregator) and an 8-file refactor to Mozilla's Common Voice bundler. Every run produced a Pull Request that matched the diff a real maintainer had merged into production.

The cheapest tier did it for $0.082 per task and 224 joules. The flagship tier did the same work for $0.857 per task and 1,693 joules. Same diff. Same merge-readiness.

On this evidence, the premium is paying for compute, not capability.


Figure 1 · The headline image

All nine runs shipped working code. The only thing that changed was the bill.

Each marker is one (task × tier) run. Lower-left is better. Brighter cyan = cheaper tier.

$0.01$0.10$1100 J1k JClaude Haiku 4.5Claude Opus 4.7Claude Sonnet 4.6Dropstone FastCURRENT LEADERDropstone HeavyDropstone ProGemini 3.1 FlashGemini 3.1 ProDOLLARS PER MERGE-READY PR (LOG)JOULES PER TASK (LOG)
Source · The Joule Index · Blanklinejoule.blankline.org · Reported May 2026

Verified

Auditable by default

Every score on the leaderboard carries a sanitized observational trace. Tokens, costs, joules, file diffs. Anyone can re-score.

Live OSS

Real bugs, last week

Tasks are real GitHub issues filed and merged within the last 30 days. Contamination mathematically prevented.

Four axes

One chart, four readers

Dollars for the CFO. Joules for the climate scientist. Attention for the ML researcher. Accessibility for the median human.


The civilizational question

The cost of intelligence is the price of admission to civilization's next era.

The race for AGI is real. The questions everyone asks are when? and who?. The questions almost nobody asks are at what cost? and to whom is it accessible?

Every joule of intelligence the species generates has to come from a power grid. Every dollar a procurement team spends on AI is a dollar that does not go to housing, healthcare, or science. Every architectural choice frontier labs make, whether to cache prompts, how to price tiers, or how to disclose energy, shapes the affordability of the next decade of cognitive labor.

The Joule Index is the only benchmark that holds those choices accountable in the open.

Read the long argument →