We Backtested Buying the Favorite on a Calibrated Prediction Market — Here's Why It Loses

June 24, 2026 · 9 min read

Here is a tempting idea. On a prediction market, teams priced around 87% to win actually win about 86% of the time. The market is well-calibrated — its prices are honest probabilities. So just buy the favorites. You'll win the overwhelming majority of your bets. How could a strategy that wins 86% of the time lose money?

We tested exactly this on real, outcome-labeled data, and the answer is that it loses — not because of bad luck, but because of a structural fact that's worth internalizing once and never forgetting: a well-calibrated market is a fairly-priced market, and a fairly-priced market is the anti-signal for a directional bet. The better the calibration, the more certain the loss.

The data

We used a cross-venue "matched book" — the same MLB games captured live on both Polymarket and Kalshi, stitched onto one timeline, with the eventual winner labeled on every row. The honest sample is small and we'll say so up front: 41 settled game-sides across about 30 games over 3 days. (The raw file has 57,757 rows, but those are the same books re-snapshotted every ~31 seconds — quoting that as your sample size is the most common way prediction-market analyses fool themselves. The real unit is the game.)

First we confirmed the premise. The closing prices are well-calibrated: the Brier score is about 0.136 on Polymarket and 0.133 on Kalshi — comfortably below the 0.25 you'd get from a coin flip — and favorites priced ~87% won ~86%. So the premise of the strategy is true. Now watch it not matter.

The one-line proof

Let p be the price (in probability units) and assume the market is perfectly calibrated, so the true win probability is also p. You buy one contract at the ask, hold to settlement, and receive $1 if the team wins, $0 if it loses. Your expected profit is:

E[profit] = (prob win)·($1 − ask) + (prob lose)·(−ask)
          = p·(1 − ask) − (1 − p)·ask
          = p − ask
          = −(half-spread)        ← because a calibrated ask ≈ p + half-spread

The price terms cancel. At every price — favorite, underdog, or coin flip — the calibrated taker's expected edge is exactly minus the half-spread he crossed, before a cent of platform friction. There is no p where this flips positive. Win rate and expected value have decoupled. The 86% win rate is real, but you already paid for all of it in the 87¢ ask: you're buying 13¢ of upside for 87¢-plus-spread, and 14% of the time you lose the whole stake.

The backtest agrees

Across all 41 game-sides, buying the tracked side at the ask returned a gross +1.8¢ per contract — which looks like a faint pulse until you put error bars on it. The game-level bootstrap 95% confidence interval runs from −9.5¢ to +13.3¢ (a permutation test gives p = 0.76 — pure noise). And that's before trading costs. After a realistic round-trip friction on top of the spread, it's negative either way:

Cost on top of the spread-inclusive askNet EV per contract
+3¢ (friendly)−1.2¢
+5¢ (realistic on thin books)−3.2¢

"But the heavy favorites won every single time!" They did — ten-of-ten in the most extreme bucket. Their ask was about 0.98. You pay 98¢ to win a dollar, so after 5¢ of friction that "perfect" bucket nets −2.62¢. A 100% hit rate at a 98¢ price is not an edge; it's the market doing its job.

Where the "winning" buckets came from

Every positive-looking slice we found dissolved under scrutiny, and it's worth seeing how, because these are the exact traps that make a backtest lie to you:

The production proof

We don't have to argue this in theory, because we ran the strategy with real money. Our own live trading bot takes directional positions on these markets, and on its last 115 settled trades it posted a closing-line value of −6.7¢, beat the close only 42% of the time, and ran breakeven-to-negative on ROI. That's the theorem realized in production: directional taking on a calibrated market bleeds the spread. The most useful thing our bot ever taught us is that it can't beat the market it trades.

So where does a real edge live?

If the level is right, the only places left for an edge are narrow, and we can see them closing on our own data:

The takeaway

A high win rate on a well-calibrated market is the price, not the edge. If anything, discovering that a market is beautifully calibrated is the strongest possible evidence that there's no directional trade in it. The signal that actually separates skill from luck isn't your win rate — it's closing-line value: did you get a better price than the market's final word? Win rate flatters; CLV tells the truth.

See the calibration analysis and the raw data

ZenHodl publishes the full write-up plus a free, outcome-labeled Polymarket × Kalshi sample you can re-run these tests on yourself — no marketing mockups, real settled rows.

Read the calibration study →