Every prediction market trader wants to pick winners. But picking winners isn't enough to be profitable. What matters is whether your predicted probabilities are calibrated — meaning when you say 70%, the event actually happens 70% of the time.
This distinction between accuracy and calibration is the single most important concept in prediction market trading. Get it right and you have a systematic edge. Get it wrong and you'll lose money even while picking more winners than losers.
The Accuracy Trap
Consider two models predicting NBA games:
| Model | Prediction | Actual Outcome | Profitable? |
|---|---|---|---|
| Model A | Boston 90% | Boston wins (90% of time) | Only if entry < 90c |
| Model B | Boston 72% | Boston wins (72% of time) | Only if entry < 72c |
Model A looks more confident and "accurate." But if the market is pricing Boston at 85 cents, Model A says buy (90% > 85%), while Model B says pass (72% < 85%). Model A is overconfident and will lose money at 85c. Model B is correctly calibrated and avoids the trap.
The key insight: a calibrated 72% prediction is more valuable than an overconfident 90% prediction, because it correctly identifies when the market price is too high.
Measuring Calibration: ECE
Expected Calibration Error (ECE) is the standard metric. It groups your predictions into probability buckets (50-60%, 60-70%, etc.) and measures how far the actual win rate in each bucket is from the predicted probability.
Bucket Predicted Actual WR Gap
50-60% 55.2% 57.1% 1.9pp (good)
60-70% 65.1% 63.8% 1.3pp (good)
70-80% 74.3% 71.0% 3.3pp (ok)
80-90% 84.7% 69.2% 15.5pp (BAD - overconfident!)
ECE = weighted average of gaps = 5.5pp
An ECE under 5 percentage points means your model is well-calibrated. Under 2pp is excellent. Above 10pp means your edge calculations are unreliable.
Edge = Fair Value − Market Price
Once you have calibrated probabilities, edge calculation is simple: your fair value minus the market's ask price. If your model says 72 cents fair and the market asks 60 cents, you have a 12-cent edge.
But not all edges are created equal. Live trading data from 400+ bot trades reveals a counterintuitive pattern:
| Edge Size | Win Rate | Avg P&L per Trade |
|---|---|---|
| 5-12 cents | 64-67% | +4 to +8 cents |
| 12-22 cents | 58-62% | +5 to +18 cents |
| 23-58 cents | 44.7% | Negative |
The largest "edges" have the worst win rate. Why? Because a 40-cent edge usually means the model is wrong, not the market. The market has information your model doesn't — injuries, lineup changes, weather, sharp money. When your model and the market disagree by a huge amount, the market is usually right.
Closing Line Value: The Ultimate Edge Metric
Closing Line Value (CLV) measures whether you're consistently entering at better prices than the final market price. If you buy at 60 cents and the line closes at 65 cents, you have +5c CLV. This means the market moved toward your model's fair value after you entered — confirming your edge was real.
A positive average CLV across hundreds of trades is the strongest evidence that your model has genuine predictive power. Even if individual trades lose, positive CLV means you're systematically finding mispriced contracts.
The Adverse Selection Problem
Here's why backtests lie: your model might be perfectly calibrated on a random sample of games (ECE = 0.002), but the games where you actually trade are not random. You only trade when the model disagrees with the market. In those specific disagreements, the market might be right more often than your model.
This is called adverse selection, and it's the reason live trading performance is always worse than backtest performance. The fix is a live recalibrator — an isotonic regression layer that learns the mapping from your model's raw predictions to actual outcomes on traded games, and adjusts future predictions accordingly.
Practical Filters That Work
Based on live trading data across 1,000+ trades, these filters consistently improve profitability:
- Period filters: Late-game trades outperform early-game. In MLB, innings 4, 6, and 7 are profitable while inning 5 loses money. In NHL, the 1st period (72.7% WR) destroys the 2nd period (52.2%).
- Toss-up exclusion: Skip contracts priced 45-55 cents. These are genuinely uncertain outcomes where fees and slippage eat any edge.
- Minimum fair probability: Requiring the model to assign at least 63-65% fair value filters out the lowest-confidence predictions that tend to be wrong.
- Spread filter: Never trade when the bid-ask spread exceeds 6-8 cents. Wide spreads signal low liquidity and high execution risk.
- Score differential: In basketball, requiring at least a 3-point lead eliminates tied/close games where the model has least conviction.
Get calibrated win probabilities for 10 sports. ECE under 1pp. Ready-to-trade API.
Try ZenHodl Free →Building Your Edge Stack
A profitable prediction market trading system combines:
- A calibrated model (ECE < 5pp) as the probability source
- Signal filters tuned on live data, not just backtests
- A live recalibrator correcting for adverse selection
- Execution optimized for speed (sub-second signal-to-order)
- Position sizing proportional to edge confidence, not edge size
- CLV tracking to continuously validate that your edge is real
The most common mistake is optimizing for win rate instead of calibration. A 55% win rate with calibrated probabilities will outperform a 65% win rate with overconfident probabilities, because the calibrated model knows exactly when to bet and how much.