Methodology

How the scoring works

Every prediction has two parts: the call and the confidence behind it.

Picking a winner is only half the test. The more interesting question is whether your confidence was earned. A 55% call and a 95% call should not be treated the same when they are wrong.

That is what this experiment scores.

The Brier score

The Brier score measures the distance between your stated confidence and the result.

Brier score = (predicted probability - actual outcome)^2

The actual outcome is simple:

1 if your prediction happened
0 if it did not

Lower is better. A perfect prediction scores 0.

Two examples

70% confident and correct

(0.70 - 1)^2 = 0.09

Good call. You were confident, and the result backed you up.

90% confident and wrong

(0.90 - 0)^2 = 0.81

That is the expensive miss. The prediction was wrong, but the real problem was the confidence.

The Calibration Score

The Brier score is the math. The Calibration Score is the easier version to read.

Calibration Score = max(0, round(100 × (1 - average Brier score)))

Higher is better.

A high score means your confidence generally matched reality. A low score means your confidence and reality were out of sync.

Why this matters

Most prediction games reward the final answer. This one also grades the certainty behind it.

That matters because the same failure shows up everywhere: sports predictions, market calls, executive decisions, AI outputs, security alerts, medical triage, policy claims, and everyday judgment.

Being wrong is normal. Being very sure and wrong is different.

The Calibration Cup turns that difference into a receipt.