Anon sent a question about Bertrand’s Paradox. The paradox is supposed to show something has gone wrong with our thinking in probability. And it has, but not in the way its proponents imagine.
There’s a video at Aeon which explains a simplified version of the paradox. Watch it if you can’t follow my written explanation. Here’s their introduction:
The unresolved probability paradox that goes to the heart of scientific objectivity
The principle of indifference states that, without any evidence, all potential outcomes should be considered equally probable. For example, if there’s a 10-horse race then, without any additional information, one should assume that each horse has a 1-in-10 chance of winning. It’s an important epistemological principle at the foundation of probability that might seem as safe and sound as it is obvious. But, as this video from Wireless Philosophy (Wi-Phi) lays out, a paradox first described by the French mathematician Joseph Bertrand in 1889 can make starting from a position of true indifference impossible. And, because probability is at the core of almost every scientific field, this paradox has rippled through science for more than a century, leaving in its wake disagreements, workarounds and, so far, no clear solution.
The simplification goes like this. Imagine a factory that makes boxes, anywhere from 1 foot to 3 feet in length. This is our evidence, from which we must deduce our probabilities. Call it “E” for short.
What is the probability box’s length is between 1 and 2 ft?
Pr(length between 1 and 2 ft| E) = 1/2.
We get that from reasoning the length can be 1 to 3, 2 is halfway, and there we go. The video, and others, say the 1/2 deduction comes from implicit premise which is the “principle of indifference”. Meaning, here, that we are indifferent (whatever that might mean) to boxes of any length, and knowing only E—rather, E augmented by the principle of indifference.
Easy enough, no? Now the twist. What is the probability a box’s area is between 1 and 4 square feet? Well, a box of length 1 ft has 1 ft^2 area. And a box 2 ft in length has 4 ft^2 area. So maybe
Pr(area between 1 and 4 ft^2| E*) = 1/2,
where the * in E indicates we’re guessing.
Alas, no. For consider the areas of boxes can be between 1 ft^2 and 9 ft^2, and the distance between 1 and 9 is 8. And 4 is only 3/8 of the way from 1 to 9. Here, in case you can’t see that:
1 2 3 4 5 6 7 8 9.
So the principle of indifference answer is
Pr(area between 1 and 4 ft^2| E) = 3/8.
Yet asking the chance a box is between 1 and 2 ft in length is identical, and this is true, and must be, with asking the chance a box is between 1 and 4 ft^2 in area. But the probabilities aren’t the same.
Conclusion? Probability is broken! All is subjective!
But, no.
What most don’t notice is there is yet another hidden or tacit premise that causes all to go awry. The video maker almost noticed it: came damn close. But just as he keyed in on it, he was distracted to other matters.
The problem, or tacit premise, is infinity. Not only is infinity unimaginably huge and mysterious, it is itself of different sizes. Which size are we using in the box example? Don’t know. It’s never specified. But it’s sort of assumed—another tacit premise—that’s it’s not the so-called counting infinity, nor it is power sets of real numbers, or still others, but the infinity of the continuum. One of many infinities, and a common one in math.
Infinity is like a teeming metropolis, in the sense that the road you ride to get there takes you to different neighborhoods. Infinity is not a point, in this sense, but a place. Length Road will lead you to the downscale 1/2 neighborhood, whereas Area Boulevard brings you to the tonier 3/8. The trick to get the same answers is to take the right road.
Now if you don’t understand that, think of it this way. No possible factory can cut an infinite number of lengths, no matter which infinity you’re using. All possible box lengths are finite and discrete in actuality. And we can certainly only measure to finite and discrete levels.
Let’s take a side-on close up of a box’s material that is three feet in length, here drawn using my masterful Gimp skills.
For whatever reasons of physics, mechanics, and materials, the factory can only cut lengths of 1 ft, 2 ft, or 3 ft. That is, the box that comes out can be measured down to some finite, discrete level, which here turns out to be 1 foot chunks. There is nothing special in this number; I could have made it thousandths of a inch, or millionths, or whatever, as we’ll see. But whole numbers are easy to work with.
We need to augment our E with knowledge of our finite discrete limitations. Call that evidence, for shorthand, F.
What is the probability box’s length is between 1 and 2 ft?
Pr(length between 1 and 2 ft| F) = 2/3.
What is the probability a box’s area is between 1 and 4 square feet?
Pr(area between 1 and 4 ft^2| F) = 2/3.
Let’s walk through the answer using a modified picture of our box (I am available for all art awards).
The thin blue lines indicate the only possible box length cuts. We can, due to the limitations noted, only cut a length of 1 ft, or a length of 2 ft, or a length of 3 ft. Between 1 and 2 ft (inclusive) is 2 out of 3. And the probability is 2/3.
The thin red lines indicate the only possible box area cuts. We can, due to the same limitations, only cut an area of 1 ft^2, or an area of 4 ft^2, or an area of 9 ft^2. Between 1 and 4 ft^2 (inclusive) is again 2 out of 3. And the probability is again 2/3.
Both match, and there is no crisis.
You will have noticed that neither probability is 1/2. This is crucial. Here’s a third box, which is probably more realistic upon viewing our box in a microscope.
There was nothing in our premises that said the cuts must be perfect and uniform. Real materials have flaws and inconsistencies. In this case, we can get lengths of 2 ft or 1 ft, maybe with some sanding at the end. Three ft is easy enough, too. But boxes are out for all but two lengths.
Looks like we can get a box with length 2.5 ft, with corresponding area 6.25 ft^2. And we can get a box with length 3 ft and area 9 ft^2. No other boxes look possible—without surgery.
What is the probability box’s length is between 1 and 2 ft?
Pr(length between 1 and 2 ft| F’) = 0.
What is the probability a box’s area is between 1 and 4 square feet?
Pr(area between 1 and 4 ft^2| F) = 0.
Neither can be done.
Well, this is a crude picture, which we can refine by sharpening our saws and perfecting our sanding. In the end, though, we will still be left, no matter what, with a box that is constructed out of finite discrete measurable parts.
That being so, there will never be an inconsistency in probabilities. This is easy to prove. Suppose, without (as they say) loss of generality, the box comes in chunks of 1/n, and that all chunks are uniform. Let’s skip writing the units, as they don’t really matter.
We can get boxes of length 1/n, area (1/n)^2, or length 2/n, area (2/n)^2, and so on, up to length 1, area 1^2 (multiplying this by a constant, like 3 to match the first example, doesn’t change anything). As long as the length and area in the probability question is in this set, then there is never be a paradox. (You can’t ask for lengths .3/n or 1.7/n or whatever, because these can’t be made.)
Call our new information G.
What is the probability box’s length is between 1/n (the minimum) and 1/2 (the middle)?
Pr(length between 1/n and 1/2| G) = 1/2.
What is the probability a box’s area is between (1/n)^2 and (1/2)^2, which is the equivalent question (as it must be) in whatever units we use?
Pr(area between (1/n)^2 and (1/2)^2| F) = 1/2.
All found by simple counting.
Now let n grow, and grow as large as you like, except don’t let it hit infinity. Let it be 10 to the 10 a million times, and raise all that to the power of 10 to the 10 a million times, and keep doing that 10 to the 10 a million times, and then multiply all this by 2.
This, you will agree, is a very large number. More than there are particles in the universe. No boxes can be made this big. But, mathematically, there is no difficulty. And there is no paradox. All the probabilities work and match.
As large as number is, infinity is still, well, infinitely far away. To get there and save probability, you have to take a road that allows the length and area to stick together. Allow them to separate, even for an instant, and the whole thing is wrecked.
Incidentally 1: Jaynes provides an elegant example of the path to infinity in his Probability Theory: The Logic of Science, and I describe this situation in more detail in my own Uncertainty.
Incidentally 2: most problems in statistics, like parameterized models, priors, and all that, suffer the same kind of travels to infinity made in Bertrand’s paradox.
Buy my new book and learn to argue against the regime: Everything You Believe Is Wrong.
Subscribe or donate to support this site and its wholly independent host using credit card or PayPal click here