You play a game flipping a fair coin. You may stop after any trial, at which point you are paid in dollars

the percentage of heads flipped. So if on the first trial you flip a head, you should stop and earn $100

because you have 100% heads. If you flip a tail then a head, you could either stop and earn $50,

or continue on, hoping the ratio will exceed 1/2. This second strategy is superior.A paper by Medina and Zeilberger (arXiv:0907.0032v2 [math.PR]) says that it is an unsolved

problem to determine if it is better to continue or stop after you have flipped 5 heads in 8 trials: accept $62.50 or hope for more. It is easy to simulate this problem and it is clear

from even limited experimental data that it is better to continue (perhaps more than 70% chance you’ll improve over $62.50):

My question is basically: Why is this difficult to prove? Presumably it is not that difficult

to write out an expression for the expectation of exceeding 5/8 in terms of the cumulative binomial distribution.

(

5 Dec 2013).

A paper on this topic was just published:Olle Häggström, Johan Wästlund.

“Rigorous computer analysis of the Chow-Robbins game.”

(pre-journal arXiv link).

The American Mathematical Monthly, Vol. 120, No. 10, December 2013.

(Jstor link).

From the Abstract:“In particular, we confirm that with 5 heads and 3 tails, stopping is optimal.”

**Answer**

I accept Qiaochu’s answer “Have you tried actually doing that?” I did try now, and now I can appreciate the challenge. 🙂 The paper I cited refers to another by Chow and Robbins from 1965

that has a beautiful formulation, much clearer than the cummulative binomials with which I struggled. Let me explain it, because it is really cool.

For the natural strategy I mentioned in the comments (and echoed by Raynos),

let f(n,h,t) be the expected payoff

if you start with h heads and t tails, and let the game continue no more than n further trials. Then there is an easy recursive formulation for f:

f(n,h,t) = \max \left( \frac{1}{2} f(n,h+1,t) + \frac{1}{2} f(n,h,t+1) , h/(h+t) \right)

because you have an equal chance of increasing to h+1 heads or to t+1 tails on the next

flip if you continue, and you get the current ratio if you stop.

Now, when h+t=n, you need to make a “boundary” assumption. Assuming the law of

large numbers for large n leads to the reasonable value \max ( 1/2, h/n ) in this case.

So now all you need to do is fill out the table for all h and t values up to n=h+t.

The real answer is the limit when

n \rightarrow \infty, but using large n approximates this limit.

After the Medina and Zeilberger paper was released, in fact just about three weeks ago,

a very careful calculation using the above recursive formulation was made by Julian Wiseman and reported on this web page. The conclusion is to me remarkable: “Choosing to continue lowers the expected payoff [from 0.625] to 0.62361957757.”

This is still not a proof, but the “answer” is now known.

So my “it is clear from even limited experimental data that” was completely wrong!

I am delighted to learn from my mistake.

**Attribution***Source : Link , Question Author : Joseph O’Rourke , Answer Author : Joseph O’Rourke*