A couple of weeks ago I hit a small math problem related to probabilistic and I thought it was trival (it took me a few hours to solve, though :p ). However I could not find a good stack exchange post or whatever that answers my problem, so I put it here for the sake of my future reference and possibly some people who happen to hit the same problem.

Note that the "short answer" itself really is short and trival, but the point of this post is the last section that describes the relationship between the "short" and the "long" answers.

# Problem Definition

Let the probability that a coin shows its head in a coin toss be *x* (0 < *x* < 1) and the expected number of heads after *N* tosses be *M*, compute *x* using *N* and *M*.

# Short Answer

The expected number of heads for **one** toss is obvisouly *x*. Because the expected value of the sum of two independent probablistic variables is equal to the sum of the expected value of each probablistic variable (E[X + Y] = E[X] + E[Y]), the expected number of heads for *N* tosses is *Nx*.

Therefore, and .

# Long Answer (kind of)

By definition, the expected number of heads *M* can be expressed using *x* and *N* as follows:

where is the probability that exactly n coins show their heads and defined as:

Therefore,

We can compute *x* from *M* and *N* by solving this equation for *x*..... but how??? (this is an N-th order equation of *x*!)

# What's Behind

Both the short and the long answers say correct stuff, which means one thing:

And below is a proof:

where and . Note that . The term after is equal to 1 as shown below:

# Let's Check Numerically

For those who may suspect the proof above, here are *M*s calculated by the short and the long answers for *N = 100* (there are small differences by numerical errors, but they are pretty much the same).

x |
M_short |
M_long |

0.0 |
0.0 |
0.0 |

0.04 |
4.0 |
3.9999999999999862 |

0.08 |
8.0 |
8.000000000000032 |

0.12 |
12.0 |
11.999999999999998 |

0.16 |
16.0 |
15.99999999999996 |

0.2 |
20.0 |
20.00000000000012 |

0.24 |
24.0 |
23.999999999999993 |

0.28 |
28.000000000000004 |
28.0 |

0.32 |
32.0 |
31.99999999999981 |

0.36 |
36.0 |
35.99999999999999 |

0.4 |
40.0 |
39.99999999999999 |

0.44 |
44.0 |
44.00000000000023 |

0.48 |
48.0 |
47.99999999999998 |

0.52 |
52.0 |
52.0 |

0.56 |
56.00000000000001 |
56.00000000000001 |

0.6 |
60.0 |
59.99999999999997 |

0.64 |
64.0 |
64.0 |

0.68 |
68.0 |
68.0 |

0.72 |
72.0 |
71.99999999999999 |

0.76 |
76.0 |
75.99999999999997 |

0.8 |
80.0 |
80.0 |

0.84 |
84.0 |
84.00000000000001 |

0.88 |
88.0 |
88.00000000000001 |

0.92 |
92.0 |
92.00000000000001 |

0.96 |
96.0 |
96.0 |

Here is the code used to generate the table above. Note that M_long(x) does not work for a large *N* due to overflow.

import scipy.special as sp
N = 100
def M_short(x):
return N * x
def M_long(x):
ans = 0
for n in range(1, N+1):
ans += n * sp.comb(N, n) * pow(x, n) * pow(1 - x, N-n)
return ans
print("|x|M_short|M_long|")
print("|-|-------|------|")
for i in range(0, 25):
x = i / 25
print("|", x, "|", M_short(x), "|" , M_long(x), "|")