A couple of weeks ago I hit a small math problem related to probabilistic and I thought it was trival (it took me a few hours to solve, though :p ). However I could not find a good stack exchange post or whatever that answers my problem, so I put it here for the sake of my future reference and possibly some people who happen to hit the same problem.
Note that the "short answer" itself really is short and trival, but the point of this post is the last section that describes the relationship between the "short" and the "long" answers.
Problem Definition
Let the probability that a coin shows its head in a coin toss be x (0 < x < 1) and the expected number of heads after N tosses be M, compute x using N and M.
Short Answer
The expected number of heads for one toss is obvisouly x. Because the expected value of the sum of two independent probablistic variables is equal to the sum of the expected value of each probablistic variable (E[X + Y] = E[X] + E[Y]), the expected number of heads for N tosses is Nx.
Therefore,
and
.
Long Answer (kind of)
By definition, the expected number of heads M can be expressed using x and N as follows:

where
is the probability that exactly n coins show their heads and defined as:

Therefore,

We can compute x from M and N by solving this equation for x..... but how??? (this is an N-th order equation of x!)
What's Behind
Both the short and the long answers say correct stuff, which means one thing:

And below is a proof:



where
and
. Note that
. The term after
is equal to 1 as shown below:


Let's Check Numerically
For those who may suspect the proof above, here are Ms calculated by the short and the long answers for N = 100 (there are small differences by numerical errors, but they are pretty much the same).
x |
M_short |
M_long |
0.0 |
0.0 |
0.0 |
0.04 |
4.0 |
3.9999999999999862 |
0.08 |
8.0 |
8.000000000000032 |
0.12 |
12.0 |
11.999999999999998 |
0.16 |
16.0 |
15.99999999999996 |
0.2 |
20.0 |
20.00000000000012 |
0.24 |
24.0 |
23.999999999999993 |
0.28 |
28.000000000000004 |
28.0 |
0.32 |
32.0 |
31.99999999999981 |
0.36 |
36.0 |
35.99999999999999 |
0.4 |
40.0 |
39.99999999999999 |
0.44 |
44.0 |
44.00000000000023 |
0.48 |
48.0 |
47.99999999999998 |
0.52 |
52.0 |
52.0 |
0.56 |
56.00000000000001 |
56.00000000000001 |
0.6 |
60.0 |
59.99999999999997 |
0.64 |
64.0 |
64.0 |
0.68 |
68.0 |
68.0 |
0.72 |
72.0 |
71.99999999999999 |
0.76 |
76.0 |
75.99999999999997 |
0.8 |
80.0 |
80.0 |
0.84 |
84.0 |
84.00000000000001 |
0.88 |
88.0 |
88.00000000000001 |
0.92 |
92.0 |
92.00000000000001 |
0.96 |
96.0 |
96.0 |
Here is the code used to generate the table above. Note that M_long(x) does not work for a large N due to overflow.
import scipy.special as sp
N = 100
def M_short(x):
return N * x
def M_long(x):
ans = 0
for n in range(1, N+1):
ans += n * sp.comb(N, n) * pow(x, n) * pow(1 - x, N-n)
return ans
print("|x|M_short|M_long|")
print("|-|-------|------|")
for i in range(0, 25):
x = i / 25
print("|", x, "|", M_short(x), "|" , M_long(x), "|")