Page 18-25
is the probability of success, then the mean value, or expectation, of X is E[X]
= p, and its variance is Var[X] = p(1-p).
If an experiment involving X is repeated n times, and k successful outcomes
are recorded, then an estimate of p is given by p’= k/n, while the standard
error of p’ is
σ
p’
=
√
(p
⋅
(1-p)/n) . In practice, the sample estimate for p, i.e., p’
replaces p in the standard error formula.
For a large sample size, n>30, and n
⋅
p > 5 and n
⋅
(1-p)>5, the sampling
distribution is very nearly normal. Therefore, the 100(1-
α
) % central two-sided
confidence interval for the population mean p is (p’+z
α
/2
⋅σ
p’
, p’+z
α
/2
⋅σ
p’
).
For a small sample (n<30), the interval can be estimated as (p’-t
n-1,
α
/2
⋅σ
p’
,p’+t
n-
1,
α
/2
⋅σ
p’
).
Sampling distribution of differences and sums of statistics
Let S
1
and S
2
be independent statistics from two populations based on
samples of sizes n
1
and n
2
, respectively. Also, let the respective means and
standard errors of the sampling distributions of those statistics be
µ
S1
and
µ
S2
,
and
σ
S1
and
σ
S2
, respectively. The differences between the statistics from the
two populations, S
1
-S
2
, have a sampling distribution with mean
µ
S1
−
S2
=
µ
S1
-
µ
S2
, and standard error
σ
S1
−
S2
= (
σ
S1
2
+
σ
S2
2
)
1/2
. Also, the sum of the statistics
T
1
+T
2
has a mean
µ
S1+S2
=
µ
S1
+
µ
S2
, and standard error
σ
S1+S2
= (
σ
S1
2
+
σ
S2
2
)
1/2
.
Estimators for the mean and standard deviation of the difference and sum of
the statistics S
1
and S
2
are given by:
2
2
2
1
2
1
2
1
2
1
2
1
ˆ
,
ˆ
n
n
X
X
S
S
S
S
S
S
σ
σ
σ
µ
+
=
±
=
±
±
In these expressions,
X
1
and
X
2
are the values of the statistics S
1
and S
2
from
samples taken from the two populations, and
σ
S1
2
and
σ
S2
2
are the variances
of the populations of the statistics S
1
and S
2
from which the samples were
taken.