Useful Formulae
Special Functions
• Factorial For positive integer x integer x,, x! = x( x (x − 1) · · · 1; 0! = 1. 1. • Gamma Function For α > 0 > 0 we define Γ(α) = 0∞ xα−1 e−x dx. dx. For all α > 0 > 0,, Γ(α + 1) = α Γ(α). For positive integer α , Γ(α) = ( α − 1)!. 1)!.
�
• Beta Function De fine, for α > 0 > 0 and β > 0, 0 , B (α, β ) = Γ(α)Γ(β )/Γ(α + β ). • Binomial Coefficients For any real r real r,,
�r x
=
For r > x − 1 this can be written as r !/ [(r [(r − x)!x )!x!] if r r is also an integer.
r 0
� � = 1. For positive integer x integer x,,
r(r − 1) · · · (r − x + 1) . x! Γ(r +
1)/ 1)/ [Γ(r − x + 1)x 1)x!] which in turn can be written as
Basic properties of random variables
• If X is X is has a discrete distribution then E [g (X )] )] =
g(x)P (P (X = x) x). x
• If X is X is has a continuous distribution with density f X then E [[g (X )] )] = X (x) then E • If E ( E (X ) = µ and
Var (X )
= σ 2 then E (a + bX ) = a + bµ and
Var (X )
∞
�
−∞
g (x)f X dx. X (x) dx.
= b 2 σ 2 .
• E (X + any X and Y and Y .. + Y ) Y ) = E ( E (X ) + E (Y ) Y ) for any X •
+ Var (X +
Y ) Y ) = Var (X ) + Var (Y ) Y ) + 2 Cov (X, Y ) Y ).
• If X X and Y and Y are independent then = E X 2
•
Var (X )
•
Y ) Cov (X, Y )
� �
2
)] − [E (X )]
Y ) Cov (X, Y )
= 0.
.
= E (X Y ) Y ) − E (X )E (Y ) Y ).
• Corr (X, Y ) Y ) = Cov (X, Y ) Y )/
Y ). Var (X )Var (Y )
Finite Populations N 1 Let X = { x1 , . . . , xN } denote a population of real numbers with mean µ = N j=1 j =1 x j and (population) N 1 2 variance σ 2 = N j=1 j =1 (x j − µ) . If Y 1 , . . . , Yn denotes a random sample taken from X either with or without replacement then with T = ni=1 Y i , we have E (T ) T ) = nµ. nµ .
Sampling with replacement
If the sampling is taken with replacement then • Y 1 , . . . , Yn are iid; •
Var (T ) T )
= n σ 2 ;
• if all x j ’s are 0 or 1 then with p = µ = µ,, T has T has a binomial distribution and for appropriate t, t , n t n −t P ( P (T = t) t ) = t p (1 − p) p) .
��
8
Sampling without replacement
If the sampling is taken without replacement then •
Var (T )
= n σ
2
; N −n N −1
• if all x j ’s are 0 or 1 then with M = N µ, T has a hypergeometric distribution and for appropriate t, (M )(N −M ) P (T = t) = t N n−t . (n) • if x j = j for j = 1, . . . , N then the i-th smallest observation Y (i) satisfies – P (Y (i) = y) = – E (Y (i) ) =
−y (yi−−11)(N n− i) for appropriate y; N (n)
i(N +1) n+1
and
Var (Y (i) )
=
i(N +1)(n−i+1)(N −n) . (n+1)2 (n+2)
Probability Distributions Discrete Distributions
• Beta Binomial X has a beta-binomial distribution if for positive α , β and integer n, −x−1 (α+xx−1)(β+nn− ) (nx)B(α+x,β +n−x) x P (X = x) = = for x = 0, 1, . . . , n; then E (X ) = np and n−1 B(α,β ) (α+β+ ) n N +n Var (X ) = np(1 − p) N +1 where N = α + β and p = α /N .
• Poisson X has a Poisson distribution if for x = 0, 1, 2, . . ., P (X = x) = e−µ µx /x!, in which case E (X ) = Var (X ) = µ. • Negative Binomial X has a negative binomial distribution if for k > 0, 0 < p < 1 and x = 0, 1, 2 . . ., P (X = x) = x+kx−1 (1 − p)x pk in which case E (X ) = k(1 − p)/p and Var (X ) = k(1 − p)/p2 . For integer values of k, Y = X +k is also called negative binomial. The geometric distribution is negative binomial with k = 1.
�
�
Continuous Distributions
• If X and has density f X (x) and Y = µ + σX then Y has density f Y (y) = σ1 f X
y −µ
� �. σ
1 2
and
Var (X )
= 1.
• Uniform X ∼ U (0, 1) means X has density f X (x) = 1 for 0 < x < 1, 0 otherwise. E (X ) = 1 Var (X ) = 12 . Y ∼ U (a, b) means that (Y − a)/(b − a) ∼ U (0, 1). 1
2
• Normal X ∼ N (0, 1) means X has density f X (x) = (2π )−1/2 e− 2 x . E (X ) = 0 and Y ∼ N (µ, σ 2 ) means (Y − µ)/σ ∼ N (0, 1). • Beta
– U ∼ Beta (α, β ) means X has pdf uα−1 (1−u)β −1 /B(α, β ) over the interval (0, 1) and 0 otherwise,
where the normalising constant is the beta function. – If U ∼ Beta (α, β ) then for appropriate k, E (U k ) = B(α + k, β )/B(α, β ). In particular, E (U ) = α α+β
and V ar(U ) =
αβ
(α+β )2 (α+β +1)
.
– If U 1 , . . . , Un are independent U (0, 1) the i-th order statistic U (i) ∼ Beta (i, n − i + 1).
9
• Gamma – If, for α > 0, X has density f X (x) = x α−1 e−x /Γ(α) for x > 0, 0 otherwise, then we say X has a gamma distribution with shape α and unit rate/scale and we write X ∼ gamma(α, 1). – If X ∼ gamma(α, 1), E (X k ) = Γ(α + k)/Γ(α) (for k > −α) so E (X ) = V ar(X ) = α . – If X 1 ∼ gamma(α1 , 1) and X 2 ∼ gamma(α2 , 1) are independent then X 1 + X 2 ∼ gamma(α1 + α2 , 1). – If X ∼ gamma(α, 1) then ∗ ∗
Y = X/λ is said to be gamma with shape parameter α and rate parameter λ ; Z = β X is said to be gamma with shape parameter α and scale parameter β .
• Exponential is the same as gamma with shape parameter 1. Convergence in Probability
• A sequence of random variables X 1 , X 2 , . . . is said to converge to µ in probability if for any ε > 0, P (|X n − µ| > ε ) → 0 as n → ∞. • If X n
P
→
P
c and a function g(·) is such that limx→c g(x) = � then g(X n ) → �.
• Delta Method If X n mation
P
→
c and g(·) is diff erentiable at c then for large n we may use the approxig(X n ) − g(c) X n − c
≈
g � (c) .
Inequalities
• Markov’s Inequality If X is a random variable only taking non-negative values then for any c > 0, P (X ≥ c) ≤ E (X )/c. • Chebyshev’s Inequality If X has E (X ) = µ and V ar(X ) = σ 2 < P (|X − µ| ≥ kσ ) ≤ 1/k2 .
∞ then
for any k > 0,
• Cauchy-Schwarz Inequality If X and Y are random variables with E (X 2 ) < ∞ and E (Y 2 ) < then [E (XY )]2 ≤ E (X 2 )E (Y 2 ) with equality if and only if Y = cX for some constant c.
∞
• Cramér-Rao Inequality If � �θ (y) denotes the derivative with respect to θ of the log-likelihood for a one-parameter family indexed by θ and t(Y ) is an unbiased estimator of θ then V arθ [t(Y )] ≥
1 V arθ [��θ (Y )]
with equality if and only if � �θ (y) = C θ [t(y) − θ] for some C θ not depending on y.
10