跳至主要内容

統計推論(Statistical Inferences)

統計量(Statistics)

Definition

T:RnRmT:\R^n\to \R^m with T(X~)=(T1(X~),T2(X~),Tm(X~))T(\utilde{X})=(T_1(\utilde{X}),T_2(\utilde{X})\ldots,T_m(\utilde{X})) is called statistics of X~=(X1,X2,,Xn)\utilde{X}=(X_1,X_2,\ldots,X_n).

Usually, mnm\le n.

  • eg. 觀察到樣本 X1X10X_1\ldots X_{10},定義統計量 T(X~)=(1ni10Xi,1ni10Xi2)T(\utilde{X})=(\frac{1}{n}\sum_i^{10}X_i,\frac{1}{n}\sum_i^{10}X_i^2),i.e. T:R10R2T: \R^{10}\to\R^2

充分統計量(Sufficient Statistics)

  • Idea: 認為樣本 X~\utilde{X} 包含了所有關於參數 θ\theta 的信息。我們希望使用的統計量能夠保留所有關於參數 θ\theta 的信息。
Definition

A stat T=T(X~)T=T(\utilde{X}) is sufficient for θ\theta (or sufficient for F\mathscr{F}, where F={f(:θ):θΩ}\mathscr{F}=\set{f(:\theta):\theta\in\Omega})

iff the conditional distribution of X~\utilde{X} given T(X~)T(\utilde{X}) does not depend on θ\theta.

i.e. Pθ(X~x~T=t)θP_\theta(\utilde{X}\le\utilde{x}|T=t)\perp\theta

θ\theta 相關的信息都通過 TT 提供了,所以 Pθ(X~x~T=t)P_\theta(\utilde{X}\le\utilde{x}|T=t) 不會隨著 θ\theta 變化。

Remark

我們很難通過定義找到充分統計量

  1. Guess a T=T(X~)T=T(\utilde{X}) (直覺)
  2. Compute fX~T(x~t)f_{\utilde{X}|T}(\utilde{x}|t) (可能涉及複雜計算)
Theorem

分解定理(Factorization Theorem)

Let X~=(X1,,Xn)\utilde{X}=(X_1,\ldots,X_n) be a random sample with pdf f(x~;θ),θΩ=Rrf(\utilde{x};\theta),\theta\in\Omega=\R^r.

A stat T=T(X~)=(T1(X~),,Tm(X~))T=T(\utilde{X})=(T_1(\utilde{X}),\ldots,T_m(\utilde{X})) is sufficient for θ\theta     \iff \exists functions g(t;θ)0g(t;\theta)\ge 0 and h(x~)h(\utilde{x}) such that f(x~;θ)=g(t;θ)h(x~)f(\utilde{x};\theta)=g(t;\theta)h(\utilde{x}).

x~Rn,θΩ,t=T(x~)\forall \utilde{x}\in\R^n, \theta\in\Omega, t=T(\utilde{x})

f(x~;θ)=g(t;θ)h(x~)f(\utilde{x};\theta)=g(t;\theta)h(\utilde{x})     \iff T=T(X~)T=T(\utilde{X}) is sufficient for θ\theta.

    \implies if T=H(u)T=H(u) with u=u(X~)u=u(\utilde{X}), then

f(x~;θ)=g(t;θ)h(x~)=g(u;θ)h(x~)\begin{align*} f(\utilde{x};\theta)&=g(t;\theta)h(\utilde{x})\\ &=g^*(u;\theta)h(\utilde{x}) \end{align*}

    u=u(X~)\implies u=u(\utilde{X}) is also sufficient for θ\theta.

and if T=T(X~)11T(X~)T^*=T^*(\utilde{X})\xleftrightarrow{1-1} T(\utilde{X}), then f(x~;θ)=g(t;θ)h(x~)f(\utilde{x};\theta)=g^*(t^*;\theta)h(\utilde{x}).

    T\implies T^* is also sufficient for θ\theta.

Corollary
  1. TT is sufficient for θ\theta and T=H(T)T^*=H(T) with HH is 1-1     \implies TT^* is also sufficient for θ\theta.
  2. TT is sufficient for θ\theta, uu is statistic and T=H(u)T=H(u) with some function HH     \implies uu is also sufficient for θ\theta.

EX: X1,,XniidB(1,p)X_1,\ldots,X_n\stackrel{\text{iid}}{\sim} B(1,p) independent, pΩ=(0,1)p\in\Omega=(0,1)

f(x~;p)=pinxi(1p)ninxig(t;p)h(x~),p(0,1)f(\utilde{x};p)=p^{\sum_i^n x_i}(1-p)^{n-\sum_i^n x_i}\triangleq g(t;p)h(\utilde{x}), \forall p\in(0,1)

with t=inxit=\sum_i^n x_i.

    T=T(X~)=i=1nXi\implies T=T(\utilde{X})=\sum_{i=1}^n X_i is sufficient for pp.

    Xˉ=1ni=1nXi\implies \bar{X}=\frac{1}{n}\sum_{i=1}^n X_i is also sufficient for pp.

    eXˉ\implies e^{\bar{X}} is also sufficient for pp.

or

f(x~;p)=px1+i=2nxi(1p)n(x1+i=2nxi)g(t;p)h(x~),p(0,1)f(\utilde{x};p)=p^{x_1+\sum_{i=2}^n x_i}(1-p)^{n-(x_1+\sum_{i=2}^n x_i)}\triangleq g(t;p)h(\utilde{x}), \forall p\in(0,1)

with t=(t1,t2)=(x1,i=2nxi),g(t;p)=g(t1,t2;p)=pt1+t2(1p)n(t1+t2)t=(t_1,t_2)=(x_1,\sum_{i=2}^n x_i), g(t;p)=g(t_1,t_2;p)=p^{t_1+t_2}(1-p)^{n-(t_1+t_2)}

    T=(T1,T2)=(X1,i=2nXi)\implies T=(T_1, T_2)=(X_1,\sum_{i=2}^n X_i) is sufficient for pp.


EX: X1,,XniidP(λ)X_1,\ldots,X_n\stackrel{\text{iid}}{\sim} P(\lambda) independent, λΩ=(0,)\lambda\in\Omega=(0,\infty)

f(x~;λ)=enλλinxiΠinxi!g(t;λ)h(x~),λ(0,)f(\utilde{x};\lambda)=\frac{e^{-n\lambda}\lambda^{\sum_i^n x_i}}{\Pi_i^n x_i!}\triangleq g(t;\lambda)h(\utilde{x}), \forall \lambda\in(0,\infty)

with t=inxit=\sum_i^n x_i is suff for λ\lambda,     Xˉ\implies \bar{X} is also suff for λ\lambda.


EX: X1,,XniidN(μ,σ2)X_1,\ldots,X_n\stackrel{\text{iid}}{\sim} N(\mu,\sigma^2) independent, (μ,σ)Ω=R×(0,)(\mu,\sigma)\in\Omega=\R\times(0,\infty)

f(x~;μ,σ2)=(12πσ)ne12σ2in(xiμ)2f(\utilde{x};\mu,\sigma^2)=(\frac{1}{\sqrt{2\pi}\sigma})^ne^{-\frac{1}{2\sigma^2}\sum_i^n(x_i-\mu)^2}
  • μ=μ0\mu=\mu_0 known, θ=σ2\theta=\sigma^2 unknown

    f(x~;σ2)=(12πσ)ne12σ2in(xiμ0)21g(t;σ2)h(x~) f(\utilde{x};\sigma^2)=(\frac{1}{\sqrt{2\pi}\sigma})^ne^{-\frac{1}{2\sigma^2}\sum_i^n(x_i-\mu_0)^2}*1\triangleq g(t;\sigma^2)h(\utilde{x})

    with t=in(xiμ0)2t=\sum_i^n(x_i-\mu_0)^2 is suff for σ2\sigma^2.

  • σ2=σ0\sigma^2=\sigma_0 known, θ=μ\theta=\mu unknown

    f(x~;μ)=(12πσ0)ne12σ02in(xiμ)2=exp(12σ02(2μinxi+nμ2))(12πσ02)ng(t;μ)h(x~)\begin{align*} f(\utilde{x};\mu)=(\frac{1}{\sqrt{2\pi}\sigma_0})^ne^{-\frac{1}{2\sigma_0^2}\sum_i^n(x_i-\mu)^2}&=\exp(-\frac{1}{2\sigma_0^2}(-2\mu\sum_i^n x_i+n\mu^2))(\frac{1}{\sqrt{2\pi}\sigma_0^2})^n\\&\triangleq g(t;\mu)h(\utilde{x}) \end{align*}

    with t=i=1nxit=\sum_{i=1}^nx_i is suff for μ\mu and Xˉ\bar{X} is also suff for μ\mu.

  • θ=(μ,σ2)\theta=(\mu,\sigma^2) unknown

    f(x~;μ,σ2)=(12πσ)ne12σ2in(xiμ)2=(1σ)nexp(n12σ2S2n2σ2(Xˉμ)2)(12π)ng(t;μ,σ2)h(x~)\begin{align*} f(\utilde{x};\mu,\sigma^2)&=(\frac{1}{\sqrt{2\pi}\sigma})^ne^{-\frac{1}{2\sigma^2}\sum_i^n(x_i-\mu)^2}\\ &=(\frac{1}{\sigma})^n\exp(-\frac{n-1}{2\sigma^2}S^2-\frac{n}{2\sigma^2}(\bar{X}-\mu)^2)(\frac{1}{\sqrt{2\pi}})^n\\ &\triangleq g(t;\mu,\sigma^2)h(\utilde{x}) \end{align*}

    with t=(t1,t2)=(Xˉ,S2)t=(t_1, t_2)=(\bar{X}, S^2),     T=(Xˉ,S2)\implies T=(\bar{X},S^2) is suff for θ=(μ,σ2)\theta=(\mu, \sigma^2).

備註

S2=1n1i=1n(XiXˉ)2=1n1(i=1nXi2nXˉ2)S^2=\frac{1}{n-1}\sum_{i=1}^n (X_i-\bar{X})^2=\frac{1}{n-1}(\sum_{i=1}^n X_i^2-n\bar{X}^2) and Xˉ=1ni=1nXi\bar{X}=\frac{1}{n}\sum_{i=1}^n X_i

    (Xˉ,S2)11(Xˉ,i=1nXi2)11(Xi,Xi2)\implies(\bar{X}, S^2)\xleftrightarrow{1-1}(\bar{X}, \sum_{i=1}^nX_i^2)\xleftrightarrow{1-1}(\sum X_i, \sum X_i^2)

EX: X1,,XniidU(0,θ),θ>0X_1,\ldots,X_n\stackrel{\text{iid}}{\sim} U(0,\theta), \theta>0

f(x~;θ)=Πi=1n1θI(0,θ)(xi)=1θnI(x(n)θ)I(x(1)0)g(t;θ)h(x~)\begin{align*} f(\utilde{x};\theta)&=\Pi_{i=1}^n\frac{1}{\theta}I_{(0,\theta)}(x_i)\\ &=\frac{1}{\theta^n}I(x_{(n)}\le \theta)I(x_{(1)}\ge 0)\\ &\triangleq g(t;\theta)h(\utilde{x}) \end{align*}

with t=x(n)t=x_{(n)}     T=X(n)\implies T=X_{(n)} is suff for θ\theta.


EX: X1,,XniidU(α,β),α<βX_1,\ldots,X_n\stackrel{\text{iid}}{\sim} U(\alpha,\beta), \alpha<\beta

  • θ=α\theta=\alpha unkonwn, β\beta known

        X(1)\implies X_{(1)} is suff for α\alpha.

  • θ=β\theta=\beta

        X(n)\implies X_{(n)} is suff for β\beta.

  • θ=(α,β)\theta=(\alpha,\beta) unknown

        (X(1),X(n))\implies (X_{(1)}, X_{(n)}) is suff for θ\theta.

最小充分統計量(Minimal Sufficient Statistics)

Definition

A sufficient stat TT^* is called minimal sufficient for θ\theta

iff \forall suff stat TT for θ\theta, \exists function hh such that T=h(T)T^*=h(T).

最小充分統計量可以從其他任何充分統計量歸納得到,並且無法進一步歸納(縮減信息)。即信息量在足夠充分的情況下是最小的。

危險

最小統計量不是指統計量的維度最小,而是信息量最小。

Theorem

Let X~iidf(x~;θ)\utilde{X}\stackrel{\text{iid}}{\sim} f(\utilde{x};\theta), θΩ\theta\in\Omega, suppose a\exist a stat T=T(X~)T=T(\utilde{X}) s.t. f(x~;θ)f(y~;θ)θ    T(x~)=T(y~)\frac{f(\utilde{x};\theta)}{f(\utilde{y};\theta)}\perp\theta\iff T(\utilde{x})=T(\utilde{y})

then T=T(X~)T=T(\utilde{X}) is minimal sufficient for θ\theta.