跳至主要内容

信賴集合估計(Confidence Sets Estimation)

我們有 nn 個數據 X~=(X1,,Xn)f(x~;θ)\utilde{X}=(X_1, \cdots, X_n)\sim f(\utilde{x};\theta) with θΩRr,r1\theta\in\Omega\subset\R^r, r\ge 1,並且我們對 η(θ):ΩRm,mr\eta(\theta):\Omega\to\R^m,m\le r (通常 m=1m=1) 感興趣。

e.g. N(μ,σ2),θ=(μ,σ2)N(\mu, \sigma^2), \theta=(\mu, \sigma^2)

    η(θ)=μη(θ)=ση(θ)=logμη(θ)=σ2η(θ)=μσ\begin{alignat*}{3} \implies &\eta(\theta)=\mu&\qquad&\eta(\theta)=\sigma &\qquad& \eta(\theta)=\log|\mu|\\ &\eta(\theta)=\sigma^2&\qquad&\eta(\theta)=\frac{\mu}{\sigma} &\qquad& \cdots \end{alignat*}

在數據 X~\utilde{X} (r.v) 下,η(θ)\eta(\theta) 的集合估計是指在 Rm(=η(Ω))\R^m(=\eta(\Omega)) 下找到一個子集 C(X~)C(\utilde{X}) 使得

θPθ(η(θ)C(X~))=r[0,1]\forall \theta \quad P_\theta\left(\eta(\theta)\in C(\utilde{X})\right)=r\in[0,1]

而當得到實際數據 X~=x~\utilde{X}=\utilde{x} 時,我們稱有 r 的信心,未知量 η(θ)C(x~)\eta(\theta)\in C(\utilde{x}) 。因為當數據確定下來時,η(θ)\eta(\theta) 是否在 C(x~)C(\utilde{x}) 也是確定的,只是我們不知道。


EX X1,,XniidN(μ,σ02)X_1,\cdots, X_n\overset{\text{iid}}{\sim}N(\mu, \sigma^2_0)

    n(Xˉμ)σ0N(0,1)\implies\frac{\sqrt{n}(\bar{X}-\mu)}{\sigma_0}\sim N(0,1)     μR1α=Pμ(zα/2<n(Xˉμ)σ0<zα/2)=Pμ(Xˉzα/2σ0n<μ<Xˉ+zα/2σ0n)=Pμ(μ(Xˉzα/2σ0n,Xˉ+zα/2σ0n))\begin{align*} \implies \forall \mu\in\R\quad 1-\alpha&=P_\mu\left(-z_{\alpha/2}<\frac{\sqrt{n}(\bar{X}-\mu)}{\sigma_0}<z_{\alpha/2}\right)\\ &=P_\mu\left(\bar{X}-\frac{z_{\alpha/2}\sigma_0}{\sqrt{n}}<\mu<\bar{X}+\frac{z_{\alpha/2}\sigma_0}{\sqrt{n}}\right) &=P_\mu\left(\mu\in(\bar{X}-\frac{z_{\alpha/2}\sigma_0}{\sqrt{n}}, \bar{X}+\frac{z_{\alpha/2}\sigma_0}{\sqrt{n}})\right) \end{align*}     Pμ(μC(X~))=1αμ where C(X~)=[Xˉ±zα/2σ0n]\implies P_\mu(\mu\in C(\utilde{X}))=1-\alpha\quad \forall\mu\text{ where }C(\utilde{X})=[\bar{X}\pm\frac{z_{\alpha/2}\sigma_0}{\sqrt{n}}]
Definition

X~f(x~;θ)\utilde{X}\sim f(\utilde{x};\theta) where θ\theta is the true parameter.

Pθ(η(θ)C(X~))P_\theta(\eta(\theta)\in C(\utilde{X}))

is the coverage probability(涵蓋幾率) of C(X~)C(\utilde{X}) for η(θ)\eta(\theta).

我們當然會希望 conv. prob. 越大越好,但按照這個想法 C(X~)=η(Ω)C(\utilde{X})=\eta(\Omega) 一定會是最好的.但 η(Ω)\eta(\Omega) 在實踐中是沒用的,因此我們需要另一種評判標準。我們希望在 conv. prob. 相同的情況下,C(X~)C(\utilde{X}) 越小越好。

我們計算 C(X~)C(\utilde{X}) 會覆蓋所有錯誤點的幾率,即

Pθ(η(θ)C(X~))θθP_\theta(\eta(\theta^*)\notin C(\utilde{X})) \quad \forall \theta^*\neq\theta

這個幾率越小越好,我們稱之為 false cov. prob。而當 C(X~)=η(Ω)C(\utilde{X})=\eta(\Omega) 時,error prob. = 1.

Remark:

我們會希望 cov. prob. 越大越好,而覆蓋到在意的 θθ\forall \theta^*\neq\theta 的幾率越小越好。其中:

  • 對於雙邊區間 C(X~)=[L(X~),U(X~)]C(\utilde{X})=[L(\utilde{X}), U(\utilde{X})],我們在意 θθ\forall \theta^*\neq\theta
  • 單邊區間 C(X~)=[L(X~),)C(\utilde{X})=[L(\utilde{X}), \infty),我們在意 θθ\forall \theta^*\neq\theta with η(θ)<η(θ)\eta(\theta^*)<\eta(\theta)
  • 單邊區間 C(X~)=(,U(X~)]C(\utilde{X})=(-\infty, U(\utilde{X})],我們在意 θθ\forall \theta^*\neq\theta with η(θ)>η(θ)\eta(\theta^*)>\eta(\theta)

因為 cov. prob. 和 false cov. prob. 是互斥的關係。想要 cov. prob. 最大化,那麼我們會取 C(X~)=η(Ω)C(\utilde{X})=\eta(\Omega)。想要 false cov. prob. 最小化,那麼我們會取 C(X~)=η()C(\utilde{X})=\eta(\empty)。因此我們需要一個平衡點。

我們首先會希望 cov. prob 至少要大於某個信賴係數,然後再盡可能讓 false cov. prob. 最小化。

Definition

C(X~)C(\utilde{X}) is a conf. set for η(θ)\eta(\theta)

  1. Pθ(η(θ)C(X~)),θΩP_\theta(\eta(\theta)\in C(\utilde{X})), \forall\theta\in\Omega is the coverage probability(涵蓋幾率) of C(X~)C(\utilde{X})

  2. C(X~)C(\utilde{X}) is called a 1α1-\alpha conf. set for η(θ)\eta(\theta) if

    infθΩPθ(η(θ)C(X~))= conf. coef of C(X~)=1α,α[0,1] \begin{align*} \inf_{\theta\in\Omega}P_\theta(\eta(\theta)\in C(\utilde{X})) &= \text{ conf. coef of }C(\utilde{X})\\ &= 1-\alpha, \quad \alpha\in [0,1] \end{align*}
  3. A 1α1-\alpha conf. set C(X~)C^*(\utilde{X}) is called a uniformly most accurate (UMA) 1=α1=\alpha for η(θ)    \eta(\theta)\iff

    1. infθΩPθ(η(θ)C(X~))=1α\inf_{\theta\in\Omega}P_\theta(\eta(\theta)\in C^*(\utilde{X}))=1-\alpha

    2. Pθ(η(θ)C(X~)Pθ(η(θ))C(X~)),θθP_\theta(\eta(\theta^*)\in C^*(\utilde{X})\le P_\theta(\eta(\theta^*))\in C(\utilde{X})), \forall\theta^*\neq\theta relevant

      1α\forall 1-\alpha conf. set C(X~)C(\utilde{X}) for η(θ)\eta(\theta)

  4. A 1α1-\alpha conf. set C(X~)C(\utilde{X}) for η(θ)\eta(\theta) is unbiased     1α\iff 1-\alpha\ge relevant false cov. prob.

    i.e. 1αPθ(η(θ)C(X~)),θθ1-\alpha\ge P_\theta(\eta(\theta^*)\in C(\utilde{X})), \forall\theta^*\neq\theta relevant

  5. A 1α1-\alpha conf. set C(X~)C^*(\utilde{X}) for η(θ)\eta(\theta) is UMAU 1α1-\alpha conf. set if C(X~)C^*(\utilde{X}) is UMA among unbiased 1α1-\alpha conf. set.

EX X1,,XniidN(μ,σ02)X_1, \cdots, X_n\overset{\text{iid}}{\sim}N(\mu, \sigma^2_0) of interest μ\mu

recall: pointest for μ\mu: Xˉ\bar{X}(UMVUE, MLE, MOME, Minimax)

But Pμ(Xˉ=μ)=0,μP_\mu(\bar{X}=\mu)=0, \forall\mu     \implies idea: μ[Xˉ±c]=C(X~),c>0\mu\in[\bar{X}\pm c]=C(\utilde{X}), c>0 given, with positive prob. of being correct.

cov. prob of [Xˉ±c]=Pμ(μ[Xˉ±c])=Pμ(μcXˉμ+c)=P(n(μcμ)σ0Zn(μ+cμ)σ0)=Φ(cnσ0)Φ(cnσ0)=2Φ(cnσ0)1μ\begin{align*} \text{cov. prob of }[\bar{X}\pm c]&=P_\mu(\mu\in[\bar{X}\pm c])\\ &=P_\mu(\mu-c\le\bar{X}\le\mu+c)\\ &=P(\frac{\sqrt{n}(\mu-c\mu)}{\sigma_0}\le Z\le\frac{\sqrt{n}(\mu+c\mu)}{\sigma_0})\\ &=\Phi(\frac{c\sqrt{n}}{\sigma_0})-\Phi(-\frac{c\sqrt{n}}{\sigma_0})\\ &=2\Phi(\frac{c\sqrt{n}}{\sigma_0})-1\quad\forall\mu \end{align*}     infμRPμ(μ[Xˉ±c])=2Φ(cnσ0)1\implies \inf_{\mu\in\R}P_\mu(\mu\in[\bar{X}\pm c])=2\Phi(\frac{c\sqrt{n}}{\sigma_0})-1

is the conf. coef. of [Xˉ±c][\bar{X}\pm c]

e.g. n=4,σ0=1,c=1n=4, \sigma_0=1, c=1

    2Φ(121)1=2Φ(2)1=20.97721=0.9544\implies 2\Phi(\frac{1\cdot 2}{1})-1=2\Phi(2)-1=2\cdot 0.9772-1=0.9544

i.e. [Xˉ±1][\bar{X}\pm 1] is a 95.44% conf. set for μ\mu

If want to have conf. coef. = 1α1-\alpha

2Φ(cnσ0)1=1α    c=σ0nzα/2\because 2\Phi(\frac{c\sqrt{n}}{\sigma_0})-1=1-\alpha\implies c=\frac{\sigma_0}{\sqrt{n}}z_{\alpha/2}

    [Xˉ±σ0nzα/2]\implies [\bar{X}\pm\frac{\sigma_0}{\sqrt{n}}z_{\alpha/2}] is a 1α1-\alpha conf. set for μ\mu

If now, σ02\sigma^2_0 is unknown, θ=(μ,σ02),η(θ)=θ\theta=(\mu, \sigma^2_0), \eta(\theta)=\theta

Conf. coef. of [Xˉ±c]=infθΩPθ(μ[Xˉ±c])=infσ0>0[2Φ(cnσ0)1]=0\begin{align*} \text{Conf. coef. of }[\bar{X}\pm c]&=\inf_{\theta\in\Omega}P_\theta(\mu\in[\bar{X}\pm c])\\ &=\inf_{\sigma_0>0}[2\Phi(\frac{c\sqrt{n}}{\sigma_0})-1]=0 \end{align*}

i.e. [Good point±c][\text{Good point}\pm c] 可能並不能得到一個好的結果。


EX X1,,XniidN(μ,σ02)X_1, \cdots, X_n\overset{\text{iid}}{\sim}N(\mu, \sigma^2_0). Given C(X~)=[Xˉσ0nZα,]C(\utilde{X})=[\bar{X}-\frac{\sigma_0}{\sqrt{n}}Z_\alpha, \infty]

Pμ(μC(X~))=Pμ(μXˉσ0nZα)=Pμ(n(Xˉμ)σ0n(μ+σ0nZαμ)σ0)=P(ZZα)=1α\begin{align*} P_\mu(\mu\in C(\utilde{X}))&=P_\mu(\mu\ge\bar{X}-\frac{\sigma_0}{\sqrt{n}}Z_\alpha)\\ &=P_\mu(\frac{\sqrt{n}(\bar{X}-\mu)}{\sigma_0}\le \frac{\sqrt{n}(\mu+\frac{\sigma_0}{\sqrt{n}}Z_\alpha-\mu)}{\sigma_0})\\ &=P(Z\le Z_\alpha)\\ &=1-\alpha \end{align*}

    C(X~)\implies C(\utilde{X}) is 1α1-\alpha conf. lower limit for μ\mu

Is it unbiased?

Pμ(μC(X~))μ<μ=Pμ(μXˉσ0nZα)=P(ZZα+nσ0(μμ)<0)<1α\begin{align*} &P_\mu(\mu^*\in C(\utilde{X}))\quad\forall\mu^*<\mu\\ =&P_\mu(\mu^*\ge\bar{X}-\frac{\sigma_0}{\sqrt{n}}Z_\alpha)\\ =&P(Z\le Z_\alpha+\underbrace{\frac{n}{\sigma_0}(\mu^*-\mu)}_{<0})\\ <&1-\alpha \end{align*}