跳至主要内容

不等式(Inequalities)

Holder Inequality

Lemma

Let a>0,b>0a>0, b>0 and p>1,q>1p>1, q>1, where 1p+1q=1\frac{1}{p}+\frac{1}{q}=1

    1pap+1qbqab\implies \frac{1}{p}a^p+\frac{1}{q}b^q \ge ab, where equality holds     ap=bq\iff a^p=b^q

X,YX, Y are r.v.'s, take

a=X[E(Xp)]1/p,b=Y[E(Yq)]1/qin the lemmaa=\frac{|X|}{[E(|X|^p)]^{1/p}},\qquad b=\frac{|Y|}{[E(|Y|^q)]^{1/q}} \quad \text{in the lemma}     1p(X[E(Xp)]1/p)p+1q(Y[E(Yq)]1/q)qX[E(Xp)]1/pY[E(Yq)]1/qboth sideexpectation1=1p+1qEXY[EXp]1/p[EYq]1/q\begin{align*} \implies & \frac{1}{p}\left(\frac{|X|}{[E(|X|^p)]^{1/p}}\right)^p+\frac{1}{q}\left(\frac{|Y|}{[E(|Y|^q)]^{1/q}}\right)^q \ge \frac{|X|}{[E(|X|^p)]^{1/p}}\frac{|Y|}{[E(|Y|^q)]^{1/q}} \\ \xRightarrow[\text{both side}]{\text{expectation}}& 1 =\frac{1}{p}+\frac{1}{q}\ge \frac{E|XY|}{[E|X|^p]^{1/p}[E|Y|^q]^{1/q}} \\ \end{align*}
Theorem

Holder's Inequality

p>1,q>1p>1, q>1 where 1p+1q=1\frac{1}{p}+\frac{1}{q}=1

E[XY][EXp]1p[EYq]1qE[|XY|] \le [E|X|^p]^{\frac{1}{p}}[E|Y|^q]^{\frac{1}{q}}

where equality holds     P(XpE[Xp]=YqE[Yq])=1\iff P(\frac{|X|^p}{E[|X|^p]}=\frac{|Y|^q}{E[|Y|^q]})=1 (almost surely)

Y=1Y=1 in Holder's Inequality, we get

E[X][EXp]1pXXrE[Xr][EXpr]1p=[E(Xs)]rs,sr    [E(Xr)]1r[E(Xs)]1s,sr    g(r)[E(Xr)]1r is monotonically increasing\begin{align*} & E[|X|] \le [E|X|^p]^{\frac{1}{p}} \\ \xRightarrow{|X|\to |X|^r} & E[|X|^r] \le [E|X|^{pr}]^{\frac{1}{p}} = [E(|X|^s)]^\frac{r}{s},\quad s\ge r\\ \implies & [E(|X|^r)]^\frac{1}{r}\le [E(|X|^s)]^\frac{1}{s},\quad s\ge r\\ \implies & g(r) \triangleq [E(|X|^r)]^\frac{1}{r} \text{ is monotonically increasing} \end{align*}

因此,高次動差存在,可以保證低次動差存在。

Theorem

Cauchy-Schwarz Inequality

Let p=q=2p=q=2, in Holder's Inequality, we get

[E(XY)]2E[X2]E[Y2][E(|XY|)]^2\le E[X^2]E[Y^2]

Let XXE[X],YYE[Y]X\to X-E[X], Y\to Y-E[Y]

E[(XE[X])(YE[Y])]E[(XE[X])(YE[Y])][E(XE[X])2]12[E(YE[Y])2]12=σ2(X)σ2(Y)=σ(X)σ(Y)\begin{align*} |E[(X-E[X])(Y-E[Y])]| &\le E[|(X-E[X])(Y-E[Y])|]\\ &\le [E(X-E[X])^2]^\frac{1}{2}[E(Y-E[Y])^2]^\frac{1}{2}\\ &= \sqrt{\sigma^2(X)\sigma^2(Y)}\\ &= \sigma(X)\sigma(Y) \end{align*}

i.e. Cov(X,y)σ(X)σY    ρX,Y1|Cov(X,y)|\le\sigma(X)\sigma{Y}\iff|\rho_{X,Y}|\le 1

Theorem

Minkowski's Inequality

[EX+Yp]1p[EXp]1p+[EYp]1p,p1[E|X+Y|^p]^\frac{1}{p}\le [E|X|^p]^\frac{1}{p}+[E|Y|^p]^\frac{1}{p},\quad p\ge 1

Jensen's Inequality

Theorem

Jensen's Inequality

Any r.v. X, if gg is a convex function, then E[g(X)]g(E[X])E[g(X)]\ge g(E[X])

Equality holds     P(g(x)=a+bx)=1\iff P(g(x)=a+bx)=1

Proof: Let l(x)l(x) be the tangent line to the graph of g(x)g(x) at the point (E[X],g(E[X]))(E[X], g(E[X])). Note that E[X]E[X] is a constant

i.e. l(x)=a+bxl(x)=a+bx, s.t. l(E[X])=a+bE[X]=g(E[X])l(E[X])=a+bE[X]=g(E[X]) and l(x)g(x),xl(x)\le g(x), \forall x, since g(x)g(x) is convex.

l(x)g(x)g(E[X])=E[l(X)]E[g(X)]\begin{align*} \because & l(x)\le g(x)\\ \therefore & g(E[X])=E[l(X)]\le E[g(X)]\\ \end{align*}

柴比雪夫不等式(Tchebycheff's Inequality)

σ2=E[(Xμ)2]=E[(Xμ)2I(Xμε)]+E[(Xμ)2I(Xμ<ε)]E[(Xμ)2I(Xμε)]ε2E[I(Xμε)](Xμ)2ε2 when Xμε=ε2P(Xμε)    P(Xμε)σ2ε2\begin{align*} \sigma^2 =& E[(X-\mu)^2]\\ =& E[(X-\mu)^2I(|X-\mu|\ge\varepsilon)]+E[(X-\mu)^2I(|X-\mu|<\varepsilon)]\\ \ge& E[(X-\mu)^2I(|X-\mu|\ge\varepsilon)]\\ \ge& \varepsilon^2E[I(|X-\mu|\ge\varepsilon)] \quad \because (X-\mu)^2\ge\varepsilon^2 \text{ when } |X-\mu|\ge\varepsilon\\ =& \varepsilon^2P(|X-\mu|\ge\varepsilon)\\ \implies& P(|X-\mu|\ge\varepsilon)\le\frac{\sigma^2}{\varepsilon^2} \end{align*}
Theorem

Tchebycheff's Inequality

Let XX be a r.v. with E[X]=μ,0σ2=Var(X)E[X]=\mu, 0\le\sigma^2=Var(X)

    P(Xμε)σ2ε2\implies P(|X-\mu|\ge\varepsilon)\le\frac{\sigma^2}{\varepsilon^2}

Take ε=kσ>0    P(Xμkσ)σ2k2σ2=1k2    P(Xμkσ)11k2\varepsilon=k\sigma>0\implies P(|X-\mu|\ge k\sigma)\le\frac{\sigma^2}{k^2\sigma^2}=\frac{1}{k^2} \iff P(|X-\mu|\le k\sigma)\ge 1- \frac{1}{k^2}

For XiidN(μ,σ2)X\stackrel{\text{iid}}{\sim} N(\mu, \sigma^2)

P(Xμ2σ)=P(Xμσ2)=P(Z2)=0.95450.75P(|X-\mu|\le 2\sigma)=P(\frac{|X-\mu|}{\sigma}\le 2)=P(|Z|\le 2)=0.9545\ge 0.75
Theorem
σ2Var(X)=0    P(X=μ)=1\sigma^2\triangleq Var(X)=0 \implies P(X=\mu)=1

Proof:

ε>0,P(Xμε)σ2ε2=0σ2=0    ε>0,P(Xμ<ε)=1    n=1,2,,P(Xμ<1n)=1\begin{align*} & \forall \varepsilon>0, P(|X-\mu|\ge\varepsilon)\le\frac{\sigma^2}{\varepsilon^2}=0 \qquad \because \sigma^2=0\\ \iff &\forall \varepsilon>0, P(|X-\mu|<\varepsilon)=1\\ \implies &\forall n=1,2,\cdots, P(|X-\mu|<\frac{1}{n})=1\\ \end{align*}

Let An{Xμ<1n}A_n\triangleq\set{|X-\mu|<\frac{1}{n}}, then A1A2A_1\supseteq A_2\supseteq\cdots, and limnAn=n=1An={Xμ=0}\lim_{n\to\infty}A_n=\bigcap_{n=1}^\infty A_n=\set{|X-\mu|=0}

    1=P(limnAn)=P(n=1An)=P(Xμ=0)=1\implies 1=P(\lim_{n\to\infty}A_n)=P(\bigcap_{n=1}^\infty A_n)=P(|X-\mu|=0)=1

雖然柴比雪夫不等式給出的上限或下限很寬泛,但它無法被進一步改進,因為有例子可以觸碰到上下限。

Give k>0k>0, let XX where

P(X+0)=11k2P(X=1)=12k2P(X=1)=12k2\begin{align*} & P(X+0)=1-\frac{1}{k^2}\\ & P(X=1)=\frac{1}{2k^2}\\ & P(X=-1)=\frac{1}{2k^2} \end{align*}     μ=E[X]=0P(X=0)+1P(X=1)+(1)P(X=1)=0σ2=E[X2]E[X]2=E[X2]=02P(X=0)+12P(X=1)+(1)2P(X=1)=1k2    P(Xμkσ)=P(X1)μ=0,σ=1k=P(X=1)+P(X=1)=1k2\begin{align*} &\begin{align*} \implies \mu=E[X]&=0\cdot P(X=0)+1\cdot P(X=1)+(-1)\cdot P(X=-1)\\ &=0 \end{align*}\\ &\qquad \begin{align*} \sigma^2=&E[X^2]-E[X]^2=E[X^2]\\ =&0^2\cdot P(X=0)+1^2\cdot P(X=1)+(-1)^2\cdot P(X=-1)\\ =&\frac{1}{k^2} \end{align*}\\ &\begin{align*} \implies & P(|X-\mu|\ge k\sigma) \\ &= P(|X|\ge 1) \qquad \because \mu=0, \sigma=\frac{1}{k}\\ &=P(X=1)+P(X=-1)\\ &=\frac{1}{k^2} \end{align*} \end{align*}

但對於特定的分佈,可以找到更接近的上下限。

e.g. ZiidN(0,1)Z\stackrel{\text{iid}}{\sim} N(0,1), for k>0k>0

P(Zk)=2P(Zk)=2k12πe12z2dz=2πke12z2dz2πkzke12z2dz=2π1kek22\begin{align*} P(|Z|\ge k) &= 2P(Z\ge k)\\ &=2 \int_k^\infty \frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}z^2}dz\\ &=\frac{\sqrt{2}}{\sqrt{\pi}}\int_k^\infty e^{-\frac{1}{2}z^2}dz\\ &\le \frac{\sqrt{2}}{\sqrt{\pi}}\int_k^\infty \frac{z}{k}e^{-\frac{1}{2}z^2}dz\\ &=\frac{\sqrt{2}}{\sqrt{\pi}}\frac{1}{k}e^{-\frac{k^2}{2}} \end{align*}

對於 P(Z2)P(|Z|\ge 2) 來說,用這個不等式計算得到 P(Z2)0.054P(|Z|\ge 2)\le 0.054。用柴比雪夫不等式計算得到 P(Z2)0.25P(|Z|\ge 2)\le 0.25。而實際上 P(Z2)=0.0455P(|Z|\ge 2)=0.0455

我們可以推廣到更一般的情況:

Lemma

Let gg be a non-negative function, ε>0\forall \varepsilon>0

P(g(X)ε)E[g(X)]εP(g(X)\ge\varepsilon)\le\frac{E[g(X)]}{\varepsilon}

Proof:

E[g(X)]=E[g(X)I(g(x)ε)]+E[g(X)I(g(x)<ε)]εE[I(g(x)ε)]εE[I(g(x)ε)]g(X)ε when g(X)ε=εP(g(X)ε)\begin{align*} E[g(X)] &= E[g(X)I(g(x)\ge\varepsilon)]+E[g(X)I(g(x)<\varepsilon)]\\ &\ge \varepsilon E[I(g(x)\ge\varepsilon)]\\ &\ge \varepsilon E[I(g(x)\ge\varepsilon)] \qquad \because g(X)\ge\varepsilon \text{ when } g(X)\ge\varepsilon\\ &= \varepsilon P(g(X)\ge\varepsilon) \end{align*}