Home OCVX2 : Tout ce que vous avez toujours voulu savoir sur l'ACP (ou pas)
Post
Cancel

OCVX2 : Tout ce que vous avez toujours voulu savoir sur l'ACP (ou pas)

Lien de la note Hackmd

On a un nuage de points:

On une infinite de directions possibles. On projette sur les axes:

En formulation mathematiques:

\[\text{proj}(x_i,\text{span}(u))=P_u(x_i)=\langle x_i,u\rangle = x_i^Tu\]

On impose $\Vert u\Vert=1$

\[P_u(x_i)=\langle x_i,u\rangle=x_i^Tu\to E(x_i,\underbrace{P_u(x_i)}_{y_i}) = \Vert x_i-y_i\Vert^{(2)}\]

D’apres Pythagore:

\[\Vert x_i\Vert^2=\Vert y_i\Vert^2+\Vert x_i-y_i\Vert\\ \Vert x_i-y_i\Vert^2=\Vert x_i\Vert^2-\Vert y_i\Vert^2\]

On peut donc reecrire:

\[\begin{aligned} \sum_{i=1}^nE(x_i,P_u(x_i))&=\sum_{i=1}^n\Vert x_i-y_i\Vert^2\\ &=\underbrace{\sum_{i=1}^n\Vert x_i\Vert^2}_{\text{constante}} - \sum_{i=1}^n\Vert y_i\Vert^2 \end{aligned}\]

Donc:

\[\begin{aligned} \text{arg}\min_{u,\Vert u\Vert=1}&=\text{arg}\min_{u,\Vert u\Vert=1}-\sum_{i=1}^n\Vert y\Vert^2\\ &=\text{arg}\max_{u,\Vert u\Vert=1}\sum_{i=1}^n\Vert y_i\Vert^2\\ &=\text{arg}\max_{u,\Vert u\Vert=1}\underbrace{\frac{1}{n}\sum_{i=1}^n\Vert y\Vert^2}\\ &\Vert y_i\Vert^2=y_i^Ty_i\underbrace{\frac{1}{n}\sum_{i=1}^ny_i^Ty_i}_{\text{var}(P_u(X))} \end{aligned}\]

On est en train de chercher:

\[\begin{aligned} \text{arg}\min_{u,\Vert u\Vert=1}-\sum_{i=1}^n\Vert y_i\Vert^2&= -\frac{1}{n}\sum_{i=1}^ny_i^Ty_i\\ &= -\frac{1}{n}\sum_{i=1}^n(x_i^Tu)^T(x_i^Tu)\\ &= -\frac{1}{n}\sum_{i=1}^nu^Tx_ix_i^Tu\\ &= -u^T(\underbrace{\frac{1}{n}\sum_{i=1}^nx_ix_i^T}_{\Sigma\text{ matrice de covariance} \\ \text{de } X=\{x_i\}^n_{i=1}\text{ centre}})u \end{aligned}\]\[\text{arg}\max_{\Vert u\Vert=1}u^T\Sigma u = \text{arg}\max\overbrace{\frac{u^T\Sigma u}{u^Tu}}^{\text{Quotien de Rayleigh}}\]

On va reecrire:

\[\text{arg}\min_{u\in\mathbb R^p}-u^T\Sigma u\\ \begin{aligned} \Vert u\Vert =1\Leftrightarrow &\Vert u\Vert^2=1\\ &\begin{cases} u^Tu=1 &f(u)-u^T\Sigma u\\ u^Tu-1=0&g(u)=u^Tu-1 \end{cases}\biggr\}\text{arg}\min_{g(u)=0} f(u) \end{aligned}\]

Le lagrangien de ce probleme est $\mathscr L(u\lambda)=f(u)+\lambda g(u)-u^T\Sigma u+\lambda(u^Tu-1)$

  • Stationnarite de $\mathscr L:\nabla_u\mathscr L(u,\lambda)=0=\nabla_u(-u^T\Sigma u)+\lambda\nabla_u(u^Tu-1)$

On calcule la differentielle $f(x+h)=f(x)+df_x(h)+O(h)$

\[\begin{aligned} f(x+h)=(x+h)^TA(x+h)=\underbrace{x^TAx}_{f(x)}+x^T\underbrace{AH+h^T}&Ax+\underbrace{h^TAh}_{O(h)}\\ x^TAh+(\underbrace{h^TAx})^T&\\ \underbrace{x^TA^Th}&\\ x^T(A+A^T)h&\to df_x(h)=x^T(A+A^T)h\\ (=2x^TAh\text{ si } &A \text{ symetrique } A=A^T) \end{aligned}\\\] \[df_x(h)=\langle\nabla_xf,h\rangle=\nabla_xf^Th\to\nabla_xf=(A+A^T)x=2Ax\\ \begin{aligned} \nabla_u(-u^T\Sigma u)=-\nabla_u(u^T\Sigma u)=-2\Sigma u\\ \nabla_u(u^Tu)=\nabla_u(u^TI_du)=2u \end{aligned}\biggr\}-2\Sigma u+2du=0\]

On se souvient qu’on cehrche $u$, $\Vert u\Vert=1$ qui maximise:

\[\begin{aligned} \underbrace{\frac{1}{n}\sum_{i=1}^n\Vert y_i\Vert^2}_{\text{var}(P_u(x))}&= \frac{1}{n}\sum_{i=1}^ny_i^Ty_i\\ &=\frac{1}{n}\sum_{i=1}^n(x_i^Tu)(x_i^Tu)\\ &=\frac{1}{n}\sum_{i=1}^nu^Tx_ix_i^Tu\\ &=u^T(\frac{1}{n}\sum_{i=1}^nx_ix_i^T)u\\ &=u^T\underbrace{\Sigma u}_{\lambda u}\\ &=\lambda \underbrace{u^Tu}_{\Vert u\Vert^2 =1}\\ &=\lambda \end{aligned}\]
This post is licensed under CC BY 4.0 by the author.