Lien de la note Hackmd
Introduction
Hyperspectral images
Les acquisitions dans le domaine spectral (les bandes) presente un echantillonage beaucoup plus fin.
L’image est en fausse couleur, on a des tenseurs a 3 voies: $x$, $y$ et les bandes spectrales. Sur cette image on a associe 3 bandes au RGB, c’est une reconstruction partielle mais ca permet de visualiser.
C’est des images aeriennes du massif du Mont Blanc.
Quelle est la variable physique qui nous interesse ?
Ici, la variable physique qui nous interesse si on veut faire une analyse continue de la scene est la reflectance.
S’il n’y a pas de transmission, la reflectance est directement liee a l’absorbance.
On souhaite avoir des images exploitable, on veut un rapport image/bruit suffisant. On a typiquement un seul capteur qui a un systeme de diffraction optique (prisme, etc.), la lumiere va arriver et etre diffractee et reflechie dans differentes longueurs d’onde.
On n’arrive pas a voir un bloc entier tout d’un coup lors d’une acquisition
Si on veut 600 bandes, on va devoir faire un compromis sur la resolution spatiale et spectrale.
On va avoir des imageurs qui ont une faible resolution spatiale ($\sim 30m$) mais il y a un 2e capteur associe qui fait l’acquisition d’une bande panchromatique.
On arrive a avoir des informations assez precises sur la reflectance des differents materiaux. On a $\sim 600$ echantillons pour la reflectance.
Des qu’on passe dans l’infrararouge, on a une reflectance plus importante, due a la presence de la chlorophyle.
Toutes les bandes sur le “red edge” ($\sim 0.7\mu m$), ou on a la montee raide du spectre de reflectance de la vegetation, qui permet de discriminer certaines especes.
C’est la variable physique d’interet que l’on essaie d’extraire.
Example
Quel est l’interet de faire des acquisitions au-dela du domaine visible ?
Regardons differentes bandes des plantes:
Dans le proche IR:
On trouve une difference dans la 3e plante (elle est en plastique)
Applications
Dans des contextes pas forcements lie a la teledetection:
- Detection d’hydrocarbure dans l’eau
L’huile superposee a de l’eau a un spectre relativement proche de celui de l’eau
Si on fait le traitement d’une image avec plus d’acquisition:
On extrait de l’information “cachee”
- Monitorage et caracterisation des differents mineraux
- Biomedical
- Detection de tumeurs de la peau
- Astronomie
- Telescope “Muse”
- L’art
- Certaines oeuvres ont des proprietes de transmittance variant selon la longueur d’onde
- C’est possible de detecter des couches invisibles a l’oeil nu
- Controle non-destructif
- Evolution d’un poisson dans le temps
- Detection precoce de la peremption de l’echantillon
Spectral Unmixing
Une potentielle limitation de cette imagerie qu’on trouve assez souvent: la resolution spatiale faible $\to$ certains objets ne sont pas completement resolus
On mesure des combinaisons en fonction des spectres des elements constituant la scene.
On souhaite des echantillons en reflectance, on a une conversion a faire depuis la radiance.
Si on traite une image RGB, chaque pixel est un vecteur avec $3$ composantes. Ici, on a $600$ composantes, c’est une problematique liee a la grande dimension des donnees.
What to mine ?
On peut utiliser une bibliotheque/catalogue de spectres de differents materiaux pour l’unmixing
On a 2 possibilites de traitement:
Spectral processing
- Information resides in the spectral signature of the pixels
- Pixels can be processed independently
- Approaches issuing from multivariate statistics and linear algebra
- Objects of interest could by sub-pixel size
- Analysis done on the full image
Spatial processing
- Information resides in the spatial organization of pixels
- Pixels are processed together (analysis done on local parts of the image)
- Use image processing tools
- Objects of interest are fully resolved
Analysis of the spectral domain
HSI scene classification
Spectral classification
High number of features ?
When the dimensionality of the problem is high
- How calssification accuracy depends upon the dimensionality (and amount of training data)?
- Computational complexity of designing the classifier ?
Classification accuracy
- Bayes error depends on the number of statistically independant features
- Exampe: consider binary classification problem with $p(x\vert \omega_j)\sim\mathcal N(\mu_j,\Sigma_j)$ $(j=1,2)$, when $P(\omega_{1,2})=0.5$:
with $r^2=(\mu_1-\mu_2)^T\Sigma^{-1}(\mu_1-\mu_2)$ the squared Mahalanobis distance
- $P(e)\searrow$ for $r\nearrow$
- In the case of conditionally independent features $\Sigma = \text{diag}(\sigma_1^2,\dots,\sigma_d^2)$
- $r^2 = \sum_{i=1}^d(\frac{\mu_{i,1}-\mu_{i,2}}{\sigma_2})^2$
Il y a des zones ou on a un recouvrement On peut augmenter la dimensionnalite, rajouter un descripteur Attention a la malediction de la dimensionnalite
- Intuition fails in high dimensions
- Curse of dimensionality (Bellman, 1961): many algorithms working fine in low dimensions become intractable when the input is high-dimensional
- Generalizing correctly becomes exponentially harder as the dimenonality grows, because a fixed-size training set covers a smaller fraction of the input space
- In high dimensional space, the concept of proximity, distance or nearest neighbor may not even be qualitatively meaningful
- Similarity measures based on $l_k$ norms loose meaning with respect to $k$ in high dimensions
- $l_1$ norm (Manhattan distance metric) is more preferable thant the Euclidean distance metric $(l_2)$ for high dimensional data mining
HSI in high dimensions
Volume of a hypersphere
- The volume of a hypersphere of radius $r$ in a $p$-dimensional
- Volume of a hypercube $[-r, r]^p$
- The fraction of the volume contained in the inscribed hypersphere
- Fraction of the volume of a thn spherical shell defined by a sphere of radius $r$ inscribede inside a sphere of radius $(r-\varepsilon)$ to the volume of the entire sphere:
\(\begin{aligned} f_{p_2}&=\frac{V_s(r)-V_s(r-\varepsilon)}{V_s(r)}\\ &=\frac{r^p-(r-\varepsilon)^p}{r^p}\\ &= 1-\biggr(1-\frac{\varepsilon}{r}\biggr)^p \end{aligned}\)
On veut voir le rapport du volume entre une sphere et le carre qui inscrit la sphere.
- Small sample size Number of samples for accurate classification:
Si on n’a pas assez d’echantillon pour notre estimation, notre estimation ne sera pas robuste
- Curse of dimensionality !
- Computational complexity
The blessing of non-uniformity
- In most application examples are not spread uniformly throughout the instance space, but are concentrated on or near a lower-dimensional manifold
- Intrinsic dimensionality of the data might be difficult to estimate in real data
Dimensionality reduction
Dimension reduction aims at representing data in a reduced number of dimensions
Reasons:
- Easier data analysis
- Improved classifcation accuracy
- More stable representation
- Removal of redundant or irrevelant information
- Attempt to discover underlying structure by obtaining a graphical representation
Dimensionality reduction is usually obtained by feature selection or extraction
Feature selection keeps only some of the features according to a criterion leading to new subset of features with lower dimensionality
\[x'=[x_1,x_2,x_3,x_4,\dots,x_d]^T\\ x'=A^Tx\]with
\[A=\begin{pmatrix} \color{red}{1}&0&0&0&\dots&0\\ 0&\color{red}{0}&0&0&\dots&0\\ 0&0&\color{red}{1}&0&\dots&0\\ 0&0&0&\color{red}{0}&\dots&0\\ \vdots&\vdots&\vdots&\vdots&\ddots&\vdots\\ 0&0&0&0&\dots\color{red}{1} \end{pmatrix}\]Feature extraction transform the data in a space of lower dimensionality with an arbitrary function $f$
\[x'=f(x)\quad\text{with } f:\mathbb R^d\to\mathbb R^n, n\lt d\]Example: Color composite
The pigment in plant leaves, chlorophyll, strongly absorbs visible light (from $0.4$ to $0.7\mu m$) for use in photosynthesis. The cell structure of the leaves, on the other hand, strongly reflects near-infrared light (from $0.7$ to $1.1\mu m$). The more leaves a plant has, the more these wavelengths of light are affected, respectively.
Normalized Difference Vegetation Index (NVDI)
Example
Exploratory analysis
Covariance matrix
A partir du moment ou c’est tres correle, on peut reduire les dimensions tout en conservant une partie de l’information
Quelle est la definition de la matrice de covariance ?
C’est ce qui permet de visualiser la dependance des bandes entre elles
Sur l’image ci-dessus, les variables globalement entre $80$ et $100$ ont une correlation relativement elevee.
Correlation matrix
Si on affiche les valeurs de la diagonale de la matrice:
Feature extraction
Eigen decomposition of the covariance matrix:
\[\Sigma = \phi \Lambda \phi^T\]with $\Lambda$ the matrix of eigenvalues (values only on the diagonal) and $\phi$ the matrix of eigenvectors
Principal Component Analysis
Application
Denoising
Test
Let us consider the data $X\in\mathbb R^{b\times n}$ with $n$ samples of $b$ bands and centered at the origin. Matrix $\Phi=[\phi_1,\dots,\phi_d]$ is composed of $d\lt b$ eigenvectors extracted from the $n\times n$ covariance matrix $\Sigma$ of the data $X$
Which transformation would you apply to the data for denoising based on the concepts seen so far ?
- $Y=X_{[1:d,:]}$
- $Y=\Sigma X$
- $Y=\Phi X$
- $Y=\Phi^T X$
- $Y=\Phi\Phi^T X$
- $Y=\Phi^T\Phi\Phi^T X$
Spectral Mixture Analysis
Spectral mixing
Linear mixing model
\[x=\sum_{k=1}^ma_ks_k+e=Sa+e\]- $x$: Spectrum of a pixel
- $a$: Coefficients in the mixture (abundance)
- $S$: Spectra of the sources of the mixture (endmembers)
- $e$: Noise
Contraintes:
- Sum to $1$
- Non negativity
Geometrical interpretation
On a un cas tres simple:
\[\begin{cases} x = a_1s_1 + a_2s_2\\ a_1+a_2 = 1 \end{cases}\]D’un point de vue representation, si on considere les vecteurs $s_1$ et $s_2$, toutes les valeurs de $x$ definies par l’equation ci-dessus sont retrouvees dans le segment $s_1\leftrightarrow s_2$
Endmember determination technique
Principles:
- Endmembers are the vertexes of the simplex $\to$ find extrema when projecting the data on a line
- The convex-hull of the data encloses the simplex $\to$ find endmembers such as to maximise the volume
Abundance
- If the endmembers are available: Solve a minimization problem of the form:
- If the endmembers are not available: Use alternating minimization techniques (e.g., Non-negative matrix factorization)
Hyperspectral in nature
Mantis shrimp visual system
- $12$ different types of color photoreceptors
- see in the UV, VIS and NIR spectral domains
- $3$ focal points per eye ($6$ in total, we have $2$)
- see polarized light (linear vs circular)