Lien de la note Hackmd
Scope of this course
Apply Machine Learning (ML) techniques to solve some practical Computer Vision (CV) problems
- About Computer Vision (CV)
- It should be called CV-ML, ML4CV or so…
We need some definitions:
- What is Computer Vision ? What is Pattern Recognition ? Shape Recognition ?
- What is Machine Learning ?
- How do those concepts relate together ?
Agenda for lecture 1
- Some definitions and basic notions
- Course outline
- Introduction to Twin it !
- Pattern Matching
Some definitions
Computer Vision
Definition The automation of visual tasks with the goal of producing results directly or indirectly usable by humans
- Input: image(s) in machine format (image acquisition of a subpart of CV)
- Output: some pieces
Exemple
How would you process image pixels to get those results ?
Les photos de chats sur Internet c’est important
- Some applications are direct (like the insect recognition app):
- a human reads and uses the output
- Some applications are indirect (like bank checking reading)
- The output is fed to a business system
- Some applications extend what humans can naturally do
- Either by extending our range
Pattern Recognition
Definition The field of a pattern recognition is concerned with the automatic discovery of regularities in data through the use of computer algorithms and with the use of these regularities to take action such as classifying the data into different categories
Bishop, 2006
IAPR: pattern recognition, computer vision and image processing in a broad sense
Examples
- OCR
- Computer vision
- Pedestrian detection
- Computer Vision
- Credit fraud detection
- Not computer vision
$\Rightarrow$ CV$\cap$PR$\neq\emptyset$
Pattern Recognition is an inverse problem
OCR example - Why Pattern Recognition is hard
“Shapes”
Definition A way to designate meaningful visual patterns.
Sometimes used to describe “visual percepts”
Let S and S’ be 2 shapes observed in 2 different images which happen to be similar.
Some statistics can help us making better decisions…
Idea: learn the distance threshold under which shapes can be deemed identical
Machine Learning
Many forms of Machine Learning
- Focus on inductive learning (generalize from examples)
- We will consider both supervised (a “teacher” provides labels for examples) and unsupervised (only samples)
- Focus on optimization-based learning techniques (examples are represented as numerical vectors)
Examples of optimization-based learning techniques
- Linear classifiers, SVMs
- Neural networks
(“Statistical”) Machine Learning
Learning means changing in order to be better (according to a given criterion) when a similar situation arrives Learning IS NOT learning by heart Any computer can learn by heart, the difficulty is to generalize a behavior to a novel situation Quoting S. Bengio
From an engineer’s POV
Machin Learning is about building programs with tunable parameters (typicalyy an array of floating point values) that are adjusted automatically so as to improve their behavior by adapting to previously seen data. Machine Learning can be considered a subfield of AI since those algorithms can be seen as building blocks to make computer learn Scikit Learn Documentation
Why is learning difficult ?
Given a finite amount of training data, you have to derive a relation for an infinite domain. In fact, there is an infinite number of such relations
Which relation is the most appropriate ?
… the hidden test points…
Learning bias
It is always possible to find a model complex enough to fit all the examples But how would this help us with new samples ? It should not generalize well. We need to define a family of acceptable solutions to search from. It forces to learn a “smoothed” representation
So in practice we need
- Examples (data!)
- A tunable algorithm (model)
- A evalutation of the model fitness to examples (risk, loss)
- A definition of the model search space (not too big, not too small)
- An optimization strategy
The bias/variance compromise Small search space:
- Easier to find the best (available) solution
- But it may be far from the ideal one
Large search space:
- It is hard to find the best (available) solution
3 kinds of problems
Regression
\[x=\underbrace{\begin{pmatrix} \vdots \end{pmatrix}}_{\in\mathbb R^T}\\ y=\underbrace{\begin{pmatrix} \vdots \end{pmatrix}}_{\in\mathbb R^5}\]Classification
\[x=\mathbb R^5\\ y=\mathbb R^T\]Density estimation
\[x\in\mathbb R^5\\ \mathbb P(x)\in[0,1]\]3 types of learning
- Supervised learning $(x,y)$
- The training contains the desired behavior (desired class, outcome, etc.)
- Reinforcement learning $(x,\tilde y)$
- The training data contains partial targets (for instance, simply whether the machines did well or not)
- Unsupervised learning
- The training data is raw, no class or target is given
- There is often a hidden goal in that task (compression, maximum likelihood, etc.)
Model validation
More on that later
- You need to test the generalization power of your approach
- So you need data not seen during the training: a test set
- For which you know the expected output (“ground-truth”, “gold standard”, “target”,…)
Benefits of ML
A duck example
How to filter the grass to keep only the duckshape, using threshold domain ?
Why using Machine Learning in computer Vision ?
To avoid knob turning. It’s complex. It’s unsafe
But beware of the Machine Learning Magic
Actual goals of this course
- Teach you that you can (and should whenever possible) optimize the parameters of your CV/PR product
- Show some simple tools to try to do it
- Address practical problem
- describe a pattern
- look for a pattern
- match a pattern
- classify a pattern
- describe a set of patterns (an object/an image)
- retrieve an object given a query, segment objects…
- and face the unavoidable work surrounding them
Course agenda
6 “weeks” (Friday to Friday) See the web page for complete agenda Weekly tests + assignments (practice sessions). No final exam
Weekly wokflow should be:
- Friday, 09:30-10:00: answer the weekly quiz on Moodle (starting next Friday)
- Friday, 10:00-12:00: attend the lecture using Teams
- Friday, 14:00-17:00: Work on the practice session and join the discussion using Teams
- Before next Friday: Complete the assignement and submit your results using Moodle (for sessions 4, 5 and 6 only)
No deep learning !
- We need a course about basic techniques
- There are cases where setting up
Pratice sessions: setup your dev. env.
Basically: Python with:
- Jupyter
- Numpy
- Matplotlin
- Scikit-image: RGB
- Scikit-learn
- OpenCV: BGR
Why I love Scikit-Learn
Numpy-friendly
3-way documentation: User guide, API ref, Examples
Super smart API
Decomposition, level of detail, default values, consistency, etc
Introduction to Twin it!
Overview
A poster game
- $X$ bubbles, all different but
- $Y$ bubbles, which have 1 (and only 1) twin
Your goals:
- Find the pairs
Discussion (3 minutes):
- How can we decompose the problem ?
- How can we make sure our solution works ?
- What should we focus on ?
Already done:
- Scan the poster
- Stitch the tiles
- Normalize the contrast
Undelying problems
- Isolate each bubble $\Rightarrow$ Segmentation
- We provide pre-computed results for this step
- Compare image pairs $\Rightarrow$ Matching
- We will focus on this one
- We will use Template Matching
- Identify pairs $\Rightarrow$ Calibration
Template matching
Why template matching ?
A simple method which will be useful to understand
- Evaluation challenges
- The ideas behind keypoint detection (next lecture)
It can work on the Twin it! case
- Twice the same texture
- Textures are the same scale, without rotation nor intensity change
- Only need to cope with translation (and some small noise)
Step by step: Compare 2 images
- 2 arrays of intensities
- Take the absolute difference
- Sum the differences
(Opt.) Normalize so the results belongs to $[0,1]$
Template Matching: Sliding comparison
- $I_1$ is a small template $T$ to match against $I_2$ (just $I$ after)
- We rewrite the preceding formula to compute a map $R$ of the shape of $I$
- Each pixel of $R$ will have the value of the SSD when the top-left pixel of $T$ in on the pixel $(x,y)$ of $I$
Several approaches $\Leftrightarrow$ Practice session
About the denominator
Cross correlation: 2 things to know
More robust to intensity shift
Ideal goal
For each bubble, retunr only a mathcin pair, if it exists