Home IML: Introduction
Post
Cancel

IML: Introduction

Lien de la note Hackmd

Motivation

What is learning ?

It’s all about evolving

Can machines learn ?

A new science with a goal and an object.

What is it good for ?

According to Peter Norvig

The 3 main reasons why you may want to use Machine Learning:

  • Avoid coding numerous complex rules by hand
    • lower cost, more effective, faster reaction to changing problem
  • Optimize the parameteres of your system given a dataset of yours
    • Better accuracy
  • Create systems for which you do not know the rules conscioulsy (e.g. recognize a face)
    • Greater potential

AI vs Machine Learning

  • AI is a very fuzzy concept, much like “any computer program doing something useful”
    • Think “if-then” rules
  • ML can be considered a subfield of AI since those algorithms can be seen as building blocks to make computers learn to behave more intelligently by somehow generalizing rather that just storing and retrieving data items like a database system would do
  • Engineering point of view: ML is about builiding programs with turnable parameters (typically an array of floating point values) that are adjusted automatically so as to improve their behavior by adapting to previously seen data

Machine Learning vs Deep Learning

Traditional Machine Learning

Deep Learning

AI vs ML vs DL

DL $\subset$ ML $\subset$ AI

Exercise

Machine Learning Examples

Can you list examples of projects or products involving Machine Learning ?

  • Google Lens

Machine Learning Problem

Why is learning difficult ?

Generalization is an ambiguous process.

Given a finite amount of training data, you have to derive a relation for an infinite domain. In fact, there is an infinite number of such relations.

How should we draw the relation?

Which relation is the most appropriate? …the hidden test points (seen after the training)…

Learning bias

How to guide generalization

  • It is always possible to find a model complex enough to fit all the examples
    • Example: polynomial with very high degree
  • But how would this help us with new samples?
    • It should not generalize well.
  • We need to define a family of acceptable solutions to search from
    • It forces to learn a “smoothed” representation.
    • … but it should not smooth the representation too much!

When many solutions are available for a given problem, we should select the simplest one. But what do we mean by simple? We will use prior knowledge of the problem to solve to define what is a simple solution.

Example of a prior: smoothness

Learning as a search problem

Hypothesis space / initial, compatible (with train set), optimal, and ideal solutions

What are the sources of error ?

Noise, intrinsic error

Your data is not perfect (can have noisy or erroneous labels). (or “Every model is wrong.”) Even if there exist an optimal underlying model, the observations are corrupted by noise.

(Inductive) bias, approximation error

We are exploring a restricted subset of all possible solutions. Your classifier needs to drop some information about the training set to have generalization power (simplify to generalize).

Variance, estimation error

You have many ways to explain your training dataset. It is hard to find an optimal solution among those many possibilities. Our exploration is not very accurate, we are limited by data we see during training.

Bias / variance compromise

  • Low bias $\Leftrightarrow$ high variance: large search set, can capture many useless details
    • overfitting
  • High bias $\Leftrightarrow$ low variance: small search set, limited exploration, solution too simple
    • underfitting.
  • Solutions: regularization (penalize solutions which are too complex), early stopping (stop when no more progress)…

Parameters of a ML problem

Many variations for each element

  • Protocol: supervision? feedback? how many samples for each “experience”?
  • Measure of success: error cost? convergence? …
  • Inputs (representation space): quality (noise, distribution) and nature (numerical, symbolical, mixed)
  • Solutions (space hypothesis/functions to explore): many approaches

Three kinds of ML problems

According to Samy Bengio

Regression

Regression input: samples described by several input variables (correlated) Regression output: a quantitative variable (scalar)

Regression, classification

Classification input: samples described by several input variables (correlated) Classification output: a qualitative variable (class, category)

Regression, classification, density estimation

Density estimation input: samples described by several input variables (correlated) Density estimation output: estimate of the probability distribution function over the feature space

Three kinds of supervision/trainings

According to Lecun, S. Bengio

  • Supervised learning: Training data contains the desired behavior — desired class, outcome, etc
    • Medium feedback
  • Reinforcement learning: Training data contains partial targets — Did the system do well or not? Is some object present in the image (without knowing is position)?
    • Weak feedback
  • Unsupervised/Self-supervised learning: Training data is raw, no class or target is given.
    • There is often a hidden goal in the task: compression, maximum likelihood, predict parts from other parts (BERT-like)…
    • Lot of feedback

Forms of Machine Learning

According to Cornuejols and Miclet

  • Exploration-based: Generalization or specialization of rules
    • Examples: Grammatical inference, heuristic discovery for SAT solvers…
  • Optimization-based: Topic of this course.
    • Examples: linear separators and SVMs, neural networks, decision trees, Bayesian networks, HMMs…
  • Approximation-based: Data compression, analogy.
    • Examples: KNN, embedding spaces

Machine Learning Engineering

ML from an engineer point of view

Solve problems using the right tool

Some taxonomy

Simplified view of pre-2010 Machine Learning

Choosing the right tool

Why we love scikit-learn

Representing data

Why we love scikit-learn

At the cross-roads of numerous fields

  • Signal processing
  • Databses, information retrieval
  • Statistics
  • Pattern Recognition
  • Optimization
  • Data science, data mining
This post is licensed under CC BY 4.0 by the author.