# Learning paradigms

Compare learning paradigms in NLP.

1h20 per week, for 4 weeks

## Learning paradigms

Learn powerful representations

**Theory**: Linear algebra. NMF. SVD. Spectral decomposition.- Supervised learning (Linear, LDA/QDA, naive Bayes, Logistic, RF, MLP, SVM, Kernel)
- Unsupervised learning, e.g., clustering (see ML1, ML2 course), PCA, ICA, t-SNEâ€¦ + Bag of word, tfidf, pLSI (doc embed)

## Other learning paradigms

- Semi-supervised learning, contrastive learning (cPCA, RBM), reinforcement learning, self-supervised, curiosity-driven learning, few-shot learning, active learning, federated learning, online learningâ€¦ Effort on model design or problem to solve, representation/task to learn, ?
- Generative vs. discriminative models.
- Parametric vs. non parametric
- Other tools : OT, ODE,

## Why/when deep learning?

- CNN (log), RNN (linear), attention models, Bert (quadratic)
- Limits of current models (lack of intrinsic uncertainty, interpolation in latent spaces)
- Learning to repeat, reformulate, predict word from contextâ€¦ task influences representations
- Semantic similarity: cosine, manh, kulb, w1 (OT, combinatorial complexity). Info Theory. Shannon (encode) vs Fisher (param)

- Simple preprocessing + ranking can solve your problem?
- Is it the solution or the problem that is wrong? Quote Einstein + Feynman.
**Usecase**:- Deduplicate database, build search/recommendation APIâ€¦ (faq)
- Regulatory, media & political feedback
- Summary (models, hypothesis, limits)

## From language to socio dynamics

- Behavioral psychology.
**Usecase**: Diversity & inclusion. Online Harassment. Twitter. Amnesty.**Usecase**: Orthophonistes

The general form of the **normal** probability density function is:

$$ f(x) = \frac{1}{\sigma \sqrt{2\pi} } e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} $$

## Quiz

## What is the parameter $\mu$?

The parameter $\mu$ is the mean or expectation of the distribution.

## Reference

Michel Deudon. Learning semantic similarity in a continuous space. Advances in neural information processing systems. vol 31. 2018.

Gabriel PeyrĂ© and Marco Cuturi. Computational Optimal Transport. ArXiv:1803.00567. 2018.

Chloe Clavel. Traitement automatique du langage naturel et fouille dâ€™opinions

Michalis Vazirgiannis. INF554 - Machine learning I. 2016.

Matt Kusner et al. From word embeddings to document distances. International conference on machine learning. PMLR, 2015.

Christopher Manning and Anna Goldie. CS224n. Stanford. 2000.