Machine Learning Algorithms – Summary

Characteristics of Machine Learning Algorithms

Discriminative / Generative algorithms

Many machine learning algorithms are based on probabilistic models. Discriminative algorithms learn the conditional probability distribution of y given x p(y|x). Generative algorithms learn the joint probability distribution p(y,x) = p(y|x) * p(x), and therefore take in consideration the distribution of x.

Parametric & Non-Parametric algorithms

Neural Network, SVM,… are non-parametric algorithms because they have no fixed parameter vector; depending on the richness of the data we can choose the size of the parameter vector that we need (ex. we can add hidden units, change SVM kernels,…)

Capacity

The capacity is controlled by parameters called Hyperparameters (ex. degree p of a polynomial predictor, kernel choice in SVM, number of layers in a neural network).

Summary of Machine Learning Algorithms

The table below describes briefly each machine learning algorithm.

Algorithm

Description

Characteristics

Linear regression

To use when Y is normally-distributed

Discriminative

Parametric

Logistic regression

To use when Y is Bernoulli-distributed

Discriminative

Parametric

Multinomial logistic regression (softmax regression)

To use when Y is multinomially-distributed

There are two versions of the algorithm, one based on maximum likelihood maximization, and one based on cross-entropy minimization.

Discriminative

Parametric

Gaussian Discriminant Analysis

Supervised classification algorithm

To use when Y is Bernoulli distributed and the conditional distribution of X given Y is multivariate Gaussian

Generative

Parametric

Naive Bayes Algorithm

Supervised classification algorithm

To use when Y is Bernoulli, and the conditional distribution of X given Y is Bernoulli and X features are conditionally independent

Generative

Parametric

EM

Unsupervised soft-clustering algorithm

Generative

Parametric

Principal Component Analysis

Reduce the dimensionality of X.

Calculate eigenvectors for \(XX^T\). Use eigenvectors with higher eigenvalues to transform data \(x^{(i)} := (u_1^T x^{(i)}, u_2^T x^{(i)},…, , u_k^T x^{(i)})\)

Factor Analysis

Reduce the dimensionality of X

To use when X is Gaussian.

Transform X to Z matrix with lower dimensionality (x ? ?+?z).

Neural Network

Non-linear classifier

Discriminative

Non-Parametric

K-Nearest Neighbor Regression

Predict y as the average of values y1,y2,…,yk of k nearest neighbors of x

Discriminative

Non-Parametric

K-Nearest Neighbor Classifier

Predict y as the most common class among the k nearest neighbors of x

Discriminative

Non-Parametric

Support Vector Machine

Find a separator that maximizes the margin between two classes

Discriminative

Non-Parametric

K-means

Unsupervised hard-clustering algorithm

Discriminative

Non-Parametric

Leave a Reply

Your email address will not be published. Required fields are marked *

*

code