Sei sulla pagina 1di 10

EVERYTHING YOU NEED TO KNOW ABOUT LINEAR DISCRIMINANT ANALYSIS

Dimensionality reduction techniques have become critical in machine learning since many high-

dimensional datasets exist these days. Linear discriminant analysis is one such dimensionality

reduction technique that is extremely popular.

Linear Discriminant Analysis was developed as early as 1936 by Ronald A. Fisher. The original Linear

discriminant applied to only a 2-class problem. It was only in 1948 that C.R. Rao generalized it to apply

to multi-class problems.

What Is Linear Discriminant Analysis(LDA)?

Linear Discriminant Analysis is used as a dimensionality reduction technique. It is commonly used in

the pre-processing step in machine learning and pattern classification projects.

Linear Discriminant Analysis Python helps to reduce high-dimensional data set onto a lower-

dimensional space. The goal is to do this while having a decent separation between classes and

reducing resources and costs of computing.

The original technique that was developed was known as the Linear Discriminant or Fisher’s

Discriminant Analysis. This was a two-class technique. The multi-class version, as generalized by C.R.

Rao, was called Multiple Discriminant Analysis. Today, they are all known simply as Linear Discriminant

Analysis. Here is a video that clearly explains LDA.


Linear Discriminant Analysis Before & After

Linear Discriminant Analysis For Dummies: What Is Dimensionality Reduction?

To understand Linear Discriminant Analysis Python better, let’s begin by understanding what

dimensionality reduction is.

Multi-dimensional data is data that has multiple features which have a correlation with one another.

Dimensionality reduction simply means plotting multi-dimensional data in just 2 or 3 dimensions.

An alternative to dimensionality reduction is plotting the data using scatter plots, boxplots,

histograms, and so on. We can then use these graphs to identify the pattern in the raw data.

However, with charts, it is difficult for a layperson to make sense of the data that has been presented.

Moreover, if there are many features in the data, thousands of charts will need to be analyzed to

identify patterns.

Dimensionality reduction algorithms solve this problem by plotting the data in 2 or 3 dimensions. This

allows us to present the data explicitly, in a way that can be understood by a layperson.
Linear Discriminant Analysis For Dummies

Linear Discriminant Analysis works on a simple step-by-step basis. Here is a Linear Discriminant

Analysis example. These are the three key steps.

(i) Calculate the separability between different classes. This is also known as between-class variance

and is defined as the distance between the mean of different classes.

Between Class Variance

(ii) Calculate the within-class variance. This is the distance between the mean and the sample of every

class.

Within-Class Variance

(iii) Construct the lower dimensional space that maximizes Step1 (between-class variance) and

minimizes Step 2(within class variance). In the equation below P is the lower dimensional space

projection. This is also known as Fisher’s criterion.

Fisher’s Criterion
Representation Of Linear Discriminant Analysis Models

The representation of Linear Discriminant Analysis models consists of the statistical properties of the

dataset. These are calculated separately for each class. For instance, for a single input variable, it is the

mean and variance of the variable for every class.

If there are multiple variables, the same statistical properties are calculated over the multivariate

Gaussian. This includes the means and the covariance matrix. All these properties are directly

estimated from the data. They directly go into the Linear Discriminant Analysis equation.

The statistical properties are estimated on the basis of certain assumptions. These assumptions help

simplify the process of estimation. One such assumption is that each data point has the same variance.

Another assumption is that the data is Gaussian. This means that each variable, when plotted, is

shaped like a bell curve. Using these assumptions, the mean and variance of each variable are

estimated.

Linear Discriminant Analysis For Dummies- How To Make Predictions

The linear Discriminant analysis estimates the probability that a new set of inputs belongs to every

class. The output class is the one that has the highest probability. That is how the LDA makes its

prediction.

LDA uses Bayes’ Theorem to estimate the probabilities. If the output class is (k) and the input is (x),

here is how Bayes’ theorem works to estimate the probability that the data belongs to each class.

P(Y=x|X=x) = (PIk * fk(x)) / sum(PIl * fl(x))

In the above equation:

Plk – Prior probability. This is the base probability of each class as observed in the training data
f(x) – the estimated probability that x belongs to that particular class. f(x) uses a Gaussian distribution

function.

LDA vs Other Dimensionality Reduction Techniques

Two dimensionality-reduction techniques that are commonly used for the same purpose as Linear

Discriminant Analysis are Logistic Regression and PCA (Principal Components Analysis). However,

Linear Discriminant Analysis has certain unique features that make it the technique of choice in many

cases. Here are some differences between Linear Discriminant Analysis and the other techniques.

1. LINEAR DISCRIMINANT ANALYSIS VS PCA

(i) PCA is an unsupervised algorithm. It ignores class labels altogether and aims to find the principal

components that maximize variance in a given set of data. Linear Discriminant Analysis, on the other

hand, is a supervised algorithm that finds the linear discriminants that will represent those axes which

maximize separation between different classes.

(ii) Linear Discriminant Analysis often outperforms PCA in a multi-class classification task when the

class labels are known. In some of these cases, however, PCA performs better. This is usually when the

sample size for each class is relatively small. A good example is the comparisons between classification

accuracies used in image recognition technology.

(ii) Many times, the two techniques are used together for dimensionality reduction. PCA is used first

followed by LDA.
Linear Discriminant Analysis vs PCA

LINEAR DISCRIMINANT ANALYSIS VS LOGISTIC REGRESSION

(i) Two-Class vs Multi-Class Problems

Logistic regression is both simple and powerful. However, it is traditionally used only in binary

classification problems. While it can be extrapolated and used in multi-class classification problems,

this is rarely done. When it’s a question of multi-class classification problems, linear discriminant

analysis is usually the go-to choice. In fact, even with binary classification problems, both logistic

regression and linear discriminant analysis are applied at times.

(ii) Instability With Well-Separated Classes

Logistic regression can become unstable when the classes are well-separated. This is where the Linear

Discriminant Analysis comes in.

(iii) Instability With Few Examples

If there are just a few examples from the parameters need to be estimated, logistic regression tends to

become unstable. In this situation too, Linear Discriminant Analysis is the superior option as it tends to

stay stable even with fewer examples.


Linear Discriminant Analysis via Scikit Learn

Of course, you can use a step-by-step approach to implement Linear Discriminant Analysis. However,

the more convenient and more often-used way to do this is by using the Linear Discriminant Analysis

class in the Scikit Learn machine learning library. Here is an example of the code to be used to achieve

this.

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA

# LDA

sklearn_lda = LDA(n_components=2)

X_lda_sklearn = sklearn_lda.fit_transform(X, y)

def plot_scikit_lda(X, title):

ax = plt.subplot(111)

for label,marker,color in zip(

range(1,4),(‘^’, ‘s’, ‘o’),(‘blue’, ‘red’, ‘green’)):

plt.scatter(x=X[:,0][y == label],

y=X[:,1][y == label] * –1, # flip the figure

marker=marker,

color=color,

alpha=0.5,

label=label_dict[label])

plt.xlabel(‘LD1’)
plt.ylabel(‘LD2’)

leg = plt.legend(loc=‘upper right’, fancybox=True)

leg.get_frame().set_alpha(0.5)

plt.title(title)

# hide axis ticks

plt.tick_params(axis=“both”, which=“both”, bottom=“off”, top=“off”,

labelbottom=“on”, left=“off”, right=“off”, labelleft=“on”)

# remove axis spines

ax.spines[“top”].set_visible(False)

ax.spines[“right”].set_visible(False)

ax.spines[“bottom”].set_visible(False)

ax.spines[“left”].set_visible(False)

plt.grid()

plt.tight_layout

plt.show()

plot_step_lda()

plot_scikit_lda(X_lda_sklearn, title=‘Default LDA via scikit-learn’)


Linear Discriminant Analysis via Scikit Learn

Default LDA via scikit-learn

Extensions & Variations Of Linear Discriminant Analysis

Due to its simplicity and ease of use, Linear Discriminant Analysis has seen many extensions and

variations. These have all been designed with the objective of improving the efficacy of Linear

Discriminant Analysis examples. Here are some common Linear Discriminant Analysis examples where

extensions have been made.


(i) Flexible Discriminant Analysis (FDA)

Regular Linear Discriminant Analysis uses only linear combinations of inputs. The Flexible Discriminant

Analysis allows for non-linear combinations of inputs like splines.

(ii) Quadratic Discriminant Analysis (QDA)

In Quadratic Discriminant Analysis, each class uses its own estimate of variance when there is a single

input variable. In case of multiple input variables, each class uses its own estimate of covariance.

(iii) Regularized Discriminant Analysis (RDA)

This method moderates the influence of different variables on the Linear Discriminant Analysis. It does

so by regularizing the estimate of variance/covariance.

Conclusion

Linear Discriminant Analysis Python has become very popular because it’s simple and easy to

understand. While other dimensionality reduction techniques like PCA and logistic regression are also

widely used, there are several specific use cases in which LDA is more appropriate. Thorough

knowledge of Linear Discriminant Analysis is a must for all data science and machine

learningenthusiasts.

If you are also inspired by the opportunities provided by the data science landscape, enroll in our data

science master course and elevate your career as a data scientist.

Potrebbero piacerti anche