Python

Measuring Performance of Classification using Confusion Matrix

Programmingempire

In this post on Measuring Performance of Classification using Confusion Matrix, I will explain what is a confusion matrix and how to use it with a Machine Learning classifier.

Machine Learning offers a number of algorithms for classification and prediction. However, each algorithm performs differently. Hence, we must have some kind of performance measures to evaluate the machine learning classifiers. In fact, measuring the performance of different classifiers help us in choosing the most accurate algorithm for predicting the outcome.

Interestingly, there are many performance measuring tools available that we can use to determine the accuracy of a classifier, and a confusion matrix is one of them.

What is a Confusion Matrix?

Basically, the confusion matrix displays the prediction results in a matrix format consisting of rows and columns. Further, the rows indicate the actual values whereas the columns indicate the predicted values. Moreover, all the diagonal entries in the matrix represent the instances where actual values match with the predicted ones. Besides, the numbers in other cells of the matrix represent the instances where the prediction fails.

How to Create a Confusion Matrix?

Basically, a confusion matrix has following general form that is explained further with an example given next.

Confusion Matrix
Confusion Matrix

The confusion matrix for binary classification is a 2X2 matrix and represented as follows. For example, suppose there are two class labels in a dataset – True (T), and False (F). Also, the total number of instances in test data is n. Further n1 is the number of instances where a T is predicted as T, and n2 represents cases where F is predicted as F. Likewise, m1 represents instances where a T is falsely predicted as F. Similarly, m2 represents the number of instances where F is falsely predicted as T. Therefore n is the sum of n1, n2, m1, and m2 as shown in the following figure.

Confusion Matrix for Binary Classification
Confusion Matrix for Binary Classification

However, if the dataset has n classes, then the confusion matrix is nXn and we can extend in the same manner as shown in the example given next.

Benefits of Confusion Matrix

In fact, the confusion matrix serves as an important tool to determine how a machine learning classifier performs. Moreover, we can use it to compare two or more classifiers.

Python Code for Measuring Performance of Classification using Confusion Matrix

The following example shows predictions done by the MLPClassifier (Multi-layer Perceptron) available in the scikit-learn library of python. The variable predictions contains the predicted values while the variable y_test contains actual values in the test dataset.

Evidently, the example given below makes use of the Iris dataset that contains the data about the Iris flower. Further, the dataset contains three classes – setosa, versicolor, and virginica.

from sklearn.metrics import confusion_matrix

predictions = mlp.predict(X_test) 
print(confusion_matrix(y_test,predictions))

Output

Confusion Matrix
Confusion Matrix

The above Confusion Matrix indicates that the MLP Classifier has predicted correctly the seven instances of setosa species, eleven instances of versicolor species, and ten instances of virginica species of the iris dataset. However, it incorrectly predicted one instance of virginica as versicolor. Additionally, another instance of versicolor species is incorrectly predicted as virginica.

Creating Heatmap of the Confusion Matrix

After that, we can visualize the confusion matrix using the heatmap function of the seaborn library of python. The following code shows how to display a heatmap of the confusion matrix.

import seaborn as sn
import matplotlib.pyplot as p
c=confusion_matrix(y_test,predictions)
sn.heatmap(c, annot=True)
p.show()

Finally, we get the following heatmap after running the above code.

Heatmap of Confusion Matrix

Further Reading

Deep Learning Tutorial

Text Summarization Techniques

How to Implement Inheritance in Python

Find Prime Numbers in Given Range in Python

Running Instructions in an Interactive Interpreter in Python

Deep Learning Practice Exercise

Python Practice Exercise

Deep Learning Methods for Object Detection

Understanding YOLO Algorithm

What is Image Segmentation?

ImageNet and its Applications

Image Contrast Enhancement using Histogram Equalization

Transfer Learning and its Applications

Examples of OpenCV Library in Python

Examples of Tuples in Python

Python List Practice Exercise

Understanding Blockchain Concepts

Edge Detection Using OpenCV

Predicting with Time Series

Example of Multi-layer Perceptron Classifier in Python

Measuring Performance of Classification using Confusion Matrix

Artificial Neural Network (ANN) Model using Scikit-Learn

Popular Machine Learning Algorithms for Prediction

Long Short Term Memory – An Artificial Recurrent Neural Network Architecture

Python Project Ideas for Undergraduate Students

Creating Basic Charts using Plotly

Visualizing Regression Models with lmplot() and residplot() in Seaborn

Data Visualization with Pandas

A Brief Introduction of Pandas Library in Python

A Brief Tutorial on NumPy in Python

programmingempire

You may also like...