Confusion Matrix ,its Error and how it helps in cyber crime cases
Machine Learning is very useful nowadays, we train our model to perform specific task. But as we all know for such tasks the main thing which is most is the accuracy of that created model.
So, Confusion Matrix helps us in calculating the accuracy of the classification model which indirectly helps us to describe the performance of our model. It is the most important step when it comes to evaluating a model.
What is a Confusion Matrix ?
Confusion Matrix is the summarize view of the predicted results and the actual results in any classification problem. This is extremely necessary to determine the performance of our model after we trained with some training data.
For Binary Classification problem, it is just like like a 2X2 matrix
here TP,TN,FN,FP are the result of this matrix.
True Positive(TP): Here TP indicates the predicted and the actual values is 1(True)
True Negative(TN): Here TN indicates the predicted and the actual value is 0(False)
False Negative(FN): Here FN indicates the predicted value is 0(Negative) and Actual value is 1. Here both values do not match. Hence it is False Negative.
False Positive(FP): Here FP indicates the predicted value is 1(Positive) and the actual value is 0. Here again both values mismatches. Hence it is False Positive.
There are two types of Error of this matrix i.e. FN & FP
False Negative means in actual it is true but our model predict it false, so this is a type of error.
False Positive means in actual it is false but our model predict it true, so this is the another type of error. This is most dangerous so we always try to minimize this error.
Components of Confusion Matrix
Accuracy (all correct/ all)
Misclassification (all incorrect/ all)
Precision (true positives / predicted positives)
Recall (true positives / all actual positives)
lets take an example to understand this:
Cyber Crime use Case
Cybercrime (computer crime), crime that involves a computer and a network. The computer may have been used in the commission of a crime, or it may be the target. Cybercrime may harm someone’s security and financial health. It encompasses a broad range of activities. Computer fraud is any dishonest misrepresentation of fact intended to let another to do or refrain from doing something which causes loss.
Feature Extraction is important here, in cyber related ML:
The confusion matrix is need here to plot these features, we use the normalized confusion matrix which will produce like this…
The darker the blue, the better the classifier is at predicting files for this class. This is a 7X7 matrix, so we need to learn such matrix.
In the field of cybercrime to predict cybercrime. Therefore, We have to predict using this confusion matrix.
The association of security analysis and data analysis methods helps us analyze and classify crimes from integrated data (which may be structured or unstructured) from India. The main advantage of this work is the test analysis report, which can accurately classify crimes with 99% accuracy.
In this way we Confusion Matrix in Cyber Crime.
Thank You. :)