Everyday millions of people use banknotes to make transactions. The security of these banknotes for governments and banks is hence an essential factor to fight froud.

Nowadays, sometimes it is too hard to spot counterfeit and genuine notes. Then, the aim of this example is to create a support system ready to help organizations to accurately classify fraudulent notes.

Data were extracted from images that were taken from genuine and forged banknote-like specimens. For digitization, an industrial camera usually used for print inspection was used. The final images have 400x 400 pixels.

Due to the object lens and distance to the investigated object gray-scale pictures with a resolution of about 660 dpi were gained. Wavelet Transform tool were used to extract features from images.

This is a classification project, since the variable to be predicted is binary (fraudulent or legal).

The goal here is to model the probability that a banknote is fraudulent, as a function of its features.

The data file banknote_authentication.csv is the source of information for the classification problem. The number of instances (rows) in the data set is 1372, and the number of variables (columns) is 5.

In that way, this problem has the following variables:

**variance_of_wavelet_transformed**, used as input.**skewness_of_wavelet_transformed**, used as input.**curtosis_of_wavelet_transformed**, used as input.**entropy_of_image**, used as input.**class**, used as target. It can only have two values: 0 (false) or 1 (true).

The instances are divided into training, selection and testing subsets. There are 824 instances for training (60%), 274 instances for selection (20%) and 274 instances for testing (20%).

We can calculate the data distributions and plot a pie chart with the percentage of instances for each class.

As we can see, the number of authentic and forged banknotes are similar.

Next we plot a scatter chart with the counterfeit and the wavelet transformed variance data.

In general, the more wavelet transformed variance, the more probability of counterfeit.

The input-targets correlations might indicate which factors better discriminate between authentic and false banknotes.

From the above chart, we can see that the wavelet transformed variance might be the most influential variable for this application.

The second step is to configure a neural network to represent the classification function.

The next picture shows the neural network that defines the model.

The fourth step is to set the training stragey, which is composed by:

- Loss index.
- Optimization algorithm.

The loss index that we use is the weighted squared error with L2 regularization.

The learning problem can be stated as to find a neural network which minimizes the loss index. That is, we want a neural network that fits the data set (error term) and that does not oscillate (regularization term).

We use here the quasi-Newton method as the optimization algorithm. The default training parameters, stopping criteria and training history settings are left.

The next figure shows the loss history with the quasi-Netwon method. As we can see, the loss decreases until it reaches a stationary value. This is a sign of convergence.

The final training and selection errors are almost zero, which means that the neural network fits the data very well.
More specifically, **training error = 0.014 WSE** and **selection error = 0.011 WSE**.

The objective of model selection is to improve the generalization capabilities of the neural network or, in other words, to reduce the selection error.

Since the selection error that we have achieved so far is very small (0.011 WSE), we neither apply order selection nor inputs selection here.

The aim of testing analysis is to validate the generalization performance of the trained neural network.

A good measure for the precission of a binary classification model is the ROC curve.

The area under the curve of the model is
**AUC = 1**, which means that the classifier is predicting well all the testing instances.

In the confusion matrix, the rows represent the target classes and the columns the output classes for the testing target data set. The diagonal cells in each table show the number of cases that were correctly classified, and the off-diagonal cells show the misclassified cases. The following table contains the elements of the confusion matrix.

Predicted positive | Predicted negative | |
---|---|---|

Real positive | 103 | 0 |

Real negative | 0 | 171 |

The number of correctly classified instances is 274, and the number of misclassified instances is 0. As there are not misclassified patterns, the model is predicting this testing data very well.

In the model deployment phase, the neural network is used to predict outputs for inputs that it has never seen.

For that, we can embed the mathematical expression represented by the neural network in the banknote authentication system. This expression is written below.

scaled_wavelet_transformed_variance = (wavelet_transformed_variance-0.433735)/2.84276; scaled_wavelet_transformed_skewness = (wavelet_transformed_skewness-1.92235)/5.86905; scaled_wavelet_transformed_curtosis = (wavelet_transformed_curtosis-1.39763)/4.31003; scaled_image_entropy = (image_entropy+1.19166)/2.10101; y_1_1 = Logistic (-2.95122+ (scaled_wavelet_transformed_variance*-3.20568)+ (scaled_wavelet_transformed_skewness*-4.57895) + (scaled_wavelet_transformed_curtosis*-5.83131)+ (scaled_image_entropy*0.125717)); y_1_2 = Logistic (3.23366+ (scaled_wavelet_transformed_variance*3.5863)+ (scaled_wavelet_transformed_skewness*2.36407) + (scaled_wavelet_transformed_curtosis*1.0865)+ (scaled_image_entropy*-1.0501)); non_probabilistic_counterfeit = Logistic (3.48838+ (y_1_1*9.72432)+ (y_1_2*-8.93277)); (counterfeit) = Probability(non_probabilistic_counterfeit); Logistic(x){ return 1/(1+exp(-x)) } Probability(x){ if x < 0 return 0 else if x > 1 return 1 else return x }

- Banknote authentication data set, UCI Machine Learning Repository.