Machine learning examples

Detect forged banknotes

Everyday millions of people use banknotes to make transactions. The security of these banknotes for governments and banks is hence an essential factor to fight froud.

Nowadays, sometimes it is too hard to spot counterfeit and genuine notes. Then, the aim of this example is to create a support system ready to help organizations to accurately classify fraudulent notes.

Data were extracted from images that were taken from genuine and forged banknote-like specimens. For digitization, an industrial camera usually used for print inspection was used. The final images have 400x 400 pixels.

Due to the object lens and distance to the investigated object gray-scale pictures with a resolution of about 660 dpi were gained. Wavelet Transform tool were used to extract features from images.


  1. Application type.
  2. Data set.
  3. Neural network.
  4. Training strategy.
  5. Model selection.
  6. Testing analysis.
  7. Model deployment.

1. Application type

This is a classification project, since the variable to be predicted is binary (fraudulent or legal).

The goal here is to model the probability that a banknote is fraudulent, as a function of its features.

2. Data set

The data file banknote_authentication.csv is the source of information for the classification problem. The number of instances (rows) in the data set is 1372, and the number of variables (columns) is 5.

In that way, this problem has the following variables:

The instances are divided into training, selection and testing subsets. There are 824 instances for training (60%), 274 instances for selection (20%) and 274 instances for testing (20%).

We can calculate the data distributions and plot a pie chart with the percentage of instances for each class.

As we can see, the number of authentic and forged banknotes are similar.

Next we plot a scatter chart with the counterfeit and the wavelet transformed variance data.

In general, the more wavelet transformed variance, the more probability of counterfeit.

The inputs-targets correlations might indicate which factors better discriminate between authentic and false banknotes.

From the above chart, we can see that the wavelet transformed variance might be the most influential variable for this application.

3. Neural network

The second step is to configure a neural network to represent the classification function.

The next picture shows the neural network that defines the model.

4. Training strategy

The fourth step is to set the training strategy, which is composed by:

The loss index that we use is the weighted squared error with L2 regularization.

The learning problem can be stated as to find a neural network which minimizes the loss index. That is, we want a neural network that fits the data set (error term) and that does not oscillate (regularization term).

We use here the quasi-Newton method as the optimization algorithm. The default training parameters, stopping criteria and training history settings are left.

The next figure shows the loss history with the quasi-Netwon method. As we can see, the loss decreases until it reaches a stationary value. This is a sign of convergence.

The final training and selection errors are almost zero, which means that the neural network fits the data very well. More specifically, training error = 0.014 WSE and selection error = 0.011 WSE.

5. Model selection

The objective of model selection is to improve the generalization capabilities of the neural network or, in other words, to reduce the selection error.

Since the selection error that we have achieved so far is very small (0.011 WSE), we neither apply Order selection nor inputs selection here.

6. Testing analysis

The aim of testing analysis is to validate the generalization performance of the trained neural network.

A good measure for the precision of a binary classification model is the ROC curve.

The area under the curve of the model is AUC = 1, which means that the classifier is predicting well all the testing instances.

In the confusion matrix, the rows represent the target classes and the columns the output classes for the testing target data set. The diagonal cells in each table show the number of cases that were correctly classified, and the off-diagonal cells show the misclassified cases. The following table contains the elements of the confusion matrix.

Predicted positive Predicted negative
Real positive 103 0
Real negative 0 171

The number of correctly classified instances is 274, and the number of misclassified instances is 0. As there are not misclassified patterns, the model is predicting this testing data very well.

7. Model deployment

In the model deployment phase, the neural network is used to predict outputs for inputs that it has never seen.

For that, we can embed the mathematical expression represented by the neural network in the banknote authentication system. This expression is written below.

scaled_wavelet_transformed_variance = (wavelet_transformed_variance-0.433735)/2.84276;
scaled_wavelet_transformed_skewness = (wavelet_transformed_skewness-1.92235)/5.86905;
scaled_wavelet_transformed_curtosis = (wavelet_transformed_curtosis-1.39763)/4.31003;
scaled_image_entropy = (image_entropy+1.19166)/2.10101;
y_1_1 = Logistic (-2.95122+ (scaled_wavelet_transformed_variance*-3.20568)+ (scaled_wavelet_transformed_skewness*-4.57895)
+ (scaled_wavelet_transformed_curtosis*-5.83131)+ (scaled_image_entropy*0.125717));
y_1_2 = Logistic (3.23366+ (scaled_wavelet_transformed_variance*3.5863)+ (scaled_wavelet_transformed_skewness*2.36407)
+ (scaled_wavelet_transformed_curtosis*1.0865)+ (scaled_image_entropy*-1.0501));
non_probabilistic_counterfeit = Logistic (3.48838+ (y_1_1*9.72432)+ (y_1_2*-8.93277));
(counterfeit) = Probability(non_probabilistic_counterfeit);

return 1/(1+exp(-x))

if x < 0
return 0
else if x > 1
return 1
return x


Related examples: