Dollar logo

Banknote authentication

By Sergio Sanchez, Artelnics.

Everyday millions of people use banknotes to make transactions. The security of these banknotes for governments and banks is hence an essential factor in order to fight froud.

Nowadays, sometimes it is too hard to spot counterfeit and genuine notes. Then, the aim of this study is to create a support system ready to help organizations to accurately classify fraudulent notes.

Dollar logo
Banknote authentication.

Contents:

  1. Data set
  2. Neural network
  3. Loss index
  4. Training strategy
  5. Testing analysis
  6. Model deployment

1. Data set

The first step is to prepare the data file, which is the source of information for the classification problem. The format of this file is a set of rows with values separated by tabs. In classification, target variables can only have two values: 0 (false) or 1 (true).
The following listing is a preview of the data file. The number of instances (rows) in the data set is 1372, and the number of variables (columns) is 5.

Banknote dataset picture
Banknote dataset.

The next figure shows the data set tab in Neural Designer.

Data set page screenshot
Dataset page.

In that way, this problem has the following variables:

  1. variance_of_wavelet_transformed, used as input.
  2. skewness_of_wavelet_transformed, used as input.
  3. curtosis_of_wavelet_transformed, used as input.
  4. entropy_of_image, used as input.
  5. class, used as target.

The data set is divided into training, selection and testing subsets. There are 824 instances for training (60.1%), 274 instances for selection (20%), 274 instances for testing (20%) and 0 unused instances (0%).

Target variable distribution
Target variable distribution.

2. Neural network

The second step is to choose a neural network architecture to represent the classification function. Depending on the number of inputs, neurons in the hidden layers, and outputs, the architecture and the result of the neural network will be different. The next picture shows the neural network that defines the model mentioned below.

Neural network graph
Neural network graph.

The scaling layer section contains information about the method for scaling the input variables and the statistic values to be used by that method. In this example, we will use the minimum and maximum method for scaling the inputs. The mean and standard deviation would also be appropriate here.

3. Loss index

The third step is to set the loss index, which is composed by:

  • Error term.
  • Regularization term.

The next figure shows the loss index page here. All setting are default values.

Loss index screenshot
Loss index page.

The objective term is the weighted squared error.

On the other hand, the regularization term is the neural parameters norm. The weight for this term is 0.001. Regularization has two effects here:

  1. It makes the model to be stable, without oscillations.
  2. It avoids saturation of the logistic activation functions.

The learning problem can be stated as to find a neural network which minimizes the loss index, i.e., a neural network that fits the data set (objective) and that does not oscillate (regularization).

4. Training strategy

The third step in solving this problem is to assign the training strategy. A general training strategy is composed of two algorithms:

  • Initialization algorithm.
  • Main algorithm.

The next figure shows the training strategy page in Neural Designer.

Training strategy page screenshot
Training strategy page.

We will not use any initialization algorithm here. We use here the quasi-Newton method as the main training algorithm. We will leave the default training parameters, stopping criteria and training history settings.

The next figure shows the loss history with the quasi-Netwon method. As we can see, the loss decreases until it reaches a stationary value. This is a sign of convergence.

Training performance plot
Training loss.

The neural network is trained in order to achieve the best possible loss and good generalization properties. The next table shows the final states from the neural network, the loss index and the training algorithm.

Performance measure table
Loss measure table.

The final loss is almost zero, which means that the neural network fits the data very well. The selection loss is also very small, which certificates that no over fitting has occurred.

5. Testing analysis

The last step is to test the generalization performance of the trained neural network. In the confusion matrix the rows represent the target classes and the columns the output classes for the testing target data set. The diagonal cells in each table show the number of cases that were correctly classified, and the off-diagonal cells show the misclassified cases.

The following table contains the elements of the confusion matrix.

Confusion matrix table
Confusion matrix.

The number of correctly classified instances is 274, and the number of misclassified instances is 0. As there are not misclassified patterns, the model is predicting this testing data very well.

6. Model deployment

The neural network is now ready to predict outputs for inputs that it has never seen. The "Calculate output" task calculates the output value for a given input value. This task opens a dialog to set the input values, see the next figure.

Output dialog
Output dialog.

The mathematical expression represented by the neural network is written below.


				scaled_variance=2*(variance+7.0421)/(6.8248+7.0421)-1;
				scaled_skewness=2*(skewness+13.7731)/(12.9516+13.7731)-1;
				scaled_kurtosis=2*(kurtosis+5.2861)/(17.9274+5.2861)-1;
				scaled_entropy=2*(entropy+8.5482)/(2.4495+8.5482)-1;
				y_1_1=Logistic(-3.28052
				+5.03143*scaled_variance
				-0.0363754*scaled_skewness
				-3.85087*scaled_kurtosis
				+0.00528292*scaled_entropy);
				y_1_2=Logistic(0.743815
				-0.699382*scaled_variance
				-2.24925*scaled_skewness
				-0.859183*scaled_kurtosis
				-0.127586*scaled_entropy);
				y_1_3=Logistic(1.97779
				+6.38392*scaled_variance
				+3.23236*scaled_skewness
				+3.52077*scaled_kurtosis
				-1.19071*scaled_entropy);
				y_1_4=Logistic(-1.38079
				-2.13277*scaled_variance
				-1.99074*scaled_skewness
				+0.790205*scaled_kurtosis
				-1.00866*scaled_entropy);
				y_1_5=Logistic(2.53669
				+5.82326*scaled_variance
				+5.22523*scaled_skewness
				+6.12734*scaled_kurtosis
				-0.686447*scaled_entropy);
				non_probabilistic_class=Logistic(4.87466
				+8.67167*y_1_1
				+2.83798*y_1_2
				-7.84766*y_1_3
				+3.72746*y_1_4
				-11.2728*y_1_5);
				(class) = Probability(non_probabilistic_class);

				Logistic(x){
					return 1/(1+exp(-x))
				}

				Probability(x){
					if x < 0
						return 0
					else if x > 1
						return 1
					else
						return x
				}