Target donors in blood donation campaigns using Neural Designer

This study aims to predict if a person will donate blood by using a recency, frequency, monetary, and time (RFMT) marketing model.

The database used for this study was taken from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan.


  1. Application type.
  2. Data set.
  3. Neural network.
  4. Training strategy.
  5. Model selection.
  6. Testing analysis.
  7. Model deployment.

This example is solved with Neural Designer. To follow it step by step, you can use the free trial.

1. Application type

The variable to be predicted is binary (donate or not). Therefore, this is a classification project,

The goal here is to model the probability of a person donating blood, conditioned on his/her features.

2. Data set

The data file blood_donation.csv contains the information used to create the model. It consists of 748 rows and 5 columns. The columns represent the variables, and the rows represent the instances.

The next list describes the variables in the data set:

On the other hand, the total number of instances is 748. From that, we set 60% for training, 20% for selection, and 20% for testing.

We can calculate the data distributions and plot a pie chart with the percentage of instances for each class.

As we can see, the number of negative responses is much higher than the number of positive responses.

Another relevant piece of information to keep in mind is the correlation of each input with the target variable. Below a chart with this information is displayed.

3. Neural network

The second step is to choose a neural network to represent the classification function. For classification problems, it is composed of:

For the scaling layer, the mean and standard deviation scaling method is set.

We set 2 perceptron layers, one hidden layer with 3 neurons as a first guess and one output layer with 1 neuron, both layers having the logistic activation function.

At last, we will set the continuous probabilistic method for the probabilistic layer.

The next figure is a diagram for the neural network used in this example.

4. Training strategy

The fourth step is to configure the training strategy, which is composed of two terms:

The loss index chosen is the weighted squared error with L2 regularization.

The chosen optimization algorithm is the quasi-Newton method. We leave the default training parameters, stopping criteria, and training history settings.

The following chart shows how the training and selection errors decrease with the epochs during the training process. The final values are training error = 0.695 WSE and selection error = 0.907 WSE, respectively.

5. Model selection

The objective of model selection is to find the network architecture with the best generalization properties, that is, that which minimizes the error on the selected instances of the data set.

More specifically, we want to find a neural network with a selection error of less than 0.907 WSE, which is the value that we have achieved so far.

Order selection algorithms train several network architectures with a different number of neurons and select that with the smallest selection error.

The incremental order method starts with a small number of neurons and increases the complexity at each iteration. The following chart shows the training error (blue) and the selection error (orange) as a function of the number of neurons.

The final selection error achieved is 0.902 for an optimal number of neurons of 2.

The graph above represents the architecture of the final neural network.

6. Testing analysis

The next step is to evaluate the performance of the trained neural network. The standard way to do this is to compare the outputs of the neural network against data never seen before, the training instances.

A standard testing method is to plot a ROC curve, a graphical illustration of how well the classifier discriminates between the two different classes. The output is shown in the next figure.

A random classifier has an area under a curve of 0.5, while a perfect classifier has an area under the curve of 1. In practice, this measure should take a value between 0.5 and 1. The closer to 1, the better the classifier. In this example, this parameter is AUC = 0.804, which means a great performance.

The binary classification tests provide us with useful information about the performance of a binary classification model:

The parameter classification accuracy takes a value of 0.861, which means that the prediction is good for most cases.

The confusion matrix contains the true positives, false positives, false negatives and true negatives for the diagnose:

Predicted positive Predicted negative
Real positive 32 6
Real negative 4 30

The number of correctly classified instances is 62, and the number of misclassified instances is 10.

The cumulative gain analysis is a visual aid that shows the advantage of using a predictive model as opposed to randomness. It consists of three lines.

The baseline represents the results that would be obtained without using a model.

The positive cumulative gain shows in the y-axis the percentage of positive instances found against the percentage of the population represented in the x-axis.

Similarly, the negative cumulative gain shows the percentage of the negative instances found against the population percentage.

7. Model deployment

Once the generalization performance of the neural network has been tested, it can be saved for future use in the so-called model deployment mode.

We can predict whether a person is going to donate blood by calculating the neural network outputs. For that, we need to set the input variables.

The mathematical expression represented by the neural network is written below. It takes the inputs recency, frequency, monetary, and time to produce the output prediction about donation. For classification problems, the information is propagated in a feed-forward fashion through the scaling, perceptron, and probabilistic layers.

scaled_recency = (recency-9.50668)/8.0954;
scaled_frequency = (frequency-5.51471)/5.83931;
scaled_time = (time-34.2821)/24.3767;
y_1_1 = Logistic (-3.2852+ (scaled_recency*-3.22375)+ (scaled_frequency*3.67502)+ (scaled_time*-2.45661));
y_1_2 = Logistic (-4.08721+ (scaled_recency*-2.96105)+ (scaled_frequency*2.76006)+ (scaled_time*-3.40265));
non_probabilistic_donation = Logistic (-1.089+ (y_1_1*5.14874)+ (y_1_2*-2.1466));
donation = probability(non_probabilistic_donation);

   return 1/(1+exp(-x))

   if x < 0
       return 0
   else if x > 1
       return 1
       return x

The above expression can be exported anywhere.


Related examples:

Related solutions: