Urinary biomarkers for pancreatic cancer detection using Neural Designer

Pancreatic ductal adenocarcinoma (PDAC) is an extremely deadly type of pancreatic cancer. Once diagnosed, the five-year survival rate is less than 10%. However, if the disease is detected at an early stage when tumors are still small and resectable, 5-year survival can increase up to 70%. Unfortunately, many cases of pancreatic cancer show no symptoms until cancer has spread throughout the body. Therefore, a diagnostic test to identify people with pancreatic cancer could be enormously helpful.

While blood has traditionally been the main source of biomarkers, urine represents a promising alternative biological fluid. It allows a completely non-invasive sampling, high volume collection, and ease of repeated measurements. Currently, no useful biomarkers for earlier detection of PDAC exist; the only biomarker in clinical practice, serum CA19-9, is not specific or sensitive enough for screening purposes and is mainly used as a prognostic marker and for monitoring response to treatment.

Despite of his invasive sampling collection, it improves cancer detection when we include it in the study with the others urine biomarkers.

Previous studies have identified a panel of 3 protein biomarkers (LYVE1, REG1A, and TFF1) in urine that showed promise in detecting resectable PDAC. In this study, we have improved this panel by substituting REG1A with REG1B.

Finally, the four important biomarkers that we will consider and are present in urine are creatinine, LYVE1, REG1B, and TFF1. Creatinine is a protein that is often used as an indicator of kidney function. YVLE1 is lymphatic vessel endothelial hyaluronan receptor 1, a protein that may play a role in tumor metastasis. REG1B is a protein that may be associated with pancreas regeneration, and TFF1 is trefoil factor 1, which may be related to regeneration and repair of the urinary tract.

Contents:

  1. Application type.
  2. Data set.
  3. Neural network.
  4. Training strategy.
  5. Model Selection.
  6. Testing analysis.
  7. Model deployment.

This example is solved with Neural Designer. To follow it step by step, you can use the free trial.

1. Application type

This is a classification project since the variable to be predicted is categorical (no pancreatic disease, benign hepatobiliary disease, or pancreatic cancer).

The goal is predicting the presence of disease before it's diagnosed, and more specifically, differentiating between pancreatic cancer versus non-cancerous pancreas condition and healthy condition.

2. Data set

The data set was obtained from multiple centres: Barts Pancreas Tissue Bank, University College London, University of Liverpool, Spanish National Cancer Research Center, Cambridge University Hospital, and University of Belgrade. The biomarker panel was assayed on 590 urine specimens: 183 control samples, 208 benign hepatobiliary disease samples (of which 119 were chronic pancreatitis), and 199 PDAC samples.

It is composed of four concepts:

The data file pancreatic-cancer.csv contains the information used to create the model. It consists of 509 rows and 14 columns. The columns represent different cancer risk factors, while the rows represent the samples of the study.

This data set uses the following 16 variables:

Among all these input variables, there are a few of them that must be set as unused. Specifically, 'sample_id'; which Neural Designer does automatically, 'sample_origin'; as it only particularizes the origin of the patient samples, and it should not affect the final diagnosis, 'stage'; which is a variable that only exists for individuals we already know have cancer, 'patient_cohort' which it does not contribute to the final sample diagnosis, and 'benign_sample_diagnosis'; which only specifies the complete diagnosis for those patients with a benign diagnosis.

The variable corresponding to the biomarker REG1A is not in all the samples of the study. For that reason we choose to set it as unused too. This decision will not mean a deterioration of the model as the biomarker REG1B improve the results.

Once the data set is configured, we can calculate the data distribution of the variables. The following figure depicts the number of patients who have cancer and those who do not.

The minimum frequency is 31.0169%, which corresponds to no pancreatic disease diagnosis. The maximum frequency is 35.2542%, which corresponds to benign hepatobiliary disease diagnosis. As we can see, all the samples are well distributed between the three cases.

To compare the accuracy and AUC (Area Under Curve) calculated in this study with the ones that appear in the paper cited in the references section, we have to divide our dataset into four subsets.

For all these cases, the instances are divided into training and testing, containing 50% of the samples in each subset.

2.1. Control samples vs. PDAC stages I and II

The next figure depicts the inputs-target correlations of all the inputs with the target. This helps us see the different inputs' influence on the default.

The more correlated variables are the biomarkers LYVE1 and plasma_CA19_9.

2.2. Control samples vs. PDAC stages III and IV

The next figure depicts the inputs-target correlations of all the inputs with the target. This helps us see the different inputs' influence on the default.

The more correlated variables are the biomarkers LYVE1 and TFF1.

2.3. Benign hepatobiliary diseases vs. PDAC stages I and II

The next figure depicts the inputs-target correlations of all the inputs with the target. This helps us see the different inputs' influence on the default.

The more correlated variables are the biomarkers LYVE1 and plasma_CA19_9.

2.4. Benign hepatobiliary diseases vs. PDAC stages III and IV

The next figure depicts the inputs-target correlations of all the inputs with the target. This helps us see the different inputs' influence on the default.

The more correlated variables are the biomarkers LYVE1 and plasma_CA19_9.

3. Neural network

The second step is to choose a neural network to represent the classification function. We will use the same neural network configuration for all four cases for this part of the model creation. For classification problems, it is composed of:

We realize that having a perceptron layer contributes to overfitting the neural network. For this reason, we remove the perceptron layer.

The following figure is a diagram of the neural network used in each case of this example:

It contains a scaling layer with 7 neurons (yellow) and a probabilistic layer with 1 neuron (red).

4. Training strategy

The fourth step is to configure the training strategy. Finally, the training strategy is applied to the neural network to obtain the best possible loss. The type of training is determined by how the adjustment of the parameters in the neural network takes place. is composed of two concepts:

The loss index chosen for this problem is the mean squared error with L2 regularization. It calculates the average squared error between the outputs from the neural network and the target in the data set.

The optimization algorithm is applied to the neural network to get the best performance. Gradient descent is used here for training. With this method, the neural parameters are updated in the direction of the negative gradient of the loss function.

The following chart shows how the training error decreases with the epochs during the training process. As all the charts have a similar curvature, we will only show the case corresponding to control samples vs. PDAC stages I and II. The selection error is not plotted in the chart because we have not initially taken any selection samples.

Now, we calculate the training error of all the cases of this study:

6. Testing analysis

The next step is to evaluate the performance of the trained neural network by an exhaustive testing analysis. The standard way to do this is to compare the outputs of the neural network against data never seen before, the testing instances.

A common method to measure the generalization performance is the ROC curve. This is a visual aid to study the capacity of discrimination of the classifier. One of the parameters obtained from this chart is the area under the curve (AUC). The closer to 1 area under the curve, the better the classifier.

6.1. Control samples vs. PDAC stages I and II

In this case, the AUC takes a high value: AUC = 0.919.

Neural Designer computes the optimal threshold by finding the point of the ROC curve nearest to the upper left corner. The threshold which corresponds to that point is called the optimal threshold, and in this case, has a value of 0.788.

The binary classification tests and the confusion matrix give us helpful information about our predictive model's performance. Below, both are displayed for their optimal decision threshold.

Predicted positive Predicted negative
Real positive 34 (39.5%) 6 (7.0%)
Real negative 9 (10.5%) 37 (43.0%)

The classification accuracy takes a high value (82.6%), which means that the prediction is suitable for a large number of cases.

6.2. Control samples vs. PDAC stages III and IV

In this case, the AUC takes a high value: AUC = 0.913.

The optimal threshold has a value of: 0.587.

Predicted positive Predicted negative
Real positive 85 (60.7%) 9 (6.4%)
Real negative 7 (5.0%) 39 (27.9%)

The classification accuracy takes a high value (88.6%), which means that the prediction is suitable for a large number of cases.

6.3. Benign hepatobiliary diseases vs. PDAC stages I and II

In this case, the AUC takes a high value: AUC = 0.920.

The optimal threshold has a value of: 0.653.

Predicted positive Predicted negative
Real positive 44 (46.8%) 5 (5.3%)
Real negative 11 (11.7%) 34 (36.2%)

The classification accuracy takes a high value (83.0%), which means that the prediction is suitable for a large number of cases.

6.4. Benign hepatobiliary diseases vs. PDAC stages III and IV

In this case, the AUC takes a high value: AUC = 0.848.

The optimal threshold has a value of: 0.412.

Predicted positive Predicted negative
Real positive 47 (52.8%) 11 (12.4%)
Real negative 8 (9.0%) 23 (25.8%)

The classification accuracy takes a high value (78.7%), which means that the prediction is suitable for a large number of cases.

As in the paper, we will show a table with some sensitivity and specificity cutoffs. First we treat the case of the control samples versus pancreatic cancer stages I and II, and III and IV:

Sensitivity cutoff Specificity (Control vs I, II) Specificity (Control vs III, IV)
0.8 0.86 0.875
0.85 0.791 0.854
0.9 0.744 0.833
0.95 0.512 0.771

Now we treat the case of the benign samples versus pancreatic cancer stages I and II, and III and IV:

Specificity cutoff Sensitivity (Benign vs I, II) Sensitivity (Benign vs III, IV)
0.8 0.846 0.676
0.85 0.769 0.647
0.9 0.769 0.618
0.95 0.615 0.559

7. Model deployment

Once the generalization performance of the neural network has been tested, the neural network can be saved for future use in the so-called model deployment mode.

An interesting task in the model deployment tool is to calculate outputs, which produces a set of outputs for each set of inputs applied. The outputs depend, in turn, on the values of the parameters.

Next, we will show an example for the benign tumor or PDAC stages III and IV diagnosis.

The pancreatic cancer risk (stages III or IV) of that person would be high.

References:

Related examples:

Related solutions: