Parkinson's disease telemonitoring
By Sergio Sanchez, Artelnics.
Parkinson's disease is a degenerative disorder of the central nervous system. There is no cure for this disease, but medications, surgery and multidisciplinary management can provide relief from the symptoms. The stage of the disease determines what group of drugs is most useful in order to relieve the symptoms. The goal of this study is to predict the clinician's Parkinson's disease symptom score on the UPDRS scale.
This dataset is composed of a range of biomedical voice measurements from 42 people with early-stage Parkinson's disease recruited to a six-month trial of a telemonitoring device for remote symptom progression monitoring. The recordings were automatically captured in the patient's homes.
The following table depicts the names, descriptions and uses of all the variables in the data set.
In this case the number of variables is 22, and the number of instances is 5875.
There are 3525 instances for training (60%), 1175 instances for generalization (20%), 1175 instances for testing (20%) and 0 unused instances (0%).
The second step is to set the model stuff. For approximation project type, it is composed by:
- Scaling layer.
- Principal components layer.
- Learning layers.
- Unscaling layer.
- Bounding layer.
The following figure shows the neural network page in Neural Designer.
A graphical representation of the neural network is depicted next. It contains a scaling layer, a multilayer perceptron and an unscaling layer. The number of inputs is 19, and the number of outputs is 2. The complexity, represented by the numbers of hidden perceptrons is 19.
The neural network defines a function which represents the model.
The third step is to select an appropriate loss index, which plays an important role in the use of a neural network. It defines the task the neural network is required to do, and provides a measure of the quality of the representation that it is required to learn. The choice of a suitable loss index depends on the particular application.
The normalized squared error is used here as objective term. It divides the squared error between the outputs from the neural network and the targets in the data set by a normalization coefficient. If the normalized squared error has a value of unity then the neural network is predicting the data 'in the mean', while a value of zero means perfect prediction of the data.
The procedure used to carry out the learning process is called training (or learning) strategy. The training strategy is applied to the neural network in order to obtain the best possible performance. The type of training is determined by the way in which the adjustment of the parameters in the neural network takes place.
The following chart shows how the performance decreases with the iterations during the training process. The initial value is 15.3913, and the final value after 100 iterations is 0.288922. (The first 100 iterations are shown)
The next table shows the training results by the quasi-Newton method. They include some final states from the neural network, the loss index and the training algorithm.
Here the final parameters norm is not very big, the final performance and generalization performance are small, and the final gradient norm is next to zero.
A standard method for testing the predictive model is to compare the outputs from the neural network against data never seen before.
The following images show the linear regression analysis for the two main outputs (motor_UPDRS and total_UPDRS).
First of all, we can see motor_UPDRS values:
The next chart illustrates the linear regression for the scaled output motor_UPDRS. The predicted values are plotted versus the actual ones as squares. The coloured line indicates the best linear fit. The grey line would indicate a perfect fit. Note that some scaled outputs fall outside the range defined by the scaled targets, and therefore they are not plotted.
For this variable, there is a good correlation, since the corresponding coefficient is high (0.919). Also, the intercept is close to 0 (-0.00862) and the slope is high (0.853). The plot also indicates that the neural network predicts well the motor_UPDRS.
Secondly, we have total_UPDRS values:
The next chart illustrates the linear regression for the scaled output total_UPDRS. The predicted values are plotted versus the actual ones as squares. The coloured line indicates the best linear fit. The grey line would indicate a perfect fit. Note that some scaled outputs fall outside the range defined by the scaled targets, and therefore they are not plotted.
As before, the intercept, slope and correlation values are good. The plot also indicates that the prediction is accurate for most of the cases.
The neural network is now ready to estimate the motor_UPDRS and the toal_UPDRS with satisfactory quality over the same range of data.
The "Calculate output" task calculates the output value for a given set of input values.
As well, we can use the mathematical expression of the neural network, which is listed next.
- The data for this problem has been taken from the UCI Machine Learning Repository