Through image analysis with neural networks we can detect diseased trees, thus anticipating the imminent fall of them. The analysis of high-resolution photos can help us to remove and replant new ones.

This example uses remote sensing data for detecting diseased trees.

The data set consists of image segments, generated by segmenting the pan-sharpened image. The segments contain spectral information from the Quickbird multispectral image bands and texture information from the panchromatic (Pan) image band.

This is a classification project, since the variable to be predicted is binary (disease region or not).

The goal here is to model the probability that a region of trees presents wilt, conditioned on the image features.

The data set comprises a data matrix in which columns represent variables and rows represent instances.

The data file tree_wilt.csv contains the information for creating the model. Here, the number of variables is 6, and the number of instances is 574.

The total number of variables is 6:

**glcm**: Mean gray level co-occurrence matrix (GLCM) texture index.**green**: Mean green (G) value.**red**: Mean red (R) value.**nir**: Mean near infrared (NIR) value.**pan_band**: Standard deviation.**class**: Diseased trees or all other land cover.

The total number of instances is 574. They are divided into training, generalization and testing subsets. The number of training instances is 346 (60%), the number of selection instances is 114 (20%) and the number of testing instances is 114 (20%).

There are few training samples for the 'diseased trees' class (74) and many for 'other land cover' class (4265).

Now we have to configure the neural network that represents the classification function.

The number of inputs is 5, and the number of outputs is 1. The complexity, represented by the numbers of hidden perceptrons is 5. The probabilistic layer allows the outputs to be interpreted as probabilities, i.e., all outputs are between 0 and 1 and their sum is 1.

The following picture shows a graph of the neural network for this example.

The loss index defines the task that the neural network is required to accomplish. The weighted squared error with L2 regularization is used here.

The learning problem can be stated as to find a neural network which minimizes the loss index. That is, a neural network that fits the data set (error term) without undesired oscillations (regularization term).

The procedure used to carry out the learning process is called optimization algorithm. The optimization algorithm is applied to the neural network to obtain the minimum possible loss. The type of training is determined by the way in which the adjustment of the parameters in the neural network takes place.

The quasi-Newton method is used here as optimization algorithm in the training strategy.

The following chart shows how the training and selection errors decrease with the optimization algorithm epochs during the training process.
The final values are **training error = XXX WSE** and **selection error = XXX WSE**, respectively.

The objective of model selection is to find the network architecture with best generalization properties, that is, that which minimizes the error on the selection instances of the data set.

More specifically, we want to find a neural network with a selection error less than **XXX**,
which is the value that we have achieved so far.

Order selection algorithms train several network architectures with different number of neurons and select that with the smallest selection error.

The incremental order method starts with a small number of neurons and increases the complexity at each iteration. The following chart shows the training error (blue) and the selection error (orange) as a function of the number of neurons.

The last step is to test the generalization performance of the trained neural network.

In the confusion matrix the rows represent the target classes and the columns the output classes for the testing target data set. The diagonal cells in each table show the number of cases that were correctly classified, and the off-diagonal cells show the misclassified cases. The next table shows the confusion elements for this application. The following table contains the elements of the confusion matrix.

Predicted positive | Predicted negative | |
---|---|---|

Real positive | 103 | 0 |

Real negative | 0 | 171 |

The number of correctly classified instances is 106, and the number of misclassified instances is 8.

The next list depicts the binary classification tests for this application:

**Classification accuracy: 79.4%**(ratio of correctly classified samples).**Error rate: 20.6%**(ratio of missclassified samples).**Sensitivity: 80.4%**(percentage of actual positive classified as positive).**Specificity: 79.3%**(percentage of actual negative classified as negative).

The neural network is now ready to predict outputs for inputs that it has never seen.

The next table shows the input values and their corresponding output values. The input variables are GLCM_pan, mean_G, mean_R, mean_NIR and SD_pan; and the output variables are class.

The mathematical expression represented by the neural network is written below.

- The data for this problem has been taken from the UCI Machine Learning Repository.
- Johnson, B., Tateishi, R., Hoan, N., 2013. A hybrid pansharpening approach and multiscale object-based image analysis for mapping diseased pine and oak trees. International Journal of Remote Sensing, 34 (20), 6969-6982.