The aim of this example is to identify human activity from data collected by a smartphone.

Activity recognition allows, for instance, the development of health apps that track the behaviour of users and provide recommendations.

The experiments have been carried out with a group of 30 volunteers within an age bracket of 19-48 years. Each person performed six activities (walking, walking_upstairs, walking_downstairs, sitting, standing, laying) wearing a smartphone on the waist.

Using its embedded accelerometer and gyroscope, we captured 3-axial linear acceleration and 3-axial angular velocity at a constant rate of 50Hz. The experiments have been video-recorded to label the data manually.

This is a classification project, since the variable to be predicted is categorical (walking, walking_upstairs, walking_downstairs, sitting, standing or laying).

The goal here is to model the probabilities of class membership, conditioned on the input variables.

The file activity_recognition.csv contains 10299 samples, each of them with 561 inputs and one categorical target.

The input variables include time and frequency domain signals obtained from the smartphone sensors:

**body_acceleration****gravity_acceleration****body_acceleration_jerk****body_angular_speed****body_angular_acceleration****body_acceleration_magnitude****gravity_acceleration_magnitude****body_acceleration_jerk_magnitude****body_angular_speed_magnitude****body_angular_acceleration_magnitude**

The target variable has six different classes, each of them corresponding to one of the previously mentioned activities.

**walking****walking_upstairs****walking_downstairs****sitting****standing****laying**

The instances are splitted at random into a training (60%), a selection (20%) and a testing (20%) subsets.

By calculating the data distribution, we can see the number of instances belonging to each class in the data set.

As we can see, the number of instances belonging to each category is similar. Therefore this data set is well balanced.

We can also calculate the inputs-targets correlations to see which signals better define each activity. The following chart shows the 20 variables most correlated with the activity "standing".

The person's activity model will be a neural network composed by:

- Scaling layer.
- Perceptron layers.
- Probabilistic layer.

The scaling layer uses the minimum and maximum scaling method.

The number of perceptron layers is 2:

- The first perceptron layer has 561 inputs and 3 neurons.
- The second perceptron layer has 3 inputs and 6 neurons (the number of classes).

The probabilistic layer uses the softmax probabilistic method.

The procedure used to carry out the learning process is called training strategy. The training strategy is applied to the neural network to obtain the best possible performance. The type of training is determined by the way in which the adjustment of the parameters in the neural network takes place.

We set the normalized squared error with L2 regularization as loss index.

On the other hand, we use the quasi-Newton method as optimization algorithm.

The following chart shows how the training and selection errors decrease with the epochs of the quasi-Newton method during the training process.

As we can see, the behavior of both curves is similar along the iterations which means that no over-fitting has appeared.
The final training and selection errors are **training error = 0.008 NSE** and **selection error = 0.048 NSE**.
That which indicates that the neural network has good generalization capabilities.

The objective of model selection is to find the network architecture with best generalization properties.

Since the final selection error that we have so far is very small (0.048 NSE), there is no need to use these kind of algorithms here.

Once the model is trained, we perform a testing analysis to validate its prediction capacity. For that, we use a subset of data that have not been used before, the testing instances.

The next table shows the confusion matrix for our problem. In the confusion matrix, the rows represent the real classes and the columns the predicted classes for the testing data.

Predicted STANDING | Predicted SITTING | Predicted LAYING | Predicted WALKING | Predicted WALKING_DOWNSTAIRS | Predicted WALKING_UPSTAIRS | |
---|---|---|---|---|---|---|

Real STANDING | 376 (18.3%) | 18 (0.874%) | 0 | 0 | 0 | 0 |

Real SITTING | 18 (0.874%) | 330 (16%) | 1 (0.0486%) | 0 | 0 | 0 |

Real LAYING | 0 | 2 (0.0971%) | 402 (19.5%) | 0 | 0 | 0 |

Real WALKING | 1 (0.0486%) | 0 | 0 | 307 (14.9%) | 0 | 0 |

Real WALKING_DOWNSTAIRS | 0 | 0 | 0 | 0 | 279 (13.6%) | 1 (0.0486%) |

Real WALKING_UPSTAIRS | 1 (0.0486%) | 0 | 0 | 0 | 5 (0.243%) | 318 (15.4%) |

As we can see, the number of instances that the model can correctly predict is 2012 while it misclassifies only 47. This shows that our predictive model has a great classification accuracy.

The neural network is now ready to be used to predict the activity of new people in the so-called model deployment phase.

The file activity_recognition.py implements the mathematical expression of the neural network in Python. This piece of software can be embedded in any tool to make predictions on new data.

- UCI Machine Learning Repository Human Activity Recognition Using Smartphones Data Set.
- Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz. A Public Domain Dataset for Human Activity Recognition Using Smartphones. 21th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2013. Bruges, Belgium 24-26 April 2013.