Machine learning examples

Diagnose faults in ultrasonic flowmeters

A significant problem for engineers is finding the faults and inconveniences of the facilities created. On a large scale, these can cause important inconveniences in the process and commit unacceptable malfunctions.

In this study, we analyze a database with the characteristics of a liquid ultrasonic flowmeter and their state (healthy or unhealthy), so we can know which are the relevant factors that cause failures in the system.

Flow meters are devices used for the purpose of measuring the volumetric or mass flow of a fluid with ultrasound. However, these devices present serious engineering problems that give rise to a defective meter function and cause errors in the reading of the flow velocity.

Consequently, they require to be taken to accredited flow facilities to be recalibrated after one year of operation. Nevertheless, this does not take into account if the device is healthy and thus leading to a unnecessary expense, or worse, the devise has already failed before the year is completed and provides erroneous measurements.

Therefore, a new solution to this problem is needed, which provides a balance between accurate measurement and the necessary costs of recalibration.

To carry out the analysis, we work with the most powerful method of analysis, Advanced Analytics.

Contents:

  1. Application type.
  2. Data set.
  3. Neural network.
  4. Training strategy.
  5. Model selection.
  6. Testing analysis.
  7. Model deployment.

1. Application type

This is a classification project, since the variable to be predicted is binary (healthy or unhealthy).

The goal here is to model the probability that an ultrasonic flowmeter fails, conditioned on the process characteristics.

2. Data set

The first step is to prepare the data set, which is the source of information for the approximation problem. It is composed of:

The file fault_detection.csv contains 87 instances of 37 diagnostic parameters corresponding to an 8-path liquid ultrasonic flow meter. These 37 variables are all continuous except for the state of health.

The variables of the problem are:

Since neural networks work with numbers, the class attribute (health state of meter) has been transformed into two numerical values, 0 when the meter is faulty and 1 when it works properly.

Once we have the prepared data we will make a descriptive analysis of the problem to detect the factors to be taken into account in the training to provide a logical and adequate solution to the problem. During the development, we modify the most important aspects, shaping the solution until obtaining the most precise and convenient one.

In the first place, it is important to know the ratio of negative and positive instances that we have in the data set.

The chart shows that the number of negative instances(40.23%) and the number of positive instances(59.77%) are similar. This information will later be used to properly design the predictive model.

The inputs-targets correlations, which are shown in the next chart, analyze the dependencies between each input variable and the target.

Flatness ratio (0.511) and some of the gains are the variables most relevant to the target. On the other hand, flow velocity in the paths contribute much less, so they are less determinant when deciding the state of health of the device.

The number of inputs is too high. Thus, we will set some of the input variables with a correlation with the target fewer than 0,1 to unused variables. Furthermore, all gain variables will be manually set to input in order to build the model taking into account all gains.

This data set contains 87 instances. From these, 53 instances are used for training (60%), 17 for generalization (20%), and 17 for testing (20%).

3. Neural network

The next step is to choose a neural network. For classification problems, it is usually composed by:

The scaling layer contains the statistics on the inputs calculated from the data file and the method for scaling the input variables.

Two perceptron layers with a logistic hidden layer and a logistic output layer are used. The neural network must have 19 inputs, since these are the inputs selected after the descriptive analysis of the data. As an initial guess, we use 3 neurons in the hidden layer.

The probabilistic layer only contains the method for interpreting the outputs as probabilities. For this project, being the target variable's possible values 1 or 0, we set the binary probabilistic method.

The next figure shows the structure of the neural network that has been set for this data set.

4. Training strategy

The next step is to select an appropriate training strategy, which defines what the neural network will learn. A general training strategy is composed of two concepts:

The data set is slightly unbalanced. Nevertheless, we use the normalized squared error with L2 regularization as loss index.

Now, the model is ready to be trained. As optimization algorithm we use the default option for this type of cases, which is the quasi-Newton method.

The following chart shows how the training and selection errors decrease with the epochs during the training process.

The final values are training error = 0.077 WSE and selection error = 1.07 WSE, respectively.

This selection error is not a reliable value, therefore to improve it we must perform a model selection process.

5. Model selection

The objective of model selection is to find the network architecture with best generalization properties, that is, that which minimizes the error on the selection instances of the data set.

More specifically, we want to find a neural network with a selection error less than 1.07 WSE, which is the value that we have achieved so far.

Order selection algorithms train several network architectures with different number of neurons and select that with the smallest selection error.

The incremental order method starts with a small number of neurons and increases the complexity at each iteration. The following chart shows the training error (blue) and the selection error (orange) as a function of the number of neurons that has been set for each iteration. Besides, the new network architecture is represented below.

After the Order selection, the results are that the optimal network architecture consists of a single neuron in the perceptron layer. The selection error for this configuration is 0.41 WSE, which is far better than before.

6. Testing analysis

testing analysis is in charge of examining the performance of our predictive model. In first place we will check the ratio of positive and negative results from the outputs.

As we can see in the above figure, the frequency for faulty meters is 47.0588%, while for meters that work properly it is 52.9412%. The distribution being similar is a good sign, since it is similar in the original dataset.

A good measure to check the precision of a binary classification model is the ROC curve, which is shown below.

The parameter of importance here is the area under the curve (AUC). The value of this parameter must be higher than 0.5, and the closer to 1 the better. For this model, AUC = 0.764, which is not great but it is not randomness.

Another valid option to test the accuracy of a binary classification model is the confusion matrix. This matrix shows the number of negative and positive predicted values versus the real ones.

Predicted positive Predicted negative
Real positive 7 (41.2%) 2 (11.8%)
Real negative 2 (11.8%) 6 (35.3%)

From the confusion matrix, we can obtain the following binary classification tests:

According to the results our predictive model is slightly accurate (76.5%). The probability of success in positive cases is 77.8% and in the negative cases it is 75%, practically the same.

7. Model deployment

Once we know that our model is reliable and accurate, we can use it to check the health state of a meter, given the required values for the input variables. This is called model deployment.

The predictive model also provides which are the most significant factors when determining the health state of the device. This fact allows us to act fast thus avoiding the inadequate functioning of the system for a long time. On the other hand, we also save important economic resources by predicting that the system works correctly.

The predictive model provides us with a mathematical expression to integrate the function to other meters with the same variables. This mathematical expression is shown below:

scaled_flatness_ratio = (flatness_ratio-0.823907)/0.0182529;
scaled_symmetry = (symmetry-1.01097)/0.00580434;
scaled_crossflow = (crossflow-0.997436)/0.00202617;
scaled_gain_1 = (gain_1-33.9365)/0.240569;
scaled_gain_2 = (gain_2-33.3957)/0.0427105;
scaled_gain_3 = (gain_3-36.6963)/0.0213074;
scaled_gain_4 = (gain_4-36.8947)/0.107135;
scaled_gain_5 = (gain_5-35.1576)/0.0429498;
scaled_gain_6 = (gain_6-35.3841)/0.0959621;
scaled_gain_7 = (gain_7-33.537)/0.347826;
scaled_gain_8 = (gain_8-31.5031)/0.103171;
scaled_gain_9 = (gain_9-34.0167)/0.226074;
scaled_gain_10 = (gain_10-33.1856)/0.101873;
scaled_gain_11 = (gain_11-36.6672)/0.0195906;
scaled_gain_12 = (gain_12-36.7895)/0.0790561;
scaled_gain_13 = (gain_13-35.9453)/0.0577769;
scaled_gain_14 = (gain_14-35.9201)/0.0449897;
scaled_gain_15 = (gain_15-34.2524)/0.372388;
scaled_gain_16 = (gain_16-32.2916)/0.615941;
y_1_1 = Logistic (-0.25737+ (scaled_flatness_ratio*1.04101)+ (scaled_symmetry*-1.83364)+ (scaled_crossflow*7.42195)+ (scaled_gain_1*-2.34869)+ (scaled_gain_2*-8.03584)+ (scaled_gain_3*9.03579)+ (scaled_gain_4*-2.94732)+ (scaled_gain_5*-2.24593)+ (scaled_gain_6*-1.15397)+ (scaled_gain_7*5.47512)+ (scaled_gain_8*-4.10548)+ (scaled_gain_9*-1.91741)+ (scaled_gain_10*-4.7171)+ (scaled_gain_11*-0.496304)+ (scaled_gain_12*1.51573)+ (scaled_gain_13*-0.862037)+ (scaled_gain_14*-7.85136)+ (scaled_gain_15*0.977022)+ (scaled_gain_16*-3.77067));
non_probabilistic_fault = Logistic (-1.3648+ (y_1_1*9.12262));
fault = binary(non_probabilistic_fault);

logistic(x){
   return 1/(1+exp(-x))
}

binary(x){
   if x < decision_threshold
       return 0
   else
       return 1
}
        

This expression can be exported elsewhere, for instance, a dedicated engineering software used in industry.

References:

Related examples:

Related solutions: