Diagnose faults in ultrasonic flowmeters using machine learning

Flow meters are devices used to measure the volumetric or mass flow of a fluid with ultrasound. However, these devices present serious engineering problems that give rise to a defective meter function and cause errors in reading the flow velocity.

In this study, we analyze a database with the characteristics of a liquid ultrasonic flowmeter and its state (healthy or unhealthy) to know the relevant factors that cause failures in the system.

Therefore, a solution that balances accurate measurement and the necessary costs of recalibration is needed.

To carry out the analysis, we work with the most powerful method of analysis, Advanced Analytics.


  1. Application type.
  2. Data set.
  3. Neural network.
  4. Training strategy.
  5. Model selection.
  6. Testing analysis.
  7. Model deployment.

This example is solved with Neural Designer. To follow it step by step, you can use the free trial.

1. Application type

This is a classification project, since the variable to be predicted is binary (healthy or unhealthy).

The goal here is to model the probability that an ultrasonic flowmeter fails, conditioned on the process characteristics.

2. Data set

The first step is to prepare the data set, which is the source of information for the approximation problem. It is composed of:

The file fault_detection.csv contains 87 instances of 37 diagnostic parameters corresponding to an 8-path liquid ultrasonic flow meter. These 37 variables are all continuous except for the state of health.

The variables of the problem are:

Since neural networks work with numbers, the class attribute (health state of the meter) has been transformed into two numerical values, 0 when the meter is faulty and 1 when it works properly.

Once we have the prepared data, we will make a descriptive analysis of the problem to detect the factors to be taken into account in the training to provide a logical and adequate solution to the problem. We modify the most critical aspects during the development, shaping the solution until obtaining the most precise and convenient one.

It is essential to know the ratio of negative and positive instances that we have in the data set in the first place.

The chart shows that the number of negative instances(40.23%) and positive instances(59.77%) are similar. This information will later be used to design the predictive model properly.

The inputs-targets correlations shown in the next chart, analyze the dependencies between each input variable and the target.

The flatness ratio (0.511) and some of the gains are the variables most relevant to the target. On the other hand, flow velocity in the paths contributes much less, so they are less determinant when deciding the state of health of the device.

The number of inputs is too high. Thus, we will set some of the input variables with a correlation with the target fewer than 0,1 to unused variables. Furthermore, all gain variables will be manually set to input to build the model taking into account all gains.

This data set contains 87 instances. From these, 53 instances are used for training (60%), 17 for generalization (20%), and 17 for testing (20%).

3. Neural network

The next step is to choose a neural network. For classification problems, it is usually composed by:

The scaling layer contains the statistics on the inputs calculated from the data file and the method for scaling the input variables.

Two perceptron layers with a hidden logistic layer and a logistic output layer are used. The neural network must have 19 inputs since these are the inputs selected after the descriptive analysis of the data. As an initial guess, we use three neurons in the hidden layer.

The probabilistic layer only contains the method for interpreting the outputs as probabilities. For this project, we set the binary probabilistic method as the target variable's possible values 1 or 0.

The following figure shows the structure of the neural network that has been set for this data set.

4. Training strategy

The next step is to select an appropriate training strategy, which defines what the neural network will learn. A general training strategy is composed of two concepts:

The data set is slightly unbalanced. Nevertheless, we use the normalized squared error with L2 regularization as the loss index.

Now, the model is ready to be trained. As the optimization algorithm, we use the default option for this type of case, the quasi-Newton method.

The following chart shows how the training and selection errors decrease with the epochs during the training process.

The final values are training error = 0.077 WSE and selection error = 1.07 WSE, respectively.

This selection error is not a reliable value. Therefore to improve it, we must perform a model selection process.

5. Model selection

The objective of the model selection is to find the network architecture with the best generalization properties, that is, that which minimizes the error on the selected instances of the data set.

More specifically, we want to find a neural network with a selection error of less than 1.07 WSE, which is the value we have achieved so far.

Order selection algorithms train several network architectures with a different number of neurons and select that with the smallest selection error.

The incremental order method starts with a small number of neurons and increases the complexity at each iteration. The following chart shows the training error (blue) and the selection error (orange) as a function of the number of neurons that have been set for each iteration. Besides, the new network architecture is represented below.

After the Order selection, the results are that the optimal network architecture consists of a single neuron in the perceptron layer. The selection error for this configuration is 0.41 WSE, which is far better than before.

6. Testing analysis

The testing analysis is in charge of examining the performance of our predictive model. First, we will check the ratio of positive and negative results from the outputs.

As shown in the above figure, the frequency for faulty meters is 47.0588%, while for meters that work correctly, it is 52.9412%. The distribution being similar is a good sign since it is similar in the original dataset.

A good measure to check the precision of a binary classification model is the ROC curve shown below.

The parameter of importance here is the area under the curve (AUC). The value of this parameter must be higher than 0.5, and the closer to 1, the better. For this model, AUC = 0.764, which is not great, but it is not randomness.

Another valid option to test the accuracy of a binary classification model is the confusion matrix. This matrix shows the number of negative and positive predicted values versus the real ones.

Predicted positive Predicted negative
Real positive 7 (41.2%) 2 (11.8%)
Real negative 2 (11.8%) 6 (35.3%)

From the confusion matrix, we can obtain the following binary classification tests:

Our predictive model is slightly accurate (76.5%) according to the results. The probability of success in positive cases is 77.8%, and in the negative cases, it is 75%, practically the same.

7. Model deployment

Once we know that our model is reliable and accurate, we can use it to check the health state of a meter, given the input variables' required values. This is called model deployment.

The predictive model provides us with a mathematical expression to integrate the function to other meters with the same variables. This mathematical expression is shown below:

scaled_flatness_ratio = (flatness_ratio-0.823907)/0.0182529;
scaled_symmetry = (symmetry-1.01097)/0.00580434;
scaled_crossflow = (crossflow-0.997436)/0.00202617;
scaled_gain_1 = (gain_1-33.9365)/0.240569;
scaled_gain_2 = (gain_2-33.3957)/0.0427105;
scaled_gain_3 = (gain_3-36.6963)/0.0213074;
scaled_gain_4 = (gain_4-36.8947)/0.107135;
scaled_gain_5 = (gain_5-35.1576)/0.0429498;
scaled_gain_6 = (gain_6-35.3841)/0.0959621;
scaled_gain_7 = (gain_7-33.537)/0.347826;
scaled_gain_8 = (gain_8-31.5031)/0.103171;
scaled_gain_9 = (gain_9-34.0167)/0.226074;
scaled_gain_10 = (gain_10-33.1856)/0.101873;
scaled_gain_11 = (gain_11-36.6672)/0.0195906;
scaled_gain_12 = (gain_12-36.7895)/0.0790561;
scaled_gain_13 = (gain_13-35.9453)/0.0577769;
scaled_gain_14 = (gain_14-35.9201)/0.0449897;
scaled_gain_15 = (gain_15-34.2524)/0.372388;
scaled_gain_16 = (gain_16-32.2916)/0.615941;
y_1_1 = Logistic (-0.25737+ (scaled_flatness_ratio*1.04101)+ (scaled_symmetry*-1.83364)+ (scaled_crossflow*7.42195)+ (scaled_gain_1*-2.34869)+ (scaled_gain_2*-8.03584)+ (scaled_gain_3*9.03579)+ (scaled_gain_4*-2.94732)+ (scaled_gain_5*-2.24593)+ (scaled_gain_6*-1.15397)+ (scaled_gain_7*5.47512)+ (scaled_gain_8*-4.10548)+ (scaled_gain_9*-1.91741)+ (scaled_gain_10*-4.7171)+ (scaled_gain_11*-0.496304)+ (scaled_gain_12*1.51573)+ (scaled_gain_13*-0.862037)+ (scaled_gain_14*-7.85136)+ (scaled_gain_15*0.977022)+ (scaled_gain_16*-3.77067));
non_probabilistic_fault = Logistic (-1.3648+ (y_1_1*9.12262));
fault = binary(non_probabilistic_fault);

   return 1/(1+exp(-x))

   if x < decision_threshold
       return 0
       return 1

For instance, this expression can be exported elsewhere, a dedicated engineering software used in the industry.


Related posts: