A combined cycle power plant is composed of gas turbines, steam turbines and heat recovery steam generators. In this type of power plants, the electricity is generated by gas and steam turbines, which are combined in one cycle, and is transferred from one turbine to another. While the vacuum is collected from and has effect on the steam turbine, the ambient variables effect the gas turbine performance.
The electrical energy production will be influenced by the environment variables, such as temperature or humidity. The goal of this research is to know where will be more productive this kind of power plant in order to optimize profit.
The data set contains 9568 data points collected from a combined cycle power plant over 6 years (2006-2011), when the power plant was set to work with full load. Features consist of hourly average ambient variables temperature, ambient pressure, relative humidity and exhaust vacuum to predict the net hourly electrical energy output of the plant. The measurement were taken from various sensors located around the plant that record the ambient variables every second. In the next table a preview of the data file is shown:
The next figure shows the data set page in Neural Designer.
It contains four sections:
Neural Designer shows a preview of the data file and says that the number of columns is 5 and the number of rows is 9568.
The task "Report data set" shows some useful information about the data set. The next figure depicts one of the tables shown, which contains the names, units, descriptions and uses of all the variables. In this case, there are 5 variables.
The instances are divided into a training, a selection and a testing subsets. They represent 60% , 20% and 20% of the original instances, respectively, and have been splitted at random.
The second step is to choose a network architecture to represent the approximation function. For approximation problems, it is composed by:
The next figure shows the neural network page in Neural Designer.
The scaling layer section contains information about the method for scaling the input variables and the statistic values to be used by that method. In this example, we will use the minimum and maximum method for scaling the inputs. The mean and standard deviation would also be appropriate here.
The next image represents the neural network for this example.
The inputs of the neural network are the variables AT, V, AP, RH and it has one output, PE. The number of hidden perceptrons that have been chosen is 4. Therefore, the neural network can be denoted as 4:4:1.
The next step is to configure the loss index, which defines what the neural network will learn. It is composed by the objective term and the regularization term. Both of them will be set by default.
The objective term is to be the normalized squared error. It divides the squared error between the outputs from the neural network and the targets in the data set by a normalization coefficient. If the normalized squared error has a value of unity then the neural network is predicting the data 'in the mean', while a value of zero means perfect prediction of the data. This objective term does not have any parameters to set.
The neural parameters norm is used as regularization term. It is applied to control the complexity of the neural network by reducing the value of the parameters. The weight of this regularization term in the loss index is 0.001.
The learning problem can be stated as to find a neural network which minimizes the loss index, i.e., a neural network that fits the data set (objective) and that does not oscillate (regularization).
The task "perform training", trains the neural network in order to create the model. The training strategy can be configured in the training strategy page. In this case, all the parameters are set to their default values.
As we can see, the training strategy chosen is the quasi-Newton method.
When the neural network is designed and trained, there are diverse ways to test the result offered by the model. One of them is to perform a linear regression between the predicted output and a independent subset of validation data. This can be done using the task "Perform linear regression analysis", which shows the next output:
The next table lists the linear regression parameters for the scaled output PE. The intercept, slope and correlation are very similar to 0, 1 and 1, respectively, so the neural network is predicting well the testing data.
For a perfect fit the slope would be 1, and the y-intercept would be 0. The correlations measure whether there is complete relationship between the outputs of the neural network and the targets in the testing subset or not.
The neural network is now ready to predict outputs for inputs that it has never seen.
The "Calculate outputs" task will open the next dialog where the user types the input values.
The results from that task are written in the viewer.
The task "Write expression" shows the mathematical expression represented by the predictive model:
scaled_AT=2*(AT- 1.81)/(37.11- 1.81)-1; scaled_V=2*(V- 25.36)/(81.56- 25.36)-1; scaled_AP=2*(AP- 992.89)/(1033.3- 992.89)-1; scaled_RH=2*(RH- 25.56)/(100.16- 25.56)-1; y_1_1=tanh(-0.217112 -0.0529272*scaled_AT- 0.474756*scaled_V-0.0558401 *scaled_AP-0.378804*scaled_RH); y_1_2=tanh(0.405613 +0.0475993*scaled_AT+ 0.049912*scaled_V-0.0700079 *scaled_AP+0.111965*scaled_RH); y_1_3=tanh(0.557505 +1.21185*scaled_AT+ 0.0447406*scaled_V+0.270806 *scaled_AP+0.0980202*scaled_RH); y_1_4=tanh(-0.210872 +0.507619*scaled_AT+ 0.550384*scaled_V-0.54468 *scaled_AP+0.251757*scaled_RH); scaled_PE=(0.167976 -0.125366*y_1_1+ 0.0184911*y_1_2-0.764246 *y_1_3-0.48739*y_1_4); PE=0.5*(scaled_PE+ 1.0)*(495.76- 420.26)+420.26;