Concrete is the most important material in construction.

The objective is to design concretes of best quality having some given properties. The result must be a product with the highest quality by following the specifications and reduced cost by using the exact mix.

Compressive strength is one of the most important properties of concrete. It is measured by breaking cylindrical concrete specimens in a compression-testing machine.

A set of compressive strength tests has been performed in the laboratory for 425 concrete specimens with different ingredients.

The concrete compressive strength is a highly nonlinear function of age and ingredients. The objective is to model the compressive strength from these components.

This is an approximation project, since the variable to be predicted is continuous (compressive strength).

The basic goal here is to model the compressive strength, as a function of the concrete components.

The first step is to prepare the data set, which is the source of information for the approximation problem. It is composed of:

- Data file.
- Variables information.
- Instances information.

The data file concrete_properties.csv contains 8 columns and 425 rows.

The next listing shows the variables in the data set and their use:

**cement**, in kg/m3.**blast_furnace_slag**, in kg/m3.**fly_ash**, in kg/m3.**water**, in kg/m3.**superplasticizer**, in kg/m3.**coarse_aggregate**, in kg/m3.**fine_aggregate**, in kg/m3.**compressive_strength**, in MPa.

The instances are divided into a training, a selection and a testing subsets. They represent 60% , 20% and 20% of the original instances, respectively, and are splitted at random.

Once all the data set information has been set, we are ready to perform some analytics, to check the quality of the data.

For instance, we can calculate the data distribution. The next figure depicts the histogram for the target variable.

As we can see, the compressive strength has a normal distribution.

The next figure depicts the inputs-targets correlations. This might help us to see the influence of the different inputs on the compressive strength of the concrete.

The above chart shows that the amount of cement has the greatest impact on the compressive strength.

We can also plot a scatter chart with the compressive strength versus the cement amount.

In general, the more cement the more compressive strength. However, the compressive strength depends on all the inputs at the same time.

The second step is to set the neural network stuff. For approximation project types, a neural network is usually composed by:

- Scaling layer.
- Perceptron layers.
- Unscaling layer.

The scaling layer contains the statistics on the inputs calculated from the data file and the method for scaling the input variables. Here the minimum and maximum scaling method has been set. Nevertheless, the mean and standard deviation method would produce very similar results.

Here two perceptron layers are added to the neural network. This number of layers is enough for most applications. The first layer has 5 inputs and 3 neurons. The second layer has 3 inputs and 1 neuron. Hyperbolic tangent and linear functions has been set as the activation functions for the first and second layer respectively. These are the default values we will be using as a first guess.

The unscaling layer transforms the normalized values from the neural network into original outputs. Here the minimum and maximum unscaling method will also be used.

The figure above shows the resulting network architecture.

The fourth step is to select an appropriate training strategy. It is composed of two parameters:

- A loss index.
- An optimization algorithm.

The loss index chosen is the normalized squared error with L2 regularization.

The normalized squared error divides the squared error between the outputs from the neural network and the targets in the data set by a normalization coefficient. If the normalized squared error has a value of 1 then the neural network is predicting the data 'in the mean', while a value of zero means perfect prediction of the data. This error term does not have any parameters to set.

The L2 regularization is applied to control the complexity of the neural network by reducing the value of the parameters. A weak regularization weight is applied here.

The learning problem can be stated as to find a neural network which minimizes the loss index. That is, a neural network that fits the data set (error term) and that does not oscillate (regularization term).

The next step in solving this problem is to assign the optimization algorithm. We use the quasi-Newton method here.

The neural network is trained to obtain the best possible performance. The next table shows the training history.

The final training and selection errors are **training error = 0.153 NSE** and **selection error = 0.24 NSE**, respectively.
In the next section we will try to improve the generalization performance by reducing the selection error.

The best generalization is achieved by using a model whose complexity is the most appropriate to produce an adequate fit of the data. Order selection is responsible of finding the optimal number perceptrons. The algorithm selected for this purpose is the incremental order method.

The next image shows the result after the process. The blue line symbolizes the training error, and the orange line represents the selection error.

As we can see in the picture above, the method starts with a small number of neurons (order) and increases the complexity at each iteration. The algorithm selects the order with the minimum selection loss, and for greater values than this order, the selection error increase due to overfitting since it would be a complex model.

After the Order selection, we have achieved a selection error of **0.211839 NSE**.

The figure above represents the final network architecture.

A standard method for testing the prediction capabilities of an approximation model is to compare the outputs from the neural network against an independent set of data.

The next plot shows the predicted compressive strength values versus the actual ones.

As we can see, both values are very similar for the entire range of data.
The correlation coefficient is **R2 = 0.861**, which indicates that the model
has a reliable prediction capability.

It is also convenient to explore the errors made by the neural network on single testing instances. In this example, some outliers are removed to achieve the best possible performance. The mean error is 5.53%, with a standard deviation of 3.69%, which is a good value for this kind of applications.

Once we know that the neural network can predict the compressive strength accurately, we can move to the model deployment phase to design concretes with desired properties.

It is very useful to see the how the outputs vary as a function of a single input, when all the others are fixed. Directional outputs plot the neural network outputs through some reference point.

The next list shows the reference point for the plots.

**cement**:265 kg/m3.**blast_furnace_slag**:86 kg/m3.**fly_ash**:62 kg/m3.**water**:183 kg/m3.**superplasticizer**:7 kg/m3.**coarse_aggregate**:956 kg/m3.**fine_aggregate**:764 kg/m3.**compressive_strength**: MPa.

The following plot shows how the compressive strength varies with the cement amount for that reference point.

The next listing is the mathematical expression represented by the predictive model.

scaled_cement = (cement-265.444)/104.67; scaled_blast_furnace_slag = (blast_furnace_slag-86.2852)/87.8265; scaled_fly_ash = (fly_ash-62.7953)/66.2277; scaled_water = (water-183.06)/19.3286; scaled_superplasticizer = (superplasticizer-6.99576)/5.39228; scaled_coarse_aggregate = (coarse_aggregate-956.059)/83.8016; scaled_fine_aggregate = (fine_aggregate-764.377)/73.1205; y_1_1 = tanh (0.00514643+ (scaled_cement*-0.252051)+ (scaled_blast_furnace_slag*0.219995)+ (scaled_fly_ash*0.21738)+ (scaled_water*0.409424)+ (scaled_superplasticizer*-0.582195)+ (scaled_coarse_aggregate*0.228467)+ (scaled_fine_aggregate*-0.0963871)); y_1_2 = tanh (-0.66057+ (scaled_cement*-0.381739)+ (scaled_blast_furnace_slag*-0.303248)+ (scaled_fly_ash*-0.153626)+ (scaled_water*-0.29632)+ (scaled_superplasticizer*-0.0808802)+ (scaled_coarse_aggregate*-0.305395)+ (scaled_fine_aggregate*-0.202458)); y_1_3 = tanh (-0.0348251+ (scaled_cement*0.105028)+ (scaled_blast_furnace_slag*-0.518262)+ (scaled_fly_ash*-0.54546)+ (scaled_water*0.0903926)+ (scaled_superplasticizer*1.03298)+ (scaled_coarse_aggregate*0.0226592)+ (scaled_fine_aggregate*0.247202)); scaled_compressive_strength = (-1.25097+ (y_1_1*-1.11323)+ (y_1_2*-1.90226)+ (y_1_3*-0.770388)); compressive_strength = (0.5*(scaled_compressive_strength+1.0)*(81.75-8.54)+8.54);

The above formula can be exported to the software tool required by the customer.

The purpose of improving quality of concrete was to help construction companies to obtain the best product suited to their needs at minimum cost. We have used a neural network to model 425 specimens of concrete, in order predict the compressive strength as a function of the constituent materials and their proportions.

- I-Cheng Yeh, "Modeling of strength of high performance concrete using artificial neural networks", Cement and Concrete Research, Vol. 28, No. 12, pp. 1797-1808 (1998).