In this post, we formulate the model selection problem and describe the algorithms most used in practice.
Input selection algorithms are responsible for finding the optimal subset of inputs.
Input selection is a method to improve the quality of the predictions. It consists in extracting the subset of inputs that have more influence on a particular physical, biological, social, etc. process.
The growing inputs method starts by calculating every input’s correlation with every output in the neural network.
The growing inputs method starts with the most correlated input and keeps adding well-correlated variables until the selection error starts increasing.
The pruning input method also starts by calculating the correlations among every input and output in the neural network.
A different class of inputs selection method is the genetic algorithm.
This is a stochastic method based on the mechanics of natural genetics and biological evolution.
The genetic algorithm implemented includes several methods to perform fitness assignment, selection, crossover, and mutation operators.
The following figure shows a simplified flow diagram of the genetic algorithm.
The genetic algorithm starts with a population of different subsets of variables.
In every generation, the fitness of every individual in the population is computed as the selection error for that subset of inputs.
Then, the method evolves the population by selecting some individuals to generate the new population, performing a crossover with the selected population, and mutating the offspring generated during the crossover.
But model selection algorithms are very expensive in computational terms, so a big drawback here is the performance.
Neural Designer includes an advanced model selection framework capable of representing very complex data sets.
This system procures high added value to data scientists, providing them with results in a way previously unachievable.