{"id":3539,"date":"2025-11-25T11:12:57","date_gmt":"2025-11-25T10:12:57","guid":{"rendered":"https:\/\/neuraldesigner.com\/learning\/training-strategy\/"},"modified":"2026-02-11T09:01:00","modified_gmt":"2026-02-11T08:01:00","slug":"training-strategy","status":"publish","type":"learning","link":"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/","title":{"rendered":"Machine learning: Training strategy &#8211; tutorial"},"content":{"rendered":"<p data-start=\"186\" data-end=\"277\">The machine learning training strategy is the method that drives the learning process of a neural network.<\/p>\n<p data-start=\"385\" data-end=\"485\">It searches for the parameter values that best adapt the neural network to the data set.<\/p>\n<p>A general training strategy combines two key ideas:<\/p>\n<section>\n<ul>\n<li><b><a href=\"#LossIndex\">1. Loss index<\/a><\/b><\/li>\n<li><b><a href=\"#OptimizationAlgorithm\">2. Optimization algorithm<\/a><\/b><\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/www.neuraldesigner.com\/images\/training_strategy.svg\" width=\"52\" height=\"52\" \/><\/p>\n<\/section>\n<section>\n<h2>1. Loss index<\/h2>\n<p>The loss index defines the task for the neural network and measures how well it learns.<\/p>\n<p>The choice of a suitable loss index depends on the application.<\/p>\n<p>When setting a loss index, we must choose two different terms: an <a href=\"#ErrorTerm\">error term<\/a> and a <a href=\"#RegularizationTerm\">regularization term<\/a>.<\/p>\n<p>$$\\text{loss index} = \\text{error term} + \\text{regularization term}$$<\/p>\n<h3>Error term<\/h3>\n<p>The error is the most important term in the loss index.<\/p>\n<p>It measures how well the <a href=\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/neural-network\">neural network<\/a>\u00a0fits the\u00a0<a href=\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/data-set\">data set<\/a>.<\/p>\n<p>We can calculate errors on the <a href=\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/data-set#TrainingSamples\">training<\/a>,\u00a0 <a href=\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/data-set#SelectionSamples\">selection<\/a>, and <a href=\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/data-set#TestingSamples\">testing<\/a>\u00a0samples<\/p>\n<p>Next, we describe the most important errors used in machine learning:<\/p>\n<ul>\n<li><a href=\"#MeanSquaredError\">Mean squared error<\/a>.<\/li>\n<li><a href=\"#NormalizedSquaredError\">Normalized squared error<\/a>.<\/li>\n<li><a href=\"#WeightedSquaredError\">Weighted squared error<\/a>.<\/li>\n<li><a href=\"#CrossEntropyError\">Cross-entropy error<\/a>.<\/li>\n<li><a href=\"#MinkowskiError\">Minkowski error<\/a>.<\/li>\n<\/ul>\n<p>We calculate the gradient of all these error methods using the backpropagation algorithm.<\/p>\n<h4>Mean squared error (MSE)<\/h4>\n<p style=\"max-width: 100%;\">The mean squared error is the average of the squared differences between the neural network outputs and the dataset&#8217;s targets.<\/p>\n<p>$$\\text{mean squared error} = \\frac{\\sum (outputs &#8211; targets)^2}{\\text{samples number}}$$<\/p>\n<h4>Normalized squared error (NSE)<\/h4>\n<\/section>\n<p>The normalized squared error is the squared difference between outputs and targets divided by a normalization factor.<\/p>\n<p>A value of zero means the network predicts the data perfectly.<\/p>\n<p>On the other side, a value of one means the network only predicts the average of the data.<\/p>\n<section>$$\\text{normalized squared error} = \\frac{\\sum (outputs &#8211; targets)^2}{\\text{normalization coefficient}}$$<\/section>\n<section><\/section>\n<section>The normalized squared error is the default error term for approximation problems.<\/p>\n<h4>Weighted squared error (WSE)<\/h4>\n<\/section>\n<p>The weighted squared error is used in binary classification with unbalanced data.<\/p>\n<p>This happens when the number of positive and negative samples is very different.<\/p>\n<p>It assigns weights so that positives and negatives contribute equally to the error.<\/p>\n<section><\/section>\n<section>$$\\text{weighted squared error} =<br \/>\n\\text{positives weight} \\cdot \\sum (outputs &#8211; \\text{positive targets})^2<br \/>\n+ \\text{negatives weight} \\cdot \\sum (outputs &#8211; \\text{negative targets})^2$$<\/section>\n<section><\/section>\n<section>\n<h4>Cross-entropy error<\/h4>\n<p>The cross-entropy error is used in both binary and multi-class classification problems.<\/p>\n<p>It penalizes heavily when the network assigns a high probability to the wrong class.<\/p>\n<p>For binary classification, the cross-entropy error is<\/p>\n<\/section>\n<p>$$\\text{Minkowski error} =<br \/>\n\\frac{\\sum \\Big( \\text{outputs} &#8211; \\text{targets} \\Big)^{\\text{Minkowski parameter}}}{\\text{samples number}}$$<\/p>\n<section>A perfect model has a cross-entropy error of zero.<\/p>\n<h4>Minkowski error (ME)<\/h4>\n<\/section>\n<p data-start=\"53\" data-end=\"109\">The previous errors can be very sensitive to outliers.<\/p>\n<p data-start=\"111\" data-end=\"179\">In such cases, the Minkowski error provides better generalization.<\/p>\n<p data-start=\"181\" data-end=\"279\">It raises the differences between outputs and targets to a power called the Minkowski parameter.<\/p>\n<p data-start=\"281\" data-end=\"346\">This parameter ranges from 1 to 2, with a default value of 1.5.<\/p>\n<section>$$\\text{Minkowski error} =<br \/>\n\\frac{\\sum \\Big( \\text{outputs} &#8211; \\text{targets} \\Big)^{\\text{Minkowski parameter}}}{\\text{samples number}}$$<\/p>\n<h3>Regularization term<\/h3>\n<p data-start=\"102\" data-end=\"196\">A model is regular when small changes in the inputs only cause small changes in the outputs.<\/p>\n<p data-start=\"198\" data-end=\"285\">If the model is not regular, it may overfit the training data and fail to generalize.<\/p>\n<p data-start=\"287\" data-end=\"351\">To avoid this, we add a regularization term to the loss index.<\/p>\n<p data-start=\"287\" data-end=\"351\">The term keeps the network\u2019s weights and biases small, making the model simpler and smoother.<\/p>\n<p>The main types of regularization are:<\/p>\n<ul>\n<li><a href=\"#L1Regularization\">L1 regularization<\/a>.<\/li>\n<li><a href=\"#L2Regularization\">L2 regularization<\/a>.<\/li>\n<\/ul>\n<p>We can easily compute the gradient of these regularization terms.<\/p>\n<h3>L1 regularization<\/h3>\n<p>L1 regularization adds up the absolute values of all the network\u2019s parameters.<\/p>\n<p>It drives the parameters to small values, which makes the model simpler and less likely to overfit.<\/p>\n<p>$$\\text{L1 regularization} =<br \/>\n\\text{regularization weight} \\cdot \\sum \\lvert \\text{parameters} \\rvert$$<\/p>\n<h3>L2 regularization<\/h3>\n<p>L2 regularization is the sum of the squared values of all the network\u2019s parameters.<\/p>\n<p>It also drives the parameters to small values, making the model more regular.<\/p>\n<p>$$\\text{L2 regularization} =<br \/>\n\\text{regularization weight} \\cdot \\sum \\text{parameters}^2$$<\/p>\n<p>As we can see, a parameter controls the weight of the regularization term.<\/p>\n<p data-start=\"118\" data-end=\"177\">We decrease the weight if the model is too smooth and increase it if the model oscillates too much.<\/p>\n<h3>Loss function<\/h3>\n<p data-start=\"55\" data-end=\"128\">The loss index depends on the neural network function and the data set.<\/p>\n<p data-start=\"130\" data-end=\"218\">We can imagine it as a surface in many dimensions, with the parameters as coordinates.<\/p>\n<p data-start=\"220\" data-end=\"254\">The following figure shows this idea.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.neuraldesigner.com\/images\/loss_function.svg\" alt=\"Loss function\" width=\"500\" \/><\/p>\n<p>&nbsp;<\/p>\n<\/section>\n<p>Training a neural network consists of finding the parameter values that minimize the loss index.<\/p>\n<section>\n<h2>2. Optimization algorithm<\/h2>\n<p>An optimization algorithm is the method used to adjust the parameters of a neural network to minimize the loss index.<\/p>\n<p>Training starts with random parameters and iteratively improves them until reaching a minimum of the loss index.<\/p>\n<\/section>\n<p data-start=\"487\" data-end=\"524\">The following figure shows this process.<\/p>\n<section><img decoding=\"async\" src=\"https:\/\/www.neuraldesigner.com\/images\/training_process.svg\" alt=\"Training process\" width=\"425\" \/><\/p>\n<p data-start=\"51\" data-end=\"117\">The optimization algorithm stops when a chosen condition is met.<\/p>\n<p data-start=\"119\" data-end=\"154\">Standard stopping criteria include:<\/p>\n<ul>\n<li data-start=\"255\" data-end=\"297\">The maximum number of epochs is reached.<\/li>\n<li data-start=\"300\" data-end=\"341\">The maximum computing time is exceeded.<\/li>\n<li data-start=\"344\" data-end=\"395\">The selection error increases for several epochs.<\/li>\n<\/ul>\n<p data-start=\"53\" data-end=\"134\">The optimization algorithm decides how to adjust the neural network parameters.<\/p>\n<p data-start=\"136\" data-end=\"212\">Different algorithms have different computational and memory requirements, and no one is best for all problems.<\/p>\n<\/section>\n<section>\n<p data-start=\"263\" data-end=\"323\">Next, we describe the most common optimization algorithms.<\/p>\n<\/section>\n<section>\n<ul>\n<li><a href=\"#GradientDescent\">Gradient descent<\/a>.<\/li>\n<li><a href=\"#QuasiNewtonMethod\">Newton&#8217;s method<\/a>.<\/li>\n<li><a href=\"#QuasiNewtonMethod\">Quasi-Newton method<\/a>.<\/li>\n<li><a href=\"#LevenbergMarquardtAlgorithm\">Levenberg-Marquardt algorithm<\/a>.<\/li>\n<li><a href=\"#StochasticGradientDescent\">Stochastic gradient descent<\/a>.<\/li>\n<li><a href=\"#AdaptativeLinearMomentum\">Adaptive linear momentum<\/a>.<\/li>\n<\/ul>\n<h3>Gradient descent (GD)<\/h3>\n<p>The simplest optimization algorithm is gradient descent.<\/p>\n<p>It updates the parameters each epoch in the direction of the negative gradient of the loss index.<\/p>\n<p>A factor called the learning rate controls the change of parameters.<\/p>\n<p>$$\\text{New parameters} =<br \/>\n\\text{parameters} &#8211; \\text{loss gradient} \\cdot \\text{learning rate}$$<\/p>\n<p>The main drawback of gradient descent is that it can converge very slowly.<\/p>\n<h3>Newton method (NM)<\/h3>\n<p>Newton\u2019s method uses the Hessian matrix, which contains all the second derivatives of the\u00a0<a href=\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy#LossIndex\">loss function<\/a>.<\/p>\n<p data-start=\"327\" data-end=\"476\">Unlike gradient descent, which only relies on first derivatives, Newton\u2019s method uses curvature information to find better training directions.<\/p>\n<p>$$\\text{New parameters} =<br \/>\n\\text{parameters} &#8211; \\text{loss Hessian}^{-1} \\cdot \\text{loss gradient} \\cdot \\text{learning rate}$$<\/p>\n<p>The main drawback of Newton\u2019s method is the high computational cost of calculating the Hessian matrix.<\/p>\n<h3>Quasi-Newton method (QNM)<\/h3>\n<\/section>\n<p>Newton\u2019s method uses the Hessian matrix of second derivatives to set the learning direction.<\/p>\n<p>This gives high accuracy but is very expensive to compute.<\/p>\n<p>Quasi-Newton methods avoid this by approximating the inverse Hessian with only gradient information.<\/p>\n<p>Line minimization algorithms adjust the learning rate at each epoch.<\/p>\n<section>$$\\text{New parameters} =<br \/>\n\\text{parameters} &#8211; \\text{inverse Hessian approximation} \\cdot \\text{gradient} \\cdot \\text{learning rate}$$<\/section>\n<p>The Quasi-Newton method makes training faster than gradient descent and less costly than Newton\u2019s method.<\/p>\n<section>Therefore, it is the default algorithm for small and medium neural networks.<\/p>\n<h3>Levenberg-Marquardt algorithm (LM)<\/h3>\n<p>The Levenberg\u2013Marquardt algorithm is another optimizer that achieves near second-order speed without computing the Hessian matrix.<\/p>\n<p>This method applies only when the loss index is a sum of squares, such as the mean squared error or the normalized squared error.<\/p>\n<p>It requires the gradient and the Jacobian matrix of the loss index.<\/p>\n<p>$$\\text{New parameters} =<br \/>\n\\text{parameters} &#8211; \\text{damping parameter} \\cdot \\text{Jacobian} \\cdot \\text{gradient}$$<\/p>\n<p>The Levenberg-Marquardt algorithm is very fast but requires a lot of memory, so it is recommended only for small networks.<\/p>\n<h3>Stochastic gradient descent (SGD)<\/h3>\n<\/section>\n<p>Stochastic gradient descent (SGD) works differently from the previous algorithms.<\/p>\n<p>It updates the parameters several times in each epoch using small batches of data.<\/p>\n<section>$$\\text{New parameters} =<br \/>\n\\text{parameters} &#8211; \\text{batch gradient} \\cdot \\text{learning rate} + \\text{momentum}$$<\/section>\n<section><\/section>\n<section>Stochastic gradient descent makes training faster on large data sets.Its drawback is that the updates are noisy and may lead to slower convergence.<\/p>\n<h3>Adaptive linear momentum (ADAM)<\/h3>\n<\/section>\n<p>The Adam algorithm is similar to stochastic gradient descent but uses a more advanced way to calculate the training direction.<\/p>\n<p>It also adapts the learning rate for each parameter, which usually makes convergence faster.<\/p>\n<section>$$\\text{New parameters} =<br \/>\n\\text{parameters} &#8211; \\frac{\\text{gradient exponential decay}}{\\sqrt{\\text{square gradient exponential decay}}} \\cdot \\text{learning rate}$$<\/section>\n<section><\/section>\n<section>Adam is the default optimization algorithm for large neural networks.<\/p>\n<h3>Performance considerations<\/h3>\n<p>For small datasets (about 10 variables and 10,000 samples), the <a href=\"#LevenbergMarquardtAlgorithm\">Levenberg-Marquardt algorithm<\/a> is recommended for its speed and precision.<\/p>\n<p>With medium-sized problems, the <a href=\"#QuasiNewtonMethod\">quasi-Newton method<\/a> works\u00a0well.<\/p>\n<p>For\u00a0<span style=\"box-sizing: border-box; margin: 0px; padding: 0px;\">large data sets (about 1,000 variables and 1,000,000 samples), <\/span><a href=\"#AdaptativeLinearMomentum\">adaptive linear momentum<\/a>\u00a0is the best choice.<\/p>\n<p>The article <span style=\"box-sizing: border-box; margin: 0px; padding: 0px;\"><a href=\"https:\/\/www.neuraldesigner.com\/blog\/5_algorithms_to_train_a_neural_network\" target=\"_blank\" rel=\"noopener\">&#8220;5 Algorithms to Train a Neural Network&#8221;<\/a><\/span>\u00a0in the <a href=\"https:\/\/www.neuraldesigner.com\/blog\">Neural Designer blog<\/a> contains more information about this subject.<\/p>\n<\/section>\n<section><a style=\"float: left;\" href=\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/neural-network\">\u21d0 Neural Network<\/a><br \/>\n<a style=\"float: right;\" href=\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/model-selection\">Model Selection \u21d2<\/a><\/section>\n","protected":false},"author":122,"featured_media":1428,"template":"","categories":[30],"tags":[36],"class_list":["post-3539","learning","type-learning","status-publish","has-post-thumbnail","hentry","category-tutorials","tag-tutorials"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Machine learning: Training strategy - tutorial<\/title>\n<meta name=\"description\" content=\"This tutorial shows the main training strategies used by neural networks to learn.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Neural networks tutorial: Training strategy\" \/>\n<meta property=\"og:description\" content=\"This tutorial shows the main training strategies used by neural networks to learn.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/\" \/>\n<meta property=\"og:site_name\" content=\"Neural Designer\" \/>\n<meta property=\"article:modified_time\" content=\"2026-02-11T08:01:00+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:title\" content=\"Neural networks tutorial: Training strategy\" \/>\n<meta name=\"twitter:description\" content=\"This tutorial shows the main training strategies used by neural networks to learn.\" \/>\n<meta name=\"twitter:site\" content=\"@NeuralDesigner\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/\",\"url\":\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/\",\"name\":\"Machine learning: Training strategy - tutorial\",\"isPartOf\":{\"@id\":\"https:\/\/www.neuraldesigner.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.neuraldesigner.com\/wp-content\/uploads\/2023\/06\/training_process.svg\",\"datePublished\":\"2025-11-25T10:12:57+00:00\",\"dateModified\":\"2026-02-11T08:01:00+00:00\",\"description\":\"This tutorial shows the main training strategies used by neural networks to learn.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/#primaryimage\",\"url\":\"https:\/\/www.neuraldesigner.com\/wp-content\/uploads\/2023\/06\/training_process.svg\",\"contentUrl\":\"https:\/\/www.neuraldesigner.com\/wp-content\/uploads\/2023\/06\/training_process.svg\",\"width\":13341,\"height\":8113,\"caption\":\"Neural network training process diagram\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.neuraldesigner.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Learning\",\"item\":\"https:\/\/www.neuraldesigner.com\/learning\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Machine learning: Training strategy &#8211; tutorial\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.neuraldesigner.com\/#website\",\"url\":\"https:\/\/www.neuraldesigner.com\/\",\"name\":\"Neural Designer\",\"description\":\"Explanable AI Platform\",\"publisher\":{\"@id\":\"https:\/\/www.neuraldesigner.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.neuraldesigner.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.neuraldesigner.com\/#organization\",\"name\":\"Neural Designer\",\"url\":\"https:\/\/www.neuraldesigner.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.neuraldesigner.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.neuraldesigner.com\/wp-content\/uploads\/2023\/05\/logo-neural-1.png\",\"contentUrl\":\"https:\/\/www.neuraldesigner.com\/wp-content\/uploads\/2023\/05\/logo-neural-1.png\",\"width\":1024,\"height\":223,\"caption\":\"Neural Designer\"},\"image\":{\"@id\":\"https:\/\/www.neuraldesigner.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/NeuralDesigner\",\"https:\/\/es.linkedin.com\/showcase\/neuraldesigner\/\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Machine learning: Training strategy - tutorial","description":"This tutorial shows the main training strategies used by neural networks to learn.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/","og_locale":"en_US","og_type":"article","og_title":"Neural networks tutorial: Training strategy","og_description":"This tutorial shows the main training strategies used by neural networks to learn.","og_url":"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/","og_site_name":"Neural Designer","article_modified_time":"2026-02-11T08:01:00+00:00","twitter_card":"summary_large_image","twitter_title":"Neural networks tutorial: Training strategy","twitter_description":"This tutorial shows the main training strategies used by neural networks to learn.","twitter_site":"@NeuralDesigner","twitter_misc":{"Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/","url":"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/","name":"Machine learning: Training strategy - tutorial","isPartOf":{"@id":"https:\/\/www.neuraldesigner.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/#primaryimage"},"image":{"@id":"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/#primaryimage"},"thumbnailUrl":"https:\/\/www.neuraldesigner.com\/wp-content\/uploads\/2023\/06\/training_process.svg","datePublished":"2025-11-25T10:12:57+00:00","dateModified":"2026-02-11T08:01:00+00:00","description":"This tutorial shows the main training strategies used by neural networks to learn.","breadcrumb":{"@id":"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/#primaryimage","url":"https:\/\/www.neuraldesigner.com\/wp-content\/uploads\/2023\/06\/training_process.svg","contentUrl":"https:\/\/www.neuraldesigner.com\/wp-content\/uploads\/2023\/06\/training_process.svg","width":13341,"height":8113,"caption":"Neural network training process diagram"},{"@type":"BreadcrumbList","@id":"https:\/\/www.neuraldesigner.com\/learning\/tutorials\/training-strategy\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.neuraldesigner.com\/"},{"@type":"ListItem","position":2,"name":"Learning","item":"https:\/\/www.neuraldesigner.com\/learning\/"},{"@type":"ListItem","position":3,"name":"Machine learning: Training strategy &#8211; tutorial"}]},{"@type":"WebSite","@id":"https:\/\/www.neuraldesigner.com\/#website","url":"https:\/\/www.neuraldesigner.com\/","name":"Neural Designer","description":"Explanable AI Platform","publisher":{"@id":"https:\/\/www.neuraldesigner.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.neuraldesigner.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.neuraldesigner.com\/#organization","name":"Neural Designer","url":"https:\/\/www.neuraldesigner.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.neuraldesigner.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.neuraldesigner.com\/wp-content\/uploads\/2023\/05\/logo-neural-1.png","contentUrl":"https:\/\/www.neuraldesigner.com\/wp-content\/uploads\/2023\/05\/logo-neural-1.png","width":1024,"height":223,"caption":"Neural Designer"},"image":{"@id":"https:\/\/www.neuraldesigner.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/NeuralDesigner","https:\/\/es.linkedin.com\/showcase\/neuraldesigner\/"]}]}},"_links":{"self":[{"href":"https:\/\/www.neuraldesigner.com\/api\/wp\/v2\/learning\/3539","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.neuraldesigner.com\/api\/wp\/v2\/learning"}],"about":[{"href":"https:\/\/www.neuraldesigner.com\/api\/wp\/v2\/types\/learning"}],"author":[{"embeddable":true,"href":"https:\/\/www.neuraldesigner.com\/api\/wp\/v2\/users\/122"}],"version-history":[{"count":3,"href":"https:\/\/www.neuraldesigner.com\/api\/wp\/v2\/learning\/3539\/revisions"}],"predecessor-version":[{"id":21818,"href":"https:\/\/www.neuraldesigner.com\/api\/wp\/v2\/learning\/3539\/revisions\/21818"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.neuraldesigner.com\/api\/wp\/v2\/media\/1428"}],"wp:attachment":[{"href":"https:\/\/www.neuraldesigner.com\/api\/wp\/v2\/media?parent=3539"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.neuraldesigner.com\/api\/wp\/v2\/categories?post=3539"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.neuraldesigner.com\/api\/wp\/v2\/tags?post=3539"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}