In this example, we build a machine learning model to predict employee churn and help companies reduce staff turnover.

Thank you for reading this post, don't forget to subscribe!

Employee attrition is one of the biggest challenges for human resources.

It is costly, since keeping an employee is far less expensive than hiring and training a new one.

The goal is to predict who might leave, when they might leave, and why they might leave.

With accurate predictions, organizations can take action early, address the leading causes of turnover, and improve employee retention.

This example is solved with Neural Designer. To follow it step by step, you can use the free trial.

Contents

  1. Application type.
  2. Data set.
  3. Neural network.
  4. Training strategy.
  5. Model selection.
  6. Testing analysis.
  7. Model deployment.

1. Application type

This is a classification project since the variable to be predicted is binary (attrition or not).

The goal here is to model the probability of attrition, conditioned on the employee features.

2. Data set

Data source

The data file employee_attrition.csv contains quantitative and qualitative information about a sample of employees at the company.

The data set contains about 1,500 employees (or instances).

For each, around 35 personal, professional, and socio-economic attributes (or variables) are selected.

Variables

More specifically, the variables of this example are the following.

Demographics

  • Age
  • Gender: Male, Female
  • Marital status: Single, Divorced, Married
  • Over 18: True or False

Job Information

  • Department: Sales, Research & Development, Human Resources
  • Job role: Sales Executive, Research Scientist, Laboratory Technician, Manufacturing Director, Healthcare Representative, Manager, Sales Representative, Research Director, Human Resources
  • Job level: 1–5
  • Job involvement: 1–4
  • Years at company
  • Years in current role
  • Years since last promotion
  • Years with current manager

Compensation

  • Daily rate
  • Hourly rate
  • Monthly income
  • Monthly rate
  • Percent salary hike
  • Stock option level: 0–3

Performance & Engagement

  • Performance rating
  • Job satisfaction: 1–4
  • Environment satisfaction: 1–4
  • Relationship satisfaction: 1–4
  • Work-life balance: 1–4
  • Training times last year

Career History

  • Number of companies worked
  • Total working years
  • Business travel: Non-travel (0), Rarely (1), Frequently (2)
  • Distance from home

Education

  • Education: 1–5
  • Education field: Life Sciences, Human Resources, Medical, Marketing, Technical Degree, Other

Administrative

  • Employee count
  • Employee number
  • Standard hours
  • Over time: True or False

Target Variable

  • Attrition: Loyal or Attrition

We have 48 input variables, which contain the characteristics of every employee.On the other hand, we have 1 target variable, the variable “Attrition” mentioned before.There are 3 constant variables (“EmployeeCount”, “Over18”, and “StandardHours”).They will be set as unused variables for the analysis since they do not provide any valuable information.

Variables distribution

Before starting the predictive analysis, it is important to know the distributions of the variables.

The following pie chart shows the ratio of negative and positive instances.

The chart above shows that the data is unbalanced, i.e., the number of negative instances (1233) is much larger than that of positive instances (237).

We use this information later to design the predictive model properly.

Input-target correlations

The input-target correlations analyze the dependencies between each input variable and the target.

As we can see, the input variables that have more importance in the attrition are “over time” (0.246), “total working years” (0.223), and “years at company” (0.196).

3. Neural network

The neural network takes all the employees’ attributes and transforms them into a probability of attrition.

For that purpose, we use a neural network composed of a scaling layer with 48 neurons, a perceptron layer with 3 neurons, and a probabilistic layer with 1 neuron.

4. Training strategy

The next step is to select an appropriate training strategy that defines what the neural network will learn.

A general training strategy is composed of two concepts:

  • A loss index.
  • An optimization algorithm.

Loss index

As mentioned earlier, the dataset is unbalanced.

To address this, we use the weighted squared error as the error method.

This assigns a weight of 5.20 to positive instances and 1 to negative ones.

With this adjustment, the total weight of positives equals that of negatives.

Optimization algorithm

We use the quasi-Newton method as the optimization algorithm.

Now, the model is ready to be trained. The following chart shows how the training and selection errors decrease with the epochs of the optimization algorithm.

The final errors are 0.285 WSE for training and 0.931 WSE for validation.

5. Model selection

The objective of model selection is to find the network architecture with the best generalization properties, that is, the one that minimizes the error on the selected instances of the data set.

More specifically, we aim to develop a neural network with a selection error of less than 0.931 WSE, the current best value we have achieved.

Order selection algorithms train several network architectures with a different number of neurons and select the one with the smallest selection error.

The incremental order method starts with a small number of neurons and increases the complexity at each iteration. The following chart shows the training error (blue) and the selection error (orange) as a function of the number of neurons.

As we can see, the optimal number of neurons in the hidden layer is 1, resulting in an order selection error of 0.614 WSE, which is far better than the previous one.

6. Testing analysis

The testing analysis assesses the model’s quality to determine its readiness for use in the production phase, i.e., in a real-world situation.

The way to test the model is to compare the trained neural network’s outputs against the real targets for a data set that has been used neither for training nor selection, the testing subset.

For that purpose, we use some testing methods commonly used in binary classification problems.

ROC curve

The ROC curve measures the discrimination capacity of the classifier between positive and negative instances.

The ROC curve should pass through the upper left corner for a perfect classifier.

The following chart shows the ROC curve of our problem.

In this case, the area takes the value of 0.804, which confirms what we saw in the ROC chart, that the model predicts attrition with high accuracy.

Confusion matrix

For classification models with a binary target variable, constructing the confusion matrix is also a good task to test the model. Below, this table is displayed.

Predicted positivePredicted negative
Real positive35 (11.9%)16 (5.44%)
Real negative47 (16%)196 (66.7%)

Binary classification metrics

The following list depicts the binary classification tests. They are calculated from the values of the confusion matrix.

  • Classification accuracy: 78.6% (ratio of correctly classified samples).
  • Error rate: 21.4% (ratio of misclassified samples).
  • Sensitivity: 68.6% (percentage of actual positives classified as positive).
  • Specificity: 80.6% (percentage of actual negatives classified as negative).

In general, these binary classification tests show a good performance of the predictive model.

Nevertheless, it is essential to highlight that this model has greater specificity than sensitivity, showing that it works better when detecting negative instances accurately.

7. Model deployment

Once we know that the model can accurately predict employee attrition, it can be used to evaluate employee satisfaction with the company. This is called model deployment.

The model takes the form of a function that takes an employee’s inputs and provides the predicted output.

Using the predictive model, we can simulate different scenarios and find the most significant factors for the attrition of a given employee.

This information allows the company to act on those variables.

Conclusions

Predicting employee churn helps organizations reduce the high costs of staff turnover.

By analyzing key factors such as job satisfaction, years at the company, and compensation, machine learning models can identify employees at risk of leaving.

With this knowledge, HR teams can take proactive measures to improve retention and strengthen organizational stability.

Related posts