Integrating generative AI and machine learning classifiers for solving heterogenous MCGDM: a case of employee churn prediction

Integrating generative AI and machine learning classifiers for solving heterogenous MCGDM: a case of employee churn prediction

A combination of real and imitative experts is utilized to address the MCGDM problem and rank the alternatives through a series of stages in heterogenous environment. The proposed technique comprises four main stages: data collection, generative AI, Multiple Criteria Group Decision Making (MCGDM), and machine learning for predicting employee churn as illustrated in Fig. 1. In the first stage, Data Collection; relevant employee data is gathered from the actual dataset to be used as a real expert and important features needed for the employee churn prediction process are selected. The second stage, Generative AI utilizes ChatGPT-4’s capabilities to create multiple virtual expert profiles with diverse fields and experiences to simulate real expert decision making process. Key steps in this stage include the selection of relevant criteria for evaluating employee churn, performing pairwise comparisons to determine the relative importance of these criteria, and evaluating employees based on the selected criteria for each virtual expert. These evaluations are then used to generate datasets for subsequent analysis. In the third stage, MCGDM Techniques are employed to solve the churn prediction problem. AHP is used to determine the weight of each selected criterion, followed by TOPSIS to rank alternatives based on the weighted criteria. Employees are then classified into categories (A, B, or C) reflecting their likelihood of churn. The output obtained from this stage is utilized to gather data for training and testing machine learning algorithms in the subsequent stage. Finally, Stage 4 focuses on ML Classifiers, where various machine learning models are trained and tested on the datasets to function as experts for predicting employee turnover. The performance of these models is thoroughly evaluated to ensure prediction accuracy.

Fig. 1
figure 1

Flow Chart of the Proposed Methodology.

Data collection and preprocessing

The proposal is based on the IBM Watson dataset, published in 2017 by the Smarter Workforce Institute at IBM, which contains approximately 19,479 employee records and 31 data attributes with no missing values. This structured, tabular dataset is specifically designed for employee churn prediction, helping to determine whether an employee is likely to leave the company based on various factors. The dataset includes a range of work-related attributes, such as job performance, role within the company, job involvement, monthly income, overtime status, and years at the company, as well as personal factors, including age, education level, marital status, and work-life balance. Additionally, it incorporates employee feedback on job satisfaction, workplace environment, and relationships within the company, making it a comprehensive resource for churn analysis. The dataset’s features are categorized into numerical, ordinal, and categorical types, each playing a crucial role in churn prediction. Numerical features include Age, Number of Companies Worked, Hourly Rate, Monthly Income and other attributes. Ordinal features include Education Level (ranging from 1 to 5), Job Satisfaction (ranging from 1 to 4) and Work-Life Balance (from 1 to 4) and etc. Categorical features can include Gender (Male/ Female), MaritalStatus (Single, Married, Divorced), and Department (HR, Sales, or R&D) and so on. Given its depth and reliability, the dataset is treated as a real expert in the proposed technique, serving as a foundational source for model training and evaluation.

Data preprocessing is a vital step in preparing the dataset to be clean, structured, and well-prepared for machine learning models. The process begins with checking for missing values to confirm that the dataset is complete. Next, duplicate records are detected and removed to eliminate bias and redundancy, improving the model’s learning efficiency. Ordinal features are then validated and mapped into appropriate numerical values for accurate representation. Following this, all data is standardized using the Saaty scale, ensuring consistency across features, as machine learning models perform optimally when numerical attributes are on the same scale. This transformation enhances the dataset’s suitability for ranking alternatives in the MCGDM problem. Finally, the dataset is split into two parts (80:20 ratio), with 80% allocated for training and 20% for testing machine learning classifiers, facilitating robust model evaluation and improving predictive performance.

Feature selection

The proposed technique employs a multi-stage feature selection approach that incorporates the capabilities of Generative AI (ChatGPT-4) and MCDM techniques to identify the most significant features for employee churn prediction. Initially, ChatGPT-4 was utilized to generate multiple virtual experts including HR specialists and data analysts, who contributed domain-specific insights to analyze the IBM Waston dataset and identify the key attributes influencing employee churn. Each expert contributed insights based on their domain knowledge ensuring that feature selection was data-driven rather than assumption-based, enhancing objectivity and eliminating human bias. By analyzing the IBM Watson dataset, the AI model identified 15 key features with the highest impact on employee churn, including Job Satisfaction, Monthly Income, OverTime, Years at Company, Work-Life Balance, and Environment Satisfaction. As the employee evaluation is a heterogeneous MCGDM problem, different experts with varied expertise and backgrounds employ distinct evaluation criteria and methods to assess alternatives. Each expert determines the most suitable criteria based on their experience, ensuring a comprehensive and balanced evaluation. Once the key features were selected for each expert, AHP method was applied to prioritize these features by performing a pairwise comparison, assigning relative importance scores, and ensuring logical consistency through Consistency Ratio (CR) calculations. The primary steps of the feature selection process are illustrated in Fig. 1, while Sect. 2.3 provides a detailed explanation of the feature selection methodology based on the Generative AI, and Sect. 2.4 elaborates on the criteria weighting method.

Generative AI

In this research, a newly released feature from OpenAI, which enables the creation of custom GPT models, is utilized. This feature allows the model to be tailored with specific instructions, foundational documentation to be provided, and defined limitations to be set, optimizing the model for particular tasks.

Generative AI represented by GPTs, is employed to generate a set of imitative experts based on the basic information about employees in the provided dataset. On the other hand, the insights and evaluations of the real expert are derived directly from the data within the dataset.

Firstly, a custom ChatGPT model named “Expert Guide” was created to act as the primary decision-maker for the key steps in the AHP model development. This includes identifying the optimal number of virtual experts to be generated and their areas of expertise for solving the problem. Also, Identifying the most relevant criteria from the dataset for the real expert. It can also be used to conduct pairwise comparisons needed by AHP to determine the weights of criteria.

“Expert Guide” GPT was consulted to determine the optimal number of experts required for solving the churn prediction problem and responded with a recommendation of seven experts from various relevant domains. This number is suggested because it balances the need for diverse perspectives with the practicality of managing inputs and synthesizing the results.

The recommended experts and their respective fields are summarized as follows:

Expert 1: Behavioral Data Scientist.

Expert 2: Workforce Analytics Specialist.

Expert 3: Senior HR Consultant.

Expert 4: Labor Economist.

Expert 5: Employee Engagement Strategist.

Expert 6: Employee Relations Specialist.

Expert 7: Data Scientist and Machine Learning Expert.

By incorporating expert knowledge across these areas, a well-rounded and insightful AHP model will be built for solving employee churn prediction problem.

The “Expert Guide” recommendation emphasizes the importance of the Behavioral Data Scientist and Senior HR Consultant, as their expertise is crucial for understanding the reasons behind employee turnover and designing effective interventions. Significant importance is also given to the Workforce Analytics Specialist and Employee Engagement Strategist, who offer vital data-driven insights and strategies that can directly impact churn outcomes. Meanwhile, the Labor Economist, Employee Relations Specialist, and Data Scientist play valuable and supportive roles. Their contributions, while essential, are more context-specific, helping to fine-tune the solutions developed by the primary experts.

After considering the number of virtual experts needed to address the problem, custom GPT personas were created based on the virtual profiles provided by the “Expert Guide.” After determining the number of virtual experts. Each expert focuses on a unique area related to churn prediction and employee retention. Expert 1 specializes in data science to analyze employee behavior patterns, identifying early predictors of churn and proactively addressing potential risk factors. Expert 2 develops and applies performance and engagement metrics specifically for churn prediction, offering actionable insights to reduce turnover. Expert 3 advises on HR practices that mitigate turnover risk, using churn analysis insights to enhance retention strategies. Expert 4 examines economic trends and labor market dynamics affecting turnover rates, helping to study churn within broader labor market conditions. Expert 5 identifies and addresses factors related to engagement and job satisfaction that directly impact employee retention. Expert 6 intervenes in potential churn cases by managing the relationship between employees and the organization, handling conflicts, and ensuring workplace regularity through effective communication and policies. Expert 7 utilizes machine learning and predictive models to analyze historical data, identifying patterns and factors that signal turnover risk. Each expert contributes a distinct set of insights and approaches to build a comprehensive strategy for predicting and mitigating employee churn. As a result of generating the seven GPTs, now the problem has eight experts, one is a real expert based on the information given in the dataset and seven imitative by ChatGPT capabilities.

After that, The “Expert Guide” is tasked with identifying the most effective criteria considering information about all employees in the dataset to be utilized by the real expert. A set of criteria is selected that has the greatest impact on the employee turnover process according to the dataset. The selected criteria are EmpEnvironmentSatisfaction, EmpJobInvolvement, EmpJobSatisfaction, RelationshipSatisfaction, and EmpWorkLifeBalance. For simplicity, these criteria are listed as Work Environment, Engagement Levels, Job Satisfaction, Peer Interactions, and Work-life Balance.

Once the seven experts are generated, each of them is asked to propose a number of evaluation criteria for employee churn prediction according to their specific field of expertise. Our imitative experts each suggested an extensive list of 5 primary criteria, ultimately resulting in a total of 15 unique top-level criteria without duplication. The suggested criteria for all experts are summarized in Table 1. The distribution of decision-makers’ evaluations across 15 criteria (C1 to C15) is shown in Fig. 2. Each slice of the pie chart represents the percentage of votes attributed to each criterion. C7, C8, and C9 have the highest percentage of votes, indicating that these criteria are the most significant in the decision-making process. On the other hand, C3, C4, C6, C11, C12, C13, and C14 have each received 2.5% of the total votes, making them less significant compared to others.

Table 1 Criteria selected for each expert.
Fig. 2
figure 2

Criteria Voting Percentages.

Following this, each expert is asked to conduct pairwise comparisons of their selected criteria to calculate the criteria weights using the AHP technique.

As a final step for generative AI, the experts are asked to evaluate the employees in the provided dataset using the Saaty scale, ranging from 1 to 9. Each expert assesses the criteria as if they were human, drawing on their background and experience to provide a comprehensive evaluation.

MCGDM techniques

AHP and TOPSIS are two widely used techniques for solving MCGDM problems. In this stage, AHP is applied to calculate the criteria weights for each decision maker. These weights are then passed to TOPSIS, which ranks the employees in the dataset based on their likelihood of turnover.

The integration of AHP and TOPSIS provides a structured, transparent, and efficient framework for ranking employees based on their likelihood of churn. AHP determines the importance of different churn factors through a hierarchical structure and pairwise comparisons, ensuring that qualitative factors are systematically evaluated. Additionally, the Consistency Ratio test enhances the reliability of feature selection by maintaining logical consistency in expert judgments. Once AHP assigns criteria weights, TOPSIS ranks employees based on their proximity to an ideal retention scenario and distance from a worst-case churn scenario, offering clear, data-driven predictions using Euclidean distance calculations. TOPSIS considers all criteria simultaneously, providing a balanced ranking and ensuring computational efficiency, making it ideal for large-scale HR analytics.

By combining AHP’s structured weight assignment with TOPSIS’s efficient ranking system, this approach merges subjective decision-making with objective ranking, enabling interpretable and data-driven employee retention strategies. Furthermore, integrating Generative AI (GAI) with AHP-TOPSIS automates expert evaluations, enhances ranking accuracy, and enables real-time churn predictions. This combination minimizes human bias, improves decision-making, and ensures a scalable, data-driven workforce management strategy, allowing organizations to proactively identify churn risks and implement effective retention strategies.

AHP technique

After generating decision matrices for each expert using generative AI, the problem now consists of eight distinct decision matrices; seven obtained from the virtual experts and one from the original dataset, which serves as the real expert. The complete AHP hierarchy is displayed in Fig. 3. Each expert GPT is tasked with conducting pairwise comparisons of their selected criteria to calculate the criteria weights using the AHP technique. Additionally, the “Expert Guide” is asked to perform a pairwise comparison of the criteria selected from the original dataset, which will be assigned to the real expert. The AHP technique is applied to calculate the criteria weights based on the pairwise comparisons provided by the experts. Considering the overlaps between the criteria, the final weight for each criterion will be determined using Eqs. (1) and (2).

Consider a multi-criteria group decision-making problem with \(\:m\:\)alternatives, \(\:n\:\)criteria, and \(\:p\:\)decision makers. Let \(\:DM=\left\{{DM}_{1},{DM}_{2},{\dots\:,\:DM}_{p}\right\}\) be the set of \(\:p\) decision makers. \(\:A=\left\{{A}_{1},{A}_{2},{\dots\:,A}_{m}\right\}\) be a set of \(\:m\) alternatives and \(\:C=\left\{{C}_{1},{C}_{2},{\dots\:,C}_{n}\right\}\) be the set of \(\:n\:\)criteria.\(\:\:{W}_{ij}=\left\{{W}_{1},{W}_{2},{\dots\:,W}_{n}\right\}\:\)be the set of criteria weights assigned by decision maker \(\:i\) to criterion \(\:j\)

Fig. 3
figure 3

The Complete AHP Hierarchy Tree.

  1. 1.

    Calculate the weighted sum for each criterion by Eq. (1)

$${S_j}=\sum\limits_{{i=1}}^{p} {{W_{ij}} * {N_{ij}}}$$

(1)

Where Sj is the weighted sum for criterion Cj. Wij is the weight assigned by decision maker i to criterion j. Nij is the number of decision makers evaluating criterion j.

  1. 2.

    Calculate the final weight for each criterion by Eq. (2)

$${W_j}=\frac{{{S_j}}}{{\sum\nolimits_{{j=1}}^{n} {{S_j}} }}$$

(2)

Where Wj is the final weight for criterion Cj.

The weights determined by each expert along with final criteria weights are presented in Table 2. The criteria weights represent the significance of the features used in conjunction with the machine learning models. The importance of these criteria or features is displayed in Fig. 4.

Table 2 Criteria weights.
Fig. 4
figure 4

TOPSIS

The problem is now formulated as a MCGDM problem involving 8 experts, 15 criteria, and a set of different alternatives or employees. The weights for all criteria have been determined using the AHP technique. The primary role of TOPSIS is to rank employees in order to identify those most likely to leave the organization. TOPSIS seeks to find the alternative that is as close as possible to the best ideal solution while being as far as possible from the worst ideal solution. This makes it a powerful and intuitive method for solving MCGDM problems. The main steps of TOPSIS are summarized as follows:

  1. 1.

    Construction of the decision matrices based on decision makers’ assessments.

Each decision maker (DM) creates their own decision matrix based on their expertise and perspective, using the generative AI framework as explained earlier. Each matrix includes the evaluation of all alternatives (employees) against the selected criteria, with values representing the performance or suitability of each alternative relative to each criterion.

  1. 2.

    Construction of the aggregated decision matrix Using the Geometric Mean Operator:

The aggregated decision matrix \(\:{D=[d}_{ij}]\:\)is calculated using Eq. (3)

$$D=\left[ {{d_{ij}}} \right]={\left( {\prod\limits_{{k=1}}^{{{n_{ij}}}} {{X_{ijk}}} } \right)^{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 {{n_{ij}}}}}\right.\kern-0pt}\!\lower0.7ex\hbox{${{n_{ij}}}$}}}}$$

(3)

Where [dij] is the aggregated value for alternative i under criterion j. Xijk is the evaluation provided by expert k for alternative i under criterion j. nij is the number of experts who evaluated alternative i under criterion j.

  1. 3.

    Normalization of the Aggregated Decision Matrix:

The normalized matrix \(\:{{[d}_{ij}}^{N}]\) is computed as Eq. (4)

$$\left[ {{d_{ij}}^{N}} \right]=\frac{{{d_{ij}}}}{{\sqrt {\sum\nolimits_{{i=1}}^{m} {{d_{ij}}^{2}} } }}$$

(4)

  1. 4.

    Construction of the weighted normalized decision matrix.

The weighted normalized decision matrix \(\:{{[d}_{ij}}^{*}]\:\)is calculated as in Eq. (5)

$$\left[ {{d_{ij}}^{*}} \right]=\left[ {{d_{ij}}^{N}} \right] * {w_j}$$

(5)

  1. 5.

    Identifying of the positive-ideal (PIS) and the negative-ideal (NIS) solutions using Eq. (6)

PIS (\(\:{\text{d}}^{+}\)) is the best possible solution, where NIS (\(\:{\text{d}}^{-}\)) is the worst possible solution. They can be obtained by:

$$\begin{gathered} {d^+}=\hbox{max} \left( {{d_{ij}}^{*}} \right){\text{ for benefit criteria, }}\hbox{min} \left( {{d_{ij}}^{*}} \right){\text{ for cost criteria}} \hfill \\ {d^ – }=\hbox{min} \left( {{d_{ij}}^{*}} \right){\text{ for benefit criteria, }}\hbox{max} \left( {{d_{ij}}^{*}} \right){\text{ for cost criteria}} \hfill \\ \end{gathered}$$

(6)

  1. 6.

    Calculation of the distance measure:

For each alternative, the distance measure to the PIS \(\:\left({S}_{i}^{+}\right)\) is measured by Eq. (7), and to the NIS \(\:\left({S}_{i}^{-}\right)\) by Eq. (8) based on the Euclidean distance measure.

$${S_i}^{+}=\sqrt {\sum\limits_{{j=1}}^{n} {{{\left( {{d_{ij}}^{*} – {d^+}} \right)}^2}} }$$

(7)

$${S_i}^{ – }=\sqrt {\sum\limits_{{j=1}}^{n} {{{\left( {{d_{ij}}^{*} – {d^ – }} \right)}^2}} }$$

(8)

  1. 7.

    Calculation of the relative closeness coefficient to the ideal solution by Eq. (9)

$$C{C_i}=\frac{{{S_i}^{ – }}}{{{S_i}^{+}+{S_i}^{ – }}}{\text{ where, }}0 \leqslant C{C_i} \leqslant 1$$

(9)

  1. 8.

    Ranking the alternatives.

Alternatives are ranked based on their closeness coefficient value, with employees ranked higher being more likely to leave the organization.

After ranking the employees, they are classified into three categories based on data derived from a box plot shown in Fig. 5. The first category includes those with the greatest likelihood of turnover. Table 3 presents the classification of employee categories. The first category, Class A, with the highest rankings, identified by a closeness coefficient greater than the third quartile value (75th Percentile) (Q3 = 0.64732) obtained from the box plot. This group represents employees at high risk of turnover based on churn prediction. Class C represents the lowest risk group for leaving the organization, with closeness coefficients below the first quartile (25th Percentile) (Q1 = 0.45300). Lastly, Class B, comprises those at moderate risk, with closeness coefficients between the first and third quartiles (Q1 and Q3).

Table 3 Employee categories.
Fig. 5
figure 5

Box Plot of Ranked Employees.

Machine learning classification

After obtaining the employee rankings and classification categories for churn prediction using TOPSIS, the next step is to apply various machine learning algorithms to the datasets to validate the effectiveness of using ML for solving this problem.

ML classification is based on four key concepts: classes, features, training, and testing. The classes represent the target categories the model aims to predict. For the churn prediction problem, there are three classes: A, B, and C. Features are the criteria used to make the predictions about the target. In the case of study, 15 different criteria are considered for prediction. The datasets provided by decision makers are used in training process where the classification algorithm learns from labeled data by identifying patterns and relationships between the features and their corresponding classes. After training, the model is evaluated on a testing dataset to assess its accuracy in predicting the correct class.

In this approach, nine different types of supervised learning classifiers are used to categorize employees into one of three categories, similar to TOPSIS method. The training process relies on evaluations obtained from both the real dataset and the generated datasets from imitative experts. The primary goal for applying ML classifiers to the churn prediction problem is to accurately predict potential turnover and deliver actionable insights to help minimize losses.

ML classifiers’ parameters

Nine different classifiers are applied to the churn prediction problem to categorize employees into three groups, identifying those most likely to leave the organization.

These classifiers are AdaBoost, Gradient Boosting (CatBoost), and Random Forest, Logistic Regression, KNN, and Neural Network, SVM, Decision Tree, and Naïve Bayes.

In the proposed methodology, the working parameters of ML classifiers are stated as follows: AdaBoost utilizes 60 estimators using decision trees as the base estimator, the Samme. R algorithm for classification, and a linear loss function for regression tasks. Gradient Boosting (CatBoost) employs 170 trees, a learning rate of 0.3, a maximum tree depth of 15, with replicable training and a regularization strength of 3. Random Forest runs with 150 trees, without limits on features or tree depth. It stops splitting nodes with 5 instances, but no replicable training is supported. Logistic Regression takes a simple approach with Ridge (L2) regularization, a C value of 1, and no class weights, ensuring balanced predictions. KNN keeps things basic with 5 neighbors, utilizing the Euclidean distance metric, and uniform weighting for classification. The Neural Network model is configured with two hidden layers of 50 neurons each. It uses the tanh activation function with the Adam solver, allowing up to 500 iterations with some replicability in training. For SVM, a radial basis function (RBF) kernel with a C-value of 1.0 and a tight numerical tolerance of 0.001is used. It’s set for 100 iterations, ensures a balance between precision and computational efficiency. The Decision Tree is pruned to keep at least 2 instances in leaves and 5 in internal nodes, with a maximum depth of 100, stopping splits when 95% of instances in a node belong to the same class. All these models have been fine-tuned to maximize precision, efficiency, and accuracy, making them highly effective for churn prediction problems.

link

Leave a Reply

Your email address will not be published. Required fields are marked *