The analysis of generative adversarial network in sports education based on deep learning

The analysis of generative adversarial network in sports education based on deep learning

Gan and SeqGAN

Gan

The GAN functions as a structural framework rather than a specific network model16. The schematic representation of this network is delineated in Fig. 1.

Fig. 1
figure 1

The overall structure of the GAN framework.

In Fig. 1, the architecture of the GAN is depicted, consisting primarily of two components: the “Generator” and the “Discriminator”17. Within this configuration, the generator’s sole responsibility is to produce data conforming to the authentic sample distribution without formal constraints18. While the generator can function independently of a discriminator, the training objectives diverge. When operating autonomously, the generator necessitates the definition of a loss function. Nevertheless, manually defined loss functions exhibit limited representational capacity and are often unilaterally focused, resulting in model convergence and outcomes that may fall short of expectations, particularly in tasks related to text processing. GAN offers a partial remedy to this issue. The role of the discriminator within this framework is akin to that of a dynamically self-updating objective function. Its primary objective is to minimize the probability of the generated sample being classified as authentic and the probability of the genuine sample being deemed false. Adversarial training dictates that the updates to the generator are contingent upon the discriminator’s outcomes, aiming to minimize the distribution disparity between the generated and authentic samples19.

Within the GAN framework, the generator is conceptualized as a function \(\:G( \cdot )\). The sample distribution generated by \(\:G( \cdot )\) is denoted as pg(x), while the distribution of real samples is denoted as \(\:p_data\left(x\right)\). The discriminator is abstracted as a function \(\:D( \cdot )\), with its output being a scalar representing a binary outcome. Given sufficient data sampling, \(\:D( \cdot )\) can implicitly characterize the distribution gap between pg\(\:\left(x\right)\) and \(\:p_data\left(x\right)\) within the sample space20.

The overarching optimization objective of GAN is to determine a generating network G*, as illustrated in Eq. (1):

$$\:G^*=arg\undersetGmin\undersetDmaxV(G,D)$$

(1)

The function \(\:V(G,D)\) serves as a metric for discriminator discrimination, where \(\:\undersetDmaxV(G,D)\) represents the distinction between real data \(\:p_data\left(x\right)\) and generated data pg\(\:\left(x\right)\), as expressed in Eq. (2):

$$\:V\left(G,D\right)=IE_x\simp_data\left(x\right)\left[logD\left(x\right)\right]+IE_x\simp_g\left(x\right)[1-logD(x\left)\right]$$

(2)

The solution to the GAN is bifurcated into two sequential phases. Initially, \(\:D^*\) is determined to maximize the discriminator result, followed by the subsequent step, where \(\:G^*\) is aspired to minimize the gap between pg\(\:\left(x\right)\) and \(\:p_data\left(x\right)\).

In the event that the discriminator function is continuous, leveraging the relationship between the probability density function and the expectation function allows the transformation of Eq. (2) into Eq. (3):

$$\:V\left( G,D \right) = \int \left[ p_data(x)logD(x) + p_g\left( x \right)\textlog(1 – D(x)) \right]dx$$

(3)

With the generator held constant and the discriminator reaching its maximum value, the polynomial in Eq. (3) undergoes the process of derivation and attains the extreme value. The ensuing analysis of the discriminator is elucidated in Eq. (4):

$$\:D^*\left(x\right)=\fracp_data\left(x\right)p_data\left(x\right)+p_g\left(x\right)$$

(4)

The solution derived from Eq. (4) is subsequently substituted into \(\:\undersetDmaxV(G,D)\), yielding the expression delineated in Eq. (5):

$$\:V\left(G,D^*\right)=-log4+2*JSD\left(p_data\right(x\left)\right|\left|p_g\right(x\left)\right)$$

(5)

In Eq. (5), when the distribution generated by the Generator, denoted as \(\:p_g\left(x\right)\), is entirely inconsistent with the real distribution \(\:p_data\left(x\right)\), the Jensen-Shannon Divergence (JSD) remains constant and equals log2. When the two distributions are identical, the JSD is zero, signifying that the Generator has reached its optimum.

Furthermore, the process of minimizing the real distribution and the generated distribution is the process of solving for \(\:G^*\).

$$\:\underset\textDmaxV(G,D)=L\left(G\right)$$

(6)

Then:

$$\:G^*=\underset\textDmaxV(G,D)=L\left(G\right)$$

(7)

Subsequently, the gradient descent method is employed to identify the optimal solution G that satisfies the given equation.

SeqGAN

The inherent structure of the GAN exhibits specific limitations. Conceived primarily for the generation of continuous, real-valued data, the native GAN architecture encounters challenges when tasked with directly generating discrete sequences. Notably, it struggles to propagate gradient updates from the discriminant network to the generating network effectively21,22.

As a remedy, the SeqGAN model is employed for the assessment of college student’s mental health quality evaluation texts23. This model conceptualizes the sequence generation process as a decision-making endeavor and incorporates reinforcement learning. Within this framework, the generated network assumes the role of a reinforcement agent, while the generated discrete sequences are construed as the current state. The action taken determines the subsequent generated word example24,25. Notably, the generator network solely engages in gradient optimization on the strategy, relying on the discriminator’s scores pertaining to the generator’s output sequence without directly computing the strategy outcomes. The architectural configuration of the SeqGAN model is depicted in Fig. 2.

Fig. 2
figure 2

In Fig. 2, the fundamental structure of SeqGAN aligns with the original GAN framework, where both real and generated data are amalgamated for discriminator training. Subsequent to receiving feedback from the discriminator, the generator undergoes updates26. The pivotal components of this structure entail reward computation and strategy gradient updating:

The primary objective of the generator is to maximize the cumulative rewards over the entire generative sequence, as expressed in Eq. (8):

$$\:J\left( \theta \: \right) = E\left[ s_0,\theta \right] = \sum\nolimits__y_1 \in \:Y G_\theta \:\left( s_0 \right)Q_D_\varphi \kern 1pt ^G_\theta \kern 1pt \left( s_0,y_1 \right)$$

(8)

Within Eq. (8), \(\:R_T\) signifies a comprehensive sequence of rewards and \(\:Q_D_\varphi\:^G_\theta\:(s,a)\) represents a sequence of action-value functions. The REINFORCE algorithm is employed to estimate the action value function, utilizing the estimated probability from the discriminator as a reward, as delineated in Eq. (9):

$$\:Q_D_\varphi\:^G_\theta\:\left(s=Y_1:T-1,a=y_T\right)=D_\varphi\:\left(Y_1:T\right)$$

(9)

However, the discriminator can only yield valid output when the sequence is complete. Given that sentence generation endeavors to optimize long-term returns, the roll-out strategy employs a Monte Carlo search to acquire T-t word examples, a process subsequently iterated27.

Text generation of college students’ mental health quality evaluation based on SeqGAN

Text generation guided by abstract semantics

In order to achieve targeted text generation for psychological quality evaluation, this study introduces a new variable, , and a hidden variable \(\:h_0\) within each network. \(\:h_0\) is associated with a specific attribute that the sentence aims to control, while h governs other attributes. Both the generator and discriminator are equipped with an encoder E to update the distribution of \(\:h_0\)28,29. The constructed model is presented in Fig. 3.

Fig. 3
figure 3

Process of the training algorithm.

In Fig. 3, the text generation process guided by abstract semantics unfolds through five distinct steps:

  1. 1)

    Data Processing: Involves conventional text preprocessing, encompassing tasks such as text content segmentation, extraction of corresponding indicators, word segmentation, removal of noise words, and establishment of a dictionary.

  2. 2)

    Initialization: The generator, discriminator, and data loader are initialized based on several hyperparameters derived from preprocessing, such as dictionary size and word index list. Additionally, a corresponding metric object is instantiated.

  3. 3)

    Generator Pre-training: The pre-training of the generator assumes significant importance, with the discriminator influencing parameter adjustments during adversarial training. Adequate pre-training ensures model stability. The adjustment direction of the generator’s pre-training parameters is contingent solely upon the dataset. The maximum likelihood estimation method is employed for monitoring the training process.

  4. 4)

    Discriminator Pre-training: Supervision signals provided by the pre-trained discriminator more effectively aid the generator in making adjustments. Discriminator pre-training involves supervised training, where both real text and text generated based on random semantic input are inputted and labeled as true or false for training.

  5. 5)

    Counter Training: In the adversarial process, the generator functions as an agent for reinforcement learning, utilizing the scores assigned by the discriminator as rewards. The generator employs strategy gradients for updates.

Discriminator structure

The primary task of the discriminator is to ascertain the authenticity of a given sentence as real data. Throughout the supervised training process, data encoding with the true tag encompasses both textual and abstract semantic content.

The SeqGAN model employs a Convolutional Neural Network (CNN) as the discriminator, a choice validated for its effectiveness in text classification30. Further details regarding the discriminator’s training process are provided in Table 1.

Table 1 Pseudocode for discriminator.

Generator structure

The primary objective of the generator is to generate a sentence with a high reward value, maximizing the likelihood of being deemed authentic by the discriminator, as articulated in Eq. (8).

In order to address common challenges such as gradient vanishing and exploding gradients during reverse propagation, the generator adopts the Long Short-Term Memory (LSTM) structure31. The model iteratively utilizes an update function g to map the word embedding vector representation of the input sequence \(\:\x_1,x_2,\dots\:,x_T\\) to the hidden vector \(\:\h_1,h_2,\dots\:,h_T\\). The function g is formally expressed in Eq. (10):

$$\:h_T=g(h_t-1,x_t)$$

(10)

The word distribution probability of the softmax output layer z, mapping the hidden state to the output, is delineated in Eq. (11):

$$\:p\left(y_t|x_1,x_2,\dots\:,x_t\right)=z\left(h_t\right)=softmax(c+Vh_t)$$

(11)

Within Eq. (11), the parameter c denotes the bias term, and V signifies the weight matrix.

The training process of the generator is bifurcated into two stages: pre-training and adversarial training.

In the supervised pre-training phase, each step of real data word examples is integrated into the encoding of abstract semantics. This integration aims to reinforce the association between word examples and their corresponding abstract semantics, facilitating the swift determination of the generation direction in subsequent processes. The generator’s output is a selection probability, with loss calculated according to Eq. (12). Here, \(\:p_data\) represents a unique heat distribution:

$$\:L = – \sum\nolimits_t = 1^T \left( y_t \right) \cdot G_\theta \:p_data\left( y_t \right)$$

(12)

During unsupervised adversarial training, the input of each step of case reasoning undergoes encoding using abstract semantics—a process converse to supervised pre-training. The objective is to leverage abstract semantics for the identification of more consistent word examples. Penalties are additionally imposed based on rewards. As depicted in Eq. (13), when the real data and generation probability remain constant, higher rewards result in smaller losses, whereas lower rewards correspond to greater losses:

$$\:L = – \sum\nolimits_t = 1^T {p_data\left( y_t \right) \cdot G_\theta \:p_data\left( Y_1:t – 1 \right) \cdot \:Q_{D_\varphi \kern 1pt }^{G_\theta \kern 1pt }\left( Y_1:t – 1,y_t \right)}$$

(13)

Calculation of reward

The primary function of the reward is to incentivize generators to produce distributions resembling the real text utilized in discriminator training, particularly during generator updates. When the generated sentence closely aligns with the characteristics of real text, the associated reward for that sentence increases32.

The reward, in this context, is a function that assesses the action value at each step of a given policy π. Employing Monte Carlo sampling, this function generates multiple simulation trajectories starting from the current state. Utilizing a Generator \(\:G_\beta\:\) identical to \(\:G_\theta\:\), multiple distinct trajectories may exist under each action. In order to mitigate discrepancies and obtain a more precise evaluation of action values, the generator is executed N times from the current state to the conclusion of the sequence, yielding a batch of output samples. These trajectories, collectively forming a batch, are then employed to compute the average return, thereby estimating the value associated with each action.

The detailed calculation process of the reward is outlined in Table 2.

Table 2 Pseudocode calculated by Reward.

Experimental design

Selection of data sets

The dataset originates from the mental health quality assessment data encompassing all high school and university sports disciplines within a specific province, city, and region during the period of 2020–2021. Notably, certain inconsistencies were identified in the original dataset, including instances of insufficient content length and diverse descriptors such as “excellent,” “good,” “A,” “B,” or simplistic four-character words. Following preprocessing tasks such as deduplication removal of empty and excessively brief text, a curated subset of 12,087 records, collectively comprising approximately 800,000 words, was selected. On average, each student’s evaluation content spans 52 words. The dataset encompasses three fundamental pieces of information: city, gender, and sports events. Additionally, it incorporates four objective achievement metrics: cognition (RZ), emotion (QX), will (YZ), and adaptation (SY).

The detailed structure of the data items is elucidated in Table 3.

Table 3 Examples of data items.
Experimental environment and parameter settings

The experimental environment and parameter settings designed are shown in Tables 4 and 5. Table 4 comprehensively enumerates the environmental configurations, encompassing aspects such as processors, GPUs, memory, operating systems, programming languages, and deep learning frameworks. This meticulous detailing is undertaken to ensure the stability and reproducibility of the experimental setup. Table 5 presents the hyperparameter settings employed during the experiments, encompassing the optimizer, word embedding dimensions, hidden vector dimensions, sentence length, and batch size, as well as pertinent parameters for LSTM generator pre-training, CNN generator pre-training, and adversarial training. The selection of these parameters is informed by prior research experience aimed at guaranteeing the effectiveness and stability of the experiments.

Table 4 Environment configuration.
Table 5 Hyperparameter setting.

link

Leave a Reply

Your email address will not be published. Required fields are marked *