1 Introduction

Today, the use of computers as well as other information technologies (IT) in education is inevitable. Moreover, during times of crises like COVID-19 pandemic, the importance of IT is even higher than usual times [1]. Traditional education methods have evolved into concepts such as web-based education, computer-based education, and distance education thanks to IT. Computers, tablets, smartphones, and many other IT devices are used by the majority of people, which has led to the development of individualized education systems. Individualized educational environments offer users learning activities independent from time, place and other people. Educational games are one of the best examples of such individualized educational environments. Games are as effective as other educational tools in terms of achievement [2]. Thanks to the increase and the availability of IT technologies, there has been a shift in online gaming [3]. COVID-19 led millions to spend more time at homes, and this has led online games to be more popular than before [4]. Games are popular activities for people at all ages, and it is an inevitable piece of social and cultural life because they provide enjoyment, involvement, adrenaline and social interaction to the players. The most recent growing trend is game-based learning (GBL) by using the online games [5]. Game-based learning is in widespread use in education methodology [6] with many areas [7] such as Mathematics education [8], English language education [9], and Science education [10]. Game-based learning has a positive effect on learning [11] and traditional e-learning environments benefit a lot from games because games include educational goals, rules, restrictions [12]. GBL helps learners to develop effective learning, critical thinking and problem-solving skills, achievement, cognitive enhancement and intensive motivation [13]. Moreover, GBL is an effective environment in both formal and informal educational settings and improves analysis, synthesis and evaluation skills [14]. It also helps learners to solve daily life problems [15]. The traditional games have features such as fun, rules, goals, interactions, results, feedback, conflict, challenge, and creativity [16]. In the educational GBL environment, these features also exist, but it also aims to teach the player certain subjects. It combines learning content with game elements. Educational games aim to teach something and can also be entertaining. It increases interest, concentration and motivation [17]. Thus, it increases the player's interest in learning and satisfaction. Therefore, they provide flow experience, which is a phenomenon where people enjoy and concentrate on activity without any disruption from outside [18]. Moreover, GBL environments increase student achievement [19] and engagement, so helps learning [20] and it becomes an effective learning environment [21].

The contribution of game-based learning to education cannot be underestimated, but there are many differences among students. Students' preferences and experiences during the game affect the game environment [22]. One of the most important features of web-based learning environments is that it engages learners. Game-based learning methods can enhance standard learning environments by providing individual learning and motivation. If the challenge is higher than the skill of the player, anxiety will occur, and if the opposite happens, boredom might arise. As the skill increases, the challenge must get harder [23]. In order to provide motivation, independence, responsibility and flexibility, student centered game-based learning has emerged. Thus, different and adaptive game levels can be generated automatically, but this can be a time-consuming and a costly process if it is performed on the basis of conventional game generation [24]. Also, game structure, presentation, rules, event, scenario, object and player are all important [25] for achieving the best game level. Moreover, gaining attention, informing learning objectives, recalling prior learning, presenting learning content, providing learning guidance, eliciting performance, provide feedback and evaluate performance are among other important aspects of game level generation [26]. The process of generating game levels requires speed, accuracy, convenience, and diversity. Random generator, constructive generator and searched-based generator are three types of PCG methods. Density, aesthetics, balance, suitability, symmetry and reachability are important in searched-based level generators [27]. Moreover, content generation reduces human workload and provides specific type of content. Level design combines challenge, competition and interaction, which all make games enjoyable [28]. In the literature, there is no single best type of PCG method or algorithm.

In this proposed study, regarding the essentials of the PCG, the game environment can change adaptively to the preferences of the player. For this purpose, BiLSTM which is a popular deep learning method and FAHP-GA are used together. The game is started according to many performance criteria at the beginning of the game as follows: question difficulty (Pa) which is difficulty level of question in the span for player to answer correctly, obstacle count (Pb) which is generated obstacle count in the time span for player to pass without crashing or being trapped and coin count (Pc) which is shown coin count in the time span for player to collect. During further game levels, these criteria are recalculated with previously obtained trained level data dynamically according to the player's preferences. Thus, the game environment is constantly changed.

The rest of the paper is as follows. Section 2 gives related work about procedural content generation. Section 3 describes the methods used in the proposed system. Section 4 explains the proposed approach on game level generation. Section 5 details the application and evaluation of the proposed system on an educational game. The final section presents the conclusion of the paper.

2 Related work

Procedural content generation is producing game content by algorithms with limited human intervention or without any intervention [29]. There are many methods for developing game content. These are constructive, searched based, and machine learning-based techniques [30]. Constructive techniques are fast and effective but are not suitable for complex contents. Searched-based methods are based on scoring content according to a fitness function. The common method used in game content development is the searched-based procedural content generation method. In the study of Zafar’s et al. [27], general levels for 2D games were produced using the FI2POP genetic algorithm. The aesthetics and difficulty of the game are emphasized. Symmetrical, intense, and accessible game levels were successfully produced in 5 different games. Game level generation provides variety and reduces development time and cost, augmenting human creativity and enabling adaptivity [31]. Generally, rule generation [32], levels [33], maps [34] are produced. Searched-based PCG functions are direct evaluation function (preferences of the player), simulation-based evaluation function (agent based) and interactive evaluation functions (eye, speech quality) [35]. Moreover, machine learning [36] and deep learning methods are also available for level generation [37,38,39]. A multi-faceted surrogate model for searched-based PCG used deep learning architecture that is trained on a large corpus of randomly generated levels, classes and simulations. This system adjusts human design to desired gameplay environment [40]. Answer set programming [41] and Quality Diversity algorithm [42] are among other PCG methods. Data-driven PCG used GA and support vector machines to automatically generate adaptive educational game content. Players realized greater performance from playing contents tailored to their capabilities rather than uncustomized game content [43]. Experience-driven PCG provides personalization of player experience with the affective and cognitive modeling in a real-time adjustment of content [44]. Satisfaction of the player requirements and preferences is important for the effective and meaningful game level generation [45].

3 Methods

Game level and content generation is a time-consuming and costly process. There are many game level generation methods, but the most popular one is procedural content generation method. In this part, artificial intelligence and decision-making methods such as BiLSTM, FAHP and GA methods that were used in the study were explained.

3.1 Fuzzy AHP and genetic algorithm

Multi-criteria decision making methods are used for many different purposes to solve complex problems considering the user defined preferences and criteria [46,47,48]. Analytic Hierarchy Process (AHP), the most popular multi criteria decision making method, is easy to use, flexible and effective because of having basic mathematical expressions [49] and executable both quantitative and qualitative criteria by changing comparison matrix values of priorities to get more sensitive results. However, imprecise and uncertain users’ preferences cannot be satisfied sufficiently by AHP, and thus it is criticized [50] because of using Crisp numbers in calculations [51]. Sometimes, using exact numbers (Crisp) for comparison gives imponderable results [52]. Therefore, Fuzzy Analytic Hierarchy Process (FAHP) is needed to satisfy unforeseeable and imprecise conditions of the user’s preferences.

Fuzzy AHP method uses fuzzy numbers to compare alternatives and criteria for imprecise judgments and vagueness of the human choices [53]. Uncertain judgments are used in daily life, usually instead of certain judgments. FAHP uses fuzzy triangular numbers (TFNs) to represent users’ choices rather than crisp values that AHP uses [54]. AHP is based on perception, so the FAHP is more expressive for user decisions as compared to AHP [55]. Thus, FAHP can effectively be used for many different areas such as supplier selection [56] and test sheet preparation problems [57]. Traditional FAHP was modified with Extent Analysis Method by Chang [58] to be used easily in among other MDCM methods. In many studies, FAHP was supported with genetic algorithm (GA) as a hybrid method to satisfy unforeseeable and imprecise preferences of the users.

One of the popular evolutionary algorithms is genetic algorithm, which is used for solving difficult and complex problems for many purposes by using encoded problem solution parameters [59]. GA selects best solution candidates and ensures them to live in population of solution space [60]. GA has some basic steps such as defining the initial population, evaluating convenience of chromosomes in population with using fitness function, selection of best suitable chromosomes and mutation of chromosomes. These steps are repeated until the best solution is found or stop condition is provided [61]. The GA is used to search and optimize solutions for problems in solution space with defined mathematical models and functions, and it tries without being trapped to local maximum or minimums. Therefore, MCDM methods are combined and enhanced with GA to find the best solutions for problems [62]. In many studies, the AHP method includes GA as a tool [63,64,65]. Conversely, in some studies, the weights of each criterion for the fitness evaluation of chromosomes in the GA are calculated with AHP [57, 66,67,68]. In this proposed study, different from the aforementioned studies, hybrid FAHP-GA model was used for procedural game level generation, but criteria and sub-criteria values are changed dynamically in an automatic and intelligent way by using BiLSTM method. Adaptively calculated FAHP weights are used for fitness evaluation of chromosomes in GA to generate the best game level regarding the player’s preference parameters.

In this proposed study, hybrid FAHP-GA module is used for defining new level’s game dynamics such as question difficulty (Pa) which is difficulty level of question in the span for player to answer correctly (MC1), obstacle count (Pb) which is generated obstacle count in the time span for player to pass without crashing or being trapped (MC2) and coin count (Pc) which is shown coin count in the time span for player to collect (MC3). These properties are considered as main criteria of the game level generation problem. Each main criterion also has sub-criteria that are shown in the judgment hierarchy of the proposed FAHP-GA method in Table 1.

Table 1 Sub-criteria of proposed FAHP-GA method

The proposed method uses Chang’s FAHP method (first six steps) which is expressed as follows [58]:

Step 1: The qualitative and quantitative criteria and sub-criteria are determined, based on game level properties. Then, pairwise comparisons of criteria and sub-criteria were calculated by using TFNs instead of crisp numbers as in AHP (Table 2) [68]. TFNs were adapted from Kahraman et al. [53] and Chan et al. [69].

Table 2 Crisp and TFN value scale for the proposed study [68]

Step 2: Let \(X = \left\{ {x_{1} ,x_{2} , \ldots ,x_{n} } \right\}\) be an object set, and \(U = \left\{ {u_{1} ,u_{2} , \ldots ,u_{n} } \right\}\) be a goal set. The extent analysis for each goal object is performed, respectively. Thus, m extent analysis parameters for each object are obtained with the following expressions:

\(\tilde{M}_{{gi}}^{1} ,\tilde{M}_{{gi}}^{2} \ldots ,\tilde{M}_{{gi}}^{j}\), where all the \(\tilde{M}_{{gi}}^{j} = \left( {i = 1,~2, \ldots ,n\;{\text{and}}\;j = 1,~2,~ \ldots ,m} \right)\) are TFNs. The steps of extent analysis defined as follows:

Step 3: The value of fuzzy synthetic extent with respect to the ith object is described as:

$$ S_{i} = \sum\limits_{{j = i}}^{m} {M_{{gi}}^{j} } \otimes \left[ {\mathop \sum \limits_{{i = 1}}^{n} \mathop \sum \limits_{{j = 1}}^{m} M_{{gi}}^{j} } \right]^{{ - 1}} $$
(1)

To obtain \(\mathop \sum \nolimits_{{j = i}}^{m} M_{{gi}}^{j}\), and succeed the fuzzy addition operation of m extent analysis values for a particular matrix such that:

$$ \mathop \sum \limits_{{j = i}}^{m} M_{{gi}}^{j} = \left( {\mathop \sum \limits_{{j = 1}}^{m} l_{j} ,\mathop \sum \limits_{{j = 1}}^{m} m_{j} ,\mathop \sum \limits_{{j = 1}}^{m} u_{j} } \right) $$
(2)

and to obtain \(\left[ {\mathop \sum \nolimits_{{i = 1}}^{n} \mathop \sum \nolimits_{{j = 1}}^{m} M_{{gi}}^{j} } \right]^{{ - 1}}\), and succeed the fuzzy addition operation of \(M_{{gi}}^{j} \left( {j = 1,~2, \ldots ,m} \right)\) values such that:

$$ \mathop \sum \limits_{{i = 1}}^{n} \mathop \sum \limits_{{j = 1}}^{m} M_{{gi}}^{j} = \mathop \sum \limits_{{i = 1}}^{n} l_{i} ,\mathop \sum \limits_{{i = 1}}^{n} m_{i} ,\mathop \sum \limits_{{i = 1}}^{n} u_{i} $$
(3)

and then assess the inverse of the vector above, such that:

$$ \left[ {\mathop \sum \limits_{{i = 1}}^{n} \mathop \sum \limits_{{j = 1}}^{m} M_{{gi}}^{j} } \right]^{{ - 1}} = \left( {\frac{1}{{\mathop \sum \nolimits_{{i = 1}}^{n} u_{i} }},\frac{1}{{\mathop \sum \nolimits_{{i = 1}}^{n} m_{i} }},\frac{1}{{\mathop \sum \nolimits_{{i = 1}}^{n} l_{i} }}} \right) $$
(4)

Step 4: As \(\tilde{M}_{1} = \left( {l_{1} m_{1} u_{1} } \right)\) and \(\tilde{M}_{2} = \left( {l_{2} m_{2} u_{2} } \right)\) are two TFNs, the degree of possibility of \(M_{2} = \left( {l_{2} m_{2} u_{2} } \right) \ge {\text{~}}M_{1} = \left( {l_{1} m_{1} u_{1} } \right)\) described as:

$$ V\left( {\tilde{M}_{2} \ge \tilde{M}_{1} } \right) = \sup _{{y \ge x}} \min \left( {\mu _{{\tilde{M}_{1} }} \left( x \right),\mu _{{\tilde{M}_{2} }} \left( y \right)} \right) $$
(5)

and can be equally stated as follows:

$$ V\left( {\tilde{M}_{2} \ge \tilde{M}_{1} } \right) = {\text{hgt}}\left( {\tilde{M}_{1} \cap \tilde{M}_{2} } \right) = \mu _{{M_{2} }} \left( d \right) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {{\text{if}}\;m_{2} \ge m_{1} } \hfill \\ {0,} \hfill & {{\text{if}}\;l_{1} \ge u_{2} } \hfill \\ {{\text{otherwise}},} \hfill & {\frac{{l_{1} - u_{2} }}{{\left( {m_{2} - u_{2} } \right) - \left( {m_{1} - l_{1} } \right)}}} \hfill \\ \end{array} } \right. $$
(6)

where d is the ordinate of the highest intersection point D between \(\mu _{{\tilde{M}_{1} }}\) and \(\mu _{{\tilde{M}_{2} }}\), as illustrated in Fig. 1.

Fig. 1
figure 1

Intersection point "d" between two fuzzy numbers M1 and M2 [58]

Step 5: The presumption degree of a convex fuzzy number to be greater than k convex fuzzy \(M_{i} \left( {1,{\text{~}}2,{\text{~}}k} \right)\) numbers can be expressed with:

$$ V\left( {M \ge M_{1} ,M_{2} , \ldots M_{k} } \right) = V\left[ {\left( {M \ge M_{1} } \right)\;{\text{and}}\;\left( {M \ge M_{2} } \right)\;{\text{and}}\; \ldots \;{\text{and}}\;\left( {M \ge M_{k} } \right)} \right] = \min \,V\left( {M \ge M_{i} } \right),\;i = 1,~2,~3, \ldots ,k $$
(7)

Assume that \(d\left( {A_{i} } \right) = \min V\left( {S_{i} \ge S_{k} } \right)\) for \(k = 1,~2, \ldots ,n;\;\;k \ne i\).

Then the weight vector is given by: \(W^{'} = \left( {d^{'} \left( {A_{1} } \right),d^{'} \left( {A_{2} } \right), \ldots d^{'} \left( {A_{n} } \right)} \right)^{{\text{T}}}\) where \(A_{i} = \left( {i = 1,~2, \ldots ,n} \right)\) are n elements.

Step 6: Normalized weight vectors are defined as:

$$ W = \left( {d\left( {A_{1} } \right),d\left( {A_{2} } \right), \ldots d\left( {A_{n} } \right)} \right)^{{\text{T}}} $$
(8)

where W is a non-fuzzy number and is used as weights for the corresponding genes of chromosomes in the GA fitness function.

Step 7: Then, game level properties (Pa, Pb, Pc) are encoded as gene in the chromosomes for the fitness function evaluation by using related W for each gene (Fig. 2).

Fig. 2
figure 2

Encoded game level properties as chromosome

SLP is the expected properties for the following game level which will be generated according to the previous level’s difficulty. CLP1, CLP2… CLPn are candidate game level properties in the solution area of the problem. Each gene corresponds to a game level property. SLPGj to SLPGm are genes (criterion) of the searched game level property chromosome where m is the gene count (criterion) in the chromosome. CLPiGj to CLPnGm are genes (criterion) of the candidate game level property chromosome where n is the criteria count in the solution search area, and m is the gene count in chromosome, (i = 1, 2, … n), (j = 1, 2, …m). Each gene is multiplied with related weight in W matrix as quotient of the precedence. SCM is related sub-criteria synthetic value in comparison matrix for game level properties (genes). Thus, the fitness function is expressed as:

$$ F\left( x \right) = \mathop \sum \limits_{{i = 1}}^{n} W_{i} {\text{SCM}}_{{{\text{cs}}}} $$
(9)

where c is index of \({\text{CLP}}_{i} {\text{G}}_{j}\) in corresponding candidate sub-criteria comparison matrix (SCMj), and s is index of SLPGj in substituted searched sub-criteria comparison matrix (SCMj). Tables 3, 4 and 5 show sub-criteria comparison matrixes, which are SC1 to SCm where m is gene (criterion) count. These comparison matrixes are steady in the proposed system.

Table 3 Question difficulty criterion (MC1) comparison matrix SCM1
Table 4 Obstacle count criterion (MC2) comparison matrix SCM2
Table 5 Coin count criterion (MC3) comparison matrix SCM3

Step 8: Each chromosome in the solution population is subject to F(x) fitness function in order to evaluate the suitability. Then, the best chromosomes that maximize the F(x) are picked to the new population by election rate. Simultaneously, new chromosomes are regenerated and mutated for the new population. In this study, population size was 100, mutation rate was 0.25, election rate was 0.15, and maximum generation count was 200. When stop condition is provided or maximum generation count is reached, the GA ends. The last population chromosomes, in this way, game level properties are listed as best solutions and sorted by their fitness function values from largest to smallest.

3.2 Long short-term memory

Deep learning algorithms are used in many areas such as image processing, classification and natural language processing [70]. These methods are different from classical Artificial Neural Networks (ANN) in various ways such as layer numbers. Recurrent Neural Networks (RNN) are the well-known deep learning algorithms [71] which can process input sequences such as time series problems, but sometimes gradient descend or ascend problem may occur [72]. Thus, gradient loss problems can cause learning problems, so correct relations between sequences cannot be found appropriately by the RNNs. For this reason, LSTM, a special RNN version, was developed [73]. However, LSTM can fail in sequential operations such as time series because of having process on data in one direction [74]. For this reason, BiLSTMs, having two LSTM (backward and forward direction), were developed for process input sequence is all time steps with two direction (Fig. 3). The first LSTM processes the input sequence from backward to forward, while the second one does vice-versa [75]. Both LSTM are on the copy of the same input sequence [76]. Therefore, the BiLSTM system can learn the problem faster and more effectively.

Fig. 3
figure 3

BiLSTM schema [75]

3.3 Evaluation metrics

In this proposed study, hybrid FAHP-GA model was used for procedural game level generation, but criteria and sub-criteria values are changed dynamically in an automatic and intelligent way. Adaptively calculated FAHP weights are used for fitness evaluation of chromosomes in GA to generate the best game level regarding the player’s preference parameters. The quality of the generated game levels must be evaluated. Procedural content generation algorithms mostly rely on the quality of the content which they generate [77]. There are a lot of computational metrics generally for 2D games but, in this proposed study, the developed educational game is a 3D game, and 20 levels were generated. Thus, suitable computational metrics are chosen for evaluation as follows [78]:

  • Leniency is the player’s challenge in a level which is calculated as the sum of the lenience value of all the objects by divided level length. Then, it is normalized into the range [0–1] and high leniency value means more challenging game level.

  • Density is the distribution of the objects on the same axes in the game level which is calculated as the sum of the objects in the defined axes and normalized into the range [0–1]. Higher value means more denser level.

  • Negative space is the percentage of the empty space which can be used by the player to escape or to use for other purposes. Higher values mean more enjoyable and aesthetically pleasing game level.

  • Balance measures how the objects are well distributed.

  • Reachability measures the proportion of elements which are reachable by the player in the level.

Furthermore, the success of the proposed FAHP-GA method is compared to other AHP and FAHP methods as regards the performance. Statistical methods and Pearson correlation were used, and results were analyzed. Also, the success of the BiLSTM is compared to other prediction methods such as LSTM, ANN and some other regression methods.

4 Proposed procedural game level generation system

Procedural content game generation methods are mostly used on the basis of direct evaluation of the player preferences. Challenge goal of the game must be uncertain by producing variable difficulty levels, multiple level goals and randomness [45]. Thus, in this proposed study, BiLSTM-based hybrid FAHP-GA algorithm generates variable levels with multiple goals from player’s preferences which are gained with previous game level. In the first level, player starts to the game predefined with main criteria preferences which are medium for Pa, Pb and Pc parameters. Each generated game level has the same time length, which is 30 min. At the end of each 20 s time period (span) in the level; Pa proportion (correct answer count/total question count in the time period), Pb proportion (avoided Pb without crash or trapped/total Pb in the time period), Pc proportion (collected Pc/total Pc in the time period), and total score are calculated to produce the dataset. At the end of the level, each proportion of the criteria (input values) and the total score (output value) are given to the BiLSTM for training. Then, in the following game level, at the end of each 20 s time period (span); Pa proportion, avoided Pb and total Pc proportion are gained again. Current proportions are used to predict the sequencing total score. Then, if the predicted total score is greater than the real total score of the previous level, so the game level is easy. Thus, questions should be made more difficult, obstacles should be increased, and coin count should be reduced for the following game level. If the predicted total score is smaller than the real total score of the previous level, the game level is hard. Thus, questions should be made less difficult, obstacles should be decreased, and coin count should be increased for the following game level. While defining the values of these Pa, Pb and Pc game properties (sub-criteria), the difference between current proportion and related proportion in the previous level, which interval it falls, is used (Table 6). The corresponding interval of the game property is calculated with (current proportion−previous level proportion)/(current proportion + previous level proportion). To sum up answered question proportion, avoided obstacle proportion and total coin proportion are used as input for BiLSTM. Total level score is used as output for BiLSTM. According to the input proportion values of the current level, total score is predicted as output and this score is compared with the previous level’s total score. Thus, whether the current level is difficult or easy is determined.

Table 6 Values of the sub-criteria according to the intervals of the difference

When the sub-criteria values (input values for FAHP-GA) are determined by the intervals which are defined according total score difference between the current level and the previous level, the FAHP-GA method is used for generating the next level according to the defined criteria values. When stop condition is provided or maximum generation count is reached, the GA ends. The last population chromosomes (output values of FAHP-GA), in this way, game level property parameters (Pa, Pb, and Pc) are listed as best solutions and sorted by their fitness function values from largest to smallest.

5 Application and evaluation

5.1 Game genre and scenario

The proposed hybrid method was applied in an educational game which was developed for this study. In the game, it is aimed to train users either consciously or unconsciously by asking questions according to the progress and the level of the user, as well as performing certain tasks and learning new information. Type of this game is task based, adventure, puzzle and role-playing. The player's progress depends on the performance. When the player performs in the game level, collected coins (awards), correct answered question count, avoided obstacle count without crashing or being trapped and total level score are calculated at the end of the level. For example, if the player spends a lot of time, while the obstacles in the game are reduced, the number and difficulty of the obstacles may increase for the fast-moving players. Similarly, when the player answers the questions too correctly, the difficulty level of new questions may increase. Depending on the condition of such criteria, the total score to be obtained from the game can vary and the game environment can be shaped according to the player's preferences.

5.2 Application and evaluation on game

At the beginning of the game, player can adjust Pa criteria (MC1), Pb criteria (MC2) and Pc criteria (MC3) by game preferences menu. For example, when player wants to play a difficult level game, MC1, MC2 and MC3 criteria values can be adjusted to the very difficult, many and many, respectively. Then, main criteria comparison matrix is found as Table 7 for this example.

Table 7 Main criteria comparison matrix as TFNs

Each candidate game levels based on the properties is evaluated by using the FAHP-GA hybrid model, and then, the highest valued game levels according to the fitness value of the GA are listed as a result. The FAHP produces weights for each criterion in order to evaluate chromosome fitness in the GA (Fig. 4). Also, calculated criteria weights for each generated game level are shown in Fig. 5.

Fig. 4
figure 4

The fitness values of the generated game level properties according to the FAHP-GA hybrid model

Fig. 5
figure 5

The FAHP-GA criteria weights for each game level

The performance of the proposed FAHP-GA algorithm is compared with AHP and FAHP methods according to the same criteria based on distribution of the game level suitability points. These algorithms are investigated by statistical methods to get descriptive information. The FAHP-GA has 2.084, FAHP has 1.972, and AHP has 1.926 standard deviations. High standard deviation indicates that generated game levels more different. Thus, the FAHP-GA algorithm showed better performance because of having higher standard deviation. In addition, the correlation between AHP, FAHP and FAHP-GA were analyzed by the Pearson Correlation Coefficient. Calculated correlations for the AHP and the FAHP-GA is 0.638, for the FAHP and the FAHP-GA is 0.724, and for AHP and the FAHP is 0.847. The FAHP-GA is different from other algorithms as it is seen regarding to these calculated coefficients. Also, in this study, the process of searching for the best game level has to be fast and accurate, so GA accelerates the process rather than solely FAHP or AHP which is why FAHP-GA hybridized in this study.

In this study, 20 levels were generated, and these levels were tested with computational metrics such as leniency, density, negative space, balance, and reachability (Fig. 6). Results showed that generated levels are generally balanced and reachable. Leliency (in other words challenge) changes between levels because the proposed game level generator is an adaptive and dynamic level generator, so for providing challenge in the game, the negative space and density of the levels are also changing during the developed educational game (Fig. 7).

Fig. 6
figure 6

Computational metrics results for generated levels

Fig. 7
figure 7

Diversity of the level (obstacles) in the educational case study game a Low density, high negative distance b High density, low negative distance

Computational metrics may not be enough, so, to examine the effectiveness and quality of the generated game levels, the game is evaluated by the players regarding the difficulty (easy, medium, difficult) for leniency (challenge), visual aesthetics (low, medium, high) for balance and density, and enjoyment (low, medium, high) for negative space and reachability [77]. The participants are secondary school students, and a total of 18 students (10 of whom is male, 8 of whom is female) played the game (Table 8). Results show that 61.38% of the generated game levels are difficult, 71.66% of the levels have high visual aesthetics, and 63.05% of the levels are high enjoyable.

Table 8 Comparison of the generated game levels according to the 18 students

The performance of the BiLSTM is compared with other prediction methods such as LSTM, ANN, Support Vector Regression (SVR), Decision Tree Regression (DTR), and Random Forest Regression (RFR). Implementation of these methods is applied on the same data with optimized parameters in order to predict whether the game level is difficult or not in the corresponding time span for the game level-1 which has an actual score of 1260 (Table 9). BiLSTM gives the best result with the minimum error rate among these prediction methods.

Table 9 Comparison of the BiLSTM results with other methods

Our proposed study makes remarkable contributions regarding the evaluation results:

  • Procedural game level generation is a highly cost and time-wasting problem. Using the proposed system, adaptive game levels can be generated according to the player preferences in a dynamic and automatic way by the help of artificial intelligence methods.

  • BiLSTM makes Fuzzy AHP-GA dynamic and adaptive for game level generation which is the first study for this purpose. Also, fuzzification of the AHP reflects the more suitable player preferences.

  • Deep learning, fuzzy decision-making and genetic algorithm provide an effective and interesting perspective for procedural game level generation problem. The use of hybrid systems is likely to be more popular in the future.

There are also some limitations that should be expressed as contributive suggestions for further studies by interested researchers as follows:

  • Application parameters of the proposed methods in the system may be optimized in detail for more improvements.

  • This study was evaluated on developed an educational game, but further comparison and evaluation may be done with other genre of games.

By means of the proposed LSTM-based dynamic FAHP-GA hybrid model on developed game, game levels can be generated in a fast, stable, balanced and adaptive way. Furthermore, the search space of a problem is explored in a fast way to find the peak values without trapping in local minimums or maximums by using GA. Therefore, GA is combined with many of methods to produce the best, objective, reliable and robust solutions to complex problems in the literature.

6 Conclusion

Increasing demand in games leads to a rise in the need for content. Evaluating and generating procedural content game levels is a difficult and time-consuming process. Also, game levels must be dynamic and reshaped according to the players’ preferences. This proposed method is the first study that combines player preferences with FAHP-GA. Moreover, criteria and sub-criteria values are changed dynamically, adaptively and automatically according to the player’s preferences and the performance on the game by the help of the BiLSTM method. This is the case, even if AHP is used for calculating the preferences on game level generation with numerical values, but it does not overcome unforeseeable and uncertain human choices exactly. Hence, to handle these issues, the FAHP method is used and also is enhanced with GA method to get more reliable and best performance by using fuzzy set theory with TFNs. The weights are calculated as the synthetic values of the pairwise comparisons by the FAHP in order to be used for fitness function evaluations in the GA. Thus, game levels are generated in an automatic and intelligent way according to the player performance and preferences in the game. In the future studies, not only game levels and also whole game can be generated completely by adding other artificial intelligent methods.