Introduction

Benefiting from the revolution of SOA (service-oriented architecture) [13] and cloud technology, microservices techniques have been developed and applied in many applications [26]. Compared with microservices, traditional single applications have many limitations, especially when there are a large number of users in different regions and with different business needs. A single application means that a war package contains all the functions of the project. The scalability, fault tolerance, stability, and invulnerability of single applications are clearly insufficient to adapt to the rapid development of mobile internet, which is widely used for large-scale distributed concurrent users. Therefore, microservice-based applications research has attracted wide attention [14, 26]. Microservices decompose traditional single-application tasks into independent microservices horizontally or vertically to meet different business and functional requirements. Each decomposed service can run multiple instances individually.

Microservices are flexible, scalable, and highly available. (a) Flexibility means that each service instance can be flexibly deployed in multiple servers or cloud resource centres on the condition of sufficient network communication bandwidth and computing and storage resources [10]; (b) scalability is the ability to start and shut down instances according to the user’s concurrent needs. These operations should be adaptable to a dedicated service module. For example, in e-commerce microservices, the concurrent user payment services suddenly increase at a certain time while other services (such as user registration service and ERP service) are unchanged. In that case, the system should only expand the payment service ability rather than the entire service. Therefore, the flexible extension mechanism of microservices can provide users with high concurrent services while saving resources. (c) High availability is another typical feature of the microservices. The resources required for the microservice instances are relatively small, and each microservices can start multiple instances at the same time. It can perform multi-service hot standby operations on certain key services to improve the stability of the entire application. In summary, due to the above characteristics, microservices are an effective method when there is a large number of users and access devices and massive concurrent requests [1, 11] for applications of mobile internet and internet of things.

Many large-scale network application companies or enterprises (e.g., Amazon [27], IBM [6], eBay [24], Twitter [19], Alibaba [21], Tencent [35]) are gradually adopting microservice-based architecture. Microservices can provide powerful and flexible expansion in many applications. However, there are also some problems which need to be solved urgently. One of the most critical issues is the microservice deployment and startup optimization problem. The main problem is that some microservices in multiple locations need to be started to provide services for a large number of users distributed over a wide area. How and where should the instance of these microservices be started in the constraint of hardware resources, service delays, and service reliability? This is the core problem of service centre configuration and schedules. The difficulty of this problem lies in the following: (a) many factors (computing or storage resource usage rate, service efficiency, and load balancing indicators) in various servers and service centres are mutually constrained, and it is difficult to achieve an optimal solution on all targets in all centres. (b) The computational complexity is high, and it is difficult to find the optimal solution in polynomial time using traditional optimization algorithms. (c) The startup and service requests for microservice instances have certain dynamics, and the solution strategy needs to be quickly solved to meet the actual needs.

The contributions of this paper are as follows: (a) a multi-objective optimization model is constructed for microservices deployment and startup strategy. (b) A knowledge-driven evolutionary algorithm (Multi-Generations Reference NSGA-III, MGR-NSGA-III) is proposed to choose the elite solutions in each evolutionary generation by considering the lineages of the population members. (c) A number of multi-objective algorithms are compared to solve the microservices deployment and startup problem in real datasets.

According to the authors’ knowledge, this is the first study to model the microservices deployment and startup problem based on multi-objective optimization on the server side. The model is evaluated by applying a microservice system to typical datasets in New York and Tokyo. The simulation experiment shows that the solution is feasible and supports the deployment and startup of microservices.

The remainder of this paper is organized as follows: Section “Related works” provides a brief review of the multi-objective optimization model of microservices deployment and startup strategy. Section “Problem definition and multi-objective model” proposes a multi-objective optimization model of problem. MGR-NSGA-III algorithm is proposed in Section “Multi-generations reference NSGA-III”. Section “Experiment analysis” presents a comparative analysis and summary of the experiments.

Related works

The research of service computing mainly focuses on the traditional web service composition problem [3, 15]. The traditional solution is to provide users with the most appropriate service combination strategy from the perspective of optimizing service quality indicators, such as service performance indicators, energy indicators, and service security indicators [32]. This article provides a good review of service computing; however, it does not discuss the application of the algorithm in service computing, and it lacks experimental results.

In the cloud service environment, the service deployment strategy problem consists of the following aspects. The quality of service (QoS) of composite services plays an important role in the deployment value chain and partner collaboration in the manufacturing field [29]. Qos is an evaluation system which is used to evaluate performance of the services, including comprehensive evaluation of service availability, accessibility, reliability, legality, security, success rate, access volume, response time, etc. The cloud service was adopted to improve the service quality in the intelligent manufacturing field [16]. The synthetic genetic algorithm was used to solve the service deployment and composition problem. For the QoS-based service composition problem, a service composition solution based on the multi-constraint optimal path problem model was proposed [12] by optimizing the correlation between various QoS factors. The results of some experiments show that the solution achieves good performance of QoS in cloud computing. However, when the number of available services in the service pool is increased, these solutions become very inefficient. As the services manufactured by the cloud are complementary and interact with each other, it is hard to select a service composition group. The swarm differential evolution algorithm [34] and the hybrid grey wolf group optimization algorithm [4] were introduced into the cloud manufacturing large-scale service deployment problem.

Optimal service deployment and combination strategies have been used to extend the lifespan of the internet of things (IoT). A game and evolutionary theory-based method was proposed [18] to solve the optimal service composition strategy of IoT. The life cycle of IoT services can be extended by seeking a load-balancing optimal service arrangement strategy for multiple devices. In addition, a heuristic strategy was used to schedule user requests and services in IoT applications [22] to achieve a supply and demand equilibrium between users and service providers in heterogeneous environments.

Microservices are applied to provide users with high-quality and high-performance services while minimizing the cost of hosting user services. The quality and performance of microservices are affected by some major factors [17]. These factors include the flexibility of the deployment infrastructure, load balancing, changing service providers, infrastructure reservations, memory size, etc. Low cost means that service providers have high resource usage rates. Under the condition of the FAAS (function as a service) technology architecture concept, this paper [2] proposes an event-driven and scalable service environment to determine which and how service instances need to be started. However, this strategy is based on function calls rather than resource requirements. In this case, the policy is implemented on the server. Therefore, the FAAS needs to be expanded to respond to the user requests. In this way, microservices can provide users with more scalable services. However, a limitation is that the expansion of FAAS is not efficient and transferable. The server needs to be restarted and redeployed every time the physical resources are expanded. One solution is the composite service instance startup strategy. The server can dynamically start and end different service instances according to the global user requirements. Thus, a combination optimization strategy of the entire server can be obtained without restarting the physical server. In [9], the microservices allocation strategy in the heterogeneous microservices environment was studied. An efficient real-time dynamic microservices scheduling strategy that enables users to schedule their requests after the microservices have been deployed was proposed.

The services/microservices deployment and composition scheduling strategies mentioned above can adapt to different QoS indicators. The advantage of microservices is that it can produce adaptive deployment strategies for different requirements. However, most of the microservice startup strategies are not efficient because they adopt a fixed total amount [1, 2, 14] and determine the number of fixed microservice instances startup based on the number of pre-estimated resources in the resource centre. The resource usage rate and load balance rate are low in traditional methods for microservices. Therefore, a new microservices deployment and startup method is needed.

The multi-objective evolutionary algorithms (MOEAs) have proved their effectiveness in solving multi-objective optimization problems [16]. MOEAs mainly include multi-objective genetic algorithms (NSGA, NSGA-II, NSGA-III, SPEA-II, etc.) [7, 23], multi-objective optimization based on decomposition [28, 31], multi-objective optimization based on artificial immune system [25, 34] and some other methods and applications [5, 33]. Their evolutionary nature based on population members makes it possible to obtain computational goals of multiple targets in a single run. Therefore, the MOEAs can solve the microservices deployment and startup problem.

Problem definition and multi-objective model

The illustration of the microservices deployment and startup problem is shown in Fig. 1. Assume that several resource centres (RCs) are distributed in space, and each one has two kinds of resources: computation and storage resources. The startup of each microservice instances will consume a certain amount of resources. At the same time, a large number of service users are distributed in the same space. Each user requires one or more microservice instances to be serviced. The problem of microservice deployment and startup is to seek the optimal number of the running microservice instances in each resource centre. With this optimal strategy, the resources can be used efficiently, and the load balance in each RC can be increased.

Fig. 1
figure 1

The illustration of microservices deployment and startup problem

According to the needs of microservices deployment, the optimal model should be constructed based on a typical multi-objective optimization model. Before the model is constructed, some definitions are provided as follows:

Definition 1

Resource centres (RC). A resource centre is an entity that provides an environment for the deployment and startup of microservice instances. It contains computation and storage resources, and it can be distributed in different locations.

Assuming that there is a number N of RCs in the space, \( {\text{RC}} = \left\{ {{\text{rc}}_{1} ,{\text{rc}}_{2} , \ldots ,{\text{rc}}_{N} } \right\} \), each resource centre contains two types of information (resource and location), which can be described as\( {\text{rc}}_{i} = \left( {{\text{RRes}}\left( {r\_{\text{com}}_{i} ,r\_{\text{sto}}_{i} } \right),{\text{RLoc}}\left( {r\_{\text{lat}}_{i} ,r\_{\text{lon}}_{i} } \right)} \right), \)where \( {\text{RRes}}\left( {r\_{\text{com}}_{i} ,r\_{\text{sto}}_{i} } \right) \)represents the existing computation resources \( {\text{com}}_{i} \) and storage resources \( {\text{sto}}_{i} \) in the i-th RC. To facilitate the model solution, the computation and storage resources are all normalized to a specific real value. \( {\text{Loc}}\left( {{\text{lat}}_{i} ,{\text{lon}}_{i} } \right) \) are the latitude and longitude of the i-th RC location.

Definition 2

Microservice collection (Microservices, MS). In this paper, the microservice collection refers to some independent functional services that are separate from a single big application. It can be described as follows: \( {\text{MS}} = \left\{ {s_{1} ,s_{2} , \ldots ,s_{K} } \right\} \) represents K independent microservices that are separate from a single application. \( s_{i} = \left( {{\text{SRes}}\left( {r\_{\text{com}}_{i} ,r\_{\text{sto}}_{i} } \right)} \right),i \in \left[ {1,K} \right] \) represents the computation and storage resources of the i-th microservice when its instance is deployed and started up.

Definition 3

User collection (Users, U). User collection refers to the consumers of microservices. Each user may consume a single microservice or a sequence of several microservices. Users are independent individuals.

It is assumed that there are M users in the space described as \( {\text{Users}} = \left\{ {u_{1} ,u_{2} , \ldots ,u_{M} } \right\} \). Each user contains two types of information (service request information and location information), as\( u_{i} = \left( {{\text{UReq}}\left( {S_{i} \left( {s_{p} , \ldots ,s_{k} } \right)} \right),{\text{ULoc}}\left( {u\_{\text{lat}}_{i} ,u\_{\text{lon}}_{i} } \right)} \right) \),\( i \in \left[ {1,M} \right],0 < p < k < K \), where \( {\text{UReq}}\left( {S_{i} \left( {s_{p} , \ldots ,s_{k} } \right)} \right) \) is the current requirement of the i-th user, and \( {\text{ULoc}}\left( {u\_{\text{lat}}_{i} ,u\_{\text{lon}}_{i} } \right) \)is the current location of the microservice.

Definition 4

Microservice instances deployment and startup strategy (MDSS). For a number of N resource centres in the space \( {\text{RC}} = \left\{ {{\text{rc}}_{1} ,{\text{rc}}_{2} , \ldots ,{\text{rc}}_{N} } \right\} \), the microservice instances \( {\text{MS}} = \left\{ {s_{1} ,s_{2} , \ldots ,s_{K} } \right\} \)deployment and startup strategy is expressed as \( {\text{Strategy}}\left( {{\text{RC}},{\text{MS}}} \right) \). The mathematical expressions are as (1):

$$ {\text{Strategy}}\left( {{\text{RC}},{\text{MS}}} \right) = \left[ {{\text{rc}}_{1} \left( {x_{11} , \ldots ,x_{1K} } \right),{\text{rc}}_{2} \left( {x_{21} , \ldots ,x_{2K} } \right), \ldots ,{\text{rc}}_{N} \left( {x_{N1} , \ldots ,x_{NK} } \right)} \right]^{T} , $$
(1)

where \( {\text{rc}}_{1} \left( {x_{11} , \ldots ,x_{1K} } \right) \) represents MDSS in resource centre 1.\( x_{11} \)indicates the number of startup instances of microservice 1 in resource centre 1.

The multi-objective optimization model for this problem is as Eq. (1).

It is assumed that there are a number of N resource centres \( {\text{RC}} = \left\{ {{\text{rc}}_{1} ,{\text{rc}}_{2} , \ldots ,{\text{rc}}_{N} } \right\} \) in space \( {\mathbb{R}} \), K functionally independent microservices \( {\text{MS}} = \left\{ {s_{1} ,s_{2} , \ldots ,s_{K} } \right\} \), and M users \( {\text{Users}} = \left\{ {u_{1} ,u_{2} , \ldots ,u_{M} } \right\} \), and the decision vector \( x = {\text{Strategy}}\left( {\text{RC,MS}} \right) \) is a continuous independent variable.

$$ \left\{ \begin{array} {*{20}l} \mathop {\hbox{min} }\limits_{x \in \delta } F\left( x \right) = \hbox{min} \left( {f_{1} \left( x \right),f_{2} \left( x \right), \ldots ,f_{L} \left( x \right)} \right)^{T} \hfill \\ {\text{s}} . {\text{t}} . {\text{g}}_{i} \left( x \right) \le 0\left( {i = 1,2, \ldots ,m} \right), \hfill \\ \, h_{j} \left( x \right) = 0\left( {j = 1,2, \ldots ,n} \right) \hfill \\ \end{array} \right., $$
(2)

where \( F\left( x \right) \) is the objective function of this model \( g_{i} \left( x \right) \) and \( h_{j} \left( x \right) \) represent two different constraints. L is the number of objective functions, \( m \) is the number of inequality constraints, and \( n \) is the number of equality constraints in the model. The model has five objective functions: computation and storage resource utilization efficiency, the real idle rate, and the computation and storage load balancing ratios. The five objective functions are defined as follows:

Objective functions

The first objective function calculates the computation resource idle rate, which is defined as \( f_{1} \left( x \right) \).

$$ f_{1} \left( x \right) = 1 - {{\sum\limits_{i = 1}^{N} {\frac{{{\text{rc}}_{i} \left( {x_{i1} , \ldots ,x_{iK} } \right) \cdot {\text{SRes}}\left( {r\_{\text{com}}_{1} , \ldots ,r\_{\text{com}}_{K} } \right)^{T} }}{{{\text{RRes}}\left( {r\_{\text{com}}_{i} } \right)}}} } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{N} {\frac{{{\text{rc}}_{i} \left( {x_{i1} , \ldots ,x_{iK} } \right) \cdot {\text{SRes}}\left( {r\_{\text{com}}_{1} , \ldots ,r\_{\text{com}}_{K} } \right)^{T} }}{{{\text{RRes}}\left( {r\_{\text{com}}_{i} } \right)}}} } N}} \right. \kern-0pt} N}, $$

where\( {\text{rc}}_{i} \left( {x_{i1} , \ldots ,x_{iK} } \right) \) represents the number of microservices that are running at the i-th resource centre.

The \( {\text{SRes}}\left( {r\_{\text{com}}_{1} , \ldots ,r\_{\text{com}}_{K} } \right) \)represents the vector of computational resources consumed by the microservices. \( {\text{RRes}}\left( {r\_{\text{com}}_{i} } \right) \) represents the existing computational resource value of the i-th resource centre.

Similarly, the second objective function storage resource idle rate is defined as \( f_{2} \left( x \right) \).

$$ f_{2} \left( x \right) = 1 - {{\sum\limits_{i = 1}^{N} {\frac{{{\text{rc}}_{i} \left( {x_{i1} , \ldots ,x_{iK} } \right) \cdot {\text{SRes}}\left( {r\_{\text{sto}}_{1} , \ldots ,r\_{\text{sto}}_{K} } \right)^{T} }}{{{\text{RRes}}\left( {r\_{\text{sto}}_{i} } \right)}}} } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{N} {\frac{{{\text{rc}}_{i} \left( {x_{i1} , \ldots ,x_{iK} } \right) \cdot {\text{SRes}}\left( {r\_{\text{sto}}_{1} , \ldots ,r\_{\text{sto}}_{K} } \right)^{T} }}{{{\text{RRes}}\left( {r\_{\text{sto}}_{i} } \right)}}} } N}} \right. \kern-0pt} N}. $$

The third objective function is the real idle rate of the microservice. In general, a single microservice instance can be used for several service requests once it is started up. However, when the number of service requests exceeds the maximum load of a single microservice instance, extra instances need to be started. The maximum number of service users that a single microservice can handle can be represented as \( {\text{MSMaxUsers}} = \left[ {mu_{1} ,mu_{2} , \ldots ,mu_{K} } \right] \). It is expected that the load of each microservice should be fully used.

The real idle rate for microservices is defined as \( f_{3} \left( x \right) \).

$$ f_{3} \left( x \right) = {{1 - \sum\limits_{i = 1}^{N} {\sum\limits_{j = 1}^{K} {\frac{{{\text{RealUserNum}}\left( {s_{j} ,{\text{rc}}_{i} } \right) \, }}{{{\text{ServAbility}}\left( {x_{ij} ,mu_{j} } \right)}}} } } \mathord{\left/ {\vphantom {{1 - \sum\limits_{i = 1}^{N} {\sum\limits_{j = 1}^{K} {\frac{{{\text{RealUserNum}}\left( {s_{j} ,{\text{rc}}_{i} } \right) \, }}{{{\text{ServAbility}}\left( {x_{ij} ,mu_{j} } \right)}}} } } {\left( {KN} \right)}}} \right. \kern-0pt} {\left( {KN} \right)}}, $$
$$ {\text{ServAbility}}\left( {x_{ij} ,mu_{j} } \right) = \left\{ {\begin{array}{*{20}l} {x_{ij} mu_{j} } & {{\text{if}}\quad x_{ij} \ne 0} \\ {1,} & {{\text{if}}\quad x_{ij} = 0} \\ \end{array} } \right., $$

where \( {\text{RealUserNum}}\left( {s_{j} ,{\text{rc}}_{i} } \right) \)is the actual number of users of microservice \( s_{j} \) at the resource centre \( {\text{rc}}_{i} \). The calculation method will be described in detail later as the result of the service discovery strategy in Section III.

Under the premise of this user’s service discovery strategy, the feasible solution must satisfy the constraints (see the constraint section below), and the user can get at least one available microservice. At the same time, the microservice instance that is started up in each resource will identify the number of users as \( {\text{RealUserNum}}\left( {s_{j} ,{\text{rc}}_{i} } \right) \).

The fourth objective function is the computation load balancing ratio among the resource centres. It is not good for some resource centres to maintain high-load operation for a long time while other resource centres are idle.

The load balancing rate is calculated as \( f_{4} \left( x \right) \).

$$ \begin{array} {*{20}l} f_{4} \left( x \right) = \sum\limits_{i = 1}^{N} {\left( {{\text{ComUsage}}\left( {{\text{rc}}_{i} } \right) - \overline{{{\text{ComUsage}}\left( {\text{RC}} \right)}} } \right)}^{2} , \hfill \\ {\text{ComUsage}}\left( {{\text{rc}}_{i} } \right) = \hfill \\ \frac{{{\text{rc}}_{i} \left( {x_{i1} , \ldots ,x_{iK} } \right) \cdot {\text{SRes}}\left( {r\_{\text{com}}_{1} , \ldots ,r\_{\text{com}}_{K} } \right)^{T} }}{{{\text{RRes}}\left( {r\_{\text{com}}_{i} } \right)}} ,\hfill \\ \overline{{{\text{ComUsage}}\left( {\text{RC}} \right)}} = \hfill \\ \frac{{\sum\limits_{i = 1}^{K} {{\text{rc}}_{i} \left( {x_{i1} , \ldots ,x_{iK} } \right) \cdot {\text{SRes}}\left( {r\_{\text{com}}_{1} , \ldots ,r\_{\text{com}}_{K} } \right)^{T} } }}{{\sum\limits_{i = 1}^{K} {{\text{RRes}}\left( {r\_{\text{com}}_{i} } \right)} }}. \hfill \\ \end{array} $$

The fifth objective function is the storage load balancing ratio in the resource centres. The storage load balancing ratio is used to measure the storage balance degree of the resource centre. Generally, due to the existence of data synchronization tasks between resource centres, a resource centre with a large amount of data storage may synchronize data to another one. It often occupies a large amount of communication resources and might cause congestion in services. Therefore, to reduce the data synchronization overhead, the storage load balancing of the resource centre should be considered.

The storage load balancing ratio is computed as \( f_{5} \left( x \right) \).

$$ \begin{array} {*{20}l} f_{5} \left( x \right) = \sum\limits_{i = 1}^{N} {\left( {{\text{StoUsage}}\left( {{\text{rc}}_{i} } \right) - \overline{{{\text{StoUsage}}\left( {\text{RC}} \right)}} } \right)}^{2} , \hfill \\ {\text{StoUsage}}\left( {{\text{rc}}_{i} } \right) = \hfill \\ \frac{{{\text{rc}}_{i} \left( {x_{i1} , \ldots ,x_{iK} } \right) \cdot {\text{SRes}}\left( {r\_{\text{sto}}_{1} , \ldots ,r\_{\text{sto}}_{K} } \right)^{T} }}{{{\text{RRes}}\left( {r\_{\text{sto}}_{i} } \right)}}. \hfill \\ \overline{{{\text{StoUsage}}\left( {\text{RC}} \right)}} = \hfill \\ \frac{{\sum\limits_{i = 1}^{K} {{\text{rc}}_{i} \left( {x_{i1} , \ldots ,x_{iK} } \right) \cdot {\text{SRes}}\left( {r\_{\text{sto}}_{1} , \ldots ,r\_{\text{sto}}_{K} } \right)^{T} } }}{{\sum\limits_{i = 1}^{K} {{\text{RRes}}\left( {r\_{\text{sto}}_{i} } \right)} }}. \hfill \\ \end{array} $$

Constraints

There are four constraints in the model. They are described as follows:

The first constraint is the completeness constraint of the service \( g_{1} \left( x \right) \). All microservices must be guaranteed to start at least one instance in RC. The mathematical expression of the constraint is as follows:\( g_{1} \left( x \right): \)\( x_{ij} \ge 1,1 \le i \le N,1 \le j \le K. \)N is the number of resource centres, and K is the number of microservices in the single service.

The second constraint is the total amount of computation resource constraint \( g_{2} \left( x \right) \). Because the computation resource in each RC is limited, the amount of computing resources occupied by started-up service instances cannot exceed its total amount.

$$ \begin{aligned} g_{2} \left( x \right):{\text{rc}}_{i} \left( {x_{i1} , \ldots ,x_{iK} } \right) \cdot {\text{SRes}}\left( {r\_{\text{sto}}_{1} , \ldots ,r\_{\text{sto}}_{K} } \right)^{T} \hfill \\ \, < {\text{RRes}}\left( {r\_{\text{sto}}_{i} } \right),1 \le i \le N. \hfill \\ \end{aligned} $$

The third is the storage resource constraint \( g_{3} \left( x \right) \). The amount of storage resources occupied by started-up service instance cannot exceed the total amount in each RC.

$$ \begin{aligned} g_{3} \left( x \right):{\text{rc}}_{i} \left( {x_{i1} , \ldots ,x_{iK} } \right) \cdot {\text{SRes}}\left( {r\_{\text{com}}_{1} , \ldots ,r\_{\text{com}}_{K} } \right)^{T} \hfill \\ \, < {\text{RRes}}\left( {r\_{\text{com}}_{i} } \right),1 \le i \le N. \hfill \\ \end{aligned} $$

The fourth constraint is the total number of service sequence constraints \( g_{4} \left( x \right) \). The total amount of all started-up microservice instances should be bigger than the total amount of user demand for microservices. The expression is as follows:

$$ g_{4} \left( x \right):\sum\limits_{i = 1}^{M} {{\text{UserReq}}\left( {u_{i} ,s_{j} } \right)} \le \sum\limits_{i = 1}^{N} {x_{ij} mu_{j} } ,1 \le j \le K. $$

\( {\text{UserReq}}\left( {u_{i} ,s_{j} } \right) \) indicates whether the i-th user has requested the service; it returns 1 if so and returns 0 if not. \( mu_{j} \) indicates the maximum number of service users that j-th microservice.

Optimization problem model

In summary, the microservice deployment and startup strategy model based on multi-objective optimization is described as follows:

Assuming that the MS set \( O = \left\{ {o_{1} ,o_{2} , \ldots ,o_{K} } \right\} \), for the solution space decision vector \( x \in \varOmega \),\( x = \left[ \begin{array} {*{20}l} {\text{rc}}_{1} \left( {x_{11} ,x_{12} , \ldots x_{1K} } \right), \hfill \\ {\text{rc}}_{2} \left( {x_{21} ,x_{22} , \ldots x_{2K} } \right), \hfill \\ \ldots , \hfill \\ {\text{rc}}_{N} \left( {x_{N1} ,x_{N2} , \ldots x_{NK} } \right) \hfill \\ \end{array} \right]^{T} \), and the model is

$$ \left\{ \begin{array} {*{20}l} \mathop {\hbox{min} }\limits_{x \in \delta } F\left( x \right) = \hbox{min} \left( {f_{1} \left( x \right),f_{2} \left( x \right), \ldots ,f_{5} \left( x \right)} \right)^{T} , \hfill \\ {\text{s}} . {\text{t}} . { }g_{1} \left( x \right):x_{ij} \ge 1,1 \le i \le N,1 \le j \le K, \hfill \\ g_{2} \left( x \right):{\text{rc}}_{i} \left( {x_{i1} , \ldots ,x_{iK} } \right) \cdot {\text{SRes}}\left( {r\_{\text{sto}}_{1} , \ldots ,r\_{\text{sto}}_{K} } \right)^{T} \hfill \\ \, & lt; {\text{RRes}}\left( {r\_{\text{sto}}_{i} } \right),1 \le i \le N, \hfill \\ g_{3} (x):{\text{rc}}_{\text{i}} \left( {x_{i1} , \ldots ,x_{iK} } \right) \cdot {\text{SRes}}\left( {r\_{\text{com}}_{1} , \ldots ,r\_{\text{com}}_{K} } \right)^{T} \hfill \\ \, & lt; {\text{RRes}}\left( {r\_{\text{com}}_{i} } \right),1 \le i \le N, \hfill \\ g_{4} \left( x \right):\sum\limits_{i = 1}^{M} {{\text{UserReq}}\left( {u_{i} ,s_{j} } \right)} \le \sum\limits_{i = 1}^{N} {x_{ij} mu_{j} } ,1 \le j \le K. \hfill \\ \end{array} \right. $$

It can be known from the description of the model that the problem is a typical NP-hard problem. The problem is difficult to solve using the traditional polynomial method. Therefore, it is necessary to seek the evolutionary computation method.

Multi-generations reference NSGA-III

In this paper, multi-generations reference NSGA-III (MGR-NSGA-III) is proposed to solve the problem. Compared with other multi-objective optimization algorithms such as NSGA-III, SPEA-II, etc., MGR-NSGA-III can quickly converge to the optimal solution and effectively find the Pareto frontier.

Non-dominated sorted genetic algorithm (NSGA) and its variants are one of the most classic algorithms for solving multi-objective optimization problems [20]. NSGA is a multi-objective optimization algorithm based on the Pareto frontier to optimize multiple targets. The NSGA-II algorithm [8] is also known as the non-dominated sorting genetic algorithm with elite strategy. It can find a way to retain the elite solutions as the next generation’s parents according to the dominance relationship between the offspring. NSGA-III [7], also known as the reference point non-dominated sorting genetic algorithm and based on the NSGA-II computational framework, further improves the elites retention strategy. It uses a reference point strategy based on maintaining population diversity. It improves the computational efficiency of seeking multiple elite solutions in the non-dominated layer.

The MGR-NSGA-III uses a knowledge-driven population selection method (named the learning elite population selection method) which was designed based on the NSGA-III in the non-dominated level. The learning elite population selection method can record the dominance level of historical generations. It can select elite population members with high genetically pure lineage as the knowledge driven and use them for the mutation of the next generation. The descendants which are inherited by high dominated-level ancestors are more likely to be selected. The selection of the population is based not only on this generation of inheritance but also the knowledge about lineage of the population. The mutation and crossover of the population, which refer the potential lineage, are more suitable for solving multi-objective optimization problems.

Main process

Once the microservices and users are allocated in the space, the system should decide how and where the microservice instances starts up and how users can access the microservices. The main process to handle this problem is shown in Fig. 2. First, the microservice allocation vector is initiated in RC, including the real starting up microservice vector, remaining computation and storage ability vector and user access vector. Second, the optimal deployment and starting up solution is calculated by the MGR-NSGA-III (the solving details will be explained in next section), and the microservices will be initiated according to this solution. Then, users and microservices will be matched. For every user in the users’ set, find the nearest service by location, if the service is available, complete the matching process of this user, and if not, continue to find the nearest service location, until all user’s find the services, then stop the matching process. RC will evaluate the variation of the user requests to decide whether it is necessary to recompute the optimal policy by MGR-NSGA-III.

Fig. 2
figure 2

Main process framework of microservices deployment and startup problem

MGR-NSGA-III

  1. 1.

    Algorithm’s solving process

MGR-NSGA-III integrates the evolutionary mechanism of NSGA-III with the lineage. Rather than considering the dominance and crowding degree between population members, the elite strategy selection will compute the lineage of the population and compare their historic relationship of parents. In the same situation, population members with higher unsorted-dominated rank ancestry are selected.

figure a

The input of the algorithm is the previous generation population \( P_{t} \) (if it is the first generation, the population is randomly initialized), the reference point set of the population (calculation method in [7]). First, a set of population \( Q{}_{t} \), lineage collection \( {\text{Lineage}}_{t} \), is initialized. Then, crossover and mutation of the population are operated, and the lineages are stored in the offspring; see lines 1–2 of Algorithm 1. The detailed calculation process \( {\text{Offspring}}\_{\text{Lineage}}\left( {P_{t} } \right) \) will be displayed in Algorithm 2. The parent \( P_{t} \) and offspring \( Z_{t} \)are incorporated into \( {\text{Joint}}P{}_{t + 1} \); then, using the non-dominated sorting of \( {\text{Joint}}P{}_{t + 1} \), a result set \( \left( {R_{1} ,R_{2} , \ldots ,} \right) \)is obtained, and the resulting rank code is stored as a lineage collection \( {\text{Linerage}}_{t + 1} \) to the current population. See lines 4–5 in Algorithm 1. Second, the dominated solution of the previous i levels is selected until the number of solutions of the first i + 1 levels is greater than the maximum population size. Then, besides the solution of the previous i levels, some solutions are selected from the solutions in the i + 1th level according to their reference point distance and lineage rank list until the population size is equal to the maximum population size. For its selection procession, see line 12–25 in Algorithm 1. Calculate the ideal point, extreme points, and reference points and then obtain the reference vectors [7]. \( {\text{ConsHyper}}(Q_{t} ,{\text{ExtremPoints}}_{t + 1} ) \) function is a definition in paper [7], which indicates that combined population members \( Q_{t} \) and \( {\text{ExtremPoints}}_{t + 1} \)are projected on a hyper-plane and a clustering operation is performed on the hyper-plane to select a desired number of clusters. Find the solutions that are near the reference vectors and have high lineage rank list of ancestors as candidates of the next generation population. The details on this selection process will be displayed in Algorithm 3.

  1. 2.

    Encoding

This paper employs a binary encoding method. The length of the encoding is determined by the maximum number of microservice instances that a single resource centre initiate. This paper assumes that each resource centre can initiate 16 microservice instances. Therefore, the number of initiated instances in a resource centre can be expressed by a four-digit binary code. Obviously, the total length of the binary code is \( 4 \times N \times K \).

  1. 3.

    Crossover and mutation operation

Based on the traditional crossover and mutation strategy, a lineage crossover is proposed. According to the non-dominated sorting results of the parents, it aggregates the ancestry lineage to the offspring in the crossover process. At the same time, to maintain the freshness of the lineage, the loss mechanism of the ancestral lineage of distance is adopted. The calculation of the lineage crossover and mutation process is as Algorithm 2.

figure b

Algorithm 2 first initializes the result list \( O{}_{t} \) and lineage list \( {\text{Lineage}}_{t} \). Then, all of the population members are traversed, and the two adjacent bodies are crossed by the traditional strategy; see lines 2–4. To ensure that the number of lineage rank lists does not exponentially explode with the increase in evolutionary generations, a reduction strategy is introduced here. A max threshold number of lineage rank lists in each population is defined as \( {\text{MaxLineageNum}} \); if the number of lineage rank lists in the parent population exceeds this threshold, the \( {\text{MaxLineageNum/2}} \) lineage rank value of the ancestors will be removed from the far distance to nearby. The freshness of the lineage is also guaranteed in that way; see lines 5–8. Then, the contents of the parent linage rank lists are merged with the offspring. Finally, the offspring are mutated by probability; see lines 9–11.

  1. 4.

    Selection operation

In this paper, the knowledge-driven population selection strategy uses the lineage calculation of the candidate set, which is based on the NSGA-III reference point distance calculation. The calculation process is as follows:

When selecting the optimal object based on the reference point in the 20th row of Algorithm 1, it is necessary to sort the selection according to the lineage. The specific process is as in Algorithm 3.

figure c

Algorithm 3 aims to find the best population members in the \( f_{l} \) rank. On the selection of non-dominated solutions, the lineage factor is considered in the retention strategy of the elite solution. First, the nearest distance population set of the reference point \( {\text{rp}} \) is found. If the number of sets is larger than 0, then there are two cases. The first is that if the number of the nearest distance population set of \( {\text{rp}} \) in \( Q_{t} \)is 0, then the nearest distance population of \( {\text{rp}} \) will be selected; the second is that if the number of the nearest distance population set of \( {\text{rp}} \) in \( Q_{t} \) is bigger than 0, then a population with the best historical lineage is selected according to the average lineage rank sorting.

figure d

An example of the lineage rank list process is presented in picture 3. Assume that two population members with lineage rank list [1, 1, 1, 1, 2] and [1, 1, 2, 2, 3, 3, 4] need to crossover. The \( {\text{MaxLineageNumber}} \) is 6. First, the rank list of each population will be calculated and added into the lineage rank list, assumed to be 3 and 1. Then, the lineage rank lists of the two population members are updated to [1, 1, 1, 1–3] and [1, 1, 1, 2, 2, 3, 3, 4]. As the limitation of the max number of lineages, we use the latest half of the parents’ lineage rank list to crossover, which are [1, 1, 3] and [1, 1–3]. Finally, after crossing each rank element, we obtain the offspring’s lineage [1, 1, 1, 1–3, 3] as shown in Fig. 3.

Fig. 3
figure 3

How offspring lineage is inherited from parents

  1. 5.

    Match users and microservices provider algorithm

When the best deployment strategy is found, some matching algorithms between users and microservice providers should be applied to validate the strategy in the real service environment.

The user service policy mentioned in the previous section adopts a distance-based service selection algorithm as in Algorithm 4. First, for each user, the nearest available microservices is searched for according to the location principle; then, the microservices are checked for whether they have reached the maximum service load. If not, this user request is sent to the service; otherwise, search for the next nearest one by distance. Repeat this until all users discover available microservices.

Details on the calculation process are in Algorithm 4. The first step is to initialize the actual microservice start quantity vector \( {\text{realMicroServicesStart}} \)of each resource centre, the remaining resource vector of each resource centre \( {\text{remMicSerAbility}} \), and the user list served by each resource centre \( {\text{userServiced}} \), as in lines 2–6. Then, traverse each user in \( {\text{uLs}} \)to find the most suitable resource centre service. For each service request in the user service request list, initialize the access identifier of each resource centre to 0; see lines 8–9. Recursively search for the nearest resource service centre that meets the user’s location condition. If it is found, continue to find the next service. If not, return ‘false’; see lines 10–13. Finally, return ‘true’ when all services for all users are found in all resource centres.

The implementation of the function \( {\text{findNeaSerbyLo}}\left( {\text{mSer,user,cLs}} \right) \) is shown in Algorithm 5. The basic process consists of several steps. First, find the nearest resource centre that has not been visited which is identified by \( {\text{CenterFoundFlag}} \) based on the user’s location. Get the RC number \( {\text{neLoCenter}} \)with the function \( {\text{findNeSerbyLo}}\left( {{\text{user}} . {\text{location}}} \right) \) return value. If it returns -1, then all resource centres have already been visited, there is no solution, return ‘false’; see lines 2–3. Second, determine if there is an available service in the resource centre that matches the user’s needs. If the number of services is greater than or equal to 1, the user will be added to the service user list of this centre, and the list of available services in this resource centre is updated by adding this service; see lines 3–6; if not, the access identifier \( {\text{CenterFoundFlag}} \) of the resource centre will be changed to 1. The most suitable service of the user is recursively searched for; see lines 9–11.

figure e
  1. 6.

    Algorithm complexity analysis

The main process of the microservices deployment strategy comprises two parts: the MGR-NSGA-III and matching users with services. The complexity of MGR-NSGA-III is as follows. Algorithm 1 shows the whole process of MGR-NSGA-III. Algorithm 2 and Algorithm 3 need to be computed once for each generation of iterations in MGR-NSGA-III. It is assumed that T represents the number of objective functions, Q represents the population size, G is the maximum allowable number of lineage in each population, and H is the reference number. The computational time complexity for the insertion of lineage with the 2Q population members in Algorithm 1 rows 2–8 is \( O\left( Q \right)\log^{T - 2} Q \) in extreme cases (i.e., each linage requires computational choices, and each lineage has the maximum allowable number). Each reference computation complexity is \( O\left( {T \cdot Q \cdot H} \right) \), and each crossover and merging time of the lineage is \( O\left( {Q \cdot G^{2} } \right) \). The computational complexity chosen for the lineage of each generation is \( O\left( {Q \cdot G} \right)\log^{T - 2} Q \), so the computational complexity of each generation in MGR-NSGA-III is due to \( O\left( {Q \cdot G} \right)\log^{T - 2} Q + O\left( {Q \cdot G^{2} } \right) \)+ \( O\left( {T \cdot Q \cdot H} \right) \). The complexity of Algorithm 4 is \( O\left( N \right) \)+\( O\left( {M \cdot K} \right) \) (this assumes that each user contains a request for all K microservices, which is actually less than K). The essence of Algorithm 5 is a recursive calculation with a complexity of \( O\left( {M^{2} } \right) \) during which insertion and selection of pedigrees is required. Because in each generation of Algorithm 1, we should use Algorithm 4 and Algorithm 5 to evaluate the population, the whole process of the strategy is \( {\text{MaxIterations}} \cdot (O\left( {Q \cdot G} \right)\log^{T - 2} Q + O\left( {Q \cdot G^{2} } \right) \) + \( O\left( {T \cdot Q \cdot H} \right) \) + \( O\left( {M^{2} } \right) \) + \( O\left( {M \cdot K} \right) \)).

Experiment analysis

Datasets

Two real datasets were used in the experiment. One dataset was the New York social network user check-in dataset, which is used to simulate different users distributed in different spaces to request services for different RCs at the same time. The other dataset was the Tokyo social network user check-in dataset [30]. A microservice dataset is a typical data application service system where a single application system is divided into ten microservices: reporting, administrator management, GIS, message transmission, data quality censor, data collection, data comprehensive maintenance, settlement, data visualization displays, and permission controls.

Algorithms

NSGA-III and SPEA-II are chosen as competitor algorithms. Among the multi-objective optimization algorithms, SPEA2 and NSGA-III are two representative elite algorithms, which have become the standard for comparing the performance of other multi-objective optimization evolutionary algorithms. MGR-NSGA-III is developed from NSGA-III, thus, to demonstrate its effectiveness, NSGA-III is chosen. SPEA-II, as is known, is a classic multi-objective optimizer, which is often chosen as baseline competitor algorithm. Besides, SPEA2 can also be used to solve the problem of Microservices deployment problem. Considering all above reasons, SPEA2 and NSGA-III were selected to be compared with MGR-NSGA-III.

Parameters

In this paper, MGR-NSGA-III is proposed to solve the microservices deployment and starting up strategy problem. The experimental parameters were as follows:

  1. (i)

    There are five resource centres in New York and Tokyo that are dedicated to the operation of the data service microservice system;

  2. (ii)

    Each resource centre has 300 units of computing resources and 300 units of storage resources;

  3. (iii)

    Starting a microservice instance requires 20 units of computing resources and 10 units of storage resources, and each instance can serve 10 users at the same time.

The MGR-NSGA-III value of regular parameter is shown in Table 1.

Table 1 Values of regular parameters in different algorithms

The main configuration of the computer used in the experiments is as follows: CPU Intel(R) Core (TM) i7-5600U 2.6 GHZ, 8G RAM memory.

After 20 operations to take the average running time for each algorithm, Fig. 4 shows the running time of different algorithms in different datasets. As the figure shows, solving the problem using SPEA-II would cost much more time than using NSGA-III and MGR-NSGA-III, especially when the number of iterations is larger. Additionally, MGR-NSGA-III requires a slightly longer time to compute in both datasets than NSGA-III. MGR-NSGA-III improves the selection and retention strategy of the elite solutions, it retains those elites with better lineage. Compared with SPEA-II and NSGA-III, these selection and retention calculations require a certain amount of resources, but they can get better solutions, so it has greater advantages in indicators such as average calculation rate, storage idle rate, and actual microservice idle rate.

Fig. 4
figure 4

Calculation time comparison of different iterations of different algorithms

From the optimal frontier solutions of these algorithms, several sets of solutions were selected to compare the traditional microservice deployment strategies without multi-objective function optimization. The results are shown in Fig. 4.

Figure 5 shows the deployment policy of the fixed average start service instance under the traditional combined microservice strategy after 20 operations for each algorithm. The resource idle rate is relatively high, and the actual usage rate of the service is low. The computation resource idle rate, storage resource idle rate, and actual idle rate of the traditional microservices deployment strategy in NSGA-III were 13.21%, 5.2%, and 16.67% higher, respectively. Compared with the traditional method, the calculation and storage resource idle rates of the traditional microservices deployment and startup strategy in MGR-NSGA-III were 60.51% and 2.2% higher, respectively, while the actual idle rate of microservices was 4.62% higher. In summary, the microservices deployment strategy based on multi-objective optimization has better service performance.

Fig. 5
figure 5

Comparison of objective function under different microservice scheduling strategies

Furthermore, different evolutionary multi-objective algorithms have different performance. We will compare the MGR-NSGA-III that we proposed with classic methods as follows:

It can be seen from Figs. 6, 7 and 8 that the stability of the three multi-objective optimization algorithms is good under different evolutionary generation conditions. The range of the optimal solutions became consistent after 200 iterative generations. From the uniform variation of the distribution of the solution, the five objective functions of MGR-NSGA-III remained basically unchanged before the 100th generation, and the distribution of f(3) converges greatly at the 150th generation. The upper and lower quartile range changed from 0.32–0.6 to 0.23–0.52. The convergence speed of MGR-NSGA-III was faster than that of SPEA-II and NSGA-III. In NSGA-III and SPEA-II, there was no significant change in the value distribution range of each target, and the final convergence value was slightly worse than that of MGR-NSGA-III. Finally, the upper quartile and lower quartile range converged between 0.3 and 0.6 in NSGA-III and SPEA-II.

Fig. 6
figure 6

Box chart of different generations for MGR-NSGA-III

Fig. 7
figure 7

Box chart of different generations for NSGA-III

Fig. 8
figure 8

Box chart of different generations for SPEA-II

Definition 5

Dominating number (DN). Assuming population set A and population set B, the dominating number of A to B is defined as the following equation:

$$ {\text{DN}} = \sum\limits_{{i = 1}}^{{\left| A \right|}} {{\text{Dom}}\left( {A_{i} ,B} \right)} ,\;{\text{Dom}}\left( {A_{i} ,B_{j} } \right) = 1,\;\;{\text{if}}\,B_{j} \,{\text{is completely dominated by}}\,A_{i} \,,{\text{ and vice versa}}\,{\text{Dom}}\left( {A_{i} ,B_{j} } \right) = 0. $$

Definition 6

Dominating absolutely number (DAN). Assuming population set A and population set B, the dominating absolutely number of A to B is defined as the following equation:

$$ {\text{DAN}} = \sum\limits_{{i = 1}}^{{\left| A \right|}} {{\text{Dom}}\left( {A_{i} ,B} \right)} ,\;\;{\text{Dom}}\left( {A_{i} ,B} \right) = 1,\;\;{\text{if}}\,\,\exists B_{j} \in B\,,\,A_{i} \,\,{\text{completelydominates}}\,\,B_{j} ,\,\,{\text{and vice versa}}\,\,{\text{Dom}}\left( {A_{i} ,B} \right) = 0. $$

DN and DAN express the dominated relationship between the two population sets. The convergences of different algorithms were also compared by DN and DAN.

It can be seen from Fig. 9a that the convergence rate of the MGR-NSGA-III solution was higher. When the evolutionary calculation reached 50 generations, the NSGA-III and SPEA-II solutions were dominated by MGR-NSGR-III (DN of 6270 and 23,134, respectively). In addition, the DN of the solutions in MGR-NSGR-III was significantly high after 100 generations compared with SPEA-II (DN is 30,657). It can also be seen in Fig. 9b of the figure that after 50 generations, DAN in MGR-NSGA-III reached 577 and 729 compared with NSGA-III and SPEA-II, respectively. Afterwards, DAN reached 607 and 1,039. Therefore, MGR-NSGA-III is clearly more efficient in calculating the optimal strategy for running microservice instances.

Fig. 9
figure 9

Comparison of MGR-NSGA-III with NSGA-III and SPEA-II solutions in dominance

Figures 10 and 11 show the dominating solutions number between MGR-NSGA-III and NSGA-III or SPEA-II. From Fig. 10, it can be seen that of the 50/100/150 generations the maximum DN of the population is more than 30. The maximum DN in 150 generations is even higher than 60 in Fig. 10d. Using MGR-NSGA-III, the number of dominated population can be increased. In addition, the MGR-NSGA-III algorithm has a good convergence rate for computations of less than 200 generations. As shown in Fig. 11, the dominated number may be more than 120/100 in the 50/100th generation compared with SPEA-II.

Fig. 10
figure 10

Comparison of MGR-NSGA-III with NSGA-III in dominating solutions for different generations

Fig. 11
figure 11

Comparison of MGR-NSGA-III with SPEA-II in dominating solutions for different generations

A comparison between different microservice deployment and startup strategies based on multi-objective optimization was made. It can be seen from the red circles in Fig. 12, the number of started-up microservices with NSGA-III and SPEA-II is extremely high (as in Fig. 12a) or low (as in Fig. 12b) in different resource centres, indicating poor load balancing. Conversely, the startup load in MGR-NSGA-III is relatively balanced as in Fig. 12c, d.

Fig. 12
figure 12

Real deployment and startup results of microservices in different evolutionary algorithms

The hypervolume values of different algorithms are compared. The hyper volume [5] is popularly used to evaluate the convergence and distribution of MOEAs’ solutions.

NSGA-III, SPEA-II, and MGR-NSGA-III are used to solve the microservice deployment optimization problem. The calculations have been performed 20 times. It can be seen in Fig. 13 that MGR-NSGA-III represented by the blue part is slightly better than the other two algorithms, especially in the 4th and 5th calculation results, the hyper volume value is significantly higher than the other two algorithms. Moreover, the average values of the hyper volume of the three algorithms with 20 times are shown in Table 2, MGR-NSGA-III algorithm can get the highest average value of hyper volume when comparing with SPEA-II and NSGA-III as well.

Fig. 13
figure 13

Comparison of different algorithms in hyper volume

Table 2 Average value of hypervolume in different algorithms

Conclusion

Microservices split a complex application into a number of multiple sub-services with well-defined boundaries. The distributed deployment of these sub-services to different service centres provides services for users with the advantages of flexibility, scalability, and high availability. However, a series of problems have arisen in the use of microservices, and the optimization of deployment and startup microservices is one of the key challenges.

This paper introduces a knowledge-driven evolutionary algorithm for the deployment and startup problem of microservices for the first time. It improves this NSGA-III by considering the lineage knowledge of each generational population using MGR-NSGA-III. First, a microservices deployment and startup problem model based on multi-objective optimization was constructed. Some objective functions and constraints of the problem were defined. Second, NSGA-III was improved by knowledge driven to solve the microservice deployment and startup problem. Finally, several experiments were presented to evaluate the performance of different methods. A comprehensive evaluation of the algorithm’s time efficiency, convergence degree and calculation effect were given. In conclusion, MGR-NSGA-III works well on microservice deployment and start-up problems.

In the future work, the heterogeneity of microservices and resource centres needs to be considered. Some users’ service requirements should also be further studied. It is necessary to further improve the multi-objective algorithm to meet various users’ demands.