1 Introduction

The construction, operation, and maintenance of civil engineering and building structures account for not only very large costs, but also major negative environmental and social impacts in terms of the tremendous material consumption, as well as health and safety issues often associated with construction activities.

Today, it becomes increasingly important to take various objectives into account to improve buildability and performance of structures and reduce their economic, environmental, and social impacts in a life cycle perspective. Nevertheless, in the early structural design process, when the possibility to influence the outcome is the largest, the traditional practice is to focus solely on the lowest initial costs, see, e.g., International Federation for Structural Concrete (2009), Mathern et al. (2013), and World Economic Forum (2016). Cost optimization is often conducted by means of employing a point-based incremental improvement strategy, relying on trial-and-error and the designer’s intuition, see, e.g., Parrish (2009) and Maher (1987). This approach overlooks potentially better design solutions and neglects other relevant objectives than cost.

A better alternative is to take into account many different aspects simultaneously in the design process to find the most efficient design solution with regard to all relevant objectives simultaneously. Finding the most appropriate design for the set of objectives considered requires modern design methods that allow to explore many design choices and evaluate them against these objectives.

Multi-objective optimization concerns the problem of simultaneously maximizing the utility values of multiple, typically conflicting, objective functions (Tamaki et al. 1996). For engineering applications, this setting is generally more common than the single-objective setting as usually many objectives have to be balanced (Nakayama 2005), e.g., the cost and performance of a construction. Typically, the different objectives are conflicting, which is why no single solution exists and hence instead the challenge becomes to retrieve a set of solutions providing a human decision maker with a set of diverse objective trade-offs (Vlennet et al. 1996). Additionally, in real-world optimization applications, constraints are generally imposed on the candidate solutions, which may render individual solutions infeasible despite exhibiting favorable objective values (Deb et al. 2001). In structural design, such constraints take the form of requirements defined by structural norms and standards such as the Eurocodes (European Committee for Standardization 2005). Appropriate structural and load models are required to determine the design effects of actions and verify the associated design constraints in ultimate limit states (ULS) and serviceability limit states (SLS).

Mathematically, we are interested in solving the problem:

$$ \begin{array}{@{}rcl@{}} &\min_{\boldsymbol{x}}(f_{1}(\boldsymbol{x}), \dots, f_{m}(\boldsymbol{x})) \end{array} $$
(1a)
$$ \begin{array}{@{}rcl@{}} & s.t. \qquad g_{j}(\boldsymbol{x}) \leq 0, \end{array} $$
(1b)
$$ \begin{array}{@{}rcl@{}} &\qquad \qquad\boldsymbol{x} \in \mathbb{R}^{D}, \end{array} $$
(1c)

where fi, \(i = 1,\dots ,m\), are objective functions, gj, \(j = 1,\dots ,n\), are constraint functions, and the variable x is a D-dimensional real valued input. Given a set of multi-objective solutions retrieved by some kind of search procedure, the quality of individual solutions is often classified in terms of Pareto optimality. Again, assuming minimization of m objectives subject to n constraints according to (1a), a feasible solution x is said to be Pareto optimal, or equivalently, non-dominated, if there does not exist another feasible x such that \(f_{i}(\boldsymbol {x}) \leq f_{i}(\boldsymbol {x}^{*})\) for all i \(\in \{1,2,\dots ,m\}\) and \(f_{j}(\boldsymbol {x}) < f_{j}(\boldsymbol {x}^{*})\) for at least one j (Miettinen 2012).

Structural optimization problems for RC structures are characterized by their evaluation expensiveness. The verification of the constraints, in particular, is often associated with relatively expensive computations, involving structural analysis with 2D or 3D finite element (FE) models and possibly a large number of load combinations (e.g., in bridge design). In order to compute high-quality solutions in a feasible amount of time, the constraint evaluations require a high degree of efficiency in the optimization process; hence, the number of applicable algorithms is severely limited.

Bayesian optimization has emerged as a capable approach to optimizing expensive functions by iteratively constructing a probabilistic surrogate model of the underlying target function, and has had many previous successful applications (Snoek et al. 2012; Calandra et al. 2016; Imani and Ghoreishi 2020). An acquisition function is used to translate the surrogate model to a metric reflecting potential for improvement, which is in turn used to guide the optimizer toward a promising new query point. Bayesian methods are designed to be efficient in terms of function evaluations, but in turn require extensive computational resources on the modelling side, which of course needs to be balanced against the computational time of the underlying function. Bayesian optimization has also been extended to constrained problems (Bernardo et al. 2011; Snoek 2013; Gelbart et al. 2014), and to multi-objective settings (Zuluaga et al. 2013).

Structural optimization of RC structures has been examined previously. Various evolutionary algorithms have been studied (Jahjouh et al. 2013; Mergos and Mantoglou 2020), however without explicitly dealing with the expensiveness aspect of the problem. In order to deal with expensiveness, Penadés-Plà et al. (2019) suggest fitting a constraint-weighted surrogate model to prior evaluations sampled from a latin-hypercube, whereafter the surrogate is optimized in order to select the final design. In contrast to this approach, Bayesian optimization allows surrogate models to be constructed sequentially, utilizing information from prior evaluations at each iteration, while guiding the search procedure, which should decrease the sensitivity to poor initialization.

The aim of this paper is to study the applicability and efficiency of a state-of-the-art constrained Bayesian optimization algorithm on a generic structural design case. The benchmark problem considered consists of structurally optimizing a doubly reinforced concrete beam over a number of design parameters (i.e., dimensions, reinforcement layout, material type) with respect to objective functions covering economic, environmental, and social factors, buildability, and performance aspects of the design options, while fulfilling structural design constraints according to design codes. Two other optimization algorithms are also applied for comparison: the commonly applied NSGA-II (Deb et al. 2002), which is a genetic algorithm specifically targeting multi-objective problems, and a slightly modified random search algorithm.

2 Method

2.1 Design case definition

The structural design case considered in this work is the one of a RC beam, as illustrated in Fig. 1. The beam is simply supported with an effective span length (Lspan) of 14m. It is designed for two uniformly distributed loads: a characteristic permanent load gk = 12kN/m + gsw, where gsw corresponds to the self-weight of the beam per unit length (assuming densities of 24kN/mˆ3 and 78.5kN/mˆ3 for concrete and steel, respectively), and a characteristic variable load qk = 18kN/m. This design case was chosen for its simplicity and because the structural design process for a RC beam constitutes the basis of many more complex common structural engineering cases (e.g., design of parts of buildings’ structural systems, of RC beam bridges, of RC wailing beams for support of sheet pile walls). The design parameters considered and their possible values, detailed in Table 1, generate a design space consisting of more than 186 million possible configurations.

Fig. 1
figure 1

Drawing of the RC beam considered as design case with indication of the index of the design parameters (see Table 1)

Table 1 Set of design parameters and possible values

2.2 Structural design

The structural analysis was conducted in accordance with Eurocode 2 (European Committee for Standardization 2004) considering bending and shear in ULS and deflection control in SLS. Bending is checked taking into account the parabola-rectangle diagram for concrete under compression and the design stress-strain diagram with an inclined top branch for reinforcing steel under tension and compression. The resulting design constraints are described in Table 2. Crack control and anchorage of longitudinal reinforcement are not included in this study.

Table 2 Set of design constraints

2.3 Evaluation of sustainability, performance, and buildability

The objective functions taken into account in the evaluation of a design choice are described in Table 3. These were carefully chosen to address the aim of this work, while being relevant to building or civil engineering design cases regardless of their complexity. All objective functions are to be minimized in the optimization process. Unit values used in the evaluation of the different objectives are described in Appendix 1.

Table 3 Set of objective functions considered

The first objective function, f1(x), corresponds to the total production cost, C, which is estimated as the sum of the cost of materials, Cm, and the cost of labor, Cw, for the related construction works:

$$f_{1}(\boldsymbol{x}) := C(\boldsymbol{x}) = C_{\text{m}}(\boldsymbol{x})+C_{\text{w}}(\boldsymbol{x}). $$
(2)

The material cost, Cm, is equal to the unit price, pm,i, for each material i (i.e., concrete, steel, and form) times the respective material quantity, Qm,i. That is:

$$C_{\text{m}}(\boldsymbol{x}) = \sum\limits_{i} p_{\text{m,i}} \cdot Q_{\text{m,i}}(\boldsymbol{x}), $$
(3)

and the cost of construction works, Cw, is equal to the unit labor cost, pw, times the time required for construction work, Tw:

$$C_{\text{w}}(\boldsymbol{x}) = p_{\text{w}} \cdot T_{\text{w}}(\boldsymbol{x}). $$
(4)

The time of construction works, Tw, which also corresponds to the third objective function, f3(x), is calculated according to (5). It is the sum of the unit time for the construction activity associated with each material times the respective material quantity:

$$f_{3}(\boldsymbol{x}) := T_{\text{w}}(\boldsymbol{x}) = \sum\limits_{i} t_{\text{m,i}} \cdot Q_{\text{m,i}}(\boldsymbol{x}), $$
(5)

where tm,i refers to the unit construction time for the activities related to each material.

In addition to the construction time, another objective function related to buildability is considered: the height of the beam, f4(x). This choice is also motivated by the interest, in this work, of using an objective function directly equal to one of the design parameters and strongly conflicting with some others.

The second objective function, f2(x), corresponds to the aggregated life cycle analysis (LCA) result for the 16 impact categories using the normalization and weighting factors defined in the European Product Environmental Footprint (Sala et al. 2017, 2018) considering the life cycle stages A1–A3 (material production) and A4 (transportation to the site) (European Committee for Standardization 2019). Impact categories in both the environmental dimension (e.g., climate change, ozone depletion, acidification) and the social dimension (e.g., water use, human health) are covered.

Finally, the fifth objective function, f5(x), is equal to the maximum deflection of the beam at mid-span, dmax. Although the maximum deflection is limited by design codes, which results in a design constraint, it is interesting to include it as an objective as well, as there may be reasons, when it comes to performance or safety, to optimize design options beyond the design requirements set in the codes.

2.4 Formalizing the problem

In this paper, we consider the problem of structurally optimizing a RC beam over the 8 design parameters (\(x_{1}, \dots , x_{8}\)) specified in Table 1 with respect to the 5 objective functions (\(f_{1}, \dots , f_{5}\)) described in Table 3. Furthermore, each design candidate is obliged to pass three (g1,g2,g3) of the four constraints in Table 2 during the structural analysis procedure described in Section 2.2. The remaining constraint g4, directly corresponding to objective f5, is left out for implementation reasons.

As the design parameters are either discrete or categorical (but of ordinal type), each parameter value is chosen to be represented by a non-negative integer index value of its discrete domain. The allowed input space in (1c) hence becomes \(\boldsymbol {x} \in {\mathscr{B}}^{8} \subset \mathbb {Z}_{\geq 0}^{8}\), where \({\mathscr{B}}^{8}\) is an 8-dimensional integer-valued hypercube, with the length of each side given by the number of allowed values per design parameter in Table 1.

For the objectives, all functions are to be minimized and no modifications of the problem specified in (1a) are needed, using the case of m = 5.

As previously declared, the three constraint functions g1,g2, and g3 in Table 2 are considered. Furthermore, the routine for evaluating the beam constraints imposes a conditional structure on the evaluations of the constraint functions where g2 and g3 may only be evaluated if g1 is satisfied.

With these modifications, the problem in (1a) is restated as:

$$ \begin{array}{@{}rcl@{}} &\min_{\boldsymbol{x}}(f_{1}(\boldsymbol{x}), \dots, f_{5}(\boldsymbol{x})) \end{array} $$
(6a)
$$ \begin{array}{@{}rcl@{}} & s.t. \qquad g_{j}(\boldsymbol{x}) \leq 0, \quad j = 1, \dots, 3, \end{array} $$
(6b)
$$ \begin{array}{@{}rcl@{}} &\qquad \qquad\boldsymbol{x} \in \mathcal{B}^{8} \subset \mathbb{Z}_{\geq0}^{8}, \end{array} $$
(6c)

which defines the optimization problem that is examined. The constraint functions gj in (6b) are assumed to be prohibitively expensive to evaluate due to underlying numerical computations. In contrast, the objective functions in (6a) are assumed to have known analytical expression that allow for cheap evaluation.

In order to determine the quality of a solution set in multi-objective problems, the hypervolume indicator measure is commonly employed.

Given a finite set of t objective points, \(\mathcal {F} = \{\boldsymbol {f}^{(1)},\dots , \boldsymbol {f}^{(t)} \} \subset \mathbb {R}^{m}\) the hypervolume indicator (Zitzler and Thiele 1999), denoted HV, measures the volume of the subspace dominated by \(\mathcal {F}\), given a from above bounding reference point \(\boldsymbol {r} \in \mathbb {R}^{m}\):

$$ \begin{array}{@{}rcl@{}} \text{HV}(\mathcal{F}, \boldsymbol{r}) := \lambda_{m}(\cup_{\boldsymbol{f} \in \mathcal{F}}[\boldsymbol{f}, \boldsymbol{r}]), \end{array} $$
(7)

where λm is the Lebesgue measure on \(\mathbb {R}^{m}\).

Typically, the objective values are normalized to [0,1]m, so that we emphasize all objective functions equally while working with the hypervolume indicator. Furthermore, a reference point r = 1m can be used. This, however, implies that the minimum and maximum values of each unconstrained objective functions are known beforehand, which may be reasonable considering that the evaluations of the objective functions are cheap. If this is the case, objective values can be normalized by:

$$\Hat{f}_{i}(\boldsymbol{x}) := \frac{f(\boldsymbol{x}) - f_{i*}}{f_{i}^{*} - f_{i*}}, i = 1,\dots,m, $$
(8)

where \(f_{i}^{*} = \max \limits _{\boldsymbol {x} \in {\mathscr{B}}^{8}} f_{i}(\boldsymbol {x})\) and \(f_{i*} = \min \limits _{\boldsymbol {x} \in {\mathscr{B}}^{8}} f_{i}(\boldsymbol {x})\). If these values are not known on beforehand, the normalized objective values may be recomputed from (8) whenever a new minimum or maximum solution is encountered.

2.5 Examined algorithms

Three different optimization algorithms are employed to solve the problem defined in (6), as introduced here. Additional details on the implementation are provided in Appendix 2.

2.5.1 Bayesian algorithm

Gelbart et al. (2014) proposed a Bayesian optimization approach to expensive, unknown-constraint optimization. To solve a one-dimensional optimization problem with n unknown constraints, the authors suggested using a separate Gaussian process (see Rasmussen 2004) surrogate model for the objective and each constraint, while employing a constraint-weighted acquisition function of the form:

$$a(\boldsymbol{x}) := \text{EI}(\boldsymbol{x})\prod\limits_{j=1}^{n} \text{Pr}(-g_{j}^{\text{GP}} (\boldsymbol{x}) \geq 0), $$
(9)

where EI is the standard expected improvement acquisition function, Pr is the probabilistic probability function for constraint satisfaction, and \(g_{j}^{\text {GP}}\) is the Gaussian process surrogate model of constraint function gj. In this work, this method is expanded upon to handle the multi-objective setting with cheap objective evaluations.

Considering the cheapness of the objective functions, EI in (9) may be replaced with actual improvement I, i.e., I(x) = f(x) − fbest, as there is no need in constructing surrogates for non-expensive functions.

We now wish to extend (9) to the multi-objective setting. By using the hypervolume indicator definition as stated in (7), the hypervolume improvement may be defined as:

$$\text{HVI}(\boldsymbol{f}(\boldsymbol{x}); \mathcal{F}, \boldsymbol{r}) := \text{HV}(\{ \boldsymbol{f}(\boldsymbol{x}) \} \cup \mathcal{F}, \boldsymbol{r}) - \text{HV}(\mathcal{F}, \boldsymbol{r}). $$
(10)

Finally, the acquisition function of (9) is restated as:

$$a(\boldsymbol{x}) := \text{HVI}(\boldsymbol{f}(\boldsymbol{x}); \mathcal{F}, \boldsymbol{r}) \prod\limits_{j=1}^{n} Pr(-g_{j}^{\text{GP}}(\boldsymbol{x}) \geq 0), $$
(11)

now adapted to the cheap and multiple-objective setting, as in problem (6). A flowchart of the proposed Bayesian algorithm is presented in Fig. 2.

Fig. 2
figure 2

Flowchart of the proposed Bayesian algorithm

A common procedure to optimize the acquisition function in Bayesian optimization is to start with initialization of randomized points and thereafter utilize local search. However, a by-product of the definition of HVI (10) is that as soon as a decently estimated Pareto front is found, the HVI is equal to zero for many x. This generally risks trapping the search algorithm in these regions. In order to alleviate this problem, we implement a pattern search (Torczon 1997) algorithm including a tie-breaking strategy for inputs with zero HVI. See Appendix 2 for details.

2.5.2 NSGA-II algorithm

As a benchmark to the Bayesian algorithm, the NSGA-II (Deb et al. 2002), which is a standard genetic algorithm often employed in multi-objective optimization, is included in the study.

In the NSGA-II algorithm, each solution candidate (i.e., input) is viewed as an individual with a set of properties (a specific design choice), which may be manipulated by mathematical operators representing biological phenomena such as selection, reproduction (crossover), and mutation.

The performance of the NSGA-II algorithm has been evidenced to depend strongly on its hyperparameter configurations (Andersson et al. 2015), such as population size, crossover rate, and mutation rate parameters, implying that some kind of tuning procedure should be performed in practical applications.

2.5.3 Random search algorithm

Finally, a randomized search algorithm is also included as a baseline. Random search optimization works by randomly sampling the input space and evaluating the solutions in sequence. Random search is considered a naive strategy as no information from previous search history is utilized in selecting the next evaluation points. It has however been evidenced that random search generally compares favorably to grid search (Bergstra and Bengio 2012), i.e., sweeping parameters over a manually specified input subset, especially in situations where only a small number of dimensions bears high influence on the objective function(s).

As the Bayesian algorithm incorporate mechanisms as to avoid evaluating an input with non-positive HVI, the random search algorithm is slightly modified as to avoid wasting evaluations on redundant inputs. For each new evaluation, the objective values of a randomly sampled input are evaluated. If the set of objective values does not yield a positive improvement, the input is re-sampled and the process repeats. Once an input with positive improvement in the objectives is found, the constraint functions are evaluated.

2.5.4 Experimental setup

The three algorithms are examined by running 25 independent optimization runs with an allowed budget of 1000 expensive constraint evaluations.

For the Bayesian algorithm, Gaussian process surrogates were used for the constraint function modeling, using the Matèrn 3/2-kernel (Rasmussen 2004), with a single length-scale parameter. This kernel is a sensible default given the assumption that no prior information exists of the constraints functions being modeled. The hyperparameters of the Gaussian process are determined by maximum likelihood estimation optimized by L-BFGS-B algorithm (Byrd et al. 1995). Input parameter bounds are normalized to the [0,1]8 hypercube. For constraint evaluation where g1 is not fulfilled, implying that g2,g3 are not available at this input, the corresponding surrogate models are not updated.

The simple structure of the objective functions allows minimum and maximum values of each objective to be discovered by any suitable optimization procedure, whereafter the normalization strategy of (8) is employed.

In each iteration, the acquisition function in (11) is optimized by the pattern search algorithm including the tie-breaking strategy described in Appendix 2, using a budget of 5000 acquisition function evaluations.

For the NSGA-II algorithm, a tuning procedure is first performed as an attempt to mitigate the risk of choosing poor hyperparameters by varying algorithm parameters population size, crossover rate, and mutation rate according to values stated in Table 4. Each of the 27 configurations is run 10 times with seeded initialization. From these results, the highest performing configuration in terms of final hypervolume is selected and then evaluated along the Bayesian algorithm and the random search algorithm. For inputs where constraint evaluations are undefined, that is g2,g3 at inputs where g1 is not fulfilled, the NSGA-II algorithm is provided with the value 0.0 for both constraint functions. This is motivated by the internal sorting procedure included in NSGA-II, where multiple constraints are summed to quantify the degree of in-feasibility of a solution; a zero value hence avoids shifting this measure in any direction.

Table 4 Examined NSGA-II hyperparameter values

For the random search algorithm, no modifications are needed.

In each of the 25 runs, the algorithms are provided with 100 initial evaluations, including both objective and constraint functions, with inputs generated by sampling from a seeded uniform distribution over the defined input domain of (6c). Initialization inputs are shared for all algorithms per individual run, with the exception of the NSGA-II algorithm using hyperparameter configurations with the population size parameter set to 50, hence using only the first 50 initialization inputs.

All algorithms are implemented in Python 3. The hypervolume computations are carried out using the pygmo-library (Biscani and Izzo 2019). For the constraint surrogate models, the Bayesian algorithm uses the default implementations provided in the pyGPGO-library (Jiménez and Ginebra 2017). The NSGA-II algorithm uses the default implementation provided in the Platypus-library (Hadka 2015).

3 Results and discussion

The performance of the three examined algorithms has been compared in terms of rate of improvement and final-evaluation normalized hypervolume.

As a reference point, r = 18 is used, which implies that the hypervolume is bounded to [0,1] with larger values indicating higher quality solution sets.

In the tuning procedure for the NSGA-II algorithm a configuration with population size 50, crossover rate 0.6, and mutation rate 0.1, was seen to exhibit the best results in terms of final hypervolume. Hereafter, NSGA-II refers to this configuration, which is compared along the other two algorithms.

For each algorithm, Fig. 3 shows the 25-run hypervolume mean as a function of constraint evaluation, while Table 5 lists the mean hypervolume obtained after the final evaluation.

Fig. 3
figure 3

Normalized hypervolume improvement as a function of constraint evaluation for the three examined algorithms, averaged over 25 runs with seeded initialization. Solid lines indicate the average, and upper/lower limits of the shaded areas represent ± 1 standard deviation from the average. For the Bayesian and the random search algorithms, each run is initialized with 100 inputs generated by seeded randomized sampling (marked by the dashed black line). For the NSGA-II algorithm, only the first 50 inputs per seed (marked by the gray dashed line) are used due to its population-size hyperparameter being set to 50

Table 5 Mean normalized hypervolume across 25 runs per algorithm at the final constraint evaluation

As shown in Fig. 3, the best performance is by far achieved by the Bayesian algorithm followed by that of the NSGA-II algorithm. After the random initialization phase, the Bayesian algorithm quickly improves its hypervolume. Already at constraint evaluation 150, the Bayesian algorithm has surpassed the end-of-the-run results for both the NSGA-II and random search algorithms. At around evaluation 400, the Bayesian algorithm improves only with a very small rate, barely noticeable on the linearly scaled plot. It should also be noted that the Bayesian algorithm exhibits the lowest variance among the algorithms across the runs, as relevant to practical applications in the expensive setting where multiple runs in general cannot be afforded. Using the end-of-run final hypervolume means listed in Table 5, the Bayesian algorithm yields a 72.8% larger volume than the random search algorithm.

In Fig. 3, the NSGA-II algorithm is seen to distinctly outperform the random search algorithm from approximately constraint evaluation 700 and onwards, displaying zero overlap with random search in the 1 standard deviation. And, as displayed in Table 5, it achieves an approximately 39.7% increase in final hypervolume with respect to the random search result, however with a larger variance compared to the Bayesian algorithm. It should be pointed out that the NSGA-II algorithm was subjected to a hyperparameter tuning procedure prior to algorithm evaluation, which is not desirable for practical applications in expensive constraint evaluation settings.

Finally, the random search algorithm exhibits a comparatively low rate-of-improvement, and displays the largest variance exceeding all algorithms at the final constraint evaluation.

In order to demonstrate the applicability of the proposed approach in a more practical sense, results are analyzed in more detail for the two objective functions subjectively deemed the most important: the production costs f1, and the aggregated LCA result f2. Figure 4 displays the complete set of solutions belonging to the Pareto front spanning the final hypervolume, projected to the cost/LCA-subspace, for the best run per algorithm. In Fig. 4, solutions along the estimated 2D dominating fronts can for the Bayesian algorithm be seen to dominate solutions for the other two algorithms.

Fig. 4
figure 4

Set of algorithm solutions projected to the objective dimensions f1 and f2, for the best (highest final hypervolume indicator) run per algorithm. Opaque diamond markers represent solutions on the estimated 2D Pareto front while faded cross markers represent the remaining solutions belonging to the estimated 5D front

It can also be observed that the whole set of Pareto solutions (i.e., the estimated 5D Pareto front) obtained by the Bayesian algorithm appears to cluster in a better region (lower left corner in Fig. 4) of the subspace, compared with solutions of the NSGA-II and the random search algorithms which show more spread. This, together with the previous observations made on the respective performance of the algorithms, indicates that the Bayesian algorithm is capable of finding higher quality trade-offs among the five objectives.

A more detailed view of the solutions on the estimated 2D Pareto fronts of Fig. 4 is shown in Fig. 5.

Fig. 5
figure 5

Scatter plot of estimated 2D Pareto front after projection to the f1 and f2 subspaces, for the best (highest final hypervolume indicator) run per algorithm. Three particular solutions yielded by the Bayesian algorithm are annotated with solution input-variables x1 − 8 and objectives f1 − 5

Among the three annotated solution points produced by the Bayesian algorithm, representing the attained trade-offs between objectives f1 and f2, we notice that values for the input parameters x5 − 8 remain fixed; variability in this subspace hence seems to originate only from the x1 − 4 parameters. Also, considering the objectives f3 − 5, we observe some variability among the solutions in that the objectives f3 and f4 seem negatively (positively) correlated with f2 (f1) while f5 exhibits the opposite behavior.

However, as evident in Fig. 4, a lot of additional solutions are found in the very near vicinity of annotated 2D dominating solutions, and may offer more desirable trade-offs in the 5D with an apparent very low penalty in the (f1, f2) space. In a practical setting, where we assume that more than two objectives are of interest, alternative visualization techniques, e.g., parallel-coordinates plots (Inselberg 1985), could be employed.

Such visualization of the results, preferably including interactive features, is particularly important to analyze in multi-objective optimization as it can provide valuable insights for a designer or a decision maker when selecting among different designs.

One advantage of employing Bayesian algorithms in general is that prior domain knowledge may be incorporated directly by specifying the hyperparameters. For instance, if the order of differentiability of the target function (here constraint) is known, this prior information could guide the process of selecting the Gaussian process kernel. It should be noted that in this work, no prior knowledge or tuning procedure was used in the selection of Gaussian process hyperparameters. Another advantage is that once the optimization is completed, one callable probabilistic surrogate model per constraint over the input domain has been constructed. This provides the possibility to retrieve estimates of both constraint values and corresponding uncertainties for new inputs, providing information about constraint behavior and unexplored regions.

In order to tackle the multi-objective setting, we have used a normalized hypervolume approach, which has the advantage that the objective importance does not need to be quantified on beforehand. Furthermore, the iterative nature of the proposed Bayesian algorithm allows the HVI factor in (11) to be modified during optimization (e.g., by means of rescaling the objective normalization constants, which governs relative importance of each objective in terms of hypervolume contribution). This allows a decision maker to influence the search procedure dynamically according to his or her preferences, without requiring rework from the designer.

The good performance attained by the proposed Bayesian algorithm in this work together with capabilities demonstrated in previous works leads us to believe that this approach has a large potential to generalize its results to more complex problem settings. For instance, problems involving FE computations (e.g., 3D model of a bridge), could be expected to exhibit similar properties to those of the RC beam problem while further pronouncing evaluation expensiveness.

The proposed algorithm’s sensitivity to increasing problem dimensionality remains to be explored. While increased dimensionality is expected to increase problem difficulty for literally any algorithm, Bayesian methods are, in general, especially sensitive to the curse-of-dimensionality problem. Previous works have tackled this issue by learning subspace projections reducing the effective dimensionality (Moriconi et al. 2019), or to decrease time complexity by sparse approximation methods, in turn allowing for a larger number of evaluations (Quiñonero-Candela and Rasmussen 2005). The potential benefits of pursuing such approaches would depend on the actual problem considered.

The approach of this paper to estimate the entire Pareto front will become intractable at some point as the number of objectives increases. In such a case, one may have to resort to actively incorporating the decision makers into the process to learn the decision makers’ preferences online; see, e.g., Astudillo and Frazier (2020). However, in most use cases for structural design, the actual number of objectives is usually fairly low.

In the explored case of RC beam optimization, imposed constraints included the number of bar layers, and bending and shear resistances. Naturally, one can expect some correlation in these constraints. Further improvements on the Bayesian algorithm proposed in this paper could hence be to include correlation among the surrogate models, as has been previously suggested (Shah and Ghahramani 2016).

Wu et al. (2019) propose a method for multi-fidelity Bayesian optimization. The multi-fidelity setting has interesting parallels to the structural optimization in that FE simulations of constraints could be evaluated using a variable mesh resolution, hence introducing multiple noise levels of the constraint functions, and should be explored in future work.

Finally, in the problem setting at hand, all input parameters were of ordinal type. In practice, design cases could of course include categorical values (e.g., material types or types of cross sections such as I-beam or T-beam), or even conditional ones, such as dimensions dependent on the choice of cross section. Bayesian optimization has previously been successfully extended to such input domains in the context of hyperparameter optimization domain (Sjöberg et al. 2019), and we expect the results therein to be transferable.

4 Conclusions

In this paper, a state-of-the-art constrained Bayesian optimization algorithm has been adapted to a RC beam optimization problem incorporating multiple objectives. Contrasting previous approaches, the Bayesian algorithm proposed in this work explicitly utilizes that objective functions are cheap to evaluate while constraint functions have expensive evaluations, as constraint function evaluations include expensive numerical computations, which are common characteristics in structural engineering design problems.

The efficiency and applicability of the Bayesian algorithm were studied on a benchmark problem consisting in optimizing a RC beam over eight design parameters with respect to five objective functions covering economic, environmental and social, buildability, and performance aspects. Additionally, design constraints were imposed to ensure that the configuration of the beam could be built and that it satisfied the required bending and shear capacity according to structural design codes. It was shown that the Bayesian algorithm could effectively find high-quality estimated Pareto sets for the considered benchmark problem using a very limited number of constraint function evaluations, and that it outperformed NSGA-II and random search algorithm by a large margin.

The good performance exhibited by the Bayesian algorithm opens up possibilities for its application to more complex cases in the field of structural engineering. The limited number of function evaluations required is particularly interesting to enable performing structural calculations using a finite element analysis software, whose computational cost usually limits the number of possible evaluations and hinders their application in structural design optimization problems.