1 Introduction

A bioeconomy is “an economy where the basic building blocks for materials, chemicals and energy are derived from renewable biological resources” [1]. The need and desire for a transition from a fossil-based economy to a bioeconomy is well established at the European [2, 3], Nordic [4], and Norwegian level [5, 6]. Overall sustainability and a holistic view of resource optimization across sectors are emphasized by a Nordic Innovation Report [7]: “bioeconomic innovations target resource-efficient use of valuable bioresources”. Given this emphasis on the sustainable use of biological resources, it is likely that land use management will become a very important issue in the future. In this context, it is important to keep in mind that plant-based resources are linked to a specific geographic location. This location may vary in a range of aspects, affecting for example the quality, growth rate, renewal rate, and accessibility of the resources. Additionally, time can also create variability, all of which affects resource exploitation.

This paper deals with the question of how to locate and design multisectoral industrial clusters in, and closely linked to, the bioeconomy. The goal is to maximize the cluster’s collective efficiency while at the same time provide economic, environmental, and social benefits, which is an expected and desired outcome from the bioeconomic transition. This offers a cross-sector perspective on renewable resources, in line with both national and international strategies.

The concept of industrial clusters [8] has its roots in the field of Industrial Ecology. This concept has a holistic view of the industrial system, and emphasis on the biological and material flows within and outside the system as basic characteristics. In this paper, we define a cluster as a composition of industrial facilities that are geographically located in the same area or within a defined close distance from each other. A bioeconomic cluster is in turn a cluster in which the facilities involved mainly use renewable biological resources. Padmore and Gibson [9] developed quantitative indicators to evaluate clusters based on the type of industries, markets, and resources present in and/or around such locations. In this work, we mainly follow Raymond and Cohen-Rosenthal [8] in their main argument that the better use of resources will contribute to the bioeconomic transition by reducing emissions and waste, while increasing profitability.

The literature has the common assumption that ideally, transportation (a) within and (b) outside a cluster should be minimal. From the definition we adopted, “(a)” follows logically, whereas for “(b),” it is important that the cluster is located near places where biological, human, and energy resources are available. The volume, quality, composition, and access of biological resources vary across space, and the transport of biomass involves equally varying cost and time. Furthermore, “when focusing on innovation in the area of bioeconomy, it must be taken into account that biobased raw materials are a highly local issue, and best competence on understanding the local raw materials and ecosystems is often local” [7].

Based on the above, how to best locate a cluster for access to resources and external markets is a geographical site selection problem where the decision involves explicit spatial alternatives, the location of clusters. A combination of methods is proposed to capture both the geographic and economic factors involved. The problem to be solved can be formulated as a search for the spatial unit(s) that maximize(s) the overall benefit from using renewable resources (supply, availability, sustainability, usability, and access to markets for biomass and related resources), with minimum transport costs and environmental side effects.

In this paper, we have used a combination of Geographic Information Systems (GIS), Multicriteria Decision Analysis (MCDA) and Operations Research (OR) methods to select the most suitable location and composition of bioeconomic industrial clusters within a geographic region. These tools have been frequently employed separately. What this study aims to contribute to the field is a combined methodology that will improve the decision-making in the context of bioeconomic cluster planning. This combination adds valuable pieces compared with what can be achieve by each of the methods independently. We illustrate the method with a case study on forest biomass, given its importance to Norway’s economy, wide distribution, political interest, and range of uses.

2 Theoretical Framework

2.1 GIS-MCDA

GIS has long been serving as a tool for spatial decision support systems [10, 11], with its strong capabilities of acquiring, storing, and manipulating spatial data. A decision analysis process where multiple criteria are evaluated for one or multiple objectives is referred to as MCDA. Malczewski [12] states that “MCDA provides a rich collection of techniques and procedures for structuring decision problems, and designing, evaluating and prioritizing alternative decisions.” According to Allain et al. [13], “the unaddressed issue of landscape-level MCDAs is strikingly the question of distribution and heterogeneity”, i.e., how benefits are distributed in space, time, and between social actors. The integration of GIS and MCDA strengthens the handling of criteria that have a spatial dimension through GIS and the handling of values or preferences through MCDA [14]. The combination of the two terms meets the unaddressed issue forwarded by Allain et al. [13]. A GIS-based MCDA (GIS-MCDA) can be defined as a process that transforms and combines geographical information and value judgments, to obtain information for decision-making [12].

The spatiality is explicit in site selection problems. Selecting sites for a new industry [15], for specific agricultural land use [16], solar farms [17], a biogas plant [18], or a new park [19], are some examples. In these studies, GIS-MCDA proved useful for selecting suitable sites. For bioeconomic cluster site selection, also, aspects of cost-benefit and sustainability analyses are crucial in the decision analysis related to economic activities. MCDA lacks explicit rules of cost and benefit comparison [20]; it is thus necessary to include a method complementing that aspect of the analysis.

2.2 Operations Research in Bioeconomics and in Clusters

Operations Research methods are a useful way to include cost-benefit comparison in the analysis. According to Bjørndal et al. [21], “OR methods for bioeconomics” is indeed its own field of study, describing methods applied to agriculture, fishing, forestry, and mining. Despite rather different time scales of the production systems, objectives, and resources, there are commonalities, specifically, the need for multicriteria approaches, and explicit consideration of the environment.

Designing eco-industrial clusters is a topic getting more and more attention. Boix et al. [22] state that most research studies focus on the optimal design of an industrial cluster network while taking into account separately water, energy, and materials. Linear and non-linear programming are used to analyze and decide on optimal design, e.g., multiobjective linear programming approach for forest management [23], optimal exploitation of fish considering environmental factors [24], and cost-efficiency linking of farming nutrients, livestock, land use, and water quality [25]. Other relevant works on industrial clusters are summarized in Table 1.

Table 1 Literature on OR methods in industrial clusters

In Hauknes [36], a process for identifying and defining clusters in an aggregated Input-Output (I-O) network based on resource exchange between industries is suggested, where the full I-O table is reduced by winnowing out weak links and small sectors. In Lindberg et al. [37], the authors perform an I-O analysis of Nordic countries and regions to understand the national and regional impacts of developments within the bioeconomy. Lindberg et al. [37] also looks at value-added multipliers and employment effects of the bioeconomy by applying national and regional I-O models of the economy/production. A study more closely related to our work is found in Yazan et al. [38], who study the concept of industrial symbiosis on industrial areas, which share characteristics with our bioeconomic industrial clusters. Environmental concerns and I-O models are successfully merged, which has led to the development of the now called environmentally extended input-output (EEIO) analysis, of which a significant body of literature already exists [39].

2.3 Multidisciplinary Approaches

A combination of methodologies for a bioeconomic cluster site selection problem can draw from the strengths of both GIS-MCDA and OR, which complement each other well. There are several examples of GIS-MCDA used together with special linear programming models to solve specific problems. For example, Bryan and Crossman [40] use GIS-MCDA and a minimum set algorithm to achieve a more holistic approach to regional resource management than spatial models or linear programming models can do alone. In Orsi et al. [41], a mixed integer linear programming (MILP) approach was used to maximize areas for reforestation given constraints on budget and several other aspects. Suitability maps generated through a combination of ecological criteria were given as input to the MILP model. Also, Zhang et al. [42] proposes an integrated GIS, simulation, and optimization tool for location and operation of biofuel plants. The GIS model chose potential locations, the simulation tool analyzed flows and proposed feasible networks and their costs, and the optimization model selected among these the network which optimized flows and costs. In our study, the use of GIS in the analysis of potential bioeconomic cluster locations is extended to explicit GIS-MCDA.

3 Method

In this section, we present a methodology to solve the bioeconomic cluster location challenge, designed to be both case-independent and as comprehensive as possible. First, potentially suitable sites are identified through the coupling of GIS and MCDA. The GIS-MCDA takes into account biological as well as more abstract resources. The latter might be resources less suitable for optimization models, such as human resources. The OR method, based on inputs such as location and volume of resources, transport infrastructure, and economic I-O industry relationships, finds the optimal sites from among the suitable ones. These optimal sites give the highest total profit for the whole region under consideration, once costs and incomes are included. Figure 1 shows a flow diagram illustrating the coupling of methods.

Fig. 1
figure 1

Steps followed in linking the GIS-MCDA and the OR models. a) Aerial photograph of the study area, b) data on resources, c) value scaling, weighting and combining of resource data, d) selected alternatives based on high suitability score, for input to e) OR model, providing f) final optimal cluster locations

3.1 GIS-MCDA

The GIS-MCDA procedures applied in this study is theoretically explained by Eastman [43] and reviewed by Malczewski and Rinner [14]. In GIS-MCDA commonly, a number of specific objectives are defined. Multiple alternatives are evaluated with respect to the defined objectives based on a set of criteria. The criteria are measured with a set of attributes that is factors that directly or indirectly affect the objective and constraints.

3.1.1 Alternatives

Let Li be a potential cluster location with i ∈ (1, 2, 3, …, n). Each Li is characterized by a two-dimensional location vector defining the center of the spatial unit supported by an area, i.e., Li defines a spatial unit such as an administrative division (e.g., municipalities) or, in our case, equally sized and uniformly spaced square grid cells.

For each location Li, there are M attributes and K constraints parameterizing it in relation to the objectives. Also, for each Li, the value of each attribute Aj, j ∈ (1, 2, 3, …, M) and each constraint Ck, k ∈ (1, 2, 3, …, K) is evaluated. Attributes Aj take values in different scales and ranges, while a constraint Ck might only take binary values (0 or 1).

3.1.2 Attribute Scaling

The original values of the attributes often inherently differ from each other both in terms of their measurement units and ranges of values. These differences can be problematic during the aggregation or combination of the attributes for the final evaluation of the alternatives. To ensure comparability, the attributes are standardized to the same measuring scale, intervals, and ranges. The intervals [0,1] and [0,100] are often preferred for simplicity. This standardization of attributes is one of the critical parts of the MCDA as outlined in several key studies [14, 43, 44].

There are many scaling techniques used to standardize the raw data. Very often value functions that relate the raw values of the attributes to the standardized scale are used. The value function measures the preference or worth or desirability of the alternative with respect to the criterion [14], for example, the desirability of a given site or a municipality with respect to forest biomass supply. The value function can be linear or exponential, depending on the sensitivity of the preference to the changes in the value of the criteria. Some of the preferences aim at maximizing attribute values while others aim at minimizing them; in this case, biomass supply can be an example of the criterion to be maximized, and the distance to the nearest road can be an example of the criterion to be minimized.

For the jth criterion to be minimized, the value function can be approximated as

$$ V\left({A}_{ij}\right)={\left[\frac{\mathrm{ma}{\mathrm{x}}_i\left\{{A}_{ij}\right\}-{A}_{ij}}{\mathrm{ma}{\mathrm{x}}_i\left\{{A}_{ij}\right\}-\mathrm{mi}{\mathrm{n}}_i\left\{{A}_{ij}\right\}}\right]}^{\rho } $$
(1)

while the value function for maximization can be approximated as

$$ V\left({A}_{ij}\right)={\left[\frac{A_{ij}-\mathrm{mi}{\mathrm{n}}_i\left\{{A}_{ij}\right\}}{\mathrm{ma}{\mathrm{x}}_i\left\{{A}_{ij}\right\}-\mathrm{mi}{\mathrm{n}}_i\left\{{A}_{ij}\right\}}\right]}^{\rho } $$
(2)

The denominator in both equations is the range of the data values of the jth criterion. Although this range can be obtained from the empirical data, it can also be set according to the ideal maximum and minimum utilities. For example, if the distance to the closest road is to be minimized and if any resource further away than 10 km from a road is not usable, the distance of 10 km corresponds with the value of maximum distance and the distance of 0 km with the minimum distance irrespective of whether such a distance is empirically encountered or not. The ρ in (1) and (2) is a positive parameter. Depending on its value, the function can be linear (ρ = 1), convex (ρ > 1), or concave (0 < ρ < 1). According to Ligmann-Zielinska and Jankowski [45], ρ can also be interpreted as the decision maker’s perception of risk associated with a decision outcome. A concave function is an indication of risk aversion and a convex function shows a risk-affinity strategy. In this study, we used a linear value function, which implies a neutral attitude to risk.

3.1.3 Attribute Weighting

A weight is assigned to an attribute as a measure of the attributes’ importance relative to the other attributes under consideration. A number of systematic methods of assigning weights to attributes are available [14]. The weights can be assigned either in a spatially explicit or implicit way. They can therefore be local or global depending on how the assignment of the weights varies with space. Some examples of the weighting methods are the ranking method [46, 47], rating method [46], pairwise comparison, entropy-based, and proximity adjusted [17]. No clear recommendation on appropriateness exists for particular applications; however, some of them are used more often than others mainly due to their simplicity.

For bioeconomic applications, due to the straight forwardness with regard to the assigning mechanism and interpretation, we applied the rating method in our analysis. In the rating method, the decision maker estimates a weight on the basis of a chosen scale, for example 0 to 1, or 0 to 100. Each attribute then gets an assigned weight score wj. Finally, the normalized weight score Wj is computed as

$$ {W}_j={w}_j/\sum \limits_{j=1}^m{w}_{j,} $$
(3)

3.1.4 Criteria Combination Method

In the final step of GIS-MCDA, decision-making is based on the combined value of data and information about the alternatives and the decision makers’ preferences. The criteria are combined in order to get the final rank or value of the alternative with regard to the objectives. Several combination rules are discussed in the literature, for example, the differences between Boolean statements of suitability, the weighted linear combination (WLC) and the ordered weighted average as discussed by Eastman [43], a review of applications using the approaches weighted summation (i.e., WLC), ideal/reference point, and outranking methods by Malczewski [12], and an overview of the WLC, ideal point methods, the analytic hierarchy process, and outranking methods by Malczewski and Rinner [14]. Both Malczewski [12] and Malczewski and Rinner [14] find the WLC and related procedures to be the most popular combination methods, among the different combination rules assessed. The weighted linear combination is found to be relevant and easy to understand and implement for the problem in this study.

In the WLC technique, the score Si of alternative Li is equal to the weighted sum of the standardized values of all attributes multiplied by the product of all constraints,

$$ {S}_i=\left(\sum \limits_{j=1}^m{V}_{ij}{W}_j\right)\left(\prod \limits_{k=1}^K{C}_{ik}\right) $$
(4)

where Vij is the standardized value of the attribute j for the alternative i, Wj is the normalized weight of attribute j, Cik is the binary value (0 or 1) of the constraint k for the alternative i. If there are no constraints, or if the value of all the constraints is 1, the weighted sum of the attributes is not affected as the product of the constraints becomes 1. Notice that alternative i is disqualified if at least one constraint is 0.

3.1.5 Selection of Alternatives

The selection of alternatives can be made based on a number of strategies [14]. In summary, comparison of the alternatives is made either against an ideal or against each other so that alternatives with the best score may be selected. In the present case, as no ideal condition is set, the alternatives are compared with each other based on their scores. In the end, the alternatives with the highest scores are to be used in the OR step of the method as potential cluster locations.

3.2 OR Method

To arrive at the best overall alternative(s), we apply an economic optimization model where the inputs, outputs, interactions, and cost-benefit of the alternatives proposed by the GIS-MCDA are accounted for, as shown in Fig. 1.

3.2.1 Model Overview and Decisions

First, the set of potential cluster locations from the GIS-MCDA-model are taken into the optimization model, together with production inputs, outputs, and interactions between clusters, as well as cost-benefit of the alternatives.

The MILP model incorporates elements of dynamic facility location problems [48], in combination with a supply-chain network design problem where investments and availability and costs of resources and transport costs can influence each other. It is important to emphasize that the model makes all these decisions simultaneously, solving the set of equations aimed at maximizing profit for the entire network.

Some of the decisions made by the model are:

  • Which cluster(s) to establish (investment/location), each a yes/no decision

  • What is the mix of facilities within these clusters, each a yes/no decision

  • Where and how many resources to harvest

  • The flow and quantities of resources/products from their place of origin/production to clusters/consumption.

3.2.2 Mathematical Model Description

The description of the mathematical optimization model with the main equations and assumptions are summarized in this section. The problem is represented by a graph, a combination of arcs representing transport links between nodes, with a minimum transport distance associated with them, and nodes representing geographical locations generated by GIS-MCDA. These nodes can have natural resources, industrial clusters, or consumption centers. An (industrial) cluster is defined as a set of industrial facilities, one per sector, with the capacity to produce one “main” product and a number of by-products. Facility inclusion and size within a cluster is defined by the optimization routine. The model determines each facility’s size as a decision variable, with all facilities in all possible clusters having the same maximum limit. For this case, all nodes can build all types of facilities in any size below the maximum capacity. A cluster can potentially have one or all of the sectors represented.

Decisions in the model are taken based on an objective function Obj that maximizes total benefit (income) Bt minus total costs Ct over the planning horizon divided in t ∈ T periods,

$$ Obj\left(\bullet \right)=\sum \limits_{t\in T}\left({B}_t-{C}_t\right). $$
(5)

Sales Bt account for all sales of raw products from harvest nodes and secondary products from cluster nodes. The aggregated costs Ct consist of an expression including:

  • Investment cost of each cluster

  • Investment cost of each industrial facility in each cluster

  • Costs of harvesting resources

  • Costs of transporting resources between nodes

  • Operational costs for each facility

Below, we summarize some of the main constraints in the optimization model.

The volume of the raw material p that can be harvested in a resource node i in period t, hi, p, t, is limited by its estimated availability, Ri, p, t. Estimated availability is based on annual volume of production

$$ {h}_{i,p,t}\le {R}_{i,p,t}. $$
(6)

All harvested resources are transported to a cluster, and all product volumes going into a cluster, \( {c}_{j,p,t}^{in} \) must be transported either from other harvesting nodes or other clusters. The volume of product p, transported between nodes i,j is represented by xi, j, p, t which, depending on the nature of the origin node i, and the product type p, might be 0 in the model (as is the case for trying to send raw products from a cluster node, or processed products from a harvest node),

$$ {h}_{i,p,t}=\sum \limits_j{x}_{i,j,p,t.} $$
(7)
$$ {c}_{j,p,t}^{in}=\sum \limits_i{x}_{i,j,p,t}. $$
(8)

Transport links which are patently sub-optimal are removed during data preprocessing. This is especially important as it reduces the running times and memory usage of the model; while it might not be crucial in all cases, larger, realistic studies likely benefit from reduced computational loads.

Cluster and facility investments can only take place once in each node. In general, we assume clusters are operational immediately after investment. A cluster must either previously exist or be established by the model to allow the transport of materials to/from the associated node, and for allowing facilities inside the cluster. Since by definition, each facility has a “main” product, its capacity is constrained in relation only to the main product. By-product production is always in relation to this capacity. Cost for the establishment of each cluster includes both an investment cost for the cluster itself (infrastructure, permits, space) and a cost to establish each of the potential industrial facilities individually. The operational costs are tied to the level of production of each facility in a linear manner. This is a simplification which arises from the objective of the model which is to define and localize clusters in a larger area, and not to determine the optimal operations of said clusters.

Inside the cluster, materials are converted into new products, and products and by-products are exchanged between the facilities inside the cluster, between clusters, and eventually flow out of the cluster for sale. Unused by-products are allowed. The conversion of products inside an industry facility, yf, p, q, is modeled using economic I-O modeling, described in Section 3.3. The modeling describes how much of each product q is needed (in total money worth) to produce product p. One (arguably more desirable) alternative to this would be to have realistic production functions for each specific facility for the cluster. While more refined, this is technically demanding for a demonstrative study like the one described here.

Product flows (pin, pout) into and out of a facility f located in cluster location i can be described as

$$ {p}_{i,f,q,p,t}^{\mathrm{out}}={Y}_{f,p,q}\bullet {p}_{i,f,p,t}^{\mathrm{in}}. $$
(9)

Finished products are sent out of the cluster to customers in the market. Deliveries of products are bound by minimum and maximum demand per costumer.

The expressions above, together with the parameter values provided by the other tools, are implemented in FICO-Mosel.Footnote 1 The solved model provides the values of all decision variables (locations, facilities, transportation, etc.) and the optimal objective function result.

3.3 Economic Input-Output Analysis

To provide the conversion rates necessary for our case study, we have used a Norwegian I-OFootnote 2 table. The I-O tables are derived based on the assumption of a fixed product sale structure, and they show flows from each industry to other industries and to final users. As described above, this is but one way to estimate production conversion rates, and a realistic case study would ideally have access to more precise technical data.

An I-O table is a macroeconomic model based on a generic top-down view of the economy. It represents the whole supply chain at a nation-wide level, along with its sectoral production and consumption patterns. The sectors related to the bioeconomy, which are our focus, are identified using the classifications in Mikkelsen [49], which shows all the NACEFootnote 3 codes either belonging to the bioeconomy, or intrinsically related to it.

An I-O model records the flows of products and services from each industrial sector, i.e., a “producer” to each of the other sectors, i.e., “consumers”, and records them in a table or matrix representing the entire economy of a country. Because sectors can be aggregated by matrix operations, we can easily transform the comprehensive I-O table for Norway into one in which the bioeconomic sectors are aggregated, while the rest is represented as a single entry. Table 2 shows the bioeconomic sectors and related A64 level codes, aggregated according to NACE3 codes.

Table 2 Aggregated bioeconomic sectors

In the general I-O notation, the total (direct and indirect) requirements needed to produce the output, o, for a given final demand vector, δ, is described as

$$ o={\left(I-D\right)}^{-1}\delta . $$
(10)

where I is the identity matrix, and D = [dij] describes the products i required by industry j to produce one unit of monetary output. The mix of inputs includes raw materials, machinery, energy, goods, and services. Vector o represents the total output in a given sector. This is equal to vector δ, the sum, for each product, of the volume in which said product is consumed by other industries and by the final demand agents. The expression (I − D)−1 is commonly known as the Leontief Inverse matrix. Matrix (I − D) is generally non-singular, though aggregation can alter this property; the existence of no inverse and its consequences, however, are our scope of this work.

Finally, because I-O models represent flows of money, not volumes, we combined the I-O conversion matrix with absolute production figures from estimates for each square. In this way, we defined a conversion rate which allows us to transform physical and monetary flows. This consideration is important for example regarding transportation in the OR model, e.g., values of forest and fishing products can be the same, but the cost for transporting their volumes is likely different. The information on biomass production which is combined with I-O monetary flows can be found in Falk-Andersson et al. [50].

4 Case Study and Results

As our case study area, we have chosen the county of Østfold in south eastern Norway (Fig. 2). Østfold covers 4182 km2 of which 64% are forested and 19% are agricultural area. For the purpose of this study, the area is divided into 67 spatial units by overlaying 10 × 10 km grid cells [51]. These spatial units are the alternatives for cluster locations.

Fig. 2
figure 2

Case area is the county of Østfold, with a 10 × 10-km grid structure delivering 67 alternative locations

While forest biomass is only one example of a resource important in the bioeconomy, we found that its wide distribution, policy focus, and range of uses made it interesting for a first case study, testing the combination of GIS-MCDA and OR optimization modeling focused on potential location of bioeconomic industrial clusters.

4.1 GIS-MCDA Steps

For the construction of the GIS-MCDA model, explicit spatial data is required on biomass, infrastructure, and human resources (Table 3). Four groups of criteria are considered:

  • Biomass: Supply of forest biomass based on annual potential growth. The spatial distribution and potential growth rate is obtained from the Norwegian Land Resource map (AR5). This information is combined with statistics on estimated annual growth from Statistics Norway. Total volume in a spatial unit and its neighboring units are good indicators of the availability of biomass.

  • Transport infrastructure: A sizeable cost when utilizing forest biomass, both regarding accessibility of resources but also transport of products and by-products between industry and market. Transport is assessed according to the proximity to different road types, based on road network data; proximity to harbors and train stations is also considered, as are potential terminals of transport exchange.

  • Logistics infrastructure: Usage of resources depends on factors like power, communication, water, and sewer systems. We used population and size of city zones within the spatial units as indicators of the logistic infrastructure in that unit.

  • Human resources: The development of bioeconomic clusters requires knowledge, e.g., academic, industrial, and R & D, as well as a workforce and physical and financial markets. We have used available data on academic and research institutions, and on population, to generate distance-based access to human resources.

Table 3 Data, sources, and processing per 10 × 10-km spatial unit

The data that represents the resources in these groups was collected from different sources (Table 3) and aggregated to the 10 × 10-km spatial units. The mean distance to the different road types, harbors, train stations, and education and knowledge facilities were all calculated in the same manner. Each spatial unit was divided into 100 × 100 m sub-units for the purpose of precision, and minimum Euclidean distances from the objects to the center of the spatial unit were calculated for each sub-unit. Then, an average distance was computed for the original 10 × 10 km spatial unit.

Map layers were assembled in raster format for analysis in Python 2.7, with NumPy and GDAL packages. Each map layer represents one attribute (Table 4), while each grid cell represents a spatial unit (the alternative locations). Attributes were weighted as described in Table 4. The only model constraint was that a grid cell should not contain more than 50% of environmentally protected areas, resulting in one of the 67 cells being left out of the analysis.

Table 4 Fifteen criteria, represented as map layers of attributes, belonging to four groups. Both attributes and groups are weighted

The weights were decided by the authors based on their expertise. Weights were set for two levels, attributes, and groups. The attributes in each group were weighted relative to the other attributes in that group, e.g., importance within the transport group (Table 4). The groups were weighted relative to the other groups, e.g., transport against biomass. To address ambiguity on the importance between the groups, they were assigned weights within a range, expanding a base weight between the minimum and the maximum weights (Table 4). The group weight is then selected randomly within the range in a Monte Carlo approach to compute the suitability scores. The random selection of the weight values and the computation of the suitability scores were run 1000 times to simulate weight values estimated by 1000 individuals. The mean, the standard deviation, and the coefficients of variation of the suitability scores are subsequently computed for each site.

Model specifications used in the analysis are listed in Table 5. The use of generic specifications are justified as the goal for the GIS-MCDA analysis was to refine the total number of alternatives into a smaller list of potential candidates and since the scale of the analysis is regional, .i.e., representing resources on 10 × 10-km spatial units.

Table 5 GIS-MCDA method specifications used for the case study

The outputs from the analysis were suitability maps with scores on relative importance for each cell, for each combination of main group weightings. A single suitability map was processed based on the mean value of these maps (Fig. 3a). Additionally, the coefficient of variation of the estimated suitability scores is presented in Fig. 3b as a measure of the uncertainty of the estimation of the mean. Candidate alternatives for the OR model were selected from that suitability map based on a natural break in the scores allowing seven candidates (ca. 10%) to be subjected to further analyses using optimization. Figure 4 displays the main group contributions to the score of these seven alternatives.

Fig. 3
figure 3

a Mean of scores ranked from 1 to 67 (i.e., number of alternatives) from running the model 1000 times. Each time the scores were computed based on weights that were randomly chosen within the specified interval for each criteria group. The attribute weights were kept constant. The seven candidate alternatives for OR are marked with black borders (alternatives 23, 24, 47, 48, 52, 53, and 54 from Fig. 2). b Coefficient of variation (CV) for each square

Fig. 4
figure 4

The seven selected suitability alternatives, ordered from high to lower scores (from grid cell 24 to 52), with percentage contribution to score from criteria groups

4.2 Optimization

The OR model received the proposed locations from the GIS-MCDA, and ran the MILP optimization routine on them. The model also had access to resource availability data, which defined how much each harvesting node might produce. Alongside the proposed locations and geographical harvesting data, data on distances between the 67 spatial units was factored in. These distances were computed as the shortest path along existing road network. Distance thus refers to the transporting of materials between the cluster-, harvest-, and consumption-nodes in the cheapest way. The paths, and the associated costs per resource type, were calculated a priori and used as input. Transport costs, like production costs, are estimated following a simplistic approach using a constant fraction of the distance as transport costs.

Note that the case presented considers only one period of operations. The model, as formulated above, allows for further periods where values change over time (e.g., increased production to satisfy increasing demand over time), which may be relevant in other cases.

Cluster establishment and composition are the main strategic decisions. Harvesting, transport, and production are the three main operational decisions provided by the model.

Demand of end-products can be modeled in different ways, which likely affects the results. To illustrate this, we present two different ways of considering demand. In instance one, local demand is not explicitly specified: products produced in the cluster are assumed sold, location and size of demand obviated. In instance two, demand nodes are specified in several locations, with demand volumes proportional to the population of each node. As with the conversion rates, here, we make use of the data in the I-O tables to calculate localized demand for the lack of better sources. In both instances, demand must be met, and it is the only way the system may obtain revenues.

Table 6 summarizes the results of these two instances. For each node selected for the establishment of a new cluster, we list the facilities built there. For the sake of brevity, we omit quantitative data on capacity, resource production, or transport. We can see that location 24 is arguably a strong candidate for clusters, regardless of how demand is modeled. And we also see that all selected candidates are mainly clustered around two zones—around locations 23 and 24 and locations 47, 48, 52, and 53. There is a good mix of different types of facilities represented. We also see that agriculture and marine industries may be part of a cluster based on forestry. Figure 5 shows the flow of forest resources into the selected cluster nodes for instance 2.

Table 6 Results for the run of two instances for the OR model using GIS-MCDA input
Fig. 5
figure 5

Flow of forest biomass to selected cluster selections for instance 2

The cluster locations selected by the OR model in both instances include two clusters which have a relatively large share of human resources and infrastructure, and one or two clusters with a high share of biomass resources. In most cases, the selected location with a high share of biomass also includes the industry sector “Processing of forestry” (industry sector/A64 code R16–17), which is a sector that requires relatively much forest biomass, compared with for instance “Processing of Agriculture and Maritime Resources”, or “Biotech.”

5 Discussion

GIS-MCDA and OR represent two different methodologies with different strengths and weaknesses. In the study described here, we have combined the two and applied it to a case study. The combination of these two methods has a variety of benefits. One very important advantage of starting the analysis with a GIS-MCDA is the ability of this methodology to compile a wide range of data of different formats and domains. As long as the data has a geographical component, this can be used as a “common denominator” in their analysis. In our case, the ability of GIS to handle locational and proximity analysis of different kinds of resources adds an important aspect to MCDA and OR modeling. In GIS-MCDA, we can include large spatial datasets and process them according to defined criteria, prior to running the OR model. This way we already start to rank the alternatives based on their suitability, reducing the computation challenge for the OR model. The OR model on the other hand makes complete cost-benefit analysis, which GIS-MCDA lacks, taking into account pre-defined criteria for harvesting of resources, e.g., keep annual harvesting below annual production. Therefore, this combination of methods enables us to analyze suitability, profitability, and sustainability of a spatial unit for a bioeconomic cluster in an integrated process. Given the outlined grand challenges [52] and the transition to a bioeconomy [53], we believe that the focus on the spatiality of renewable biological resources, their use, and transportation is likely to become even more important in the future.

In combining the two methods, it is important to keep in mind that there is no universally set boundary between the candidates selected by the GIS-MCDA. Thus, which candidate to include or not to include in the OR step can be considered subjective. In the case presented here, we selected the candidates based on natural breaks in the average score values. However, one might consider using constraints such as the amount of biological resources accessible to the spatial unit. Further, it may also be that the candidate selected by GIS-MCDA result in no feasible solution for OR model. This could be assessed through broadening the input from the GIS-MCDA to the OR model for comparison. However, we consider encountering such a problem unlikely, as GIS-MCDA itself already has incorporated multiple criteria in the candidate selection in an objective approach.

We have made many simplifications to this case study, which merit consideration as they affect the quality of the results. Still, as the methodology is designed to be flexible, we consider our approach relevant to a wide range of applications. For example, the number of resources used, size of geographic region, types of industry, and products can easily be extended. While this case focused on development of new clusters, the methodology can also be used to evaluate potential localization problems where existing infrastructure is in place. Further, it is also possible to apply the method to analyze industries and clusters different from the bioeconomic industries.

The uncertainty in aggregating spatial data from different sources is an unavoidable problem in the present case, as national data are not available at the same scale for all the different variables. The combination of data from different sources with different scales contributes to the uncertainty of the result as the scale differences contribute to operational error and propagates into the results. It is well known in GIS that objects might have different properties and forms at different scales. This concept is widely acknowledged in GIS and referred to as The Modifiable Areal Unit Problem (MAUP) [54]. For example, if the origin of the 10 × 10-km grid had been moved, the results of aggregating the data would be different. However, working on a study area of a larger extent, e.g., Norway, the trends between different regions of the study area would probably still show similar patterns.

Regarding the MCDA, it is important to note that the weights chosen are only used to exemplify the method, and in a realistic application, more expert input would be used. Nevertheless, the methodology used here is clearly explained and can be used as a starting point when no expert input is available. Developing methods for finding appropriate weights to use depending on the case to be analyzed is therefore encouraged in future studies.

For the OR model to produce realistic results, it is necessary to have good input on costs along the value chain, like more detailed transportation costs. Further, note that the market in the test case was simply modeled based on the population around the cluster; ideally, this should be improved with real demand functions. Additionally, while we have omitted international sales in this model, this would indeed be interesting to look at for industries with high international trade interests, such as biotechnology.

We have used the Norwegian I-O table to provide conversion rates in our facilities, but further studies should substitute this I-O method to obtain conversion matrices with other approaches, like recruiting production experts for each of the facilities in question and model realistic production functions. This would greatly improve the quality of the results, but it is beyond the scope of this work.

The GIS-MCDA managed to differentiate between the alternatives and suggest candidates for the optimization models. The OR model then provided the best alternative cluster locations and configurations to consider in a decision-making process. We consider this highly useful and see a range of potential uses for this type of analyses. One such example is scenario analyses, where different input parameters can be varied to assess the effects of differing situations, such as testing the effect of reducing fossil-based transport as much as possible on the choices made by the model.

Our results demonstrate that the integration of different methods can indeed lead to an objective assessment of optimal cluster location. The test was simplified through our choice of focusing on only one resource and only one Norwegian county. An obvious next step would be to test the framework on a more realistic case, one that is larger with respect to size of the region, number of resource types, and number of NACE industrial categories involved.

A pending challenge is to integrate not just existing industries and products, but also obtain good predictions on which new industries will develop and what new products existing industries will provide in the future. This could also be combined with the inclusion of risk/uncertainty measures in the model framework. How exactly to model this or similar issues, while crucial for a complete analysis, seems difficult without the aid of experts on innovation and industrial processes, however.

In our case study, we have three types of boundaries: natural breaks, national/political breaks, and continuous boundary. We believe it is only on the continuous boundary we have potential edge effects [55]. As this represents a small proportion of our case study boundary and our next aim is to work on the entire country, for this methodological explorations, we decided to not focus on this aspect.

There are clearly several other factors that influence the development of bioeconomic clusters and that are not accounted for in the models, such as local involvement and participation, organizational challenges, biotechnological innovations (new uses and products), willingness to invest, property structure (possibly land use conflicts), legal rights, policy and incentives, capital and funding, and existing industry. All of these are areas of opportunity/improvement for future works.

6 Conclusions

When the aim is to select optimal locations and design of bioeconomic industrial clusters, we conclude that a combination of GIS, MCDA, and OR methods provide a promising approach. This conclusion is based on our demonstration of how location does matter, as resource availability, accessibility, and usability vary across space.

In our opinion, the spatial components of the transition to a sustainable bioeconomy should receive more attention. This requires in particular that transport needs from resource location to industry and resource flows within clusters must be focused on. In any decision regarding the use of bioeconomy resources, economic benefits are a key aspect. Our multidisciplinary approach combining these methods guarantees the consideration of both spatial and economic aspects.

We present a first test of how these different methodologies could be combined. The framework and method can easily be adapted to larger regions and types of products and contexts other than bioeconomic ones.

The collection and presentation of resource inventories in maps together with identified optimal sites can effectively communicate the geographical context. This can contribute to better, knowledge-based decision-making and information exchange between industry and government. Efficiency of resource usage, economic benefits, and environmental impacts need to be emphasized for a successful transition to a sustainable bioeconomy.

In the future, we foresee that analyses widening the biological resources included will be important. Also, extending the analysis in terms of sustainability, e.g., through the use of scenarios and environmentally extended input-output analysis, is a relevant and interesting possible next step.