Abstract

Evaluating the performance assessments of solvers (e.g., for computation programs), known as the solver benchmarking problem, has become a topic of intense study, and various approaches have been discussed in the literature. Such a variety of approaches exist because a benchmark problem is essentially a multicriteria problem. In particular, the appropriate multicriteria decision-making problem can correspond naturally to each benchmark problem and vice versa. In this study, to solve the solver benchmarking problem, we apply the ranking-theory method recently proposed for solving multicriteria decision-making problems. The benchmarking problem of differential evolution algorithms was considered for a case study to illustrate the ability of the proposed method. This problem was solved using ranking methods from different areas of origin. The comparisons revealed that the proposed method is competitive and can be successfully used to solve benchmarking problems and obtain relevant engineering decisions. This study can help practitioners and researchers use multicriteria decision-making approaches for benchmarking problems in different areas, particularly software benchmarking.

1. Introduction

Recently, evaluating the performance of solvers (e.g., computer programs), that is, the problem of solver benchmarking, has attracted significant attention from scientists. Currently, most benchmarking tests produce tables that present the performance of each solver for each problem according to a specified evaluation metric (e.g., the central processing unit (CPU) time and number of function evaluations) and use various statistical tests for the conclusions. Thus, the selection of the benchmarking method currently depends on the subjective tastes and preferences of individual researchers. The following components of the benchmarking process, including the solver set, problem set, metric for performance assessment, and statistical tools for data processing, are chosen individually according to the researcher’s preferences. For example, the performance profile method, which is currently the most popular and widely used method in practice (see [1]), is based on a comparative analysis of empirical probability distribution functions obtained in numerical experiments with different solvers.

In this study, we consider the benchmarking process based on the viewpoint that emphasizes natural relations between problems and solvers, as determined by their evaluation tables (see [2]). Specifically, we present data for benchmarking in the form of a so-called benchmarking context, that is, a triple , where and are sets of solvers and problems, respectively, and is an assessment function (a performance evaluation metric). Throughout the paper, the sets of solvers and problems are assumed to be finite. This concept is quite general and emphasizes that problems, solvers, and assessment functions must be considered closely related and not independent.

The benchmarking procedure presented in this study is described as follows. The data encapsulated by the given benchmarking context are used to build the corresponding multicriteria decision-making (MCDM) problem , where is a set of alternatives, and is a set of criteria. Hence, we define a decision matrix as a matrix whose elements exhibit the performance of different alternatives (i.e., solvers) concerning various criteria (i.e., problems) through the assessment function. Thus, the investigation of benchmarking problems was reduced to an MCDM problem. Moreover, for each MCDM problem, a corresponding benchmark context is presented. The rationale for such a consideration is that a vast array of different approaches for MCDM problems can be used for benchmarking problem analysis. In particular, such a multicriteria formulation allows the consideration of Pareto-optimal alternatives (i.e., solvers) as “good” solvers.

The next innovation presented in this study is that a recently proposed technique (see [3]) is used to solve the MCDM problem corresponding to a benchmarking problem. The multicriteria formulation is a typical starting point for theoretical and practical analyses of decision-making problems to clarify the essence of the new technique used in this study. Correspondingly, based on the fundamental concept of Pareto optimality, several methods and computational procedures have been developed to solve MCDM problems (see, e.g., overviews by [48], and more recently, [911]). However, unlike single-objective optimizations, a characteristic feature of Pareto optimality is that the set of Pareto-optimal alternatives is typically large. In addition, all these Pareto-optimal alternatives must be considered mathematically equal (equally “good”). Correspondingly, the problem of choosing a specific Pareto-optimal alternative for implementation arises because the final decision must usually be unique. Hence, additional factors must be considered to aid decision-makers in selecting specific or more favorable alternatives from the set of Pareto-optimal solutions.

Therefore, we build a special score matrix for the MCDM problem, which allows us to construct the corresponding ranking for alternatives [3]. The score matrix can be built in different ways, but we use the simplest and most natural method. This study uses a scoring matrix calculating how many times one alternative is better than another according to the criteria. Hence, the proposed approach may yield an “objective” ranking method and provide an “accurate” ranking of the alternatives for MCDM. Correspondingly, a best-ranked alternative from the Pareto set is declared a “true” solution to the MCDM problem. The approach presented in this study for solving MCDM problems is useful when no decision-making authority is available or when the relative importance of various criteria has not been previously evaluated.

Finally, we demonstrate the possibilities of the proposed method in a case study based on the computational and experimental results for benchmarking differential evolution (DE) algorithms presented by Sala et al. [12]. Specifically, we benchmark nine DE algorithms on a set of 50 test problems, following the random sampling equivalent expected run time (ERTRSE) performance metric. By conducting a numerical investigation, we demonstrate that the solution results of the MCDM problem obtained using the methods proposed in this study are quite competitive.

1.1. Contributions

This paper makes the following main contributions:(1)The concept of the benchmarking context is introduced according to [2], and it is confirmed that a one-to-one correspondence exists between the set of benchmarking contexts and the set of MCDM problems(2)The ranking-theory approach is proposed for solving MCDM problems corresponding to a given benchmarking context [3](3)The approach proposed in this article is tested on a known literature dataset for benchmarking DE algorithms (see [12]), and the possibility of effectively solving benchmarking problems is fully confirmed

1.2. Related Literature

Without claiming to be a complete review, we present a brief overview of the literature on the benchmarking problem in the context of optimization problems. Generally, the consideration of a benchmarking problem is motivated by various reasons, such as selecting the best solver (algorithm, software, etc.) for some class of problems, testing the proposed novel solvers, and evaluating the solver performance for different option settings. For example, early contributions in the benchmarking of optimization algorithms are considered [13]. The results achieved at an early stage in the development of the subject can be judged according to work by the following researchers: Nash and Nocedal [14], Billups et al. [15], Conn et al. [16], Sandu et al. [17], Mittelmann [18], Vanderbei and Shanno [19], and Bondarenko et al. [20].

The beginning of a new stage of development is associated with research work of Dolan and Moré [21], in which a performance profile comparison technique was proposed. This technique is now prevalent (but see, e.g., Gould and Scott [22]). Along with the performance profile comparison method, other more direct approaches have also been used in modern research. An idea of the modern research in the area under consideration can be obtained from the following research examples: Moles et al. [23], Mittelmann [24], Benson et al. [25], Kämpf et al. [26], Foster et al. [27], Rios and Sahinidis [28], Weise et al. [29], Sala et al. [12], and Cheshmi et al. [30]. A critical overview of the current state in the subject area was provided by Beiranvand et al. [1].

At the end of this brief overview, this study focuses on benchmarking for solvers of only the optimization problem. However, the concept of benchmarking has a much broader context (see, e.g., https://en.wikipedia.org/wiki/Benchmarking). The approach proposed in this article is quite general and can also be applied in other areas, but we do not consider this possibility here.

1.3. Notation

Throughout the article, the following general notation is used: is a set of natural numbers, and for a natural number , we denote an n-dimensional vector space by and is the in . If not otherwise mentioned, we identify a finite set with set , where is the capacity of set . We also introduce the following notations for special vectors and sets: for any , , and is a positive orthant. By necessity, we also identify the matrix with the map . For a matrix , we denote its transpose by .

1.4. Outline

The remainder of this paper is structured as follows. In Section 2, all necessary theoretical preliminaries regarding the MCDM problem (Section 2.1) and ranking-theory methods for solving MCDM problems are presented (Section 2.2). Section 3 introduces the concept of benchmarking contexts, and its relationship with the MCDM problem is discussed. In Section 4, the case-study problem of DE algorithm benchmarking is investigated numerically. Finally, the conclusions are presented in Section 5.

2. Methodology

2.1. Multicriteria Decision-Making Problems

We use the following notation from the general theory of multicriteria optimization theory [31]. We consider the MCDM problem , where is a set of alternatives, and is a set of criteria, that is, . Hence, we introduce the following decision matrix:where is the performance measure of alternative on criterion . Without loss of generality, we assume that the lower value is preferable for each criterion (i.e., each criterion is not beneficial; see [32]), and the goal of the decision-making procedure is to minimize all criteria simultaneously. Furthermore, is the set of admissible alternatives, and map is the criterion map (correspondingly, is the set of admissible values of criteria). A point , where , is called the ideal point. An ideal point is considered attainable if alternative exists such that . The following concepts are also associated with the criterion map and set of alternatives. An alternative is Pareto-optimal (efficient) if exists such that for all and for some . The set of all efficient alternatives is denoted as and is called the Pareto set. Correspondingly, is called an efficient front.

Pareto optimality is an appropriate concept for solutions to MCDM problems in general. However, the set of Pareto-optimal alternatives is very large, and all alternatives from must be considered “equally good solutions.” However, the final decision must be unique. Hence, additional factors must be considered to aid in selecting specific or more favorable alternatives from the set . We cannot provide a detailed analysis of these methods; however, interested readers can become acquainted with them through overviews [48]. Furthermore, we consider only the method proposed by Gogodze [3] without diminishing the value of more classical methods.

2.2. Ranking Methods and Their Applications to MCDM Problems

This section provides a brief overview of the basic concepts of the ranking theory (e.g., see [33] for further details) and presents the necessary formal definitions. For a natural number , the matrix is a score matrix if , and the pair is the ranking problem. We assume (conditionally) that the elements of are athletes (or sports teams) who compete in matches between themselves. Moreover, denotes a joint match for each pair of athletes , and we interpret entry , of matrix as the total score of athlete against athlete in match . In addition, athlete scored against athlete in match if , and athlete has beaten athlete in match if . Based on the introduced notation, we define the following quantities:

The weak order on the set is transitive and the complete relation . The relation is a ranking method if, for any given ranking problem is a weak order on the set . Any vector can be considered a rating vector for elements of , in the sense that each , can be interpreted as a measure of the performance of player . For the ranking problem , a ranking method is induced by the rating vector if (i.e., ranks weakly above ) if and only if .

For illustrative purposes, we consider only a few of the many ranking methods discussed in the literature. All of these methods are induced by their corresponding rating vectors. The considered ranking methods originate from different areas, such as athlete/team ranking in sports, citation indices, and website ranking. Hence, all of these reflect some (intuitive as a rule) human experience regarding the solution concept of the ranking problem. A brief overview of the ranking methods in this article is provided in Appendix.

We can unite all the information described above and demonstrate that, for any MCDM problem, we can construct the necessary matrices (e.g., , ) and, therefore, apply a suitable ranking method for the MCDM problem solution. To simplify the perception of the constructions described below, we use sports terminology. We assume that is an MCDM problem (see Section 2.1) with a set of alternatives and a set of nonbeneficial criteria and that the decision-making goal is to minimize the criteria simultaneously. We imagine that the number of athletes is for constructing matrix and that they are competing in an -athlon (i.e., each match includes competitions in different disciplines, ). For illustrative purposes, we introduce the simplest method for score calculation:

Thus, for criterion , the equality means that and the alternative (i.e., athlete ) receives one point (i.e., athlete wins the competition in discipline ). Correspondingly, indicates the number of total wins of athlete in match ). Thus, . An alternative (athlete ) has defeated an alternative (athlete ) if . In addition, the result of match is wins by athlete (losses of athlete ), wins of athlete (losses of athlete ), and the number of draws is . The constructed matrix , is the score matrix for a set of alternatives.

Thus, we can define an auxiliary matrix , where

Furthermore, using matrix and a well-known transformation, we can construct a (row) stochastic matrix as follows: , where a vector (usually ), and is a vector defined as follows:

The introduced matrix can be interpreted as an adjacency matrix for a directed graph (associated with the MCDM problem ), called the adjacency matrix for the MCDM problem . Correspondingly, matrix can be interpreted as a transition probability matrix for the Markov chain determined by the graph . Moreover, we can construct a reciprocal matrix of pairwise comparisons , for the MCDM problem as follows:

Subject to the facts presented in this section, the following procedure for solving the MCDM problem under consideration, , can be formulated:(i)For the MCDM problem , the score matrix , is constructed(ii)Using the score matrix , the alternatives from set are ranked using a ranking method (; see, e.g., Appendix)(iii)The alternative from the Pareto set, , ranked best by method is declared the solution of the considered MCDM problem

3. Benchmarking Problem

We consider a set of problems, a set of solvers, and a function , the assessment function (performance metric). The terms “solver,” “problem,” and “assessment function” are used conditionally only to simplify interpretation, although this is not generally necessary (and, as we observe below, can even lead to terminological inconsistency). Furthermore, we assume for definiteness that the high and low values of correspond to the worst and best cases, respectively, and for convenience, we interpret as the cost of solving the problem by the solver . Moreover, the following conditions are assumed:(i)Slover solves problem better than solver (ii)Problem is easier for solver than problem

Thus, we can introduce the following definition, which is sufficient for many real-world applications.

Definition 1. A triple is in the (solvers) benchmarking context if and only if and are the finite sets (called a set of solvers and set of problems, respectively), is a function (called the assessment function, or performance evaluation metric), and the following assumptions hold:The presented concept is quite general and, as mentioned, emphasizes that the set of solvers, set of problems, and assessment function must be considered closely related objects for the benchmarking goal and not independently. Assumption establishes that sets have sizes , respectively, and Assumption establishes the nonnegativity of the assessment function. Moreover, because sets are finite, Condition does not limit the generality of our considerations. Generally, the selection of a benchmarking context component is based on the research questions motivated by a benchmarking analysis goal. However, the choice of sets is often a disputable issue in the practice of certain applications. In contrast, the situation is relatively straightforward in choosing the assessment function, , at least in computer science (see, e.g., [34]). For example, the following indicators are often used in this case: running time (e.g., the CPU time [35]), reliability (i.e., the solver’s ability to successfully solve several problems, such as the success rate [36]), and others. Moreover, the case when assessment, , is a mapping in (i.e., it is a multiple criterion), also can be considered but we do not delve into this issue. Next, we consider the benchmarking context as given, and introduce the following definition:

Definition 2. For a given (solver) benchmarking context , we define function as follows: . We call the adjoint (to ) assessment function, and is the adjoint to the benchmarking context or problem benchmarking context (corresponding to the solver benchmarking context ).
Definition 2 is easily validated as correct (i.e., is the assessment function in the sense of Definition 1). Terminological inconsistency appears, as noted above. In the benchmarking context , the set of solvers is set , which is the set of problems in the sense of the benchmarking context . We hope that this does not create any problems in understanding the text below.
We assume now that a benchmarking context is given and build a corresponding MCDM problem as follows: is a set of alternatives, and is a set of criteria. Hence, we define the decision matrix as a matrix whose elements exhibit the performance of different alternatives (i.e., solvers) with respect to various criteria (i.e., problems) through the assessment function. From Property , In contrast, we assume that , where and , is a given MCDM problem such that . Hence, for , and , triplet is a benchmarking context corresponding to the MCDM problem . The correspondences described above are one-to-one and reciprocal. Thus, we prove that the following proposition holds.

Proposition 1. One-to-one mapping exists between the benchmarking contexts and MCDM problems with nonnegative criteria.
To summarize the results of this section and achieve greater clarity in the presentation, we formulated the proposed approach to solving benchmarking problems in an algorithmic form. Furthermore, we assumed that the considered benchmarking problem has already been formalized as a benchmarking context , where is a set of solvers, is a set of problems, and is an assessment function. The flowchart of the algorithm is presented in Figure 1. All elements of the Pareto set, Ae, are considered equally “good” solvers (in the sense of Pareto optimality). However, the ranking allows detailed classification to define the “best of the good,” “worst of the good,” and other intermediate “good” solvers.

4. Case Study: Benchmarking Differential Evolution Algorithms

4.1. Data

In this section, we focus on an illustrative example of the proposed approach in Section 2. Our consideration is based on the results of numerical experiments borrowed from Sala et al. [12], where nine optimization DE algorithms and 25 test functions were considered (see Table 1) for a short description of the test functions. The set of solvers in the benchmarking problem under consideration is the set of the following algorithms:(i)-DE: rand/1/bin differential evolution [37],(ii)-DE2: best/2/bin differential evolution [38],(iii)-jDE: self-adapting differential evolution [39],(iv)-JADE: adaptive differential evolution [40],(v)-SaDE: strategy adaptation differential evolution [41],(vi)-epsDE: ensemble parameter differential evolution [42],(vii)-CoDe: composite trial vector strategy differential evolution [43],(viii)-SQG: stochastic quasigradient search [44], and(ix)-SQG-DE: stochastic quasigradient-based differential evolution [12].

Thus, is a set of solvers. The set of problems comprises 50 problems, and each problem is defined by the dimension indicator and by the test function types , as listed in Table 2.

A description of the assessment function used by Sala et al. [12] is as follows: first, the expected running time (ERT), a widely used performance metric for optimization algorithms, is defined as follows:where indicates a reference threshold value, is the number of function evaluations required to reach an objective value better than (e.g., successful runs), denotes the maximum number of function evaluations per optimization run, represents the number of successful runs, is the total number of runs, and denotes the named success rate [45]. The is interpreted as the expected number of function evaluations of an algorithm to reach an objective function threshold for the first time. A threshold or success criterion is required for the performance measure. However, unlike conventional optimization problems (where the criterion is usually related to reaching the value of the known global optimum within a specified tolerance), the probability of coming close to the global optimum is negligible for difficult optimization problems, and a more acceptable alternative success criterion is required. Moreover, all compared algorithms must meet the success criterion a few times to compare qualitative performance using for difficult optimization problems. Correspondingly, Sala et al. [12] used the success criterion to reach the target value corresponding to the expected value of the best objective function value from the uniform random sampling (1000 samples). Next, the estimation of the expected objective value for test function f is based on 100 repetitions. Finally, the with respect to this objective function value limit was referred to as for test function . The dataset of estimations [12] for the above-described solvers and problems is presented in Table 3.

Thus, the benchmarking context , where , and is the assessment, is fully defined. Hence, in Section 3, the MCDM problem associated with the benchmarking problem under consideration is fully defined with a set of alternatives a set of (nonbeneficial) criteria , and a primary decision matrix obtained by transposing the matrix/table presented in Table 3, which is the transposed primary decision matrix , for writing convenience. Hence, the MCDM problem associated with the benchmarking context (i.e., the solver benchmarking problem) is fully defined. The benchmarking context is analogously defined, where the assessment function is obtained based on the decision matrix (which is the transpose of the decision matrix defined above). Hence, the MCDM problem associated with this benchmarking context (i.e., the benchmark problem for the problems) is also fully defined.

4.2. Calculation Results

In this section, we present a brief description of the calculation results (all calculations related to the case study were calculated in the MATLAB environment using standard equipment: laptop with 2.59 GHz, 8 GB RAM, and a 64 bit operation system and required a few seconds (4.87 s for the solver benchmarking and 5.04 s for the problem benchmarking for calculating all considered rankings without special code optimization measures). First, we consider the solver benchmarking problem and explain the construction of the normalized decision matrix by transforming the primary dataset (see, e.g., [32]).

For the primary decision matrix , we define the normalized decision matrix where . For the solver benchmarking problem, we consider all criteria to be nonbeneficial (i.e., minimizable). We consider a solver to be better if it solves a given problem in less time .

To illustrate this, we present the score matrix for the solver benchmarking problem in Table 4. Table 5 presents the obtained ranks for the solver benchmarking problem.

Analogously, we consider the problem benchmarking but define the normalized decision matrix as follows: where , and , is the corresponding primary decision matrix. For the benchmarking problem, we also assume that all criteria are nonbeneficial (i.e., minimizable). Again, a problem is better (i.e., easier) for a given solver if it is solved in less time by this solver. Table 6 presents the ranks for the problem benchmarking (the score matrix for the problem benchmarking is not presented).

4.3. Discussion

As Table 5 indicates, the results of solver ranking using the considered methods are somewhat similar. This observation was confirmed quantitatively by considering the Spearman correlations between ranks (Table 7), where the correlations of the solver ranks for the rankings are presented. As Table 7 demonstrates, the ranks are strongly correlated with each other. Analogously, Table 8 reflects the interrelation between ranks for problem benchmarking. In particular, ranks are strongly correlated with each other.

Regarding the results of the correlation analysis, the observed similarity of the ranking results for ranking methods appears very intriguing, given that these methods have completely different areas of origin and underlying ideas (see the corresponding scholium in Appendix). It is interesting to consider the Pareto optimization results (see the solvers and problems marked in gray in Table 5 and 4, respectively). In particular, from Table 5, all considered solvers were Pareto-optimal (i.e., they are considered “equally good” in the considered benchmarking context). We believe that this is due to the large (compared to the number of solvers) number of problems (i.e., too many criteria exist in the corresponding MCDM problem) and, accordingly, each solver is good in “its own way.” However, ranking methods enable the establishment of an appropriate hierarchy among solvers. Analogously, Table 6 demonstrates that Pareto-optimal problems are allocated to different groups or clusters, indicating similar problems belonging to the same clusters. Ranking methods also make it possible to establish an appropriate hierarchy among the problems.

Summarizing the results of the case-study investigation, we conclude the following:(i)The results of the calculations (Table 5) confirm that the SQG-DE algorithm (solver S9) is the best in the considered benchmarking context (for comparison, see [12]), and this conclusion is correct for all rankings used in this study, despite their quite different natures. Moreover, the worst results are DE2 (solver S2) according to all considered ranking methods, excluding Neustadt’s method, and DE (solver S1) according to Neustadt’s ranking method.(ii)Unlike Sala et al. [12], where the analysis of the problems was not carried out, our calculations also indicate (Table 6) that the best problems in the considered benchmarking context (in the sense of a lower value of the considered metric) are the shifted sphere function in Dimension 50 (problem 26) and the rotated hybrid composition function in Dimension 30 (problem 25). Accordingly, the worst ones are the shifted rotated expanded Schaffer’s function 6 in Dimension 30 (problem P14), and the next worst ones are the shifted rotated Ackley’s function in Dimension 50 (problem P33) (by the Colley ranking method) and the shifted rotated expanded Schaffer’s function 6 in Dimension 50 (problem P39) (for all other considered ranking methods).

We stress that these results were obtained using only the ranking-theory methods without an analysis of any statistical indicators of the assessment function values, as currently practiced (see, e.g., the related literature overview in the Introduction section).

5. Conclusions

In this study, we presented a new MCDM technique for solving decision-making problems for benchmarking. Our investigation was based on the concept of a benchmarking context, presented in detail, and the observation that a benchmarking problem is an MCDM problem. Correspondingly, to solve benchmarking problems successfully, an extensive array of MCDM methods can be used. We also presented a new approach to the MCDM problem solution based on the ranking-theory methods. The corresponding ranks are obtained by constructing a special score matrix. We emphasize that this method defines the appropriate ranks directly from the decision matrix and does not use preliminary assessments conducted by external experts or other methods. Therefore, the technique presented in this study is useful when the relative importance of various criteria has not been evaluated in advance. As a case study, the benchmarking problem of DE algorithms was considered based on the data presented by Sala et al. [12]. A detailed numerical investigation was conducted using various ranking methods. Moreover, these ranks were also correspondingly compared for solvers and problems. The results demonstrate that the method presented in this study is competitive and generates relevant solutions.

Referring to the analysis presented in this study, we conclude the following:(i)The results of applying MCDM methods to aid benchmarking problem solutions based on the proposed approach are encouraging.(ii)The proposed approach provides a constructive view of the benchmarking problem solution, identifying the “best” and “worst” cases and ordering all intermediate cases.(iii)The proposed approach is easily implementable because of its simplicity and flexibility. Moreover, the approach is sufficiently general and can be successfully used to investigate benchmarking problems in other application areas.

However, this study has limitations because we provided a tool for benchmarking only in the case in which the benchmarking context is given (i.e., when the sets of solvers (problems), problems (solvers), and performance metrics are given). However, issues regarding selecting benchmarking context components remain unresolved. The literature does not contain clear and direct recommendations regarding the correct selection of solvers, problems, and performance metrics. Hence, further investigation in this direction will be helpful.

Appendix

A. Ranking Methods

Let us assume that the ranking problem is given and considered as examples the following rating methods (see notations and concepts in Section 2.2) for it.

A.1. Score Method

The rating vector for the score method is defined as the average score:and ranking defined by rating vector will be call rank.

A.2. Neustadt’s Method

Neustadt’s rating vector is defined by the equality , where , and ranking defined by rating vector will be call rank.

A.3. Buchholz’s Method

Buchholz’s rating vector (the score, Neustadt, and Buchholz methods were used in practice of chess tournament, and go back to the investigations of H. Neustadt, E. Zermelo, and B. Buchholz, for more details see [33]) is defined by the equalitywhere , and ranking defined by rating vector will be call rank.

A.4. Colley Method

We describe the Colley method (Colley, W., Colley’s college football ranking method, https://www.colleyrankings.com/, (accessed 12.08.2020)) as follows: let be the number of athletes/teams, is the number of wins for athletes/team is the number of losses for athlete/team is the total number of games played by athletes/team and is the number of times athlete/teams played, . From these data, the following objects can be introduced: the Colley matrix, , and the Colley vector , where

Now, using score matrix , we define quantities as follows:

and, obviously, . The Colley rating vector, , is obtained as a solution of the equation , and the ranking defined by rating vector is called rank.

A.5. Keener Method

We describe the Keener method [46] as follows: let be the number of athletes/teams and is the corresponding score matrix. Keener matrix is defined as follows:where

Correspondingly, the rating vector for the Keener method is obtained as a solution of the eigenvalue problem and the ranking defined by rating vector is called rank.

A.6. Analytical Hierarchy Process

The analytical hierarchy process (AHP) is a well-known decision-making method [47]. Many modifications of this method exist, but we restrict ourselves to considering only two of them: AHP Perron–Frobenius version (AHPPF) and AHP geometric mean version (AHPGM), which are briefly described below. A main problem related to AHP is the inconsistency problem (of a pairwise comparison matrix). We will not discuss this problem here because of its technical nature. Therefore, we consider AHP only as a procedure for constructing a rating vector. Let us assume again that is the number of athletes/teams, which should be ranked based on the score matrix . We also assume that the score matrix allows the construction of a matrix which is the reciprocal matrix of pairwise comparisons. Recall that matrix , is called the reciprocal matrix of pairwise comparisons if it has the following properties: . Note also that for a positive reciprocal matrix , its principal eigenvalue has following properties: and if we have an inconsistency problem. The AHPPF rating vector is defined as the solution to the eigenvalue problem: , with the principal eigenvalue , and the corresponding ranking is called rank. At the other hand, the AHPGM rating vector is defined as follows:and corresponding ranking will be call rank.

Data Availability

The data of Sala et al. [12] were used to support this study.

Conflicts of Interest

The authors declare no conflicts of interest regarding this article.