Skip to main content
Log in

Optimizing the Allocation of Trials to Sub-regions in Multi-environment Crop Variety Testing

  • Published:
Journal of Agricultural, Biological and Environmental Statistics Aims and scope Submit manuscript

Abstract

New crop varieties are extensively tested in multi-environment trials in order to obtain a solid empirical basis for recommendations to farmers. When the target population of environments is large and heterogeneous, a division into sub-regions is often advantageous. When designing such trials, the question arises how to allocate trials to the different sub-regions. We consider a solution to this problem assuming a linear mixed model. We propose an analytical approach for computation of optimal designs for best linear unbiased prediction of genotype effects and their pairwise linear contrasts and illustrate the obtained results by a real data example from Indian nation-wide maize variety trials. It is shown that, except in simple cases such as a compound symmetry model, the optimal allocation depends on the variance–covariance structure for genotypic effects nested within sub-regions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Atlin G, Baker RJ, McRae KB, Lu X (2000) Selection response in subdivided target regions. Crop Sci 40:7–13

    Article  Google Scholar 

  • Bueno Filho JSDS, Gilmour S (2003) Planning incomplete block experiments when treatments are genetically related. Biometrics 59:375–381

    Article  MathSciNet  MATH  Google Scholar 

  • Bueno Filho JSDS, Gilmour S (2007) Block designs for random treatment effects. J Stat Plan Inference 137:1446–1451

    Article  MathSciNet  MATH  Google Scholar 

  • Butler DG, Smith AB, Cullis BR (2014) On the design of field experiments with correlated treatment effects. J Agric Biol Environ Stat 19:539–555

    Article  MathSciNet  MATH  Google Scholar 

  • Caliński T, Czajka S, Kaczmarek Z, Krajewski P, Pilarczyk W (2009) Analysing the genotype by environment interactions under a randomization-derived mixed model. J Agric Biol Environ Stat 14:224–241

    Article  MathSciNet  MATH  Google Scholar 

  • Cochran WG, Cox GM (1957) Experimental designs. Wiley, New York

    MATH  Google Scholar 

  • Cullis BR, Smith A, Coombes N (2006) On the design of early generation variety trials with correlated data. J Agric Biol Environ Stat 11:381–393

    Article  Google Scholar 

  • Cullis BR, Smith A, Cocks NA, Butler DG (2020) The design of early-stage plant breeding trials using genetic relatedness. J Agric Biol Environ Stat 25:553–578

    Article  MathSciNet  Google Scholar 

  • Entholzner M, Benda N, Schmelter T, Schwabe R (2005) A note on designs for estimating population parameters. Biom Lett 42:25–41

    Google Scholar 

  • Fedorov V, Jones B (2005) The design of multicentre trials. Stat Methods Med Res 14:205–248

    Article  MathSciNet  MATH  Google Scholar 

  • Gladitz J, Pilz J (1982) Construction of optimal designs in random coefficient regression models. Mathematische Operationsforschung und Statistik, Series Statistics 13:371–385

    MathSciNet  MATH  Google Scholar 

  • González-Barrios P, Díaz-García L, Gutiérrez L (2019) Mega-environmental design: using genotype \(\times \) environment interaction to optimize resources for cultivar testing. Crop Sci 59:1899–1915

    Google Scholar 

  • Harman R, Filová L (2016) Package ’OptimalDesign’. https://cran.r-project.org/web/packages/OptimalDesign/index.html. Accessed 20 Nov 2020

  • Harman R, Prus M (2018) Computing optimal experimental designs with respect to a compound Bayes Risk criterion. Stat Probab Lett 137:135–141

    Article  MathSciNet  MATH  Google Scholar 

  • Henderson CR (1975) Best linear unbiased estimation and prediction under a selection model. Biometrics 31:423–477

    Article  MATH  Google Scholar 

  • Heslot N, Feoktistov V (2020) Optimization of selective phenotyping and population design for genomic prediction. J Agric Biol Environ Stat 25:579–600

    Article  MathSciNet  Google Scholar 

  • Isik F, Holland J, Maltecca C (2017) Genetic data analysis for plant and animal breeding. Springer, New York

    Book  Google Scholar 

  • Jiang J, Lahiri P (2006) Estimation of finite population domain means: a model-assisted empirical best prediction approach. J Am Stat Assoc 101:301–311

    Article  MathSciNet  MATH  Google Scholar 

  • John JA, Williams ER (1995) Cyclic and computer generated designs. Chapman and Hall, London

    Book  MATH  Google Scholar 

  • Kackar RN, Harville DA (1984) Approximations for standard errors of estimators of fixed and random effects in mixed linear models. J Am Stat Assoc 79:853–862

    MathSciNet  MATH  Google Scholar 

  • Kiefer J (1974) General equivalence theory for optimum designs (approximate theory). Ann Stat 2:849–879

    Article  MathSciNet  MATH  Google Scholar 

  • Kleinknecht K, Möhring J, Singh K, Zaidi P, Atlin G, Piepho H-P (2013) Comparison of the performance of blue and blup for zoned Indian maize data. Crop Sci 53:1384–1391

    Article  Google Scholar 

  • McLean RA, Sanders WL (1988) Approximating degrees of freedom for standard errors in mixed linear models. In: Proceedings of the statistical computing section. American Statistical Association, Alexandria, VA, pp 50–59

  • Mead R, Gilmour S, Mead A (2012) Statistical principles for the design of experiments. Cambridge Univ. Press, Cambridge

    Book  MATH  Google Scholar 

  • Piepho H-P (1997) Analyzing genotype-environment data by mixed models with multiplicative effects. Biometrics 53:761–766

    Article  MathSciNet  MATH  Google Scholar 

  • Piepho H-P, Möhring J (2005) Best linear unbiased prediction of cultivar effects for subdivided target regions. Crop Sci 45:1151–1159

    Article  Google Scholar 

  • Prus M (2019) Optimal designs in multiple group random coefficient regression models. Test 4:5. https://doi.org/10.1007/s11749-019-00654-6

    Article  Google Scholar 

  • Prus M, Schwabe R (2016) Optimal designs for the prediction of individual parameters in hierarchical models. J R Stat Soc Ser B 78:175–191

    Article  MathSciNet  MATH  Google Scholar 

  • Pukelsheim F, Rieder S (1992) Efficient rounding of approximate designs. Biometrika 79:763–770

    Article  MathSciNet  Google Scholar 

  • Searle SR, Casella G, McCulloch CE (1992) Variance components. Wiley, New York

    Book  MATH  Google Scholar 

  • Snedecor GW, Cochran WG, Press A (1967) Statistical methods. Iowa State University Press, Ames

    Google Scholar 

  • Speed TP, Williams ER, Patterson HD (1985) A note on the analysis of resolvable block designs. J R Stat Soc B 47:357–361

    MathSciNet  MATH  Google Scholar 

  • Torabi M, Jiang J (2020) Estimation of mean squared prediction error of empirically spatial predictor of small area means under a linear mixed model. J Stat Plan Inference 208:82–93

    Article  MathSciNet  MATH  Google Scholar 

  • Williams E, John JA, Whitaker D (2014) Construction of more flexible and efficient p-rep designs. Aust N Z J Stat 56:89–96

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research was partially supported by Grant SCHW 531/16 of the German Research Foundation (DFG). The authors are grateful to Waqas Malik (University of Hohenheim) for determining the areas of the five breeding zones for maize in India based on a digitized map. The authors thank three referees, the Associate Editor and the Editor-in-Chief for helpful comments which improved the presentation of the results.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maryna Prus.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (txt 2 KB)

Appendices

Proof of Lemmas 1 and 2

To make use of the theoretical results that are available in the literature (see, e.g., Henderson 1975) for the prediction of random parameters, we will represent the model (1) as a particular case of the general LMM

$$\begin{aligned} \mathbf {Y}=\mathbf {X} {\varvec{\beta }} + \mathbf {Z} {\varvec{\zeta }} + {\varvec{\epsilon }} \end{aligned}$$
(29)

with design matrices \(\mathbf {X}\) and \(\mathbf {Z}\) for the fixed effects and the random effects, respectively. In (29), \({\varvec{\beta }}\) denotes the fixed effects and \({\varvec{\zeta }}\) are the random effects. The random effects and the observational errors \({\varvec{\epsilon }}\) are assumed to have zero mean and to be all uncorrelated with positive definite covariance matrices \(\text{ Cov }\,({\varvec{\zeta }})=\mathbf {G}\) and \(\text{ Cov }\,({\varvec{\epsilon }})=\mathbf {R}\), respectively. Random effects and observational errors are assumed to be uncorrelated.

To present model (1) in form (29), we follow the next steps:

$$\begin{aligned}&\mathbf {Y}_{ijk}=\mathbb {1}_L\,\mu _i+\mathbb {1}_L\,\alpha _{ik}+\mathbb {1}_L\,\lambda _{ij}+\mathbb {1}_L\,\gamma _{ijk}+\mathbf {b}_{ij}+{\varvec{\varepsilon }}_{ijk},\\&\quad i=1 \dots P,\quad k=1, \dots , K,\quad j=1, \dots , J_i, \end{aligned}$$

where \(\mathbf {b}_{ij}=(b_{ij1}, \dots , b_{ijL})^\top \).

$$\begin{aligned} \mathbf {Y}_{ik}= & {} \mathbb {1}_{LJ_i}\,\mu _i+\mathbb {1}_{LJ_i}\,\alpha _{ik}+(\mathbb {I}_{J_i}\otimes \mathbb {1}_L)\,{\varvec{\lambda }}_{i}\\&+\,(\mathbb {I}_{J_i}\otimes \mathbb {1}_L)\,{\varvec{\gamma }}_{ik}+\mathbf {b}_{i}+{\varvec{\varepsilon }}_{ik},\\&\qquad i=1 \dots P,\quad k=1, \dots , K, \end{aligned}$$

where \({\varvec{\lambda }}_i=(\lambda _{i1}, \dots , \lambda _{iJ_i})^\top \) and \({\varvec{\gamma }}_{ik}=(\gamma _{i1k}, \dots , \gamma _{iJ_ik})^\top \).

$$\begin{aligned} \mathbf {Y}_{k}=\mathbf {F}{\varvec{\mu }}+\mathbf {F}{\varvec{\alpha }}_{k}+\mathbf {H}{\varvec{\lambda }}+\mathbf {H}{\varvec{\gamma }}_k+\mathbf {b}+{\varvec{\varepsilon }}_{k},\quad k=1, \dots , K, \end{aligned}$$

where \(\mathbf {H}=(\mathbb {I}_J\otimes \mathbb {1}_L)\), \({\varvec{\mu }}=(\mu _1, \dots , \mu _P)^\top \), \({\varvec{\lambda }}=({\varvec{\lambda }}_1^\top , \dots , {\varvec{\lambda }}_P^\top )^\top \) and \({\varvec{\gamma }}_k=({\varvec{\gamma }}_{1k}^\top , \dots , {\varvec{\gamma }}_{Pk}^\top )^\top \).

$$\begin{aligned} \mathbf {Y}= & {} (\mathbb {1}_K\otimes \mathbf {F}){\varvec{\mu }}+(\mathbb {I}_K\otimes \mathbf {F}){\varvec{\alpha }}+(\mathbb {1}_K\otimes \mathbf {H}){\varvec{\lambda }}\\&+\,(\mathbb {I}_K\otimes \mathbf {H}){\varvec{\gamma }}+(\mathbb {1}_K\otimes \mathbb {I}_{LJ})\mathbf {b}+{\varvec{\varepsilon }}, \end{aligned}$$

where \({\varvec{\gamma }}=({\varvec{\gamma }}_1^\top , \dots , {\varvec{\gamma }}_K^\top )^\top \).

The latter equation may alternatively be written as

$$\begin{aligned} \mathbf {Y}=(\mathbb {1}_K\otimes \mathbf {F}){\varvec{\mu }}+(\mathbb {I}_K\otimes \mathbf {F}){\varvec{\alpha }}+\tilde{{\varvec{\varepsilon }}}, \end{aligned}$$
(30)

where \(\tilde{{\varvec{\varepsilon }}}:=(\mathbb {1}_K\otimes \mathbf {H}){\varvec{\lambda }}+(\mathbb {I}_K\otimes \mathbf {H}){\varvec{\gamma }}+(\mathbb {1}_K\otimes \mathbb {I}_{LJ})\mathbf {b}+{\varvec{\varepsilon }}\). Model (30) is of form (29) with \(\mathbf {X}=(\mathbb {1}_K\otimes \mathbf {F})\), \(\mathbf {Z}=(\mathbb {I}_K\otimes \mathbf {F})\), \(\mathbf {G}=\mathrm {Cov}({\varvec{\alpha }})=\sigma ^2\mathbb {I}_K\otimes \mathbf {D}\) and

$$\begin{aligned} \mathbf {R}= & {} \mathrm {Cov}(\tilde{{\varvec{\varepsilon }}})=\sigma ^2((v_1\mathbb {1}_K\mathbb {1}_K^\top +v_2\mathbb {I}_K)\otimes \mathbb {I}_J\otimes (\mathbb {1}_L\mathbb {1}_L^\top )\\&+\,v_3(\mathbb {1}_K\mathbb {1}_K^\top )\otimes \mathbb {I}_{LJ}+\mathbb {I}_{LJK}). \end{aligned}$$

According to Henderson (1975), the BLUP of the random effects \({\varvec{\zeta }}\) (which corresponds to \({\varvec{\alpha }}\) in our model (30)) is given by

$$\begin{aligned} \hat{{\varvec{\zeta }}}= & {} \left( \mathbf {Z}^\top \mathbf {R}^{-1}\mathbf {Z}+\mathbf {G}^{-1} -\mathbf {Z}^\top \mathbf {R}^{-1}\mathbf {X} (\mathbf {X}^\top \mathbf {R}^{-1}\mathbf {X})^{-}\mathbf {X}^\top \mathbf {R}^{-1}\mathbf {Z}\right) ^{-1}\nonumber \\&\cdot \left( \mathbf {Z}^\top \mathbf {R}^{-1} -\mathbf {Z}^\top \mathbf {R}^{-1}\mathbf {X} (\mathbf {X}^\top \mathbf {R}^{-1}\mathbf {X})^{-}\mathbf {X}^\top \mathbf {R}^{-1}\right) \mathbf {Y}. \end{aligned}$$
(31)

Using this formula, we obtain the BLUP for the genotype effects \({\varvec{\alpha }}\), which results in formula (2). The MSE matrix of the BLUP of the random effects \({\varvec{\zeta }}\) is given by

$$\begin{aligned} \mathrm {Cov}(\hat{{\varvec{\zeta }}}-{\varvec{\zeta }})=\left( \mathbf {Z}^\top \mathbf {R}^{-1}\mathbf {Z} +\mathbf {G}^{-1}-\mathbf {Z}^\top \mathbf {R}^{-1}\mathbf {X} (\mathbf {X}^\top \mathbf {R}^{-1}\mathbf {X})^{-}\mathbf {X}^\top \mathbf {R}^{-1}\mathbf {Z}\right) ^{-1}, \end{aligned}$$
(32)

where \(\mathbf {A}^{-}\) denotes a generalized inverse of \(\mathbf {A}\). By this formula, we obtain MSE matrix (4). Then, using the relation \({\varvec{\theta }}^{k,k'}={\varvec{\alpha }}_k-{\varvec{\alpha }}_{k'}=((\mathbf {e}_k-\mathbf {e}_{k'})^\top \otimes \mathbb {I}_P)\,{\varvec{\alpha }}\) between the genotype effects and their pairwise contrasts we obtain formulae (3) and (5).

Table 7 Optimal numbers of locations per sub-region and efficiency of balanced design compared to optimal designs with respect to the standard A-criterion in the FA model (late maturity) for different values of the total number of locations J and the error variance \(\sigma ^2\).
Table 8 Optimal numbers of locations per sub-region and efficiency of balanced design compared to optimal designs with respect to the standard A-criterion in the FA model (medium maturity) for different values of the total number of locations J and the error variance \(\sigma ^2\).

Sensitivity Analysis

1.1 Standard A-Criterion

We take values of the covariance matrix \(\mathbf {V}\) from Tables 3, 4 and 5 in Kleinknecht et al. (2013) for late, medium and early maturity. Tables 7, 8 and 9 summarize the results for optimal designs for the standard A-criterion in FA model for late, medium, and early maturity, respectively.

Table 9 Optimal numbers of locations per sub-region and efficiency of balanced design compared to optimal designs with respect to the standard A-criterion in the FA model (early maturity) for different values of the total number of locations J and the error variance \(\sigma ^2\).
Table 10 Optimal numbers of locations per sub-region and efficiencies of balanced and weighted designs compared to optimal designs with respect to the weighted A-criterion for the FA model (late maturity) for different values of the total number of locations J and the error variance \(\sigma ^2\).
Table 11 Optimal numbers of locations per sub-region and efficiencies of balanced and weighted designs compared to optimal designs with respect to the weighted A-criterion for the FA model (medium maturity) for different values of the total number of locations J and the error variance \(\sigma ^2\).
Table 12 Optimal numbers of locations per sub-region and efficiencies of balanced and weighted designs compared to optimal designs with respect to the weighted A-criterion for the FA model (early maturity) for different values of the total number of locations J and the error variance \(\sigma ^2\).
Fig. 1
figure 1

Efficiencies \(\mathrm {Eff}_{a,P}\) (dark dots) \(\mathrm {Eff}_{a,\ell }\) (big light dots) and \(\mathrm {Eff}_{e,P}\) (small light dots) in dependence on the total number of allocations J for weighted A-criterion in the CS model for late (left panel), medium (middle panel) and early (right panel) maturity for \(\sigma ^2=50\) (first row), \(\sigma ^2=200\) (second row) and \(\sigma ^2=400\) (third row)

1.2 Weighted A-Criterion

Tables 10, 11, and 12 summarize the results optimal designs with respect to the weighted A-criterion in the FA model for late, medium, and early maturity, respectively.

Figure 1 illustrates the behavior of efficiencies of balanced and weighted designs with respect to optimal approximate and exact designs (\(\mathrm {Eff}_{a,P}\), \(\mathrm {Eff}_{a,\ell }\) and \(\mathrm {Eff}_{e,P}\) as in Sect. 4) in dependence on the total number of allocations J for weighted A-criterion in the CS model. For J, we considered all multiples of 5 between 15 and 200. The error variance is fixed at \(\sigma ^2=50\), \(\sigma ^2=200\) and \(\sigma ^2=400\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Prus, M., Piepho, HP. Optimizing the Allocation of Trials to Sub-regions in Multi-environment Crop Variety Testing. JABES 26, 267–288 (2021). https://doi.org/10.1007/s13253-020-00426-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13253-020-00426-y

Keywords

Navigation