Skip to main content

Advertisement

Log in

An updated paradigm for evaluating measurement invariance incorporating common method variance and its assessment

  • Methodological Paper
  • Published:
Journal of the Academy of Marketing Science Aims and scope Submit manuscript

Abstract

Measurement invariance is necessary before any substantive cross-national comparisons can be made. The statistical workhorse for conducting measurement invariance analyses is the multigroup confirmatory factor analysis model. This model works well if a few items exhibit clearly differential item functioning, but it is not able to capture, model, and control for measurement bias that affects all items, i.e., this model cannot account for common method variance. The presence of common method variance in cross-national data leads to poorly fitting models which in turn often results in biased, if not incorrect, results. We introduce a procedure to analyze and control for common method variance in one’s data, based on a series of factor analysis models with a random intercept. The modeling framework yields constructs and factor scores free of method effects. We use marker variables to support the validity of the interpretation of the random intercept as method factor. An empirical application dealing with material values in Spain, the UK, and Brazil is provided. We compare results with those obtained for the standard multigroup confirmatory factor analysis model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. https://www.andiamo.co.uk/resources/expansion-and-contraction-factors/

  2. The cross-national measurement equivalence and validity of “objective” data such as scanner data, company data, and government statistics is not guaranteed either, although this issue is often ignored international (and domestic) research. Retail scanner data can e.g., suffer from cross-national differences in coverage of the population (rural vs. urbanized areas) and of distribution channels (e.g., mom-and-pop stores, hard discounters). Company data can be manipulated and reporting regimes differ between countries. Official statistics can be highly unreliable (one early landmark study is Chambliss and Nagasawa 1969) and possibly manipulated (this lies at the heart of recent crises in the Eurozone). Even the assessment of a basic measure like GDP is fraud with problems, including differences in the hidden (unreported) economy (e.g., Schneider et al. 2010). Psychometric models can also be used for validity assessment of objective data, such as GDP. An interesting application is El Aziz et al. (2019), who use a Multiple Indicators Multiple Causes model (Jöreskog and Goldberger 1975) to estimate the size of the shadow economy of nine countries in the Middle East and North Africa. However, problems with equivalence of supposedly objective data is outside the scope of this article.

  3. Since the models are nested, one could also use the χ2 difference test. However, with large samples the difference in χ2 is almost always significant, so it is not particularly useful for invariance testing.

  4. Hulland et al. (2018) recommend .50 as cutoff for salient loadings, but this is based on the standard CFA model in which method variance often upwardly biases the factor loadings (Baumgartner and Steenkamp 2006a,b). In our experience, a cutoff of .50 is also too stringent in cross-national research.

  5. This means that by definition, sociodemographics are not marker variables, and should not be used as such.

  6. We have encountered few instances in which the correlation with none of the marker items does not exceed this cutoff. In our experience, the prime culprit is the use of suboptimal marker variables—the chosen markers are not conceptually unrelated to the construct and/or are not similar to the other survey items in content and format. If the marker variable cannot be changed, we suggest investigating whether a more precise measurement model (adding correlated errors, not necessarily invariant across countries) solves the problem.

  7. These models can also be estimated using any other SEM package, such as LISREL, EQS and Amos; it can also be estimated using R routines such as Lavaan. However, Mplus has emerged as the most powerful software program for psychometric analyses. Excellent non-technical books introducing Mplus to applied users are Byrne (2012) and Wang and Wang (2020). Both books include multi-group testing.

  8. Statistical significance says relatively little given the large sample sizes.

  9. Metric invariance applies to unstandardized factor loadings. Due to cross-national differences in item and factor variances, standardized factor loadings can still be different. Our interest here is in comparing the magnitude of substantive and method loadings within countries, for which we need to use standardized factor loadings since only in Spain, substantive and method variances are equal (constrained to one).

  10. Based on the invariant unstandardized parameter estimates.

  11. Convergent evidence for this explanation comes from a regression of the social desirability marker on age. In all three countries, socially desirable responding increased with age—for every 10 years in age, the predicted rating increases between .16 and .35 scale points. Older people score higher on this marker and this marker exhibits a strong negative correlation with the method factor, contributing to why older people score lower on the method factor. This is additional evidence that the random intercept captures method variance.

  12. A supplementary regression validated this finding. Women score on average between .33 and .60 scale points higher on the social desirability marker than men. Additionally, women score between .13 and .49 points lower on “Blue,” which is positively related to the method factor, and hence point in the same direction as the SDR marker.

  13. Note that plus or minus is arbitrary. It makes no difference to give all negatively (positively)-worded items a starting value of +1 (−1).

  14. Mplus does not report the unrotated solution. To obtain a solution similar to the unrotated 2-factor solution, one needs to use a 2-factor bi-factor model (Jennrich and Bentler 2011). Principal component analysis is advised for short scales (6 items or less).

  15. It is not necessary to split the set in positively versus negatively loading items. If there are very few negatively loading items on the (m + 1)th factor, assign small positive loadings also the value of −1. The purpose here is simply to direct the program towards groups of items which exhibit different method loadings on average. Recall that this is merely a device to help with model convergence. It will not affect model fit.

  16. The fact that there is an equal number of positive and negative starting values is a coincidence. There is no need for an equal number of positive and negative starting values. Note that it makes no difference which group of items is assigned +1 versus −1.

References

  • Alden, D. L., Steenkamp, J.-B. E. M., & Batra, R. (2006). Consumer attitudes toward marketplace globalization: Structure, antecedents, and consequences. International Journal of Research in Marketing, 23(September), 227–239.

    Article  Google Scholar 

  • Baumgartner, H., & Steenkamp, J.-B. E. M. (1998). Multi-group latent variable models for varying numbers of items and factors with cross-national and longitudinal applications. Marketing Letters, 9(1), 21–35.

    Article  Google Scholar 

  • Baumgartner, H., & Steenkamp, J.-B. E. M. (2001). Response styles in marketing research: A cross-National Investigation. Journal of Marketing Research, 38(May), 143–156.

    Article  Google Scholar 

  • Baumgartner, H., & Steenkamp, J.-B. E. M. (2006a). An extended paradigm for measurement analysis applicable to panel data. Journal of Marketing Research, 43(August), 431–442.

    Article  Google Scholar 

  • Baumgartner, H., & Steenkamp, J.-B. E. M. (2006b). Response Biases in Marketing Research. In R. Grover & M. Vriens (Eds.), Handbook of marketing research (pp. 95–109). Thousand Oaks, CA: Sage.

    Chapter  Google Scholar 

  • Baumgartner, H., & Weijters, B. (2015). Response biases in cross-cultural measurement. In S. Ng & A. Y. Lee (Eds.), Handbook of culture and consumer behavior (pp. 150–180). Oxford: Oxford University Press.

    Chapter  Google Scholar 

  • Bearden, W. O., Netemeyer, R. G., & Haws, K. L. (2011). Handbook of marketing scales (3rd ed.). Thousand Oaks: Sage.

    Google Scholar 

  • Berry, J. W. (1969). On cross-cultural comparability. International Journal of Psychology, 4(2), 119–128.

    Article  Google Scholar 

  • Byrne, B. M. (2012). Structural equation modeling with Mplus. New York: Routledge.

    Google Scholar 

  • Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105(3), 456–466.

    Article  Google Scholar 

  • Chambliss, W., & Nagasawa, R. H. (1969). On the validity of official statistics: A comparative study of White, black, and Japanese high-school boys. Journal of Research in Crime and Delinquency, 6(1), 71–77.

    Article  Google Scholar 

  • Chen, F. F. (2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling, 14(3), 464–504.

    Article  Google Scholar 

  • Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of fit indexes for testing measurement invariance. Structural Equation Modeling, 9(2), 233–255.

    Article  Google Scholar 

  • De Jong, M. G., & Steenkamp, J. B. E. M. (2010). Finite mixture multilevel multidimensional ordinal IRT models for large scale cross-cultural research. Psychometrika, 75(March), 3–32.

    Article  Google Scholar 

  • De Jong, M., Steenkamp, J.-B. E. M., & Fox, J.-P. (2007). Relaxing measurement invariance in cross-National Consumer Research using a hierarchical IRT model. Journal of Consumer Research, 34(August), 260–278.

    Article  Google Scholar 

  • De Jong, M., Steenkamp, J.-B. E. M., Fox, J.-P., & Baumgartner, H. (2008). Using item response theory to measure extreme response style in marketing research: A global investigation. Journal of Marketing Research, 45(February), 104–115.

    Article  Google Scholar 

  • Dong, B., Zhou, S., & Taylor, C. R. (2008). Factors that influence multinational corporations’ control of their operations in foreign markets: An empirical investigation. Journal of International Marketing, 16(1), 98–119.

    Article  Google Scholar 

  • El Aziz, M., Magdy, M. A., & Zaki, I. M. (2019). Estimating the size of the shadow economy in nine MENA countries during the period 2000 to 2017 using the MIMIC model. Open Access Library Journal, 6, e5508.

    Google Scholar 

  • Griffin, M., Babin, B. J., & Christensen, F. (2004). A cross-cultural investigation of the materialism construct: Assessing the Richins and Dawson’s materialism scale in Denmark, France and Russia. Journal of Business Research, 57, 893–900.

    Article  Google Scholar 

  • Hayes, R. D., Hayashi, T., & Stewart, A. L. (1989). A five-item measure of socially desirable response set. Educational and Psychological Measurement, 49, 629–636.

    Article  Google Scholar 

  • He, Y., Merz, M. A., & Alden, D. L. (2008). Diffusion of measurement invariance assessment in cross-National Empirical Marketing Research: Perspectives from the literature and a survey of researchers. Journal of International Marketing, 16(2), 64–83.

    Article  Google Scholar 

  • Heinberg, M., Erkan Ozkaya, H., & Taube, M. (2016). A brand built on sand: Is acquiring a local brand in an emerging market an ill-advised strategy for foreign companies? Journal of the Academy of Marketing Science, 44, 586–607.

    Article  Google Scholar 

  • Heinberg, M., Katsikeas, C. S., Erkan Ozkaya, H., & Taube, M. (2020). How nostalgic brand positioning shapes brand equity: Differences between emerging and developed markets. Journal of the Academy of Marketing Science, 48(September), 869–890.

    Article  Google Scholar 

  • Hoppner, J. J., Griffith, D. A., & White, R. C. (2015). Reciprocity in relationship marketing: A cross-cultural examination of the effects of equivalence and immediacy on relationship quality and satisfaction with performance. Journal of International Marketing, 23(4), 64–83.

    Article  Google Scholar 

  • Horn, J. L., & McArdle, J. J. (1992). A practical and theoretical guide to measurement invariance in aging research. Experimental Aging Research, 18(3), 117–144.

    Article  Google Scholar 

  • Hu, L.-T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55.

    Article  Google Scholar 

  • Hulland, J., Baumgartner, H., & Smith, K. M. (2018). Marketing survey research best practices: Evidence and recommendations from a review of JAMS articles. Journal of the Academy of Marketing Science, 46, 92–108.

    Article  Google Scholar 

  • Jennrich, R. I., & Bentler, P. M. (2011). Exploratory bi-factor analysis. Psychometrika, 76(4), 537–549.

    Article  Google Scholar 

  • Jong, D., Martijn, G., Fox, J.-P., & Steenkamp, J.-B. E. M. (2015). Quantifying under- and over-reporting in surveys through a dual questioning technique design. Journal of Marketing Research, 52(December), 737–753.

    Article  Google Scholar 

  • Jöreskog, K. G., & Goldberger, A. S. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association, 70(351), 631.

    Article  Google Scholar 

  • Lindell, M. K., & Whitney, D. J. (2001). Accounting for common method variance in cross-sectional research designs. Journal of Applied Psychology, 86(1), 114–121.

    Article  Google Scholar 

  • Lipovčan, L. K., Prizmić-Larsen, Z., & Brkljačić, T. (2015). Materialism, affective states, and life satisfaction: Case of Croatia. Springer Plus, 4, 699.

    Article  Google Scholar 

  • MacKenzie, S. B., & Podsakoff, P. M. (2012). Common method bias in marketing: Causes, mechanisms, and procedural remedies. Journal of Retailing, 88(4), 542–555.

    Article  Google Scholar 

  • Maydeu-Olivares, A. (2017a). Maximum likelihood estimation of structural equation models for continuous data: Standard errors and goodness of fit. Structural Equation Modeling, 24(3), 383–394.

    Article  Google Scholar 

  • Maydeu-Olivares, A. (2017b). Assessing the size of model misfit in structural equation models. Psychometrika, 82(3), 533–558.

    Article  Google Scholar 

  • Maydeu-Olivares, A., & Coffman, D. L. (2006). Random intercept item factor analysis. Psychological Methods, 11(4), 344–362.

    Article  Google Scholar 

  • McDonald, R. P. (1985). Factor analysis and related methods. Hillsdale: Erlbaum.

    Google Scholar 

  • Müller, A., Smits, D. J. M., Claes, L., Gefeller, O., Hinz, A., & de Zwaan, M. (2013). The German version of the material values scale. GMS Psycho-Social-Medicine, 10, 1–9.

    Google Scholar 

  • Muthén, L. K., & Muthén, B. O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 9(4), 599–620.

    Article  Google Scholar 

  • Muthén, L. K., & Muthén, B. (2017). MPLUS 8 [computer program]. Los Angeles: Muthén & Muthén.

    Google Scholar 

  • Paulhus, D. L. (1991). Measurement and control of response Bias. In J. P. Robinson, P. R. Shaver, & L. S. Wright (Eds.), Measures of personality and social psychological attitudes (pp. 17–59). San Diego: Academic.

    Chapter  Google Scholar 

  • Podsakoff, P. M., MacKenzie, S. B., Lee, J.-Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: A critical review of the literature and recommended remedies. Journal of Applied Psychology, 88(October), 879–903.

    Article  Google Scholar 

  • Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method Bias in social science research and recommendations on how to control it. Annual Review of Psychology, 63(1), 539–569.

    Article  Google Scholar 

  • Richins, M. L. (2004). The material values scale: Measurement properties and development of a short form. Journal of Consumer Research, 31(March), 209–219.

    Article  Google Scholar 

  • Richins, M. L., & Dawson, S. (1992). A consumer values orientation for materialism and its measurement: Scale development and validation. Journal of Consumer Research, 19(December), 303–316.

    Article  Google Scholar 

  • Ruvio, A., Somer, E., & Rindfleisch, A. (2014). When bad gets worse: The amplifying effect of materialism on traumatic stress and maladaptive consumption. Journal of the Academy of Marketing Science, 42, 90–101.

    Article  Google Scholar 

  • Schneider, F., Buehn, A., & Montenegro, C. E. (2010). New estimates for the shadow economies all over the world. International Economic Journal, 24(4), 443–461.

    Article  Google Scholar 

  • Sharma, P. (2010). Measuring personal cultural orientations: Scale development and validation. Journal of the Academy of Marketing Science, 38, 787–806.

    Article  Google Scholar 

  • Simmering, M. J., Fuller, C. M., Richardson, H. A., Ocal, Y., & Atinc, G. M. (2015). Marker variable choice, reporting, and interpretation in the detection of common method variance: A review and demonstration. Organizational Research Methods, 18(3), 473–511.

    Article  Google Scholar 

  • Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal, and structural equation models. Boca Raton: CRC Press.

    Book  Google Scholar 

  • Steenkamp, J.-B. E. M. (2005). Moving out of the U.S. Silo: A call to arms for conducting international marketing research. Journal of Marketing, 69(October), 6–8.

    Google Scholar 

  • Steenkamp, J.-B. E. M., & Baumgartner, H. (1998). Assessing measurement invariance in cross-National Consumer Research. Journal of Consumer Research, 25(June), 78–90.

    Article  Google Scholar 

  • Steenkamp, J.-B. E. M., & De Jong, M. G. (2010). A global investigation into the constellation of consumer attitudes toward global and local products. Journal of Marketing, 74(November), 18–40.

    Article  Google Scholar 

  • Steenkamp, J.-B. E. M., & Geyskens, I. (2014). Manufacturer and retailer strategies to impact store brand share: Global integration, local adaptation, and worldwide learning. Marketing Science, 33(January–February), 6–26.

    Article  Google Scholar 

  • Steenkamp, J.-B. E. M., & Ter Hofstede, F. (2002). International market segmentation: Issues and perspectives. International Journal of Research in Marketing, 19(September), 185–213.

    Article  Google Scholar 

  • Steenkamp, J.-B. E. M., De Jong, M. G., & Baumgartner, H. (2010). Socially desirable response tendencies in survey research. Journal of Marketing Research, 47(April), 199–214.

    Article  Google Scholar 

  • Strizhakova, Y., & Coulter, R. A. (2013). The “green” side of materialism in emerging BRIC and developed markets: The moderating role of global cultural identity. International Journal of Research in Marketing, 30(1), 69–82.

    Article  Google Scholar 

  • Swoboda, B., Puchert, C., & Morschett, D. (2016). Explaining the differing effects of corporate reputation across nations: A multilevel analysis. Journal of the Academy of Marketing Science, 44, 454–473.

    Article  Google Scholar 

  • Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3(1), 4–70.

    Article  Google Scholar 

  • Wang, J., & Wang, X. (2020). Structural equation modeling: Applications using Mplus (2nd ed.). New York: Wiley.

    Google Scholar 

  • Zielke, S., & Komor, M. (2015). Cross-national differences in price-role orientation and their impact on retail markets. Journal of the Academy of Marketing Science, 43, 159–180.

    Article  Google Scholar 

Download references

Acknowledgements

We thank AiMark for providing the data, and thank the editor, John Hulland, and the three anonymous reviewers for their constructive suggestions, which have significantly improved the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan-Benedict E.M. Steenkamp.

Additional information

John Hulland served as Editor for this article.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Mplus code for models estimated in paper

Words in italics preceded by an exclamation mark are explanatory comments.

figure afigure a

Appendix 2: Starting values for random-intercept factor

In our experience, model convergence is often improved if the researcher gives a positive (+1) or negative (−1) starting value for each random-intercept loading. One common scenario is a uni- or multidimensional measurement instrument which contains positively and negatively worded items. In this case, a straightforward approach is to give all positively worded items a starting value of +1 and all negatively worded items a starting value of −1.Footnote 13

A second common scenario is a unidimensional measurement instrument or a multidimensional instrument where the dimensions are strongly correlated with each other, and all items are positively worded. In that case, we recommend to apply exploratory factor analysis or principal component analysis to the data, extract two factors, and examine the unrotated solution.Footnote 14 The first factor will contain high loadings for all items while the second factor will capture differences in groups of items.

A third common scenario is one where the researcher has instruments for m (m > 1) distinct constructs, which exhibit low factor correlations, and all items are positively worded. In this case, we recommend exploratory factor analysis with orthogonal rotation, extracting m + 1 factors. If the factor structure is decent, m factors will present the m constructs and the extra factor (this is not necessarily the (m + 1)th factor in the output) will model excess covariation. Identify positive and negative item loadings on that factor and use this to set starting values to +1 and − 1 respectively.Footnote 15

Note that a positive or negative starting value does not mean that the actual estimated loading is positive or negative. What it does indicate is that the researcher expects that the group of items that receive positive starting values likely has different method loadings (in direction or magnitude) compared to the group of items that receive negative starting values. Incorporating this information in the starting values improves the likelihood of convergence.

The application reported in this article falls within the second scenario described. In our application, we analyzed the MVS scale with exploratory bi-factor analysis extracting two factors to determine the direction of the starting values for method loadings. We observed in all three countries different signs on the second factor for MVS1, 2, 3, 6, and 7 versus MVS5, 7, 8, and 9. To illustrate, Table 13 provides the bi-factor loadings for the UK. Hence we gave a starting value of +1 to the first set of items and a starting value of −1 to the second set of items.Footnote 16 This was indeed necessary as the RICFA model in which all starting values are set at +1 did not converge.

Table 13 Bi-factor analysis on MVS items in the UK

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Steenkamp, JB.E., Maydeu-Olivares, A. An updated paradigm for evaluating measurement invariance incorporating common method variance and its assessment. J. of the Acad. Mark. Sci. 49, 5–29 (2021). https://doi.org/10.1007/s11747-020-00745-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11747-020-00745-z

Keywords

Navigation