Visual Model Fit Estimation in Scatterplots and Distribution of Attention
Influence of Slope and Noise Level
Abstract
Abstract. Scatterplots are ubiquitous data graphs and can be used to depict how well data fit to a quantitative theory. We investigated which information is used for such estimates. In Experiment 1 (N = 25), we tested the influence of slope and noise on perceived fit between a linear model and data points. Additionally, eye tracking was used to analyze the deployment of attention. Visual fit estimation might mimic one or the other statistical estimate: If participants were influenced by noise only, this would suggest that their subjective judgment was similar to root mean square error. If slope was relevant, subjective estimation would mimic variance explained. While the influence of noise on estimated fit was stronger, we also found an influence of slope. As most of the fixations fell into the center of the scatterplot, in Experiment 2 (N = 51), we tested whether location of noise affects judgment. Indeed, high noise influenced the judgment of fit more strongly if it was located in the middle of the scatterplot. Visual fit estimates seem to be driven by the center of the scatterplot and to mimic variance explained.
References
2018). Why scatter plots suggest causality, and what we can do about it. ArXiv. https://arxiv.org/abs/1809.09328
(2010). Scene and screen center bias early eye movements in scene viewing. Vision Research, 50(23), 2577–2587. 10.1016/j.visres.2010.08.016
(1979). The perception of Pearson product moment correlations from bivariate scatterplots. Personnel Psychology, 32(2), 313–325. 10.1111/j.1744-6570.1979.tb02137.x
(1992). Observations, theories and the evolution of the human spirit, Philosophy of Science, 59(4), 590–611. 10.1086/289697
(2012). The theory ladenness of the mental processes used in the scientific enterprise: Evidence from cognitive psychology and the history of science. In R. W. ProctorE. J. Capaldi (Eds.), Psychology of science: Implicit and explicit processes psychology of science: Implicit and explicit processes (pp. 289–334). Oxford University Press. 10.1093/acprof:oso/9780199753628.003.0013
(1982). Variables on scatterplots look more highly correlated when the scales are increased. Science, 216(4550), 1138–1141. 10.1126/science.216.4550.1138
(1988). Statistical power analysis for the behavioral sciences. Routledge. 10.4324/9780203771587
(2009). Variation in scatterplot displays. Behavior Research Methods, 41(1), 55–60. 10.3758/BRM.41.1.55
(2018). Refining the law of practice. Psychological Review, 125(4), 592–605. 10.1037/rev0000105
(2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149–1160. 10.3758/brm.41.4.1149
(2005). The early origins and development of the scatterplot. Journal of the History of the Behavioral Sciences, 41(2), 103–130. 10.1002/jhbs.20078
(2016). Perception of bar graphs—A biased impression? Computers in Human Behavior, 59, 67–73. 10.1016/j.chb.2016.01.036
(2000). The power law repealed: The case for an exponential law of practice. Psychonomic Bulletin & Review, 7(2), 185–207. 10.3758/BF03212979
(2008). The whole is equal to the sum of its parts: A probabilistic model of grouping by proximity and similarity in regular patterns. Psychological Review, 115(1), 131–154. 10.1037/0033-295X.115.1.131
(1985). Judging the relatedness of variables: The psychophysics of covariation detection. Journal of Experimental Psychology: Human Perception and Performance, 11(5), 640–649. 10.1037/0096-1523.11.5.640
(1989). Density in scatterplots and the estimation of correlation. Behaviour & Information Technology, 8(3), 235–244. 10.1080/01449298908914554
(2003). Using confidence intervals for graphically based data interpretation. Canadian Journal of Experimental Psychology, 57(3), 203–220. 10.1037/h0087426
(1992). Estimating correlations from scatterplots. Human Factors: The Journal of the Human Factors and Ergonomics Society, 34(3), 335–349. 10.1177/001872089203400307
(1997). Correlation estimates as perceptual judgments. Journal of Experimental Psychology: Applied, 3(1), 3–20. 10.1037/1076-898X.3.1.3
(2002). Four assumptions of multiple regression that researchers should always test. Practical Assessment, Research and Evaluation, 8(2), 1–5. 10.7275/r222-hv23
(1999). Theories of automaticity and the power law of practice. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25(2), 543–551. 10.1037/0278-7393.25.2.543
(2002). Toward a method of selecting among computational models of cognition. Psychological Review, 109(3), 472–491. 10.1037/0033-295x.109.3.472
(2019). Visual model fit estimation in scatterplots and distribution of attention: Influence of slope and noise level. 10.17605/OSF.IO/TG62S
. (2017). The nature of correlation perception in scatterplots. Psychonomic Bulletin & Review, 24(3), 776–797. 10.3758/s13423-016-1174-7
(2010). The perception of correlation in scatterplots. Computer Graphics Forum, 29(3), 1203–1210. 10.1111/j.1467-8659.2009.01694.x
(2000). How persuasive is a good fit? A comment on theory testing. Psychological Review, 107(2), 358–367. 10.1037/0033-295x.107.2.358
(2018). Scatterplots: Tasks, data, and designs. IEEE Transactions on Visualization and Computer Graphics, 24(1), 402–412. 10.1109/TVCG.2017.2744184
(2003). Construction and interference in learning from multiple representation. Learning and Instruction, 13(2), 141–156. 10.1016/S0959-4752(02)00017-8
(2005). Evaluating goodness-of-fit in comparison of models to data. In W. Tack (Ed.), Psychologie der Kognition: Reden und Vorträge anlässlich der Emeritierung von Werner Tack (pp. 115–154). University of Saarland Press.
(2000). Scientific graphs and the hierarchy of the sciences: A Latourian survey of inscription practices. Social Studies of Science, 30(1), 73–94. 10.1177/030631200030001003
(2002). Constructing knowledge: The role of graphs and tables in hard and soft psychology. American Psychologist, 57(10), 749–761. 10.1037/0003-066X.57.10.749
(1978). Underestimating correlation from scatterplots. Applied Psychological Measurement, 2(4), 543–550. 10.1177/014662167800200409
(2007). The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions. Journal of Vision, 7(14), 4, 1–20. 10.1167/7.14.4
(2003). How many parameters does it take to fit an elephant? Journal of Mathematical Psychology, 47(5–6), 580–586. 10.1016/S0022-2496(03)00064-6
(2015). Visualizing statistical models: Removing the blindfold. Statistical Analysis and Data Mining: The ASA Data Science Journal, 8(4), 203–225. 10.1002/sam.11271
(1997). Genuine power curves in forgetting: A quantitative analysis of individual subject forgetting functions. Memory & Cognition, 25(5), 731–739. 10.3758/BF03211316
(2019). Correlation judgment and visualization features: A comparative study. IEEE Transactions on Visualization and Computer Graphics, 25(3), 1474–1488. 10.1109/TVCG.2018.2810918
(