Introduction

What makes a scene pleasant or unpleasant to look at? Surely a large part of the answer reflects not the scene itself but how it interacts with the perceiver in idiosyncratic ways, driven by both cultural conventions and personal history. For example, you may like an arrangement of objects in a scene because it reminds you of a particular setting from your childhood. But another part of the answer may have more to do with intrinsic properties of the scenes themselves, in a way that reflects more universal aspects of visual processing. Indeed, some such aesthetic properties may even be extracted automatically as a part of perception itself, such that almost every percept we have is more or less pleasing, to some degree. Here we focus on an especially robust aesthetic regularity of this type: the ‘inward bias’.

The inward bias in aesthetics and perception

When a framed image contains a figure (such as a face in profile), people tend to prefer compositions in which the figure faces inward — i.e., toward the center of the frame, with more space in the front of the figure than behind it (as in Fig. 1a vs. 1b). This inward bias is extremely pervasive, and it occurs across wide variations in types of scenes — including both single objects (Palmer et al., 2008) and arrangements of multiple objects (Leyssen et al., 2012); both static images and dynamic movies (Bode et al., 2016); both horizontal frames (Palmer et al., 2008) and vertical (and circular) frames (Chen & Scholl, 2014; Sammartino & Palmer, 2012); and with both meaningless geometric shapes (Chen & Scholl, 2014; Guidi & Palmer, 2015) and meaningful images of people (Chen et al., 2018), animals (Bertamini et al., 2011), and even man-made artifacts (Palmer & Langlois, 2017). These effects occur with the real-world composition of elements in artificial frames in both photography (Gardner et al., 2008; Palmer et al., 2008) and painting (Bertamini et al., 2011) — but they presumably also reflect how we choose to view scenes through the “frame” of our own field of view.

Fig. 1
figure 1

Examples of the inward bias: In aesthetic preferences, observers prefer compositions in which objects in a frame are seen as facing inward (a) rather than outward (b). In perception, observers tend to see the interpretation of an ambiguous figure that makes it face inward (here the rabbit rather than the duck) (c)

One hint that the inward bias arises as a part of perception itself is that it may also influence other aspects of our visual experience. When a figure with an ambiguous orientation is placed off-center in a frame — for example, the duck/rabbit figure in Fig. 1c — people tend to perceive the figures as facing inward in both their initial and most dominant percepts (Chen & Scholl, 2014). All of these factors — the apparent universality, the wide generalization across contexts, and the deeper links with other themes from the study of vision — make the inward bias an especially appealing case study of ‘aesthetic perception’.

Centers versus borders

What underlying factors might explain the inward bias? In an operationalized sense, there seem to be only two categories of possible explanations — those that posit some special role for the center of the image, and those that posit some special role for the borders of the frame.

The ‘affordance-space’ account is an example of the former category: according to this view (Sammartino & Palmer, 2012; for a general discussion of the role of center in aesthetics, see Arnheim, 1974, 1988), people prefer compositions in which what is centered is not the object itself, but rather the affordance space that surrounds it — that space that “reflects the extent and/or importance of functions that take place in that region around the object” (Sammartino & Palmer, 2012, p. 876). In this view, the prioritized role of the center is often made explicit; for example, it is proposed that “people prefer pictures in which the affordance space, rather than the physical extent of the object, is centered within the frame” (Palmer & Langlois, 2017, p. 817, emphasis added). This affordance space is typically asymmetric, in that it extends farther from the functional side of the object (typically its front) than from its other side. For example, the affordance space of a chair extends further from its front than from its back, since the front is more functionally important. Critically, the affordance space itself is defined only in terms of the object(s), regardless of the frame — and so, for example, this view must predict that an object would be preferred in the same absolute location in a narrow frame or in a wide frame (as in Fig. 2), as long as those frames shared the same center point.

Fig. 2
figure 2

An example of the manipulation of frame width as employed in the present study: do observers’ preferred placements of the silhouetted figure depend on how wide the frame is?

In contrast, the ‘looking-into-the-future’ account emphasizes the roles of the frame’s borders, without any special role for the center. According to this view (Chen et al., 2018), people prefer compositions in which they would be able to witness predicted future events. In an important sense, this view suggests that the ‘inward bias’ is really a ‘not-outward bias,’ since the relevant aesthetic regularity is that people dislike perspectives on scenes that would result in such events taking place out of view. If a cat is likely to move in the way that it is facing, for example, then it would stay in view in Fig. 1a, but would immediately leave our field of view in Fig. 1b. Critically, this looking-into-the-future account never references the frame’s center at all, but it depends very much on the absolute locations of the frame’s borders — since the predicted movements of a figure facing outward near the border of a narrow frame (as in Fig. 2a) would be more likely missed than would a figure in that same position within a wider frame (as in Fig. 2b).

The current study: Varying the frame width

Per the descriptions above, center-based accounts (such as the affordance-space view) and border-based accounts (such as the looking-into-the-future view) make contrasting predictions about what should happen when manipulating a simple property that to our knowledge has never before been systematically explored with the inward bias: the width of the frame itself. In particular, center-based accounts would predict no difference in preferred placements for profiles such as those in Fig. 2a and b (since the centers are both equated), while border-based accounts would predict different placements (since the borders are much nearer in Fig. 2a than in Fig. 2b). The looking-into-the-future account, in particular, would predict preferred placements in which there is roughly the same ratio of space in front of versus behind the profile for both of the frame widths depicted in Fig. 2 (perhaps reflecting the relative likelihood of the person moving forwards vs. backwards).Footnote 1 Here we test these predictions directly.

Experiment 1: Wide versus narrow frames

Observers each viewed a single framed face (in profile) on an empty background (as in Fig. 2), and they adjusted its position in the image in order to maximize the image’s aesthetic appeal (a task adapted from Palmer et al., 2008; see also Chen et al., 2018). Unbeknownst to the observers, half of them saw the face in a narrow frame, and the other half saw it in a wide frame — where these two frames shared the same central position on the display.

Method

Participants

200 observers (127 females, 72 males, one other) in the USA were recruited through Amazon Mechanical Turk (MTurk). This sample size was arbitrarily chosen before data collection began and was fixed to be identical across both experiments reported here. In practice, it gave us an average of 97.42% power to detect the effect sizes of the two key wide-versus-narrow-frame effects reported here. Each observer completed a single trial in a 1- to 3-min session in exchange for monetary compensation, with worker ID screening ensuring that individuals could not participate more than once. (For a discussion of this pool’s nature and reliability, see Crump et al., 2013. All observers had at least an 80% MTurk task approval rate and had previously completed at least 50 MTurk tasks.)

Apparatus

Observers completed the experiment via a custom webpage (written in a combination of HTML, JavaScript, CSS, and PHP), which could be loaded in a web browser on their own laptop or desktop computers. Because our experiment required a relatively large display and the use of a computer mouse to make responses, mobile devices such as phones and tablet computers were explicitly disallowed, with attempts to access the experiment from such a device immediately terminating the experiment. Since the website was rendered on observers’ own devices, viewing distance, screen size, and display resolutions could vary substantially, and it is impossible to report sizes in visual angles. We instead report stimulus dimensions using pixel (px) values.

Stimuli and procedure

Each observer viewed an image, centered in their browser window with a uniform gray background (HEX color #B4C0CA), that consisted of a black silhouette of a woman’s profile (112 px × 224 px) embedded in the center of a black rectangular frame with a white background (as depicted in Fig. 2). The woman was either Left-Facing or Right-Facing (implemented by simply mirroring the same figure), and the frame was either Narrow (400 px × 300 px) or Wide (700 px × 300 px). Each of these four images (two facing directions × two frame widths) was viewed by 50 unique observers.

Observers used their mouse cursor to adjust the horizontal position of the woman in the frame until they thought that “the whole composition looks the most visually pleasing,” and then they clicked a button to submit their responses.

Before the single experimental trial, observers completed an identical practice trial with a symmetrical cartoon “strawberry” image (100 px × 133 px) instead of the woman’s silhouette. After finishing the practice trial, observers were asked about the task to ensure that they understood it. (They answered “What is your task in this experiment?” by selecting one of the four following options: “To position the object in the frame in the way that you think is least visually pleasing”; “To position the object in the center of the frame”; “To position the object in the frame in the way that you think is most visually pleasing”; “Select this option if you do not remember the instructions”.) If the question was answered incorrectly, the instructions and the practice trial were repeated.

After submitting their response, observers were asked a series of debriefing questions, and some were replaced due to their responses to these questions and for other reasons — for indicating that they had not understood the task (n = 9); for taking more than two attempts to correctly answer the post-practice question (n = 14); for failing to provide an MTurk worker ID (n = 2); and for technical problems (n = 1). In total, 20 unique observers (some of whom triggered multiple criteria) were excluded and replaced without ever analyzing their data.

Results

Figure 3a depicts the individual placements (the colored lines) and their averages (the position of the silhouette), separately for the two frame widths and collapsing across the two facing directions (with the Left-Facing trials simply mirrored in these visualizations). (A two-way ANOVA confirmed that there was neither a main effect nor any interaction effect involving facing direction, all ps > .3. As such, all analyses reported below also collapsed across the two facing directions.) These same data are then depicted again in Fig. 3b, but now scaled to the same frame width. Inspection of these figures suggests three salient patterns: First, there was a robust inward bias in all conditions. (This is clear where the silhouettes are all to the left of their frames’ centers). Second, the ratio of space in front of versus behind the silhouette was roughly the same across frame widths. (This is clear from Fig. 3a where the silhouette is further to the left in the Wide frame, compared to the Narrow frame. And it is clear from Fig. 3b where these silhouettes are roughly aligned.) Third, the responses as a whole seemed to split into two groups: one group simply centered the figure (perhaps because that was the easiest way to respond, given that the silhouette always began in the center) and the other group exhibited the typical inward bias. (This is clear from the salient thick stripes of colored lines at the center of each frame, and with the greater concentrations of lines on the left half vs. the right half of each frame.)

Fig. 3
figure 3

Individual and average preferred placements of the profiled face from each condition and experiment (with the Left-Facing responses mirrored so that all data points are represented). Each colored line represents the preferred placement of an individual observer (such that thicker stripes represent more observers), and the silhouette itself is placed at the average preferred position for each frame width. (a) The results of Experiment 1, with the Wide and Narrow frames presented at the ratio of their actual widths across observers. (b) These same results, but now scaled to an equal frame width. (c) The results of Experiment 2, with the Wide and Narrow frames presented at the ratio of their actual widths across observers. (d) These same results, but now scaled to an equal frame width

These impressions were confirmed by the following analyses. First, four one-sample t-tests (one for each of the conditions, each comparing the mean displacements to 0) all showed a robust inward bias (all ts > 4.50, df = 49, ps < .001, ds > 0.65). Second, the absolute displacements (in px) were farther from the center in the Wide frame (M = 109.4, SD = 128.7) compared to the Narrow frame (M = 55.0, SD = 76.3; t(160.96) = 3.64, p < .001, d = 0.51). However, this difference disappeared (as in Fig. 3b) when the displacements were scaled as a percentage of the largest possible displacement from the center (Narrow: M = 38.2%, SD = 53.0%; Wide: M = 37.2%, SD = 43.8%; t(191.18) = 0.14, p = 0.886, d = 0.02). Finally, to quantify the observed two response clusters, we fit Gaussian mixture models to the data, assuming the responses were drawn from two normal distributions with the same standard deviation but different means, with one of the means fixed at 0.Footnote 2 We depict the Right-Facing data in Fig. 4a, and scaled to the same width in Fig. 4b. (We do not depict the Left-Facing data, though it was quantitatively similar.) Inspection of Fig. 4 suggests the same pattern reported earlier: Even when considering only the off-center cluster, the silhouette was placed farther from the center in the Wide frame. (This is apparent in Fig. 4a where the leftmost peak is further to the left for the Wide frame than for the Narrow frame. And this same pattern is apparent in Fig. 4b where the two leftmost peaks are aligned.) This was confirmed by the model estimates (Narrow: Mleft = 104.9 [52.4%], Mright = 101.4 [50.7%], SD = 49.6, with 40.7% of the responses coming from the center distribution; Wide: Mleft = 206.2 [58.9%], Mright = 201.5 [57.6%], SD=71.6, with 44.0% of the responses coming from the center distribution).

Fig. 4
figure 4

The same individual preferred placements from Fig. 3, but with the silhouettes replaced by the estimated density plots from the Gaussian Mixture model (as described in the main text)

Discussion

The empirical upshot of the results can be stated very simply: the frame widths mattered. And the theoretical upshot is then equally clear: these results are consistent with border-based accounts of the inward bias, such as the looking-into-the-future account, and they are inconsistent with center-based accounts, such as the affordance-space account.

Experiment 2: Even wider frames

A central part of the logic of this study is that the affordance-space view predicts no role for frame width when the object remains the same (since the affordance space itself is a function only of the object, and the resulting aesthetic preference is then to center that space in any bounding frame). However, this logic could break down for especially narrow frames — for example, if the centered affordance space would cause the figure to actually extend beyond one of the frame borders. (Or even if the figure did not technically overlap a border, perhaps there is an independent aversion to placing an object too near to it.) These do not strike us as especially plausible scenarios, given the results depicted in Fig. 3. First, if the mean position in the Narrow frame were to be equated to the mean position in the Wide frame, it would still be well within the frame boundaries. Second, even if those individual responses in the Wide frame that were beyond the border of the Narrow frame were excluded, the difference would still hold. Nevertheless, to ensure that neither of these possibilities could explain our results in practice, this second experiment replicated Experiment 1, but now using a new pair of Narrow and Wide frames, both of which were wider than their counterparts in Experiment 1 — and both of which obviated these concerns.

Method

This experiment was identical to Experiment 1, except as noted here. 200 new observers (108 females and 92 males) participated, with this sample size chosen to exactly match that of Experiment 1. We tested a Narrow frame of 670 px and a Wide frame of 1,000 px.Footnote 3

Forty-eight observers were excluded and replaced (some by triggering multiple criteria) — for indicating that they had not understood the task (n = 14); for taking more than two attempts to correctly answer the post-practice question (n = 18); for failing to provide an MTurk worker ID (n = 8); for technical problems (n = 2); for taking longer than 3 min to respond (n = 1); and for not having a large-enough browser window (n = 15).

Results

The individual placements and their averages are depicted (in the same manner as in Experiment 1) in Fig. 3c and d. (A two-way ANOVA again confirmed that there was neither a main effect nor any interaction effect involving facing direction, all ps > .6, and so again all analyses reported below collapsed across the two facing directions.) Inspection of these figures suggests three salient patterns: First, there was again a robust inward bias in all conditions. (This is clear in the figure where the silhouettes are again all to the left of their frames’ centers.) Second, the frame width again influenced the preferred placements — but unlike Experiment 1, the ratio of space in front of versus behind the silhouette does not appear the same across frame widths. (This is clear from Fig. 3: although the silhouette is again further to the left in the Wide frame compared to the Narrow frame in Fig. 3c — to a greater degree than in Experiment 1 — these silhouettes are not aligned in Fig. 3d.) Third, the responses as a whole again seemed to split into the same two center and inward-bias groups. (This is clear again from the salient thick stripes of colored lines at the center of each frame of Fig. 3c and d, and with the greater concentrations of lines on the left half vs. the right half of each of these frames.)

These impressions were confirmed by the following analyses. First, four one-sample t-tests (one for each of the conditions, each comparing the mean displacements to 0) all showed a robust inward bias (all ts > 2.85, df = 49, ps < .007, ds > 0.40). Second, the absolute displacements (in px) were farther from the center in the Wide frame (M = 193.7, SD = 189.4) compared to the Narrow frame (M = 68.0, SD = 144.3; t(184.98) = 5.28, p < .001, d = 0.75). In contrast to Experiment 1, however, this difference was still present when the displacements were scaled as a percentage of the largest possible displacement from the center (Narrow: M = 24.4%, SD = 51.7%; Wide: M = 43.6%, SD = 42.6%; t(191.07) = 2.87, p = .005, d = 0.41). Finally, to quantify the observed two response clusters, we fit Gaussian mixture models to the data as in Experiment 1. We again depict the Right-Facing data in Fig. 4c and d. Inspection of these figures suggests the same pattern reported earlier: Even when considering only the off-center cluster, the silhouette was still placed farther from the center in the Wide frame. (This is apparent in Fig. 4c where the leftmost peak — which is really the only one visible in this graph — is further to the left for the Wide frame than for the Narrow frame.) This was confirmed by the model estimates (Narrow: Mleft = 130.1 [38.8%], Mright = 117.0 [34.9%], SD = 127.9, with 41.9% of the responses coming from the center distribution; Wide: Mleft = 292.6 [58.5%], Mright = 260.8 [52.2%], SD = 136.1, with 28.8% of the responses coming from the center distribution).

Discussion

The key result from Experiment 1 was simply that the frame widths made a difference, and that result was just as true (if not more so) in this experiment with the wider frames. This rules out the possibility that the Wide/Narrow difference is due to some interaction in which centering the putative affordance space results in a preferred placement that is too close to, or even overlaps, one of the borders. And as such, these results are again consistent with a border-based account of the inward bias, but not with a center-based account (although the particular Wide/Narrow ratio difference observed here was neither the one most naturally predicted by the looking-into-the-future view, nor the one empirically observed in Experiment 1).

General discussion

Like many previous studies, the current project revealed a robust inward bias in aesthetic perception: when viewing a profiled face within a frame, observers preferred compositions in which the face was positioned off-center — so that it was effectively looking toward the center of the display, with more empty space in front of it than behind it. Yet the core conclusion of this study was that this intuitive way of describing the inward bias — as being due somehow to the center of the frame — is fundamentally incorrect, or at least importantly incomplete. This is because the magnitude of the inward bias depended not just on the position of the frame’s center, but on the extent of the frame itself — a seemingly simple variable that has not previously been systematically explored. In particular, wider frames led observers to prefer placements of the face further from the center. These results are inconsistent with any theoretical account of the inward bias — such as the affordance-space view — that references only the frame’s center, with no role for its borders. The center-based accounts, in other words, cannot hold.

Saving the center?

Of course, any point within a frame can always be described by reference to either the center or the borders — and in that sense a possible role for the display center will always be unfalsifiable. However, this truism does nothing to blunt the stark explanatory asymmetry revealed by our results. In particular, the data presented here can be explained only by appeal to the frame borders, without any role for the center. But in contrast, these data cannot be explained only by appeal to the center, without any role for the frame’s borders. For this reason, it would be insufficient simply to conclude from the current study that both the borders and the center play a role: the former clearly must, while the latter may not. And this stands in stark contrast to the “affordance space” view, which often explicitly references the fundamental role of the center (as in the quote above from Palmer & Langlois, 2017), but which to our knowledge has never mentioned any role for the frame borders, per se.

Looking into the future: Dynamic representations in static scenes?

The other option, then, is that the inward bias is at least in part — and perhaps primarily — due somehow to the frame’s borders. And the looking-into-the-future view explains why this might be so: viewers have a relative aesthetic aversion to placements in which the predicted future actions of an object are likely to take it beyond the frame — where that frame might be the borders of a painting or a photograph, or it might be the border of one’s field of view. This view readily accommodates the fact that the locations of the borders (beyond the center) matter, and it readily predicts that observers might prefer a ratio of distance-in-front to distance-behind that is either relatively stable (as in Experiment 2) or even constant (as in Experiment 1).

The looking-into-the-future perspective thus interprets the inward bias not as an isolated aesthetic curiosity, but rather as an adaptive feature that helps us to orient our visual perspectives in ways that maximize our chances for witnessing future events. In this way, the inward bias may constitute yet another example of how perception involves not just characterizing the way the world is now, but also predicting how the world may soon unfold. This perspective has previously been explored in contexts ranging from extrapolation in phenomena such as representational momentum (e.g., Freyd & Finke, 1984; for a review, see Hubbard, 2005) and the flash-lag effect (e.g., Nijhawan, 1994; for a review, see Hubbard, 2014) to the notion that visual representations even of static scenes involve inferences about how those scenes were dynamically produced, in terms of their causal history (Chen & Scholl, 2016). In this context, the current results suggest that even our aesthetic experiences of static scenes may intrinsically incorporate dynamic predictions of future events.

Open Practices Statement

The Online Supplementary Material available online with this paper contains the materials and raw data for each experiment, which can be found at https://doi.org/10.17605/OSF.IO/27VNP.

Author Note

BJS was supported by ONR MURI #N00014-16-1-2007.