Introduction

Learning and attention are linked: as humans learn they change how they attend to available information, becoming more efficient in accessing information relevant for completing tasks (Chen, Meier, Blair, Watson, & Wood, 2013; Henderson, Williams, & Falk, 2005; McColeman et al., 2014; Rehder & Hoffman, 2005). For example, novice readers double-fixate on words, reread sections, and skip important words more frequently than experts (Rayner, 1998). Researchers have documented skill-related changes in basic oculomotor measures such as fixation duration, saccade length, and number of fixations. Important skill-related differences are observed in oculomotor behaviors across a diverse set of experts, from artists to athletes to radiologists (Brunyé et al., 2014; Holt & Beilock, 2006; Rayner, 1998).

Shifting gaze is not the only way to acquire information: for example, when reading a book, a page turn is required to reveal new information (Wolberg & Schipper, 2010). Visual information is also accessed through digital interfaces, such as scrolling through websites, or clicking on a drop-down menu. Beyond oculomotor activity, the manipulation of the user interface becomes the means through which information access occurs (Byrne, Anderson, Douglass, & Matessa, 1999). In these cases, the observer samples the environment to extract relevant information from the interface.

The theoretical question that drives our research is: to what extent does information sampling via digital interfaces undergo the same learning-related changes as visual information sampling via gaze? Should we be treating all information sampling, whether through eye movements or through manual manipulation of a computer interface, as a unified phenomenon? Gottlieb (50) considers information access as part of an active sensing framework: information is gathered to support higher-level decisions. In active sensing, information is gathered to minimize uncertainty and maximize reward, and though the oculomotor system is one active sensing tool, any action that samples the environment might be considered part of that system.

Attentional differences between novices and experts have been demonstrated in several dynamic, real-world tasks. Similar patterns of change with increased expertise exist on many of the standard measures – experts demonstrate greater saccade amplitude (Charness, Reingold, Pomplun, & Stampe, 2001; Gegenfurtner, Lehtinen, & Säljö, 2011), more fixations to relevant areas (Brunyé et al., 2014; Bertrand & Thullier, 2009; Charness et al., 2001; Gegenfurtner et al., 2011), and shorter fixations (Gegenfurtner et al., 2011) across a range of domains such as sports (Bertrand & Thullier, 2009; Naito, Kato, & Fukuda, 2004), driving (Borowsky, Shinar, & Oron-Gilad, 2010; Gómez-Valadés, Luis, Reina, Sabido, & Moreno, 2013), flying (Ziv, 2016), and professional image-viewing (Brunyé et al., 2014; Gegenfurtner et al., 2011). These parallels suggest that the patterns of eye movement-mediated attention allocation found in simple laboratory tasks apply to the less controlled activities performed in everyday life.

To date, work pertaining to active sensing is predominately about eye movements (e.g., Glaholt & Reingold, 2011; Gottlieb, Hayhoe, Hikosaka, & Rangel, 2014; Orquin & Loose, 2013). However, in controlled laboratory environments, some researchers have investigated eye-based and mouse-based information access in concert. In learning studies where task-relevant features were accessible only by moving the mouse to uncover a mask (Meier & Blair, 2012; Wood, Fry, & Blair, 2010), participants’ mouse movements paralleled eye movements in similar tasks (McColeman et al., 2014). Therefore, eye and mouse movements share qualitative properties. Beyond these studies, we find little work to tie findings from laboratory studies of learning and attention to complex naturalistic tasks, or to map findings from oculomotor studies to studies using computer interfaces. The present work seeks to understand information access in complex tasks that rely on a digital interface to mediate attention.

The present study investigates learning-related changes in mouse/keyboard-mediated information access by studying the visual sampling patterns of players at different skill levels in the real-time strategy video game StarCraft 2. We predict that StarCraft 2 players exhibit skill-related changes in information access behaviors using the mouse and keyboard that parallel known skill-driven changes found in oculomotor-based information access patterns: active sensing extends beyond the eye.

StarCraft 2 is a strategy game in which players manage a military campaign against opponents. There are three important factors of StarCraft 2 that make it suitable for the current research. First, players use the interface to move their display to different portions of the environment (see Fig. 1). Players view one portion of the environment, perform actions on the units/structures at that location, and shift their view elsewhere. This pattern bears a strong

Fig. 1.
figure 1

Two typical screens in StarCraft 2: (top) front line battle and (bottom) production. (Top) The small red rectangle shows a map (called the mini-map) of the total playing area while the large red rectangle is the main display, which provides a detailed interactive view of the selected area (small white polygon) of the mini map. The player can interact with allied objects (pink, can also be off-screen objects referenced by hotkeys) to give specific action orders, including, but not limited to, building, training, attack, and movement, on the interface area at the bottom (pink). Players use troops (white) to eliminate each other by destroying all enemy structures. (Bottom) Players gather resources (green) using workers (blue) to build structures (yellow) and train troops (white)

resemblance to eye-gaze fixations, and they reflect the spatial and temporal sampling of information (Hayhoe, Shrivastava, Mruczek, & Pelz, 2003). Second, actions that a player makes in the game, including keyboard commands and mouse movements that shift their view to new sources of information, are stored in replay files, and are available to researchers. These data make it possible to calculate the information access measures of interest with a high degree of certainty – allowing us to investigate performance of a naturalistic task non-invasively, but with the precision usually found only in the laboratory. Finally, StarCraft 2 is a complex game and a domain of genuine expertise. As in traditional sports, there are full-time professional players who compete for substantial prize pools (e.g., US$700,000 for the 2018 StarCraft 2 World Championship Series Global Tournament; 2018 WCS Global Finals, n.d.). These three factors: vision-like mouse and keyboard mediated information access, detailed and non-invasive measurements, and significant task complexity, make StarCraft 2 an excellent domain to explore learning-related attentional change.

The first goal of this work is to uncover how learning influences information access in the computer-mediated domain of StarCraft 2 is consistent with research focused on eye tracking. We consider three common measures: fixation duration, fixation count, and fixation amplitude. We predict patterns similar to existing work using eye tracking. Learning is associated with an increase in the efficiency of information access. In experimental studies of category learning, the number of fixations decreases, as does overall fixation duration (e.g., Chen et al., 2013; McColeman et al., 2014). Expert-to-novice comparisons in a variety of domains, such as radiology and various sports, reveal similar results (for review, see Gegenfurtner et al., 2011). While the overall number of fixations was negatively correlated with expertise (Bertrand & Thullier, 2009; Liu, Gale, & Song, 2007; Piras, Pierantozzi, & Squatrito, 2014), experts make more fixations to relevant stimuli, and fewer to irrelevant stimuli (Cooper, Gale, Darker, Toms, & Saada, 2009; North, Williams, Hodges, Ward, & Ericsson, 2009). Fixation durations to task-irrelevant features commonly decrease with expertise as well (Bellenkes, Wickens, & Kramer, 1997; Cooper et al., 2009; Kato & Fukuda, 2002). Experts in tool-based sports, such as tennis and baseball, show an overall decrease in the number of fixations they make, suggesting that experts are able to extract more information from fewer fixations (Mann, Williams, Ward, & Janelle, 2007).

The second goal is to investigate types of efficiency that are not directly related to eye-movement measures, but that are nonetheless relevant for understanding expert information access. We define “efficiency” as the best use of available resources to complete a task. In StarCraft 2, it is the best use of the display, environment, and body to gather task-relevant information where necessary. Efficiency in StarCraft 2 is mediated by built-in tools, such as hotkeys (player-assigned shortcuts to access buildings or units) and the mini-map (see Fig. 1). Experts are better at gathering information and are more efficient at performing tasks to improve their performance: they transition from information-guided actions to automatic actions that no longer require visual information to execute. This occurs in many domains; experts become so adept at their task that they can perform without watching what they are doing. Expert golfers do not fixate on the ball whilst putting (Naito et al., 2004), despite the ball’s task relevance. Expert efficiency includes cases where experts do not even need to sample required information in the first place, relying instead on peripheral processing (Kato & Fukuda, 2002; Piras et al., 2014), non-visual motor feedback (Crump & Logan, 2010), and prior knowledge (Shinoda, Hayhoe, & Shrivastava, 2001).

Methods

Participants and game records

Participants (mean age 22 years; SD = 4 years) had approximately 545 h of StarCraft 2 experience, and were mostly from the USA and Canada (Thompson, Blair, Henry, & Chen, 2013; see also for details of exclusion criteria).

Players were recruited from online communities to fill out a survey and submit a StarCraft 2 game replay file. A replay is a computer file from which a time-stamped record of all player actions can be recovered. Players watch their own replays and professional games to improve their gameplay. We use the replay file to uncover if, and how, players of different skills access information differently. Of the over 7,000 people who filled out the survey, 3,317 useable replays were obtained. Replays were filtered for ranked games involving only one player against a single human opponent to ensure comparability. Submissions were disqualified if they could not be parsed, replay files were not submitted, or the player did not complete the survey. Respondents’ within-game identifications were used to check that each player was included only once. Five players each submitted two games; these ten games were excluded from analysis, leaving 3,307 analyzable game samples. From the seven possible leagues (i.e., skill level 1–7), where one league increase from "1" (bronze) indicates an ordinal improvement from novice players, data was obtained from 167, 347, 548, 803, 800, 610, and 32 individuals from each league, respectively. The leagues are treated categorically and we conducted primarily nonparametric analyses to test omnibus hypotheses throughout this paper to alleviate concern about unequal sample sizes. To parse the data, we took the submissions and created a time-coded list of player-initiated actions (mouse movements, mouse buttons, and keyboard events) using SC2Gears (Belicza, 2010; for additional details see Thompson et al., 2013) resulting in a list of more than 1.5 million screen pauses, many including multiple training actions, building structures, and attack commands. A perception action cycle (PAC) is when a participant performs at least one action while their screen is paused; otherwise, a simple check of a part of the environment by pausing the screen in some location is called a screen fixation.

The resolution of our latency measurements is determined, in part, by network connectivity in the game. In order to allow players to compete online seamlessly, the game engine synchronizes performance roughly every 45–90 ms. Within the window between synchronizations, the order of actions produced by a player is preserved, but the timestamps associated with these actions will be tagged as equal. This means that our latency measurements are sampled ~22 Hz at most. Importantly, our variables are based on aggregates of this raw data. Depending on the measure, these aggregations will be based on hundreds or thousands of actions. Sampling rate will therefore not be visible in latency measures aggregated using means. However, when we aggregate latency information with medians on account of skewed data, the sampling rate will become visible as a band of clustered performance timings (see Fig. 3 for an example of such a clustered sample; for further discussion of sampling rate in replay files see Thompson, 2018).

Measures of information access in StarCraft 2

Our primary research questions focus on the extent to which information access in StarCraft 2, with its digital interface, is influenced by learning in ways that are similar to how eye movements are influenced by learning. We address these questions with two categories of measures. First, we use measures such as fixation duration and fixation amplitude. Although some consideration must be given to the best way to characterize these in a digital interface, these measures are broadly analogous to measures of eye movements. Second, we use measures of how players use the digital interface in StarCraft 2, such as the use of hotkeys to select units. These measures are less directly relatable to eye movements, but, nonetheless, seem relevant for a complete understanding of how information access changes with learning.

Screen fixations

Screen moves, an important component for understanding information access in StarCraft 2, were processed as follows: for any given moment during a game, the replay data includes co-ordinate information about where a player’s screen is relative to the larger game environment. Screen fixations in StarCraft 2 share the same general properties as gaze fixations do in an eye-tracking study. Screen view location data were subjected to a modified version of the Salvucci-Goldberg identity dispersion algorithm (Salvucci & Goldberg, 2000), which was originally used to identify eye fixations. In this application, the algorithm identifies screen fixations by flagging screen locations that are sufficiently close together in space (six map co-ordinate units) for a sufficiently long time (20 game time stamps, or 226 ms; see Thompson et al., 2013, for details). Using this method, screen moves, which access information from new portions of the game environment and afford new interactions with game units at those locations, are grouped into fixations. Similar to eye fixations, screen fixations target a specific location for a specific duration.

The anatomy of screen fixations

In laboratory studies of example category learning there are typically a number of fixations that occur before a category response is given. This means that fixation durations reflect only the time participants are accessing the information. In StarCraft 2 things are slightly different: When the screen is moved, players typically make one or more actions using their mouse or keyboard before shifting their screen elsewhere. Players are often shifting their screen, not only to look but also to interact with units at various locations. In order to disambiguate fixations where no action is taken from fixations containing actions, we distinguish screen fixations from perception action cycles (PACs). A PAC begins with a screen movement, just like a fixation, but differs in that at least one action is performed before the screen is moved again (Fig. 2).

Fig. 2
figure 2

The basic components of a Perception Action Cycle (PAC). The PAC begins and ends with a screen move. During the time in which the screen is stable, the player may perform actions. The time between the initial screen move and the first action is the First Action Latency. The average time between the subsequent actions is the Between Action Latency

The PAC is further broken down for the analyses throughout this paper (Fig. 2). We call the delay between the screen movement and the first action in the PAC the First Action Latency (FAL). FAL has been used as an analogue of reaction time (Thompson, Blair, & Henrey, 2014).

It differs from traditional reaction time measures in that the properties of the stimulus may be anticipated by the observer before the stimulus is shown. If there is more than one action in a PAC, the latencies of the subsequent actions are provide a Between Action Latency (BAL). The purpose of this measure is to distinguish the effect of new information on FAL (expected to exert more influence on FAL) from simple motor execution speed (influencing BAL). FAL reflects the duration of a more complex behavior, which requires adapting to a new visual stimulus and initiating action, while BAL often reflects the motor processes associated with continuing a pre-planned action sequence. BAL is derived from the mean difference between the action latencies in a PAC. A player’s BAL reflects the mean of these inter-action intervals (BAL1..n, Fig. 2). Prior work has shown that BAL is faster than FAL (Thompson, McColeman, Stepanova, & Blair, 2017), but BAL has yet to be examined across skill levels. To more succinctly examine the difference between FAL and BAL, we use New View Cost (NVC). This is obtained by subtracting average BAL from FAL. While there is no guarantee that NVC measures the duration of a particular cognitive process (e.g., perceptual processing time), we study NVC to get a sense of whether the skill-related latency reductions in FALs can outstrip the changes in BAL. By looking at these PAC components in isolation, we can gather a better understanding of the influence of expertise on a player’s ability to effectively act on information viewed in the new visual environment.

Screen fixation rate

In laboratory studies of learning, a common measure is the number of fixations per trial, which tends to decrease with learning. Outside the lab environment, however, experiences are seldom broken up into comparable chunks; people do not read, drive, or play StarCraft 2 in trials. In order to compare lab-based tasks and dynamic real-world tasks, we re-examined fixation count in the eye-tracking data as a function of time, rather than trials. While earlier work reported the number of fixations per trial over the course of learning, here we report the number of fixations per minute (this fixation rate measure from the eye-tracking data is unpublished from Meier & Blair, 2012; available publicly at http://summit.sfu.ca/item/12715). The resulting fixations per minute measure is a better parallel to our StarCraft 2 fixations per minute measure. In category learning, a typical trial begins with stimulus presentation, then a response and then the presentation of feedback. Rather than trials, we looked at fixations for the first 2 min of stimulus presentation versus the last 2 min of stimulus presentation during a category-learning experiment, to provide a fairer comparison between eye tracking and StarCraft 2.

Screen movement amplitude

Screen movement amplitude can indicate how information is gathered from the environment. It is measured as the distance between consecutive fixations. To calculate this, we take the X, Y co-ordinates for the fixations within each game (screen position, Fig. 2) and calculate the mean distance between those fixations. These distances are normalized against the distance between the two players’ initial positions to control potential differences in map size.

Hotkey use

One of the main findings in information access is that the more people learn, the less time they spend fixating irrelevant formation. In the context of a digital user interface, we might expect that experts will not shift their screen unnecessarily. There are several tools in the StarCraft 2 interface that allow players to complete tasks without shifting the screen to particular locations. The first method is to use hotkeys. A hotkey is a shortcut used in StarCraft 2 to group a set of individual units or make a single unit/building easier to access, much like using CTRL + C to copy something to a PC clipboard. Buildings and units are linked to the ten available hotkeys and are set by the player and can be updated throughout the game. Hotkeys are efficient because they eliminate the need to visually search for and select a unit or building. If the hotkeyed unit is not on-screen at the time, selecting with a hotkey allows the player to potentially avoid shifting the screen at all (see Supplementary Materials S67 for additional details).

Offscreen production

Another method of assessing whether participants are making fewer unneeded fixation is to look at offscreen production. Offscreen production is defined as any unit training that occurs without the source building in view (see Supplementary Materials, Data Administration S11 for additional details). In this situation the participant has avoided using the visual interface to complete the task, and instead relied on a combination of keyboard commands.

Mini-map use

The mini-map shows a small, top down view of the game world (see Fig. 1, small red square). By using the mini-map to issue commands, players avoid the need to move their main view screen to the desired location, thus saving themselves an unnecessary fixation. We use three measures to assess the extent to which players use the mini-map. The first is the use of special abilities, the second is attacks issued using the mini-map, and the third is right clicks – used to deploy a unit or group to the clicked location (see Supplementary Materials 10 for additional details).

General analysis strategy

Our primary goal is to discover whether information sampling via digital interfaces undergoes the same learning-related changes as gaze-based sampling, consistent with a general information sampling framework (Gottlieb, 2018). If digital interfaces produce either no learning-related changes or changes in the opposite direction from what we find in eye-tracking studies, we would interpret that as evidence against such a framework. So, our primary concern is determining whether measures of information sampling increase or decrease in more skilled groups. Because learning-related changes tend to be fairly overt, particularly so in the present case where skill groups differ by roughly 100 h of practice or more, we expect to confirm that changes are in the appropriate direction by inspection of graphs of the data. We employ non-parametric tests of the omnibus hypotheses throughout this paper to confirm that any learning-related changes are unlikely due to chance. These tests are robust to sample-size differences as well as floor or ceiling effects that are observed in some of our later measures.

We opted to use an analysis strategy that can be applied to all measures in the paper. While multilevel modeling may be desirable to describe learning trajectories, our measures are not all nested in the same manner; in the interest of maintaining structural consistency for the measures’ analyses, we use the nonparametric tests throughout the main text, with the same factors and the same reporting. We observe main effects in nearly all of our measures.

While precise changes across specific skill levels is not of primary theoretical concern for the present work, some researchers may find such analyses useful; we include pair-wise comparisons of all skill levels in the Supplementary Materials.

Results

We first report measures of StarCraft 2 information access that are most directly analogous to existing eye-movement measures. These include: screen fixation durations (those that do or do not contain actions within), fixation rate, and fixation amplitudes. Next, we address measures of how players use the digital interface to further optimize their access to information. These include hotkey use, off-screen production, and mini-map use.

Measures analogous to eye tracking

Fixation durations

The allocation of attention in StarCraft 2 is measured by screen fixations. We find that screen fixations are generally faster than PACs, attributable to the fact that actions take time to perform. Fixations without actions are rare, and account for very little of the total game time (Thompson et al., 2017) because it is more useful in StarCraft 2 to do something than to do nothing. Figure 3A shows these data; the mean duration of players’ median fixation durations decreases by league, starting at 511 ms for league 1 and ending at 451 for league 7. A non-parametric Kruskal-Wallis test indicates that these differences in fixation durations by league are statistically significant (2 = 15.637, df = 6, p = 0.016, ηp2= 0.003). Tukey’s Honestly Significant Difference (HSD) tests are reported for all measures in the Supplementary Materials. For further information about the difference between each league, please refer to the Supplementary Materials.

Fig. 3
figure 3

Duration of fixations, and Perception Action Cycles (PACs) (fixations containing actions). (A) Median duration of fixations wherein players did not perform an action by league. The shaded region shows the density, and each circle shows one game’s median fixation duration. An increase in one league is an increase in skill level. We observe that fixation durations only noticeably decrease when comparing novice players in league 1 to much better players in league 6. This shows that attentional allocation not leading to an action does not become more efficient. These appear discrete because, to ensure accurate play online, the game synchronizes information between the two players roughly every 45 ms (see the Methods section for additional information). This produces prominent bands in A. (B) Median duration of fixations that included actions across the seven leagues. PACs are faster for better players

Perception action cycle durations

With PACs, we find a trend similar to fixation durations. Figure 3B shows these data; the pattern of decreasing median durations with skill is clearly visible. League 1 durations are 2,598 ms (mean = 2,772 ms) and league 7 durations 1,265 ms (mean = 1,307 ms) for league 7. Again, statistical tests confirm that changes with league are significant (2 = 1018.4, df = 6, p < .001, ηp2= 0.307). Note that the scale of the measures differs: PACs are longer than simple fixations, but both appear to speed up as skill increases. Whether an action occurs after the screen pauses or not, better players keep their screens focused in one place for less time.

Actions within perception action cycles

Extracting information from the environment is crucial in deciding how one acts upon it. FAL (first action latency: the delay before the first action following a screen movement), BAL (between action latency: average latencies between later actions in the PAC), and NVC (new view cost: FAL minus BAL) address these information access and action-taking behaviors. The duration difference between the time-stamp attributed to the screen move and the first action in a PAC is the FAL in game time units. The game units are then converted to real time (milliseconds) for easier interpretation. Figure 4A shows first action data; there is a visible decrease in these latencies as skill increases. The mean FAL time is 1,061 ms for league 1 and decreases to 430 ms in league 7 of our sample. A Kruskal-Wallis test confirms the significance of these skill-related changes (2 = 1483.9, df = 6, p < .001, ηp2 = 0.448). Actions do not instantaneously occur with a perceptual change (otherwise FALs would be 0), but the latency decreases inversely with increasing expertise (Fig. 4). Earlier reports find first actions are faster for younger players (Thompson et al., 2014) and that the FAL is a better predictor of skill for earlier leagues (Thompson et al., 2013) (see also Thompson et al., 2017, for additional FAL analysis). One interpretation for rapid FALs in experts is that better players require less time to gather information from a new perceptual environment to make a decision. There is a possibility, however, that the decreasing FAL with league is a consequence of better players exhibiting rapid motor execution.

Fig. 4.
figure 4

Durations in milliseconds of various components within perception action cycles. (A) First Action Latency (FAL), or the delay between a screen move and the first action, is observed to decrease as a matter of league advancement. (B) Between Action Latency (BAL), or the time difference between actions in a PAC becomes shorter as league increases. (C) New View Cost (NVC) (defined as the difference between the average BAL and the FAL) tracks whether FAL changes over and above the improvements to motor execution reflected in the cognitively simpler actions reflected in BAL. Banding appears prominent in (A) due to resolution issues discussed in Fig. 3. Bands are less prominent in BAL (B) and NVC (C), as means are involved in their calculation

We also analyze BAL, which is the rate of motor execution after the first action. Earlier work finds that the BAL is faster than the FAL overall (Thompson et al., 2017). Figure 4B shows these data; it appears that better players are faster overall, in that their actions within the PAC are executed more quickly. Mean BAL decreases with league, starting at 733 ms at league 1 and ending at 290 ms in league 7. A Kruskal-Wallis test confirms that league is a significant factor (2 =1251.6, df = 6, p< .001, ηp2=0.382).

Both BAL and FAL are faster at higher levels of skill. An important question, therefore, is whether better players are also faster decision makers and perceivers. To begin to probe this question we use New View Cost (NVC). Since FALs require a motor action and a change in visual information, the immediately subsequent action is expected to take longer than any actions following the same screen position. The motivating concept for reporting NVC is that screen movements are cognitively and perceptually expensive. By subtracting mean BAL from FAL, we obtain NVC – the latency between motor actions corresponding with a shift in visual stimuli. An NVC of 0 means the latency of the action following the screen move would be indistinguishable from later actions within the fixation. Figure 4C shows that NVC tends to be positive, which is expected, given the additional cognitive demands imposed by shifting one’s screen, perceiving the new environment, and deciding to act. The differences in NVC seem visible only in later leagues, nevertheless a Kruskal-Wallis test confirms that the effect of league is significant (2 = 326.83, df = 6, p < .001) but weak (ηp2 = 0.097).

That the NVC only appears to decrease in the higher leagues is contrary to typical learning curves where improvement is most noticeable early on. In order to probe this atypical skill development pattern, we compared league 1 players with those of other leagues in a set of pairwise comparisons with a family-wise type I error rate of 0.05 (which, after Bonferroni correction, corresponds to a per-test alpha of 0.008). There are no discernible differences between players of the lower leagues: not between league 1 and league 2 players (T(233) = -0.230, p= 0.818, D=.0252), league 1 and league 3 players (T(208) = 0.028, p = 0.977, D = .003), or league 1 and league 4 players (T(184) = 1.515, p = 0.134, D = 0.195). For better players, however, there are distinct differences in NVC; the mean NVC for league 1 was 274 ms while the mean NVC for league 7 was 136 ms. Differences from league 1 are observable after league 5 where NVC is less than league 1 players (T(178) = 3.194, p = 0.002, D = .453). Comparing league 1 and league 6 shows a lower NVC for the higher league (T(179) = 5.160, p <0.001, D = 0.714), as does a comparison between league 1 and 7 (T(192) = 7.384, p < 0.001, D = .700). A likely explanation for these data is that FALs continue to show learning effects even after BALs reach floor. The changes in NVC may reflect a kind of perceptual expertise that only emerges once players have become extremely efficient in their attentional allocation.

Screen fixation rate

As discussed in the Methods section, fixation count is a good measure of oculomotor measure in the context of trials within a category-learning task, but not useful in the continuous behaviors in other situations where there are no trials in which to establish a fixation count. Here we use fixation rate, defined as the number of fixations per minute, which is a better measure for a wider variety of tasks. In StarCraft 2, players are mostly within the range of 20–40 fixations per minute, as seen in Fig. 5. In the StarCraft 2 data the mean number of screen pauses (fixations and PACs) per minute increase with league – higher leagues make more fixations and PACs per minute than lower leagues (휒2 = 511.84, df = 6, p < .001, ηp2=0.152), and the most experienced StarCraft 2 players display a higher fixation rate than the novices (W = 44300, p < .001, r = 0.22). The mean fixation rate per league started at 23 fixations per minute for league 1, and ended at 37 fixations per minute for league 7.

Fig. 5.
figure 5

Fixation rate (fixations with and without actions per minute) in StarCraft 2 and a category-learning eye-tracking experiment. (A) Fixation rate (fixations per minute) in StarCraft 2 during game play. (B) Fixation rate (fixations per minute) in eye-tracking data during stimulus presentation. Fixation rates increase by league. Fixation rates increase in category learning, within the span of a single hour of learning, but are an order of magnitude different from fixations in StarCraft 2

How does this screen-movement fixation rate compare to oculomotor fixation rates in other kinds of tasks, such as category learning? In Fig. 5 we show fixation rate for an eye-tracking-based category-learning study (an unpublished measure from Meier & Blair, 2012). To capture increasing proficiency, we plot the fixation rate for the first 2 min of stimulus presentation during the experiment (excluding time spent receiving feedback and time spent viewing the fixation cross) and last 2 min, once the participants have mastered the task. As seen in Fig. 5, fixation rates are roughly between 100 and 200 fixations per minute. The fixation rate is higher in the last 2 min than the first two, meaning that participants make more fixations in less time toward the end of the ~1-h-long experiment (W=2180, p<0.001, r=1.14). While additional work needs to be done to document fixation rates in other situations to contextualize the rates found here, the present data confirm that learning increases fixation rates in both eye-movement and digital interface scenarios.

Between-fixation amplitude

Similar to eye-saccade amplitude, screen saccade amplitude can indicate how an observer gathers information from the environment. Figure 6 shows the distance traveled from screen fixation to screen fixation by league. Generally, it seems that better players make larger amplitude screen saccades than novices; the mean between fixation amplitude was 0.158 for league 1 and 0.232 for league 7. The impact of league on fixation differences was confirmed with a Kruskal-Wallis test (2 = 336.82, df = 6, p < .001, ηp2 = .101). Such longer amplitude saccades, when observed in eye tracking, are associated with a greater perceptual advantage in experts (Charness et al., 2001). Given that a similar saccade amplitude advantage was observed in experts in computer-mediated visuospatial task, this finding confers another opportunity to generalize concepts of information access beyond the eye.

Fig. 6
figure 6

Between-Fixation Amplitude. Distance between fixations (screen saccade amplitude) by league. This normalized distance metric shows that better players make relatively large screen movements compared to novice players

Use of the digital interface to optimize information access

Hotkey use

The use of hotkeys allows players to select their units without those units being on screen. Like typing without looking at one’s keyboard, this speeds performance by minimizing unnecessary perceptual processing. As shown in Fig. 7A, the ratio of hotkey-selects to other select actions visibly increases by league. The ratio starts at 0.43 for league 1 and ends at 1.65 at league 7. A Kruskal-Wallis test confirms that league is a significant factor, (2 = 615.44, df = 6, p<0.001, ηp2 = 0.185). Six players were excluded from this analysis because they used no hotkey-select actions. Better players appear to use more efficient means to select game units (via hotkey-select) than the slower concerted manual selection method. One element of expertise is using the tools available to most efficiently complete the task, and reducing the reliance on individual selection is indeed evident in StarCraft 2 experts. How best to employ interface options to do a job is an important question for human computer interaction. These data suggest that hotkey use is a valuable metric to record as a proxy for users’ competence.

Fig. 7
figure 7

(A) Hotkey use. The ratio of hotkey to selection actions increase with league. Using more hotkeys is one way to use the interface more efficiently, and the increase in hotkeys relative to selects suggests better players more effectively use the interface to choose game units. (B) Off-screen production by league. While novices produce most of their units while looking onscreen, as players become better at the game, they initiate unit production while their screen is on another part of the map

Off-screen production

Off-screen production, like the use of hotkeys, reflects instances of players taking actions without first orienting their main screen; it is a method of reducing unnecessary visual processing enabled by the digital interface. Figure 7B shows the percentage of production that is offscreen. Predictably, as player league increases, production shifts away from on-screen training – that is, the percentage of on-screen production decreases and the percentage of off-screen production increases, starting at of mean of 21% percent in league 1 and ending at 44% percent for league 7. A Kruskal-Wallis test confirms that league is a significant factor in use of offscreen production (휒2 = 162.31, df = 6, p <.001, ηp2 = 0.047). This supports the idea that more successful players (in terms of league) have learned to become more efficient, especially with relatively simple tasks like training units. Experienced StarCraft 2 players do not need to look at the location of these simple actions in order to perform them effectively. The less time players spend looking at their production sites, the more time they can allocate to other aspects of the game.

Mini-map use

Another opportunity to leverage the game’s available tools to minimize shifts in the view screen is to use the mini-map to issue commands. We looked at three kinds of commands that can be performed on the mini-map, shown in the panels of Fig. 8. Special abilities performed on the mini-map are rare and used to perform a small set of actions to improve play (see Supplementary Materials S5, S7). Performing them without looking directly at the unit is a challenging pursuit, and so it is expected that this may be one measure in which experts are more easily identified. For the data shown in Fig. 8A, contrary to expectations, we found little evidence (2 = 14.91, df = 6, p = 0.021, ηp2 = .003) to suggest that the use of special abilities via the mini-map separates novices from better players. This may be partly attributed to the rarity of these actions, given that most games (n=2613, or 79% of the sample) had none at all. It also might be that use of abilities requires finer targeting, which is difficult to achieve given the small size of the mini-map, and so it simply is not the efficient strategy that we thought it might have been compared to other strategies that might be used by more advanced players.

Fig. 8
figure 8

Mini-map actions by league. (A) Shows the use of special abilities through the mini-map interface. (B) Shows the attacks deployed by selecting a location using the mini-map. (C) Shows the right-clicks made to the mini-map to direct units to move

Players may also use the mini-map to perform an aggressive action. Attacking on the mini-map can allow a player to deploy units without moving their screen, which in turn saves the player from having to return to their original point of interest. Figure 8B shows the number of mini-map attacks used by league.

The mean number of mini-map attacks per minute increased by league, starting at 0.15/min in league 1 and ended at 1.97/min in league 7. A Kruskal-Wallis test confirms that league is a significant factor in use of mini-map attacks (2 = 366.66, df = 6, p < 0.001, ηp2 = 0.110). Expert players, then, make more mini-map attacks than their novice counterparts.

In addition to mini-map attacks, mini-map right clicks are used by experienced players to move units and/or groups around the screen. Figure 8C shows the number of minimap right clicks by league; clearly visible is an increase by league. The mean number of mini-map right clicks per minute increased by league, starting at 1.22/min in league 1 and ending at 3.46 in league 7. A Kruskal-Wallis test confirms that league is a significant factor in use of mini-map right clicks (2 = 176.17, df = 6, p < 0.001, ηp2 = 0.052). Once more, our stronger players possessed a more efficient toolkit. Not only did they move through the game environment faster, but they also used shortcuts to improve their in-game efficiency.

Overall, our results found properties of StarCraft 2 fixations that qualitatively mirror properties of traditional eye fixations. Using cognitive theories of oculomotor activity to predict general visuospatial information access is thus supported by our findings. Additionally, we found more domain-specific properties of player behavior that offer general lessons about how efficiency is valued and developed in human-computer interaction, especially when that interaction affords improvements over time.

Discussion

Learning and information sampling co-evolve. Information needs change during the learning process as goals become more sophisticated, thereby impacting how information is sampled. Consistent with theories of active sensing that posits adaptive sampling policies to maximize reward and minimize uncertainty (Gottlieb, 2018), we observe an evolution of information gathering as a result of the learning process. Eye movements become more efficient as an individual learns. In StarCraft 2, screen-based information access movements also change with skill.

Players manipulate their viewpoint in a StarCraft 2 game using a keyboard and mouse to sample information. Whether an observer moves their eyes, their head, their whole body, or a digital window to access new information, they are engaging in visuospatial selection.

Eye movements and screen movements are broadly analogous. Changing the contents of a digital display and fixating on novel visual features invoke the same goal: to access information of interest. As a complex environment, StarCraft 2 differs in some respects from traditional eye-tracking studies in that the need to perform both screen and eye movements is urgent and persistent (Glass, Maddox, & Love, 2013). Unlike lab tasks, which can be arbitrarily separated into distinct trials, game play is dynamic and ongoing. Nevertheless, game play situates rigorously measured information sampling in a naturalistic context.

The same set of hierarchically organized, expected information gain and reward principles are invoked independently of how information is accessed (Chelazzi et al., 2014). Eye and mouse movements have also been experimentally observed to share common modes of operation (Chen, Anderson, & Sohn, 2001; McColeman et al., 2014). However, the musculature required to execute a new eye movement differs from the musculature required to execute a mouse-based screen movement (Orlov & Apraksin, 2015). These physiological differences, and the fact that eye movements occur on a much shorter time scale than computer interface manipulation, may affect how we interpret these measures, such as whether we consider the sampled information as foveated. However, we find that in both cases learning outcomes remain the same: increasing expertise corresponds to shorter fixation durations.

Oculomotor information sampling and other types of spatial information sampling share similar neurological representations. The cortical maps representing planned actions are both rooted in the posterior parietal cortex (PPC). Activity in the lateral intraparietal sulcus (LIP) of the PPC corresponds to focused oculomotor attention, suggesting that the LIP contributes to the execution of spatially selective attention (Arcizet, Mirpour, Foster, & Bisley, 2017). The LIP encodes priority maps, which act as topographical representations of information in space. While the superior colliculus alone is predictive of covert attention shifts (Herman, Katz, & Krauzlis, 2018), parietal contributions to attention are based in comparing possible actions in a common "currency" between modalities (Sugrue, Corrado, & Newsome, 2005). Measures like New View Cost report principles of visuospatial cognition that can be compared by parietal maps that integrate modality-specific expectations. Along with other regions in the parietal lobe, there is evidence that the LIP is broadly implicated in information access (Bisley & Goldberg, 2010; Fecteau & Munoz, 2006).

A domain such as StarCraft 2 offers an opportunity to model, track, and predict individuals’ growth and learning trajectories. While the present data are cross-sectional, the methodology employed here could be extended to track individual players as they advance through leagues, improving their strategies, and, if all goes well, fine-tuning their spatial information sampling behavior to approximate peak efficiency. Tracking individual differences and modeling players who do advance in ranks in contrast to those who do not would offer compelling insight into the relative impact of spatial sampling in the development of expertise. Even though a cross-sectional dataset such as this does not tell the whole story about learning-related changed in information access, we still observe that better players are simply faster at accessing information, just as learned observers are more efficient with their eye movements in analogous tasks.

With increasing skill comes increasing efficiency. Fixation durations, perception action cycles, and the actions within them all increase in speed. Efficient information access in traditional categorization tasks (e.g., Blair, Watson, Walshe, & Maj, 2009; Rehder & Hoffman, 2005) is characterized by more fixations deployed to relevant features. This effect is amplified if a relatively resource-intensive mouse movement is used to access the information instead of an eye movement (Meier & Blair, 2012). In StarCraft 2, skill significantly influences how the game interface is used. More sophisticated use of game features such as hotkeys and the mini map, which increase with skill level, act as a proxy measure for a more strategically focused interaction with the digital environment.

In a computer interface, both oculomotor observation of displays and the manipulation of those displays happen in tandem. The challenge of integrating these two modalities in game data is that usually we do not have information on the finely detailed activity of the eyes. Conversely, studies that attempt to measure learning and perception with eye tracking provide a wealth of detail on eye movements but limit participant interaction with the environment. Furthermore, the time scale of a laboratory study cannot match thousands of hours of practice typical of any real-world complex learning process. Given that we do not have access to what a player is actually foveating during a screen fixation, we do not know whether the effects seen in a laboratory learning experiment with relatively controlled stimuli apply to a dynamic display where the relevant stimuli may be moving. However, we do know from the current work that similar patterns of information sampling occur both in the lab and in the game. We identified a possible opportunity to generalize theories of oculomotor attention beyond the eye.

Our work provides converging evidence with the burgeoning study of eye movements during gaming. Söderberg, Khalid, Rayees, Dahlman, and Falkmer (2014) examined the degree to which experience in the military would affect gaze patterns during a combat video game. Military personnel exhibited different visual search strategies than civilians, but their fixation durations were similar. In the realm of virtual reality, a study in eye-hand coordination patterns (Chen & Tsai, 2015) also looked at differences between children and adults in a movement-based game, finding that children had shorter fixation durations than their adult counterparts, as well as longer latencies between the hand and eye movements. Castaneda, Sidhu, Azose, and Swanson (2016) found that experts were more efficient with their attention; they were less likely to fixate on the same area if interest twice in a row. Commonly used static elements of the heads-up-display (HUD) were fixated on less by experts, who instead spent more time monitoring dynamic elements of the HUD that would give them task-relevant information.

It is important to emphasize that while generalization of oculomotor attention theories to other sampling modalities seems warranted, there are indications that the cost of accessing information is an important component of a complete general theory. Studies of category learning have used both mouse-driven and eye-tracking-driven information access, allowing us to compare the relative expense of information access in two modalities. In the former case visual features used to classify a fictitious micro-organism were masked until the participant placed the cursor over the features, but in the latter case the features were available with a simple eye movement (McColeman et al., 2014; Meier & Blair, 2012). While no study has directly compared them, from an inspection of the data (e.g., Figs. 9C and 10C from McColeman et al., 2014) it appears that attentional optimization proceeds more rapidly in mouse-driven information access than in eye-movement studies. This supports the idea that the increased cost of accessing information leads to more efficient behavior (Meier & Blair, 2012). Also in keeping with the hypothesis that increased motor costs begets efficiency, is the observed increased consistency of the allocation of attention (e.g., Figs. 9D and 10D in McColeman et al., 2014).

More direct evidence that manipulating information access could impact learning trajectories comes from the aforementioned mouse-driven study. In it, they had two conditions, one in which the selected visual feature was revealed immediately (low-information access cost), and another in which there was a 3-s delay between the mouse cursor selection and the unveiling of the information (high-information access cost). Increasing the temporal cost to accessing information decreased the chance of fixating irrelevant information (McColeman et al., 2014). Furthermore, the additional access cost increased the chance that participants fixated the informative features in the most efficient order (Meier & Blair, 2012). Thus, early evidence suggests that temporal cost changes in information access, via delayed appearance or changes in the physical movements required, will change information access patterns, not simply speed them up or slow them down.

One final obstacle to generalizing theories of attention beyond the eye is that in the above laboratory studies information is accessed either by mouse or by eye, but in many complex tasks, such as StarCraft 2, these two modalities work in concert. For example, there may be contexts where a learned pattern of eye movements must be preceded by a different mode of information access (e.g., a head-turn, hand-movement, or screen-shift) if they are to be useful. Given the lack of data from such situations, we have little basis for predictions about how these multi-modal information access strategies develop.

In the present work we show that learning affects information access behavior independently of sampling modality. This finding supports the idea of information sampling as phenomenon not just of the oculomotor system, but of visual cognition generally. To generate the present findings, we relied on gaming logs that are records of complex human behaviors spanning thousands of hours of practice. We found with StarCraft 2 these logs can store psychologically and physiologically interesting data, albeit in a relatively coarse form. Applying broad theories of visual cognition to interpret data of this kind could open up new avenues of inquiry.

Open Practices Statement

The data and materials for the study are available on our public GitHub: https://github.com/SFU-Cognitive-Science-Lab/DigitEyes