Introduction

Despite the success of Combination Antiretroviral Therapy (cART) in suppressing viral load in persons living with HIV (PWH), >40% of these individuals exhibit neurocognitive deficits, collectively termed HIV-associated Neurocognitive Disorders (HAND) (Heaton et al., 2010). HAND-related deficits are typically mild in severity, although symptoms worsen over time in even functionally asymptomatic cases (Grant et al., 2014). This decline can impact patients’ day-to-day functioning (Heaton et al., 2004; Woods et al., 2017), potentially increasing viral risk by reducing medication adherence (Albert et al., 1999; Hinkin et al., 2004) and/or increasing risk-taking behavior (Gomez et al., 2017). Delineating mechanisms underlying HAND therefore remains a top priority. HAND most consistently affects learning and executive function, as identified by laboratory tasks requiring flexible responding (e.g., the Wisconsin Card Sorting Task; Heaton et al., 2011). These domains can be readily assessed in animals (Young & Markou, 2015). For example, the “executive” process of reversal learning—i.e., the ascertainment of changing reward contingencies and the appropriate modification of behavior—can be tested in humans and rodents via minimally modified cross-species translatable paradigms (Gilmour et al., 2013).

The HIV transgenic (HIVtg) rat, which constitutively expresses seven of the nine genes that comprise the viral genome (Reid et al., 2001), provides a model of HIV as it exists in the cART era—a non-replicative infection that chronically produces neuroinflammatory and cytotoxic agents (Vigorito et al., 2015). The HIVtg rat demonstrates impairments in learning initial task reward contingencies and in subsequent reversal learning (Lashomb et al., 2009; McLaurin et al., 2019; Moran et al., 2014); however, the applicability of these findings to clinical study is limited by two key aspects of experimental design. First, these studies utilized between-session reversal schedules, wherein rats were trained on initial behaviors over several testing sessions and then subsequently assessed across several more sessions following reversal of task contingencies. This design contrasts with those used in clinical study, in which initial and reversal learning are typically evaluated within individual testing sessions (i.e., within-session reversal) (Waltz & Gold, 2007). Second, assessments of reversal learning in the HIVtg rat have previously only utilized deterministic reward contingencies (McLaurin et al., 2019; Moran et al., 2014), wherein correct responses to experimental stimuli invariably yielded a reward and incorrect responses invariably did not. Although such tasks are useful as basic cognitive assessments, they are not representative of day-to-day problem-solving situations, which seldom offer options with 100% predictable outcomes. A more accurate readout of subjects’ “real world” problem-solving abilities may be provided by probabilistic learning paradigms, in which correct and incorrect responses offer reward rates that are merely statistically favorable or unfavorable (Amitai et al., 2014). The HIVtg rat has yet to be assessed in a within-session probabilistic reversal learning paradigm.

In the Probabilistic Reversal Learning Task (PRLT), subjects use response feedback to ascertain the probabilistic reward schedules of arbitrarily designated target (rich) and non-target (lean) stimuli. Stimulus reward schedules and target/non-target designations switch after every 8 consecutive target responses, and subjects must recognize and adapt to these reversals. The primary outcome measures of the PRLT provide metrics of subjects’ ability to determine initial task contingencies and detect reversals (Bari et al., 2010). In addition to cognitive flexibility however, a key contributor to performance of the PRLT is subjects’ motivation to maximize reward. Effortful motivation can be assayed by the Progressive Ratio Breakpoint Task (PRBT), in which subjects must expend progressively more effort across trials to earn a fixed reward. HIV genotype previously decreased responding in a PRBT-like task, albeit in non-food-deprived rats (Bertrand et al., 2018). Given potential neural heterogeneity within the HIVtg line (McLaurin, Li, et al., 2018), it is prudent to explicitly assay the motivational phenotype of individual cohorts of HIVtg rats, especially during food deprivation. Furthermore, given the physical component of operant task performance, previous reports of reduced spontaneous locomotion in the HIVtg rat necessitate the characterization of the present cohort’s baseline exploratory behavior (Casas et al., 2018; June et al., 2009; Moran et al., 2013; Reid et al., 2016). The Behavioral Pattern Monitor (BPM) provides a multivariate readout of rodents’ unconditioned activity patterns (Young et al., 2016), which could then be used to gauge any contribution of motor abnormalities to operant performance.

Here, the cognitive and motivational phenotypes of the HIVtg rat were assessed using the PRLT and the PRBT, and unconditioned exploratory behavior was measured by the BPM. Given previously reported sex × diagnosis effects in PWH (Maki et al., 2018; Martin et al., 2011), both male and female rats were included in the present study. It was predicted that regardless of sex, HIVtg rats would exhibit impairments in both initial probabilistic reward learning and subsequent reversal learning, even after taking into account any concurrent alterations in motivation and/or exploration.

Materials and Methods

Animals

The operant study utilized male and female HIV transgenic (HIVtg) Fischer-344 rats and wildtype (WT) controls (HIVtg: 8 per sex; males: 209–265 g; females: 150–185 g; Envigo; Indianapolis, IN) (WT: 8 per sex; males: 250–320 g; females: 170–210 g; Envigo; Indianapolis, IN). Given the possibility of sporadic HIV transgene insertion producing an unstable genetic baseline in non-transgenic littermates (Bertrand et al., 2018), non-littermate Fischer-344 rats were used as controls in this study. These 32 rats were later assessed in the Behavioral Pattern Monitor (BPM), as were an additional 16 HIVtg and 16 WT rats of the same age that had not undergone operant training. Rats were housed in pairs in clear plastic enclosures and maintained in a climate-controlled room under a 12-hour light/dark schedule (7:00 AM–7:00 PM dark). Operant training commenced at ~10 weeks of age. Operant-trained rats were maintained at ~90% of their free-feeding body weight. Water was available ad libitum, except during training and testing. Non-operant trained rats were not food restricted at any time, and operant-trained rats were not food restricted when tested in the BPM. Training and testing occurred during the dark portion of rats’ light/dark schedules. Rats were maintained in a dedicated animal facility compliant with all federal and state requirements and approved by the American Association for Accreditation of Laboratory Animal Care.

Apparatus

Training and testing was conducted in 9-choice operant chambers housed in ventilated, sound-attenuating cabinets (Med Associates Inc., St. Albans, VT, and Lafayette Instrument Company, Lafayette, IN; previously described in Roberts et al., 2019). Chambers contained five evenly spaced stimulus presentation/response apertures arranged laterally across the rear wall, each of which housed a single LED light. Infrared beams inside each aperture detected nosepoke responses. Liquid reinforcement (strawberry Nesquik® plus non-fat milk, 40 μL) was delivered into a magazine on the opposite wall. The magazine contained an LED light that signaled reward delivery and an infrared beam that detected reward collection. A single house light was mounted on the ceiling of each chamber. Stimulus outputs and response inputs were managed by a SmartCtrl Package (8-In/16-Out) with additional interfacing by MED-PC for Windows (Med Associates Inc., St. Albans, VT) using custom programming.

Training

Rats were first conditioned to associate magazine illumination with food reward via a 20-min FI15 training module, in which 40 μL of strawberry Nesquik® was delivered into the illuminated magazine on a 15-second fixed-interval schedule. Once responding reliably (60 reward collections; ~3 days), rats were trained in a 30-min FR1 operant paradigm that rewarded single nosepokes to any of five illuminated stimulus apertures. In order to prevent the development of strong side biases, apertures were disabled (i.e., did not reward nosepokes) following 5 consecutive responses and were only reactivated after 2 responses were made to other apertures. Nevertheless, overall side bias (i.e., proportion of pokes made in holes on the preferred side versus the non-preferred side) was tracked across training, and bias relative to the initial location of the PRLT target stimulus (see below) was later used as a covariate for primary analyses. Training continued until all rats had made ≥70 responses/day for 2 consecutive days. In order to prevent overtraining and maintain stability of responding, rats that reached this criterion before the rest of the cohort were moved to reduced training schedules (2 days/week). All rats completed a further 2 consecutive days of FR1 training before initiation of testing to confirm stability of responding.

Probabilistic Reversal Learning Task (PRLT)

As reported previously (Roberts et al., 2019), the Probabilistic Reversal Learning Task (PRLT; Fig. 1a) presented rats with two illuminated stimulus apertures. One stimulus, arbitrarily designated as the “target,” rewarded 80% of nosepoke responses (40 μL strawberry Nesquik®) and punished 20% of responses (4-s timeout plus house light illumination). The other stimulus, the “non-target,” offered the inverse reward/punishment schedule. The target/non-target designations and reward schedules of the two stimuli were reversed after every 8 consecutive target responses across the 1-hour testing session. Initial target location was counterbalanced across operant boxes and rats. Response windows were unlimited. Primary outcome variables provided metrics for both initial probabilistic reward learning and reversal learning. Secondary measures included metrics of motoric impulsivity, processing speed, perseverative behavior, and reward and punishment sensitivity. Variables are defined in Table 1.

Fig. 1
figure 1

Experimental design schematics. In the Probabilistic Reversal Learning Task (PRLT) (a), rats were presented with two physically identical illuminated stimulus apertures, one of which was arbitrarily designated as the target, and the other as the non-target. A nosepoke to the target aperture resulted in reward 80% of the time, and punishment 20% of the time. Reward/punishment probabilities following a nosepoke to the non-target were opposite those offered by the target (20/80). Following 8 consecutive target responses, the target/non-target designations and reward/punishment schedules of the two apertures switched. A further 8 consecutive target responses resulted in another such switch. This procedure continued for the duration of the 1-hour testing session. Reward/punishment probabilities are represented in the schematic as P(reward) / P(punishment). In the Progressive Ratio Breakpoint Task (PRBT) (b), rats were presented with a single illuminated stimulus aperture. The requisite number of nosepokes to earn a reward increased by trial, as indicated. Rats’ “breakpoints” were the number of such trials completed within the 1-hour testing session. During assessment in the Behavioral Pattern Monitor (BPM) (c), rats were placed in an enclosed chamber and allowed to explore freely for 1 hour. Rat position from moment to moment was tracked using a grid of infrared photobeams, disruptions of which were recorded by microcomputer. A second set of longitudinal photobeams (not pictured) were positioned at a height such that the beams would be broken when the rat reared on its hind limbs. Photobeams originated from LEDs mounted on a metal frame surrounding the behavioral arena. The beams then passed through black Plexiglas walls, crossed the arena, and were detected by phototransistors mounted on the opposite sides of the metal frame. The BPM chamber also contained 11 photobeam-monitored holes, which recorded investigatory nosepokes. Chambers were illuminated by red light. Height of the chamber walls are not to scale. Green stars represent photobeam breaks.

Table 1 Description of primary and secondary outcome variables of the PRLT

Progressive Ratio Breakpoint Task

The Progressive Ratio Breakpoint Task (PRBT; Fig. 1b) utilized only the central stimulus aperture, to which the requisite number of nosepokes to earn a single fixed reward increased as a function of trial number. The primary outcome variable was the “breakpoint”—the total number of trials completed within the session. The testing session was terminated after either: a) passage of 1 hour; or b) five uninterrupted minutes of inactivity. Response and reward collection latencies were also recorded.

Behavioral Pattern Monitor

The Behavioral Pattern Monitor (BPM; Fig. 1c) comprised a 30 × 60 cm arena traversed by 2 arrays of infrared photobeams. The first array, a 9 × 17 beam grid at a height of 1 cm from the floor, monitored rats’ X-Y coordinates while a second set of 16 longitudinal beams at a height of 11.5 cm detected rearing behavior. Eleven photobeam-monitored holes were positioned around the arena, which rats could investigate via nosepoke. Chambers were enclosed by 40-cm-high black Plexiglas walls that allowed the passage of photobeams, but appeared opaque to the rats. Chambers were illuminated by 7.5-W red light bulbs and housed within ventilated, sound-attenuating cabinets. Photobeams were sampled via microcomputer at 55-msec intervals. The BPM assessed 3 dimensions of spontaneous behavior: general activity levels, exploration, and locomotor path patterns (Young et al., 2016). Variables are defined in Table 2. Sessions lasted 60 min.

Table 2 Description of outcome variables of the BPM

Statistical Analyses

Outcome variables of FI15 (reward collections, days to acquisition) and FR1 training (total rewards, days to acquisition, and side biases across the last 3 days of training) were analyzed by genotype and sex via two-way ANOVAs. Primary and secondary outcome variables of the PRLT and PRBT were similarly analyzed using sex and genotype as between-subjects factors. Significant interactive effects were analyzed further via independent samples t-tests. Given individual differences in FR1 training, primary outcome variables of the PRLT (trials to first criterion and switches) were also analyzed via two-way ANCOVAs incorporating the following covariates: days to FR1 acquisition, total completed FR1 trials, and FR1 side bias relative to initial target location. Sample sizes varied across analyses as some animals failed to generate data for certain variables; for example, subjects that failed to attain the first criterion for reversal of reward contingency could not produce values for the “trials to first criterion” measure. Sample sizes for each analysis are therefore reported individually in the Results section. Outcome variables of the BPM were analyzed via four-factor ANOVAs, with sex, genotype, and cohort (operant-trained versus non-operant-trained) as between-subjects factors and 20-min session bin as a within-subjects factor; consistent with prior studies, center time was analyzed using 10-min session bin as a within-subjects factor for greater temporal resolution. Significant and trend-level (p < 0.10) interactive effects were analyzed further via follow-up ANOVAs. All data were analyzed using SPSS 24.0 (Chicago, IL).

Results

FI15 and FR1 Training

No main or interactive effects of genotype were observed on either reward collection or days to acquisition during FI15 training [F’s < 0.60, n.s.]. Male rats generally required fewer days to reach criterion than female rats [F(1,21) = 4.4; p < 0.05] and made more individual excursions to the magazine to collect rewards [F(1,28) = 10.1; p < 0.01]. Two rats failed to reach criterion for acquisition of FR1 within 20 days and were excluded from analyses of FR1 acquisition and side bias; however, these rats were included when analyzing total number of completed FR1 trials. An additional rat was excluded from side bias analyses (>2 standard deviations from mean). Final sample sizes were 15 WT and 15 HIVtg rats for FR1 acquisition, 16 WT- and 16 HIVtg rats for total FR1 trials, and 15 WT and 14 HIVtg rats for side bias analyses. Non-significant trends towards main effects of genotype [F(1,26) = 3.1, p = 0.091] and sex [F(1,26) = 3.6, p = 0.068], but not sex × genotype interaction [F(1,26) = 2.3, n.s.], were observed on days to FR1 acquisition (Fig. 2a). No main or interactive effects of sex or genotype were observed on completed FR1 trials [F’s < 2.7, n.s.; Fig. 2b] or side bias, either overall or relative to the initial PRLT target location [F’s < 0.9, n.s.] [Overall: WTFemales: 5.07 ± 1.50; WTMales: 5.64 ± 1.41; HIVtgFemales: 3.43 ± 1.41; HIVtgMales: 4.73 ± 1.62] [Relative to target: WTFemales: 5.07 ± 1.52; WTMales: 5.64 ± 1.42; HIVtgFemales: 3.16 ± 1.42; HIVtgMales: 4.73 ± 1.64]. While all groups developed side preferences, previous observations in our laboratory indicate that even rats demonstrating strong and persistent side biases during training (values ≥3) are capable of flexible responding during later phases of study. Specifically, while training on a simple discrimination task at approximately chance-level accuracy, individual rats developed strong side biases that spontaneously neutralized without any concurrent change in overall task performance (Roberts, 2018). Given these observations, as well as our precautions against FR1 overtraining (see Methods), the present cohort’s preference values did not likely translate to inflexible responding in the PRLT (although care was taken to control for this behavior during PRLT analysis).

Fig. 2
figure 2

HIV genotype subtly reduces FR1 acquisition and significantly enhances probabilistic learning, but does not affect reversal learning. a Non-significant trends of sex and genotype, but not sex × genotype interaction, were observed on FR1 acquisition, with both HIVtg rats and male rats requiring more sessions to reach criterion. b Neither genotype nor sex affected overall trial completion across FR1 training. c HIVtg rats required fewer trials than WT rats to attain the first criterion for reversal of reward contingencies in the PRLT (i.e., 8 consecutive target responses). d No main or interactive effects of sex or genotype were observed on total number of switches completed within the PRLT session. Data presented as mean ± standard error of the mean. Presented means are not adjusted by covariates. *p < 0.05; #p < 0.010

Primary Outcome Measures of the PRLT

The two rats that had failed to acquire FR1 were excluded from all PRLT analyses. Three more rats failed to attain the first criterion for reward contingency reversal and were excluded from analysis of that variable. Two WT rats (a male and a female) were also excluded from this analysis as outliers (>2 standard deviations greater than the mean). Two of the three rats that failed to reach the first criterion were also excluded from analysis of switches data on grounds of low overall activity (<30 total responses across entire session). Final sample sizes were 12 WT and 13 HIVtg rats for the trials to first criterion analysis and 14 WT and 14 HIVtg rats for the switches analysis.

HIVtg rats required fewer trials to reach the first criterion than controls [F(1,21) = 4.7, p < 0.05; Fig. 2c]. This effect was significant even when using FR1 side bias (relative to starting target location) as a covariate [F(1,20) = 4.4, p < 0.05], and persisted as a near-significant trend when adjusting for days to FR1 acquisition [F(1,20) = 4.2, p = 0.054] and total completed FR1 trials [F(1,20) = 3.9, p = 0.062]. Neither days to FR1 acquisition [F(1,20) = 0.04, n.s.] nor total FR1 trials [F(1,20) = 0.05, n.s.] were significant covariates, however, indicating that, while the main effect of genotype was reduced to a near-significant trend level after controlling for these factors, initial probabilistic learning was independent from FR1 performance. No main or interactive effects of sex were observed on this measure following any of these analyses [F’s < 1.2, n.s.]. No main or interactive effects of genotype or sex were observed on switches either before [F’s(1,24) < 2.1, n.s.; Fig. 2d] or after incorporating the above covariates [F’s < 1.8, n.s.].

Secondary Outcome Measures of the PRLT

The same four rats that were excluded from the switches analysis were excluded from analyses of secondary variables. One additional rat was excluded from analysis of response latencies as an outlier (>2 standard deviations from mean). Another 10 rats failed to attain the second criterion (i.e., completion of reversal 1) and were therefore excluded from analyses of variables pertaining to the second block of testing. Final sample sizes were 14 WT and 13 HIVtg rats for analyses of response latencies, 14 WT and 14 HIVtg rats for other variables that were not contingent upon completion of the second block, and 7 WT and 11 HIVtg rats for variables pertaining specifically to the second block (reversal 1).

No main or interactive effects of genotype [F(1,14) = 2.9, n.s.] or sex [F < 1, n.s.] were observed on trials for reversal 1 [F < 1, n.s.; Fig. 3a]. HIVtg rats completed fewer trials within the entire testing session than WT controls [F(1,24) = 8.6, p < 0.01; Fig. 3b], regardless of sex [F < 1, n.s.]; female rats completed non-significantly more trials than males [F(1,24) = 3.4, p = 0.08]. No main or interactive effects of genotype or sex were observed on % premature responses [F’s < 1, n.s.; Fig. 3c]. HIVtg rats made more reward perseverative responses [F(1,24) = 10.1, p < 0.01; Fig. 3e] and punish perseverative responses [F(1,24) = 39.0, p < 0.001; Fig. 3f] than controls relative to rewards delivered and punished selections, respectively (values normalized as per Table 1). Males made more punish perseverative responses than females [F(1,24) = 5.0, p < 0.05]. Sex did not affect reward perseverative responses [F(1,24) = 1.5, n.s.], and no sex × genotype interactions were observed on either perseverative response measure [F’s < 2, n.s.]. Main effects of genotype [F(1,24) = 41.0, p < 0.001] and sex [F(1,24) = 17.0, p < 0.001], but not sex × genotype interaction [F < 1.8, n.s.], were observed on timeout responses as well, with HIVtg rats making more such responses relative to total punished selections than controls, and males making more than females (Fig. 3d). Relative to controls, HIVtg rats demonstrated longer latencies to respond to target [F(1,23) = 11.0, p < 0.01; Fig. 3g] and non-target stimuli [F(1,23) = 4.3, p = 0.050; Fig. 3h] and to collect rewards [F(1,24) = 6.8, p < 0.05; Fig. 3i]. Males demonstrated longer target [F(1,23) = 14.6, p < 0.001; Fig. 3g], non-target [F(1,23) = 5.9, p < 0.05; Fig. 3h], and reward latencies [F(1,24) = 5.2, p < 0.05; Fig. 3i] than females. A sex × genotype interaction was observed on mean target latency only [F(1,23) = 4.5, p < 0.05], wherein HIV genotype increased response latency in male rats only [t(11) = −3.0, p < 0.05; Fig. 3g].

Fig. 3
figure 3

Secondary outcomes for the PRLT reveal evidence for a speed-accuracy trade-off in HIVtg rats. a No main or interactive effects of sex or genotype were observed on the number of trials required to attain criterion for the second reversal of reward contingencies. b HIVtg rats completed fewer trials within the entire session than WT rats. c No main or interactive effects of sex or genotype were observed on the percentage of total trials terminated by premature responses. HIVtg rats made more reward (d) and punish perseverative responses (e) than WT rats when these values were normalized to, respectively, the total number of rewards delivered and the total number of punished selections (i.e., nosepokes to 1 of the 2 lit stimulus apertures that resulted in a timeout); values represent the mean number of such responses per situation. HIVtg rats and male rats made more timeout responses than WT and female rats, respectively, normalized to total number of punished selections. f Values represent the mean number of timeout responses made following each punished selection. HIVtg rats demonstrated longer latencies than WT rats to respond to target (g) and non-target (h) stimuli, and were also slower to collect rewards (i). Males demonstrated longer latencies to respond to target (g) and non-target stimuli (h) and to collect rewards (i). g The main effects of genotype and sex on mean target latency were driven by HIVtg males. Latencies reported in centiseconds. Data presented as mean ± standard error of the mean. *p < 0.05; **p < 0.01; ***p < 0.001

No main or interactive effects of sex or genotype were observed on overall win-stay behavior across the entire session [F’s < 1, n.s.; Table 3, “Total”], although HIV genotype did non-significantly decrease overall lose-shift behavior [F(1,24) = 3.0, p = 0.094]. No main or interactive effects of sex were observed on overall lose-shift behavior across the session [F’s < 1, n.s.], although males exhibited more target lose-shift behavior than females [F(1,24) = 7.5, p < 0.05]; no main or interactive effects of genotype were observed on this latter measure [F’s < 1, n.s.]. Non-target win-stay behavior was greater amongst HIVtg rats than WT rats across the session [F(1,24) = 7.1, p < 0.05], with no difference within or between sexes [F’s < 1, n.s.].

Table 3 Win-stay & lose-shift metrics for the PRLT

No main or interactive effects of genotype were observed on win-stay or lose-shift metrics within the first block of testing [F’s < 2.6, n.s.; Table 3, “Criterion 1”]. Males exhibited non-significantly greater overall win-stay behavior than females within this block [F(1,14) = 3.3, p = 0.091], as well as greater target lose-shift ratios [F(1,14) = 8.6, p < 0.05]. Sex did not exert main or interactive effects on any other measure within this testing block or after the first reversal (i.e., during the second block of testing) [F’s < 1.2, n.s.; Table 3, “Reversal 1”]. The only main effect of genotype after the first reversal was on target win-stay behavior [F(1,14,) = 5.1, p < 0.05], with HIVtg rats returning higher values than controls.

PRBT

The same four rats that were excluded from PRLT switches analysis were excluded from PRBT analysis, as were an additional 6 rats which failed to respond at all during the PRBT (breakpoint = 0, reasons unknown). Final sample sizes were 11 WT and 12 HIVtg rats for breakpoint and response latency analysis, and 11 WT and 11 HIVtg rats for reward latency analysis. No main or interactive effects of sex or genotype were observed on breakpoint [F’s < 0.7, n.s.; Fig. 4a] or on response or reward latency [F’s < 2.5, n.s.; Fig. 4b, c]. A trend towards sex × genotype interaction was observed on response latency [F(1,19) = 3.35, p = 0.083], follow-up analysis of which revealed that WT females tended to respond more quickly than WT males [t(9) = −1.9, p = 0.087].

Fig. 4
figure 4

HIV genotype did not affect effortful motivation in the PRBT. No main or interactive effects of sex or genotype were observed on breakpoint (a) or response (b) or reward latency (c), although a near-significant trend of sex × genotype interaction revealed a tendency for WT females to respond more quickly than WT males. Data presented as mean ± standard error of the mean. #p < 0.10

BPM

Testing bin affected all measures of activity and exploration in the BPM (F’s > 4.5, p’s < 0.05), with activity/behavior decreasing across 20-min bins. Bin did not interact with genotype on any measure [F’s < 1.7, n.s.], indicating equal rates of habituation across genotypes. No main effects of cohort were observed on any measure, nor were any interactions with sex or genotype [F’s < 2.2, n.s.]; however, non-operant-trained rats did exhibit non-significantly higher values for spatial d [F(1,55) = 3.4, p = 0.071]. Cohort interacted with bin on counts [F(2,110) = 5.8, p < 0.01], transitions [F(2,110) = 4.6, p < 0.05], and distance traveled [F(2,110) = 8.1, p < 0.01], with operant-trained rats exhibiting more counts during the first bin [F(1,61) = 4.1, p < 0.05] and greater distance traveled during the first [F(1,61) = 8.0, p < 0.01] and second [F(1,61) = 4.3, p < 0.05] bins. Operant-trained rats tended to complete fewer transitions during the last bin [F(1,61) = 3.4, p = 0.069].

No main effects of genotype or sex were observed on counts, transitions, distance traveled, or nosepoking [F’s < 1, n.s.; Fig. 5a–d], although non-significant sex × genotype interactions were observed on each of these measures. Post hoc analysis of a trend-level interaction on counts was inconclusive [interaction: F(1,55) = 3.0, p = 0.087; post hoc: F’s < 2.8, n.s.], although similar analyses revealed that: 1) HIVtg females tended to complete more transitions than HIVtg males [interaction: F(1,55) = 3.4, p = 0.072; post hoc: F(1,31) = 3.8, p = 0.062; Fig. 5b]; 2) HIVtg males tended to travel shorter distances than WT males [interaction: F(1,55) = 3.9, p = 0.053; post hoc: F(1,31) = 3.3, p = 0.079; Fig. 5c]; and 3) HIVtg females completed more nosepokes than HIVtg males [interaction: F(1,55) = 3.5, p = 0.067; post hoc: F(1,31) = 5.4, p < 0.05; Fig. 5d]. HIVtg rats reared more times within the 60-min session than controls [F(1,55) = 5.0, p < 0.05; Fig. 5e], and females reared more times than males [F(1,55) = 10.5, p < 0.01].

Fig. 5
figure 5

HIV genotype alone affects only rearing behavior in the BPM. No main or interactive effects of sex or genotype were observed on the number of distinct behaviors performed during the session (a). Male HIVtg rats tended to make fewer transitions than female HIVtg rats (b) and to travel shorter distances than male WT rats (c). Post hoc analysis of a near-significant trend of sex × genotype interaction revealed that HIVtg females made more nosepokes than HIVtg males (d). HIVtg rats exhibited more rearing behavior than WT rats, and females reared more frequently than males (e). HIVtg rats tended to spend less time in the center than WT rats across the entire session, with no interaction with sex or bin (f). Data presented as mean ± standard error of the mean. #p < 0.010; *p < 0.05; **p < 0.01; +Significant (p < 0.05) post hoc analysis of near-significant (p < 0.10) sex × genotype interaction

Overall, HIVtg rats tended to spend less of their time in the center of the chamber than controls, although this effect failed to attain significance [F(1,50) = 3.5, p < 0.068; Fig. 5f]. Genotype did not interact with sex or 10-min bin on this measure [F’s < 1], and no main or interactive effects of sex were observed [F’s < 1].

Discussion

Contrary to our hypotheses, HIV transgenic (HIVtg) rats exhibited intact reversal learning and superior initial probabilistic learning relative to wildtype (WT) controls, as measured by the within-session Probabilistic Reversal Learning Task (PRLT). Specifically, HIVtg rats completed the same number of “switches” (reversals of reward contingency) as controls (Fig. 2d) and required fewer trials to ascertain original reward contingencies (Fig. 2c); however, they tended to require more days to reach criterion for FR1 acquisition (Fig. 2a). These effects were not likely driven by altered effortful motivation or locomotor patterns, as no significant effects of genotype were observed in the Progressive Ratio Breakpoint Task (PRBT; Fig. 4a) or on relevant measures of the Behavioral Pattern Monitor (BPM; Fig. 5a–c). The present findings suggest a nuanced cognitive profile for HIVtg rats, whereby they may excel in within-session (if not between-session; Moran et al., 2014) reversal learning in operant discrimination tasks.

The non-significant trend towards slower FR1 acquisition by HIVtg rats (Fig. 2a) suggests a between-session learning deficit consistent with earlier reports, although HIVtg and WT rats completed similar numbers of trials across FR1 training (Fig. 2b). Furthermore, despite the absence of significant sex × genotype interaction, the trend of genotype on days to FR1 criterion was likely driven by female wildtype rats, which required an average of 5 fewer days to attain criterion than other groups. Therefore, any between-session learning deficit (albeit non-significant) was sex-specific and did not translate to reduced trial completion. The absence of a significant early learning deficit was critical, as consistent with previous characterization of the HIVtg line, the rats used in the present study presented with cataracts (Reid et al., 2001). While these rats likely had some degree of visual deficiency, their relatively intact acquisition of FR1 (as well as of the subsequent PRLT and PRBT) demonstrates that this impairment was not sufficiently severe to hinder performance of operant tasks providing distinct visual cues.

In addition to reaching the first criterion more quickly than controls and completing the same number of switches, HIVtg rats exhibited longer response latencies (Fig. 3g, h) and completed fewer overall trials (Fig. 3b) than their WT counterparts. These behaviors may reflect a “speed-accuracy tradeoff” strategy, wherein longer decision times enabled more accurate choices at the expense of trial completion rate (Abraham et al., 2004). Indeed, no slowed response latencies were exhibited by HIVtg rats during the single-stimulus PRBT, thus suggesting a specificity of this behavior to decision-making situations (Fig. 4b). Furthermore, the absence of genotype effects on breakpoint in the PRBT (Fig. 4a) and activity in the BPM (Fig. 5a–c) indicates that the reduction in trial completion was not likely due to any deficit in motivation or movement, but rather to increased time expenditure during specific task-related behaviors—e.g., longer decision times pursuant to a speed-accuracy trade-off. Another likely contributor to the low trial completion rate and optimized rule acquisition/switch completion of HIVtg rats was a high level of perseverative nosepoking activity following stimulus selection (Fig. 3d–f). HIVtg rats were slower to disengage nosepoking behavior than controls following both rewarded and punished selections, thereby delaying excursion to the magazine for reward collection (Fig. 3i) or post-timeout trial initiation. While this perseveration may have limited trial completion by protracting individual trial durations, these superfluous nosepokes into rewarded or punished apertures following feedback delivery may have served to strengthen the association between given stimuli and their outcomes. Despite these behaviors, however, enhanced reward contingency acquisition of HIVtg rats was observed during the first testing block only (Fig. 2c and 3a), indicating that only initial probabilistic learning was significantly facilitated.

This speed-accuracy tradeoff strategy of the HIVtg rats may have been utilized in unequal measure by the two sexes. Overall, males exhibited slower response latencies and higher levels of punish perseverative and timeout responding than females, regardless of genotype (Fig. 3e–i). More importantly, HIVtg males took longer to respond to target stimuli than HIVtg females, which did not differ from controls (Fig. 3g). Meanwhile, HIVtg females exhibited the same reduction in completed trials as the males. The cause of this reduced trial completion is less evident than that of the HIVtg males, the excessive time expenditure of which would have likely limited the number of trials completed. Sex × genotype/diagnosis interactions in HIVtg rats and PWH typically describe more severe cognitive impairments in females than in males (Maki et al., 2018; Martin et al., 2011; McLaurin et al., 2017), although only male HIVtg rats exhibited impaired acquisition of a between-session reversal of a signal detection task (McLaurin et al., 2019). The implications of these previous findings are unclear. Critically, no main or interactive effects of sex were observed on the primary measures of the PRLT (Fig. 2c, d), indicating that, by possibly different means, both male and female HIVtg rats demonstrated superior probabilistic learning and normal reversal learning relative to controls.

Only non-significant and/or incidental differences in win-stay/lose-shift behavior were observed between genotypes (Table 3), indicating that HIVtg rats did not exhibit appreciably different levels of reward or punishment reactivity or of model-based or model-free response behavior. In model-based response strategies, decisions are made by weighing options’ statistical probabilities of reward, whereas model-free strategies non-specifically favor recently rewarded stimuli over recently punished stimuli (Groman et al., 2019; Voon et al., 2015). Given that the PRLT is characterized by the occasional delivery of “misleading” punishment or reward (Ragland et al., 2012), subjects would benefit from a predominantly model-based response strategy resistant to spurious feedback. Such a strategy would be characterized by relatively high target win-stay and non-target lose-shift ratios (Amitai et al., 2014), as opposed to non-specific elevations to overall win-stay and lose-shift behavior. Given the general absence of genotype effects on these measures, HIVtg rats’ PRLT performance cannot be explained by an enhanced ability to generate probability projections.

Despite the absence of genotype effects on switches or acquisition of the second criterion, the conclusion that HIVtg rats were unimpaired in reversal learning is limited by the testing schedule. For example, it is possible that these rats were in fact subtly impaired in reversal learning and were only able to complete the same number of switches as controls because they took fewer trials (and ostensibly less time) to initiate the first reversal (longer individual trial durations notwithstanding). Such a hypothesis is difficult to assess statistically given that few rats of either genotype completed more than one block of reversal learning (although it should be noted that 11 HIVtg rats completed this first reversal vs. only 7 WT rats). Furthermore, rats typically require several sessions to reach asymptotic performance in the PRLT (Bari et al., 2010), thus raising the possibility of floor effects in the present data set. The single-session study design employed herein was chosen for two reasons: 1) it reproduces those used in clinical study (Waltz & Gold, 2007); and 2) it better recreates “real world” situations requiring spontaneous behavioral flexibility than would a trained (i.e., multiple-session) reversal learning paradigm. While it may be conservatively concluded that HIVtg rats did not exhibit impaired reversal learning insofar as it could be measured by a single session of the PRLT, it is nevertheless necessary for future studies to address the above limitations by including additional testing points. Such studies should take care to consider the first session individually, as well as in combination with subsequent testing.

The HIVtg rats’ proficiency in the PRLT contrasts with the majority of extant literature, which describes genotype-mediated impairment in a range of cognitive tasks (reviewed in Vigorito et al., 2015). Of greatest relevance to the PRLT is the report that HIVtg rats required more sessions for acquisition of discrimination and reversal learning tasks than controls, and exhibited higher rates of attrition in the process (>75%; Moran et al., 2014). The divergence between past and present reports most likely results from the specifications of our respective reversal learning tasks. For example, the PRLT set a less stringent criterion for task acquisition than that enforced in the previous study—i.e., 8 consecutive target responses within a session versus 3 consecutive sessions of >70% accuracy (Moran et al., 2014). Since acquisition of criterion in the PRLT was not determined by cumulative accuracy across entire testing sessions, it was more “forgiving” of early errors and less contingent upon maintained performance. A more critical point of difference is that while the PRLT introduced reversals within individual testing sessions, the previous study was conducted using two discrete tasks administered during separate phases of study (i.e., between-session reversal) (Moran et al., 2014). Between-session assessments of reversal learning contain a long-term memory component not present in within-session assessments (Amitai et al., 2013); given that the HIVtg rat displays deficits in long-term memory (McLaurin, Booze, & Mactutus, 2018; Moran et al., 2013; Vigorito et al., 2013), it is possible that previously reported HIVtg reversal learning was hindered by a reduced ability to recall response outcomes from prior sessions. This deficit would likely have been compounded by the task’s cumulative accuracy criterion; impaired long-term memory may have caused high rates of error early in the session that, when averaged with subsequent trial performance, may have prevented overall session accuracy from reaching 70%. Therefore, given the present findings, it is possible that the HIVtg rat has little to no impairment in operant reversal learning, provided that task requirements are not contingent upon events that had transpired during past sessions.

Of considerable importance, also, in the comparison of past and present findings is the use of probabilistic reward contingencies by the PRLT, versus the deterministic contingencies employed during previous study (Moran et al., 2014). The critical point of difference between these two schedules is the introduction of prediction error; probabilistic learning tasks frequently deliver spurious rewards and/or punishments, whereas no misleading response feedback is provided in deterministic tasks (Ragland et al., 2012). Not surprisingly, probabilistic learning recruits additional brain areas that are not strictly necessary for ascertaining deterministic contingencies—e.g., the nucleus accumbens shell and orbitofrontal cortex. Inactivation of either of these structures impairs both initial and reversal learning in a trained version of the PRLT, but exerts either delayed (Dalton et al., 2014) or no effect (Dalton et al., 2016) on performance of an analogous deterministic learning task. Given the absence of histological analysis in the present study, it is unclear whether the involvement of either of these areas in the PRLT could have in any way mitigated the cognitive deficits of the HIVtg rat. Nevertheless, the apparent conditionality of these deficits warrants study of those areas that mediate the subtler aspects of PRLT performance.

The absence of genotype-mediated motivational alteration in the PRBT was somewhat unexpected, given a previous report that female HIVtg rats displayed reduced levels of responding for sucrose on both fixed and progressive ratio schedules (Bertrand et al., 2018). Critically, however, these rats were not food or water restricted, whereas the present cohort was maintained at ~90% free-feeding body weight for comparability to the PRLT findings. It is therefore possible that the previously documented motivational deficit of the HIVtg rat was satiety-dependent, and not prominent following food restriction.

The lack of significant main effect of HIV genotype on BPM activity (rearing behavior notwithstanding; Fig. 5a–f) similarly contrasts with previous reports of HIV-mediated motor aberrations. Direct comparisons are difficult however, given that there exists considerable interreport variation regarding the nature and degree of abnormality. Some studies report reductions to overall activity and distance traveled (June et al., 2009; Midde et al., 2011; Reid et al., 2016), while others describe more nuanced behavior dependent upon session bin and/or field region (Moran et al., 2013; Nemeth et al., 2014); one study even reports elevated overall activity in HIVtg rats (McLaurin, Cook, et al., 2018). Comparison is complicated further by the general absence of sex × genotype analyses in the extant literature. Given this heterogeneity of findings, it is improbable that the present BPM data indicated any gross behavioral variation from the general HIVtg population. Critically, the general absence of exploratory/motor abnormalities in the present cohort validates our interpretation that HIVtg behavior in the PRLT was not attributable to reduced movement speed or reluctance to initiate movement.

An advantage of the paradigms used in the present study is that they can be administered to humans, thereby enabling direct cross-species comparison (Bismark et al., 2017; Young et al., 2016). Indeed, our group has recently identified probabilistic learning deficits amongst PWH with detectable viral loads and methamphetamine dependence relative to virally suppressed PWH and HIV methamphetamine-dependent individuals; interestingly, no effect of HIV diagnosis alone was observed (unpublished observations). The present findings of apparently intact PRLT performance therefore validates the HIVtg rat as a model of the cognitive effects of non-replicative HIV infection, although reproduction with larger sample sizes is necessary. An important caveat to this interpretation is that the present cohort of HIVtg rats was not maintained on cART regimens, as would be virally suppressed PWH. The cognitive effects of cART have yet to be characterized in the HIVtg rat, although daily administration of ARTs for 3 weeks induced cognitive impairments in healthy mice (Pistell et al., 2010). This knowledge gap reduces the applicability of the present study (and indeed, any study utilizing cART-untreated animal models of HIV) to clinical research, and represents a significant need for future investigation.

In summary, male and female HIVtg rats required fewer trials than controls to ascertain initial probabilistic reward schedules in the PRLT and were able to detect and adapt to the same number of reversals of reward contingency. Secondary behaviors suggested a speed-accuracy trade-off strategy that may have enabled HIVtg rats to maintain relatively optimized rates of criterion acquisition. Genotype did not affect effortful motivation in the PRBT, and the BPM detected minimal locomotor alterations. Altogether, these findings suggest that HIVtg rats demonstrate proficient performance of within-session reversal learning tasks operating on probabilistic reward contingencies and that their performance of such tasks is not appreciably impacted by motivational or motor deficits.