Do Virtual Environments Close the Gender Gap in Participation in Question-and-Answer Sessions at Academic Conferences? In Search of Moderation by Conference Format

Virtual conferences come with big hopes for inclusion in science. They have cheaper registration fees, require no travel or lodging costs, and often have events that allow for global participation. Although these shifts made strides for international researchers and researchers with fewer resources, it is not yet known how the shift to a virtual format has impacted the experience of conferences for women, and specifically, in question-and-answer (Q&A) sessions. In-person conferences show robust evidence of men participating in Q&A sessions more than women across disciplines (Hinsley et al., 2017; Jarvis et al., 2022; Käfer et al., 2018; Pritchard et al., 2014; Schmidt & Davenport 2017; Telis et al., 2019). It could be the case that the structural shifts in online Q&A sessions successfully disrupt the masculine defaults present during in-person Q&A sessions, creating a more welcoming environment that allows for equitable gender participation. However, it is also possible that the gendered power dynamics are so pervasive that disproportionate participation persists despite the changes in format. Due to the onset of the COVID-19 pandemic, virtual conferences have become not only more popular and widespread, but necessary, creating an opportunity to more easily study them in their various instantiations. This paper seeks to investigate the impact of virtual conferences on disproportionate participation by gender in online Q&A sessions.

Impacts of Structural Shifts on Behavior

Structures and embedded systems have profound impacts on behaviors. People tend to act in ways that are consistent with and supported by the existing structures, called “channel factors” (Lewin, 1947; Ross & Nisbett, 1991). By altering structures within situations, it is possible to reduce friction and open channels to make different behavioral responses more likely (Lewin, 1952). For example, changing default options on retirement enrollment plans and appointments to receive flu shots so that the default is to be enrolled and have appointments increases participation in both programs (Carroll et al., 2009; Chapman et al., 2016; Leventhal et al., 1965). In this way, minor structural changes to the choice architecture can have meaningful impacts in how people engage in decision-making processes.

However, not all existing structures are as gender-neutral as the ones described above. One process by which fields and spaces stay majority-male is by relying on masculine defaults. Masculine defaults are aspects of systems that reward behaviors that are more associated with men and masculinity (Cheryan & Markus, 2020). Policies that require self-nominations for promotion or awards (Kang, 2014) or value individual contributions over collaboration and teamwork (Diekman et al., 2010), reward individualism and self-promotion, which are traits more commonly found among men (Cheryan & Markus, 2020; He et al., 2019; Moss-Racusin & Rudman, 2010), are examples of masculine defaults in the workplace. Masculine defaults put women in a bind: they must either conform to the masculine defaults and risk backlash for counterstereotypical behavior (Heilman, 2012; Rudman & Glick, 1999), or act in more feminine ways and be less likely to have their contributions acknowledged or valued.

Virtual conferences provide a unique opportunity to consider alternative ways to organize Q&A sessions and to test the impact of structural shifts in the conference format on participation. Rather than engaging in person with the speaker by getting in line or raising a hand, question-askers can be more distanced. Often reduced to their name or a thumbnail, participants may experience more social distance when preparing to ask a question. Chat-based question-asking and asynchronous formats may ease the barriers to entry to ask a question and increase the opportunities for questions to be answered.

What to Expect in Virtual Q&A Sessions

Disproportionate participation by men may be reduced to the extent that virtual Q&A sessions do not reward the same masculine defaults as traditional Q&A sessions. For example, during in-person Q&A sessions, individualism is rewarded when question-askers prioritize the answering of their own questions over what questions might be most informative for the full group. In contrast, in a virtual environment, time and space are less constrained due to the near infinite availability of chat space. Question-askers can therefore voice their own questions with less risk of crowding out the questions of others. Virtual contexts that incorporate chat messages also do not reward the assertiveness inherent in vying for space in line to ask questions or seeking to be called on first (i.e., before time runs out) because whether a question gets asked is more within the agency of the question-asker and whether they choose to type it or not. Rather, whether a question gets answered may be more within the discretion of the speaker or moderator. In chat messages, question-asking becomes less visible of an action, which could mean that the utility and benefits of participating to garner status could be reduced, simultaneously making it a more open and productive space for participation.

Although the creation of online discussion presents an opportunity for more equitable participation, it could also be the case that the same gendered communication dynamics are reproduced in this new environment. Previous research on gender and online communication finds that men are more likely to participate in online forums and online political debates (Albrecht, 2006; Baek et al., 2012). Sociolinguistic theory posits that men communicate to build status while women communicate to build rapport; and these dynamics are reproduced online (Gefen & Ridings, 2005) and are not sensitive to gender base rates (i.e., proportion of men and women attendees). In an undergraduate online discussion board for a psychology course where women outnumbered men 3:1, women were more likely to use hedges and express agreement whereas men were more likely to use authoritative language and express disagreement (Guiller & Durndell, 2006).

In a context similar to the present investigation, researchers investigated participation in Stack Overflow by gender. Stack Overflow is an online forum in which people can post questions related to computer coding and receive crowd sourced answers. In general, men were overrepresented in Q&A posts and answers, consistent with their higher status over women (Vasilescu et al., 2012). Vasilescu et al. argue that these gender differences could be due to the fact that status is driven through competitively earning prizes and the speed of response, and that men in that space do not consider sexism to be a problem (Begeny et al., 2020). However, if women encountered more women when they first joined the platform, they engaged more quickly in the community (Ford et al., 2017).

In contrast, the volubility (i.e., the amount one speaks or writes) of communication by gender could be flipped, wherein women participate more than men, in forums that are designed to be supportive rather than status seeking. In one study, women participated more in online cancer support groups compared to men (Ginossar, 2008). The researchers suggest that this effect is driven by the purpose of these groups to be supportive, which is more similar to the ways in which women communicate offline. However, a meta-analysis on support group communication found the support for this gendered effect to be mixed (Mo et al., 2009). In general, the existing research suggests that communication dynamics observed in in-person interactions are manifested virtually as well.

To date, work on virtual communication has primarily focused on online written communication. Little is known about the gendered experience of online video communication in terms of individual volubility and disproportionate gender participation. This paper seeks to address this gap in the literature by examining the impact of gender communication in both written and video formats.

Current Program of Research

A recent paper investigated gendered participation in Q&A sessions at in-person conferences and developed a methodology by which to evaluate how participants asked their questions (Jarvis et al., 2022). Specifically, we used a recorded small conference to explore trends in participation and confirmed our hypotheses with data from a larger recorded conference. Two research assistants watched all of the recorded Q&A sessions, timed the lengths of the questions and answers, and coded the questions for various behaviors. As predicted, men were more likely to initiate Q&A interactions, were more likely to be in the first group to ask questions, and took more total time, though they did not ask longer questions or significantly differ in any other question-asking behaviors. When asked to reflect on their experiences in Q&A sessions, men reported feeling more comfortable participating and women reported greater fears of backlash (also see Carter et al., 2018; Sandstrom et al., 2022). Women’s fears of backlash seem to be driven by their negative perceptions of the extent to which others in the audience are hostile and critical (Jarvis & Kray, 2022).

In the present studies, we use and build on this established paradigm to (1) test gender differences in Q&A behaviors across a variety of contexts to investigate the impacts of moving to a virtual environment on gender disparities in participation, and (2) test which structural factors potentially related to masculine defaults can exacerbate or mitigate gender disparities in participation using variations in Q&A formats between and within conferences. Study 1 provides an initial test on the impacts of moving to a virtual environment on Q&A participation by gender and was used in an exploratory manner to determine which of the two competing hypotheses would be most likely. In Study 2, we test the same conference as the exploratory conference from Jarvis et al. (2022) one year later. Though small, this conference provides the most direct test of the impacts of moving to a virtual format from an in-person format on gender disparities in participation by comparing gendered participation within the same organization.

In Study 3, the conference was majority women and participation was only available via text-based posting. This allows for a test if gender disparities are mitigated in the circumstances theoretically most amenable to encouraging participation from threatened groups. In a text-based communication medium, numerical representation and opportunity space to engage is high while the visibility of the question-asker is low. If gender disparities persist, this would be the most conservative test of the impacts of structural aspects. The conference tested in Study 4 was a large conference, functioning as the main conference for an entire field. The Q&A format varied widely across sessions within the conference, allowing for exploration of which circumstances are most likely to reduce (or exacerbate) gender disparities in participation.

All materials, codebooks, data, and analysis code are available on the Open Science Framework (https://osf.io/b37et/). All studies were reviewed and approved by an Institutional Review Board prior to data collection.

Study 1: Exploratory Conference

Method

Data Source

This conference was a small, topical conference that varied from being single (i.e., all attendees see the same set of talks) or dual track (i.e., attendees could choose between one of two talks) across three conference days. It was attended by 243 attendees (103 men, 134 women, 6 unidentified) and had 23 speakers (16 men and 7 women). The conference occurred on Zoom and was recorded and posted online. There were 54 Q&A interactions across 20 research talks. Four research talks either did not have Q&A or had technology failures and were not recorded. Most questions were asked on video during the sessions, though some were asked via chat and answered on video. We were unable to collect the chat messages from the conference organizers, so Q&A interactions were only counted if they were answered on video. Further, six interactions were not included in analyses because the questions were asked via chat and were not attributed to the question-asker when answered over video.

Gender

Gender base rates for the conference were determined by obtaining a participant list from the conference organizers and searching for pronouns and gender presentations online on professional, personal, or social media webpages. Self-identified pronouns were prioritized (44%), followed by gender presentation in photos (47%), and lastly by their name (5%). We were unable to identify gender identities for 4% of participants. The gender of the question-askers was coded by two raters based on their gender presentation (Cohen’s Kappa = 1.00).

In our gender identification efforts, we prioritized self-identified pronouns, however this could not be identified for each conference attendee. Basing gender on presentation could create a reliance on gender as a binary and assumes that question-askers identify with the gender they present as. Raters had the option to select a non-binary option, although they never chose to use it. Though this method leaves open the possibility of mis-gendering question-askers who identify outside the gender-binary, we expect the instances to be relatively few and therefore to pose a low risk of biasing the results.

Qualitative Coding

Two research assistants watched the Q&A sessions from each talk and coded behaviors during the Q&A sessions according to the codebook from Jarvis et al. (2022). The present investigation highlights three key indices of participation.

Initiating Q&A Interactions

A Q&A interaction was counted when a conference attendee who was not the speaker or an author on the paper spoke during the Q&A session. This was analyzed by comparing the proportion of Q&A initiates by gender to the gender base rates of the conference attendees.

Total Time Per Question

Research assistants timed the length of the participant remarks and speaker responses. The recorded time from the two research assistants were considered discrepant if the difference was greater than 5 s (weighted Kappa = 0.91). A weighted kappa was used to account for the data being ordinal (Cohen, 1968).

If the difference was less than 5 s, the two measurements were averaged and rounded to the nearest second. If the difference was five seconds or more, a senior member of the team also timed the discrepant measurement and the original measurement closest to the third measurement was selected. The time analyzed was the total amount of time the participant spoke during the interaction including their initial question and any follow up questions or comments.

Total Q&A Time

Total Q&A time was calculated by aggregating all of the time spoken by conference attendees and separated by gender. The proportion of speaking time in seconds by gender was compared to the proportion of gender base rates at the conference.

Results

To examine gender differences in participation, we first computed the gender base rates of conference attendees and then compared it to participation rates. Because men made up 42% of conference attendees, we would expect approximately 42% of questions to be asked by men if there were no gender gap in participation rates. The extent to which men collectively ask more or less than 42% of questions would suggest disproportionate participation. We did not have Zoom room-level participation data, so we used the conference-level gender base rates. As Q&A dynamics had yet to be systematically studied, we treated this study as exploratory and did not have any hypotheses. The direction of the results would influence the hypotheses for the rest of the studies.

Figure 1 presents the proportion of participation for women and men in the Q&A sessions. There were no significant differences in propensity to initiate Q&A interactions (58% men, Nmen = 28; 42% women, Nwomen = 20), χ2 (1, N = 48) = 1.55, p = .214, d = 0.36, 95% CI [-0.22, 0.94], or length of speaking time for questions (Mmen = 60 s, SDmen = 40 s; Mwomen = 51 s, SDwomen = 37 s), b = -0.16, Z = -0.77, p = .444, 95% CI [-0.57, 0.26]. However, men took more total Q&A time than women compared to gender base rates of the conference (62% men, Nmen = 1685.5; 38% women, Nwomen = 1033), χ2 (1, N = 2718.5) = 188.53, p < .001, d = 0.55, 95% CI [0.47, 0.62].

Fig. 1
figure 1

Proportion of Participation by Men and Women in Q&A Sessions. Note: The first set of columns is the expected rate of participation given the base rates of attendees. The second set of columns is the total amount of time used by gender. Panel A is the results from Study 1. Panel B is the results from Study 2. Panel C is the results from Study 3

Discussion

Though two of the analyses did not show statistically significant differences, likely due to the small sample size, all patterns of results were in the same direction. Not only did men participate more than women compared to what would be expected based on the gender base rates of attendees, men participated more than women outright despite being the numerical minority of attendees at the conference. Due to the consistent pattern of results, based on these exploratory analyses, we decided to preregister null hypothesis tests predicting men would disproportionately participate compared to gender base rates of attendees for all subsequent studies (https://osf.io/h7wfd/).

Study 2: Virtual vs. In-Person Conference Direct Comparison

Study 1 demonstrated a pattern of men over-participating in a virtual conference’s Q&A session relative to base rates of conference attendees, consistent with past research (Jarvis et al., 2022). The current study provides a more direct comparison of in-person versus virtual conferences’ Q&A sessions by examining a conference right before and directly after the switch to virtual format due to the Covid-19 pandemic.

Method

Data Source

This conference was a small, interdisciplinary conference that varied from being single or dual track across two conference days. It was attended by 172 attendees (104 men, 67 women, 1 person could not be identified). Of the conference speakers, 32% were women and 65% were men. The conference occurred on Zoom and was recorded and posted online. There were 34 Q&A interactions across 13 research talks. Two sessions were not coded because the posted recordings were cut before the Q&A occurred (N = 5 questions). Most questions were asked on video during the sessions, though some were asked via chat and answered on video. A unique benefit of this conference is that the previous year’s in-person conference was also recorded and posted online, allowing for a clearer test of how the structural move from in-person conferences to virtual conferences impacted Q&A participation by gender within roughly the same population (see Jarvis et al., 2022, exploratory conference: https://osf.io/9tuvb/).

Gender

Gender base rates for the conference were approximated by monitoring the participant list on the Zoom window in each session and recording their names. Gender was assessed by pronouns as part of their name on Zoom (4%, n = 8) or on their personal websites (24%, n = 41), gender presentation on Zoom video (17%, n = 30), their Zoom profile pictures (36%, n = 62), or their personal websites (13%, n = 23), and their names (4%, N = 7). The gender of the question-askers from synchronous Q&A sessions was coded by two raters based on their gender presentation (Cohen’s Kappa = 1.00).

Qualitative Coding

Synchronous participation of the virtual conference was coded in the same manner as Study 1. Three research assistants timed the questions and responses with good reliability (ICC = 0.98).

Results

Analytic Approach

First, gender differences in participation within the conference were tested using the same approach as Study 1. We hypothesized that there would be gender differences in asking questions, length of speaking time, and total Q&A time. Then, to test the impact of switching from in-person to virtual formats, the gender differences from each format were compared. This analysis was exploratory, and we did not preregister hypotheses for differences between in-person and virtual formats.

Participation in the Virtual Conference

There were no significant gender differences in the propensity to initiate Q&A interactions (74% men, Nmen = 25; 26% women, Nwomen = 9), χ2 (1, N = 34) = 0.73, p = .392, d = 0.29, 95% CI [-0.40, 0.98] or length of speaking time (Mmen = 44 s, SDmen = 45 s; Mwomen = 30 s, SDwomen = 22 s), b = -0.41, Z = -1.14, p = .256, 95% CI [-1.07, 0.34]. However, men occupied more of the total Q&A time compared to women (80% men, Nmen = 1102; 20% women, Nwomen = 268), χ2 (1, N = 1370) = 126.10, p < .001, d = 0.64, 95% CI [0.53, 0.75].

Two authors attended this conference and recorded when each question was asked, by whom, and their gender presentation in addition to the coded data from the recorded videos. Of the five uncoded questions, 4 were asked by men and 1 was asked by a woman. Disproportionate participation results were comparable when including the five uncoded questions (74% men, 26% women), χ2 (1, N = 39) = 1.07, p = .301, d = 0.33, 95% CI [-0.31, 0.98].

Differences Between the In-Person and Virtual Conference

To test for differences in disproportionate participation by gender between the in-person and virtual conferences, we compared the observed gender differences in participation for the in-person and virtual conferences (e.g., 47% in the virtual conference) to what would be expected based on the gender base rates of attendees (e.g., 21% in the virtual conference) using chi-square tests. There were no significant differences in men’s disproportionate participation between the in-person and virtual conferences in propensity to ask questions, χ2 (1, N = 50) < 0.01, p > .999, d = 0.00, 95% CI [-0.56, 0.56], and total floor time, χ2 (1, N = 3621) = 0.14, p = .710, d = 0.01, 95% CI [-0.05, 0.08]. Differences between in-person and virtual conferences in the propensity to ask questions was also comparable when including the five uncoded questions, χ2 (1, N = 53) < 0.01, p > .999, d = 0.00, 95% CI [-0.54, 0.54].

Differences in gender differences in speaking time by in-person and virtual conferences were analyzed with a negative binomial regression including gender (effect coded: -0.5 = men, 0.5 = women), conference type (effect coded: -0.5 = in-person, 0.5 = virtual), and their interaction. There was an effect of gender such that across both types of conferences, men spoke for a longer amount of time compared to women, b = -0.55, Z = -2.61, p = .009, 95% CI [-0.94, -0.12]. There was no effect of conference type, b = -0.36, Z = -1.73, p = .084, 95% CI [-0.78, 0.05], or a moderation of the gender difference by conference type, b = 0.28, Z = 0.67, p = .501, 95% CI [-0.55, 1.10].

Discussion

The key contribution of this study is the comparison of gendered participation between in-person and virtual conference formats of the same conference series held one year apart. In both years, individually, men took more of the Q&A session time than women compared to what would be expected from the gender base rates of attendees. Analyses comparing the two conferences found that gender differences in participation or speaking time did not significantly vary, suggesting that virtual conferences with video Q&A sessions might not provide a different (or protective) environment for women compared to in-person conferences.

Study 3: Chat-Only Participation

Studies 1 and 2 established the persistence of gender differences in Q&A participation across the total Q&A time at virtual academic conferences, a gap that was comparable to an in-person version of the same conference in Study 2, and in previously documented gaps in in-person conferences more broadly (Hinsley et al., 2017; Jarvis et al., 2022; Käfer et al., 2018; Pritchard et al., 2014; Schmidt & Davenport, 2017; Telis et al., 2019). The current study seeks to further explore the possibility that changing the communication medium of Q&A sessions to a text-based format might mitigate this gender gap.

Method

Data Source

This conference was a meeting for a subdiscipline and was attended by 3,888 attendees. All presentations were pre-recorded and available on the conference webpage. Some pre-recordings were scheduled during the conference period for collective watching (73 sessions), and others were available after the conference for asynchronous viewing. Because less emphasis is placed on engagement with poster sessions and asynchronous sessions compared to the “live” symposiums, only the “live” symposiums were analyzed. On the page for each presentation, there was a side-panel with a discussion thread specifically for Q&As and a discussion thread for general chats. Chat messages from the conference were web-scraped from the conference web portal one week after the official conference period ended. Both the Q&A window and chat window were checked for Q&A interactions. An unknown number of sessions engaged in Q&A in the chat window within video calls for the scheduled viewing of their sessions instead of the window on the conference platform. However, most sessions used the Q&A window to some degree, though seven sessions did not have any activity in the Q&A window. Across the conference, there were 750 Q&A interactions (436 at live research sessions) and 591 general chats (15 Q&A interactions were sent via general chat but analyzed with the Q&A interactions). To keep our coding of behaviors consistent with the video coding, if a participant who was not a speaker on the project replied to a Q&A thread, the reply (as well as corresponding speaker responses) was added to a new row and treated as a new Q&A interaction (N = 30). This method was designed to best capture attendee participation.

Gender

Gender base rates for the conference were determined by self-reported gender collected at registration and provided by the conference organizers. The conference was attended by 31% men (N = 1188) and 65% women (N = 2520). The names of all chat senders and presenters were recorded. The gender identities of the chat senders and presenters was determined by looking for their gender identity on professional or social media webpages. Self-identified pronouns were prioritized (78% of chat senders and presenters), followed by gender presentation in photos (21% of chat senders and presenters), and lastly by their name (1% of chat senders and presenters). We were unable to identify gender identities for 3 participants.

Participant Status

Participant status was determined by identifying which career stage they were in at the time of the conference. Web sources with career stages and year markers were prioritized in searches (e.g., CVs, LinkedIn, news articles), but information on university directories, personal web pages, and Twitter were also checked as a best alternative to web pages with time markers to confirm up-to-date positions. Status was classified into 7 categories: undergraduate, post bac (e.g., lab managers, project coordinators), graduate student, post doc, assistant professor (also included adjuncts and lecturers), associate professor, full professor, and industry. Status could not be identified for 48 chat senders and presenters (5% of the sample).

Participation

To adapt to the completely text-based nature of Q&A interactions in this conference, participation was operationalized as engaging in Q&A chats with speakers by gender of the participant and the number of words used per initial question chat message and across all participant chats in the Q&A interaction. We hypothesized that men would disproportionately ask questions, write longer questions, and use disproportionately more of the total chat space. We tested participant status as an exploratory moderator and did not specify any directional hypotheses about how it might interact with participation.

Results

Participation

Figure 1 presents the proportion of participation for women and men in the Q&A sessions. Men were more likely to initiate Q&A interactions compared to what would be expected from the gender base rates of conference attendees (39% men, Nmen = 173; 61% women, Nwomen = 273), χ2 (1, N = 446) = 4.04, p = .044, d = 0.19, 95% CI [0.00, 0.38]. There were no significant gender differences in total chat length (Mmen = 279 words, SDmen = 179 words; Mwomen = 263 words, SDwomen = 165 words), b = -0.06, Z = -1.04, p = .297, 95% CI [-0.17, 0.05]. Across the conference, men provided a disproportionate amount of the total Q&A question text compared to gender base rates (40% men, Nmen = 48,198; 60% women, Nwomen = 71,720 ), χ2 (1, N = 119,918) = 1684.4, p < .001, d = 0.24, 95% CI [0.23, 0.25].

Moderating Effects of Status

To test the moderating effects of status on gender differences in question length, status was treated in a continuous, linear manner. Question-askers who were identified as being in industry (N = 22 chats) were excluded from analyses because it was unclear how they would fit into the academic hierarchy. Status (centered) was entered into the negative binomial models as a predictor and interaction term with gender (effect-coded). Arguably, adjunct professors and other non-tenure track faculty roles hold a lower status than tenure-track or tenured faculty in the university setting (Kezar & Sam, 2011; Morling & Lee, 2020). Models were tested in which non-tenure-track faculty were ranked between post docs and assistant professors, and results were consistent as to when non-tenure-track faculty were grouped with assistant professors. Given how few non-tenure-track faculty there were, we include them in the same rank as assistant professors in the main text. See OSF for models with non-tenure track faculty as separate (https://osf.io/ebdgy).

There was a significant gender by status interaction for the total text used in the Q&A interaction, b = 0.11, Z = 2.52, p = .012, 95% CI [0.02, 0.19], with no main effect of gender, b = -0.05, Z = -0.81, p = .419, 95% CI [-0.16, 0.06], or status, b = -0.03, Z = -1.65, p = .099, 95% CI [-0.08, 0.01]. Upon evaluating the simple effects, the shape of the interaction indicates gender differences among lower status positions in which men write more than women, and gender differences are ameliorated at higher status levels (see Fig. 2). It is worth noting that the raw values for almost all of the status levels except graduate students show directionally that women had longer chat messages than men, suggesting that this interaction could be driven by the particularly large gender differences at that level (see Table 1). In a model testing gender differences in total text length for only graduate students, men wrote more than women, b = -0.24, Z = -3.04, p = .002, 95% CI [-0.39, -0.09].

Fig. 2
figure 2

Fitted Slopes and 95% Confidence Interval Bands for Total Word Count of Each Q&A Interaction by Gender and Status

Table 1 Simple Effects Test Within Model Testing Gender by Status Interaction on Total Length of Participant Text in Q&A Interactions

Discussion

Study 3 represented what might be the best chance for women to reach equitable participation in Q&A sessions. Conference attendance and conference speakers had women in the numerical majority. In addition, to the extent that the reason for women not participating is due to the stress of using collective space to visibly ask a question (Carter et al., 2018; Holmes, 2013; Jarvis et al., 2022), these fears would be ameliorated in a completely virtual setting. Yet, men asked disproportionately more questions, and wrote disproportionately more of the question words across the conference. In other words, gender differences persisted even in a context that would theoretically be most supportive of women’s participation, providing the most conservative test of the impact of structural aspects of the conference on gender disparities. This durability of gender disparities across modalities and formats speaks to the power of sociocultural factors that discourage women from participating, relative to men, such as gender norms and stereotypes, risk of backlash, heightened anxiety, and lower self-confidence (Brescoll, 2011; Carter et al., 2018; Dupas et al., 2021; Heilman, 2012; Jarvis et al., 2022; Moss-Racusin & Rudman, 2010; Rudman, 1998; Rudman & Glick, 1999).

Academic status moderated total engagement in the Q&A interaction. In particular, at the graduate student level, men wrote more than women across Q&A interactions, and this effect diminished as status increased. Here, it was not that women wrote particularly short messages; rather, men in graduate school wrote much more than any other group. Although some may be concerned that gender gaps in Q&A participation are simply reflective of men being overrepresented in higher status academic positions, this finding adds to the evidence that does not find this to be the case (Hinsley et al., 2017). If anything, the raw means suggested that women directionally engaged more in the Q&A interactions compared to men, which would suggest that there could be some benefits of a virtual format to women that were largely overshadowed by the disproportionate participation of male grad students.

Study 4: Participation Across Q&A Structures

To further investigate the conditions under which women may be more likely to participate, we used, in Study 4, data from a large conference with video and text components. We tested differences between how Q&A sessions were formatted that varied in how public the participation would be (e.g., chat participation compared to video participation) and the availability for question asking (e.g., opportunities to ask questions between sessions compared to all questions at the end). Because the formats varied within one conference, we could test the impact of various formats on gendered participation within the same population. These analyses were exploratory in nature.

Method

Data Source

The conference was a large meeting with over 10,000 attendees and served as the main conference for a field. Our intention was to use a web scraper to collect the names of all of the attendees and use an algorithm to approximate their likely gender based on their first names. There was an issue with the web scraper that was not recognized until after the conference closed in which only the first twenty attendees of each session were recorded (N = 3696). It is unclear how representative this set of attendees is of the entire conference. For this study, we are unable to test how representation differs from base rates. Instead, we test what factors predict participation.

Some symposia were presented synchronously with a live audience, and some were pre-recorded and available for asynchronous viewing. During live sessions, attendees were encouraged to only use the chat box on the conference webpage as opposed to within the Zoom window to allow future watchers to be able to see the conversations. Both the video recordings and chat messages were checked for Q&A interactions and were coded separately.

The conference included several types of events including research symposia, professional development workshops, award ceremonies, discussions, and networking events. In order to be comparable to the other conferences, only the research symposia in which speakers presented novel research with time allotted for a Q&A session (either between talks or at the end of all of the talks) were coded and included in the analyses. Across the 1576 conference events, 861 events met our criteria for being research talks, and among those, 109 were live and included Q&A sessions.

At this conference, symposium chairs could affiliate their session with a topical division (N = 21) related to the subject matter of the research. This allowed conference attendees to more easily find sessions within their specific subfield or area of interest. Sessions could be associated with up to 7 divisions, and on average were associated with 1.4 divisions. We expect that conference attendees know and interact more with people in their division than others due to the research similarities, allowing it to be a proxy for the culture at the Q&A session.

The Q&A format varied by symposia as determined by each symposium organizer. Across the included symposia, there were a total of 601 Q&A interactions. Some symposia had Q&A between each talk (64%), while others saved all Q&A for the end (36%). Symposia differed between whether Q&A interactions were only initiated live (21%), only via chat (22%), or included both formats (55%).

All chat messages sent through the conference portal were web-scraped during the last week the conference was available at the end of the three-month conference period. A total of 14,311 chats were sent across all conference events. Only chats sent during live sessions with Q&A sessions (543 chats) were analyzed for this study.

Gender

The gender of the question-askers from live Q&A sessions was coded by two raters based on their gender presentation (Cohen’s Kappa = 0.97).

Q&A Session Coding

Synchronous participation was coded in the same manner as Study 1. Two of three research assistants timed questions and responses with good reliability (ICC = 0.96). Other data about each session was recorded including aspects of how the Q&A sessions were structured (e.g., if there was time for questions after each talk or if all questions were at the end; if questions were only sent via chat, asked via video, or both were encouraged; whether the question was sent via chat or video; and from which division the talk was).

Results

Analytic Approach

We tested the effect of structural aspects of Q&A sessions on the gender of the question-asker and the quantity of participation (i.e., length of time or number of chat words). Gender of the question-asker was predicted using multi-level logistic regressions, and the quantity of participation was predicted using multi-level negative binomial regressions, each nested by symposium. Tested moderators include: whether Q&A was asked intermittently in the session or at the end (effect coded: -0.5 = end, 0.5 = middle), whether the question was asked via video or chat (effect coded: -0.5 = video, 0.5 = chat), whether the format of the Q&A was in video, chat, or a combination of the two (reference coded with video as the reference group to compare chat features to the status quo).

For the test of the number of chat words, the moderator for the format of the Q&A only included in chat or a combination of video and chat (effect coded: -0.5 = chat, 0.5 = combination of video and chat) because chat questions were not a structurally endorsed mode for question asking in sessions that were intended only to have video questions (excludes 15 questions). In the participation analyses, the interactions between gender (effect coded: -0.5 = men, 0.5 = women) and all moderators were included to test for gender differences on the impact of format. A model for each topical division was also run to explore how micro-cultures may be linked to gendered participation. These models did not converge with a nested data structure, so they were run without a multilevel framework.

In line with the previous studies, we hypothesized that men would write longer chat questions and spend more time asking their questions. Without the gender base rates, we did not evaluate disproportionate participation in this study. All analyses testing structural moderators are exploratory in nature.

Participation

In general, women asked a higher proportion of the chat messages (53%, N = 361) than video messages (44%, N = 294), χ2 (1, N = 655) = 5.58, p = .018, d = 0.26, 95% CI [0.04, 0.48] (see Fig. 3). However, no structural aspects of Q&A sessions predicted the gender of the question-asker (see Table 2). Analyzing participation by topical division, four divisions specifically saw gendered participation: three in which women were more likely to participate and one in which women were less likely to participate. These effects held while controlling for the structural predictors (see Table 3). Some structural effects emerge in this model that were not present in the nested model. These significant effects are not being interpreted because the nested structure explained 24% of the variance in the model. It is likely that these effects are explained by within-session variance as opposed to between-session variance.

Fig. 3
figure 3

Number of Initiated Q&A Interactions by Asking Mode and Gender

Table 2 Likelihood of Initiating a Q&A Interaction by Gender Regressed on Structural Factors
Table 3 Likelihood of Initiating a Q&A Interaction by Gender Regressed On Each Significant Division Controlling for Structural Factors

Length of Speaking Time

There were no significant gender differences between men (M = 46 s, SD = 46 s) and women (M = 41 s, SD = 40 s) on how long participants talked during the Q&A sessions in the video portions of the Q&A, without controlling for the structural moderators, b = -0.06, Z = -0.82, p = .41, 95% CI [-0.22, 0.09]. Of the structural moderators, one was significantly moderated by gender: whether questions were asked intermittently throughout the session versus all questions at the end, b = 0.35, Z = 2.29, p = .022 (see Table 4). There was a main effect where questions were longer when saved for the end as compared to when asked throughout the session, b = -0.22, Z = -2.24, p = .025, and there were no significant main effects of gender, b = 0.00, Z = 0.01, p = .989. Reviewing the pattern of effects, men asked longer questions when all questions were saved for the end of session compared to when questions were throughout the session, b = -0.39, Z = -3.36, p < .001, whereas women did not change the length of their questions based on when they occurred, b = -0.04, Z = -0.35, p = .725 (see Fig. 4).

Table 4 Length of Speaking Time Regressed on Gender, Structural Factors, and Their Interaction
Fig. 4
figure 4

Total Word Count of Each Q&A Interaction by Gender and Whether the Question was Asked in a Session Where Questions were Asked Intermittently or at the End Using the Fitted Values

When analyzing gender differences in question length by division, six divisions showed interactions (see OSF: https://osf.io/bszef ). Similar to what was seen with overall participation, micro-cultures within particular topics may have more of an influence over how question-askers participate.

Length of Initial Question Text

There was a significant effect of gender on word length of question text, b = -0.18, Z = -2.63, p = .008, 95% CI [-0.31, -0.04], with men (M = 239 words, SD = 155 words) writing longer chat questions than women (M = 195 words, SD = 107 words). Including all of the moderators and their interactions with gender into the model, there were no significant interactions between gender and the moderators, indicating that there weren’t any structural or cultural effects on gender differences in chat length (see Table 5). Additionally, including all of the structural factors washed out the gender effect, suggesting that the structural predictors could be explaining some of the same variance as gender. In the analysis testing gender effects within each division, only one division had a significant gender by division effect. Given that it was only one division of 21, it could be possible that this effect was a false positive and the micro culture of divisions had little effect on chat question length.

Table 5 Length of Question Text Regressed on Gender, Structural Factors, and Their Interaction

Discussion

Counter to our predictions, there were minimal to no effects of changes in Q&A formats on gendered participation. There were no gender differences on speaking time of questions, replicating effects seen in live conferences (Jarvis et al., 2022). However, men wrote longer questions in chats compared to women, replicating Study 3. Of note, within-session effects explained almost a quarter of the variance in whether there would be gendered participation, but did not explain variance in models predicting speaking length or number of words. It appears that some divisions had either inhibitory or encouraging effects on women’s likelihood of participation, though not for how long they participated, if they participated. This pattern suggests that immediate environments may play an integral role in women’s decision-making process for whether to participate. Without gender base rates, it is impossible to distinguish if this effect is a result of women being more likely to attend particular sessions. Should this effect be driven by selection effects, it would be worth considering why women are more comfortable attending particular sessions and how that relates to the culture within that particular research topic.

General Discussion

Across four studies we tested gender differences in participation in Q&A at virtual conferences and the formats that could inhibit or encourage participation (see Table 6). In Study 1, men used more of the collective time than women as compared to what would be expected by base rates and more than women outright even though they were in the numeric minority. In Study 2, men’s disproportionate participation did not significantly differ between in-person and virtual formats within the same conference before and during the pandemic. In Study 3, within an all-chat participation format, men asked disproportionately more questions. Additionally, greater volubility was moderated by status, such that graduate students had large gender differences, with men writing longer questions than women, that were not observed at higher statuses. In Study 4, alterations to how the Q&A sessions were structured had minimal impact on gender differences in Q&A participation. However, there was some evidence that environmental effects due to division-level differences could have particularly inhibitory or encouraging effects on women’s participation.

Table 6 Summary of Descriptive Statistics, Key Analyses, and Characteristics Across Studies

The findings that men take disproportionately more Q&A space is consistent with past research studying Q&A sessions at in-person conferences (Hinsley et al., 2017; Jarvis et al., 2022; Käfer et al., 2018; Pritchard et al., 2014; Schmidt & Davenport, 2017; Telis et al., 2019). Gender effects across modalities appear to be pervasive. Past work has also found that men have greater volubility at higher status levels whereas women do not change their participation by status (Brescoll, 2011). In contrast, this study found that men at lower statuses had greater volubility than low status women; this difference was not seen at higher levels. It could be the case that academic men have learned to step back or, more cynically, were less likely to engage in the virtual modality in general.

Understanding factors that promote and inhibit volubility is important because volubility influences impression formation in interpersonal and group settings. Verbal behavior influences how actors are judged by others, enabling inferences about enduring traits (e.g., dominance) and momentary states (e.g., self-assuredness). How much or how little people speak affects their perceived competence and whether they attain influence in group settings (Anderson & Kilduff, 2009). In this way, volubility contributes to the formation of social hierarchies, with a willingness to speak up signaling dominance and power to others (Mast, 2002; Pfrombeck et al., 2023; Schmid Mast, 2001). If men speak more than women in social settings, then it reinforces gender-based status hierarchies (Ridgeway, 2001). The present set of studies finds that virtual conferences may contribute to the maintenance of these hierarchies through men participating more than women.

Limitations and Future Directions

This research is limited by the methodological constraints of operating within a gender binary. Gendered experiences are not limited to two genders but span a multitude of categories and self-identifications. However, for the purpose of being able to make meaningful comparisons, we limited analyses to only men and women. That said, our work was trans-inclusive. Trans people who identified as men or women were included in their self-identified categories. For participants we could not find self-identified gender, gender was coded based on the asker’s gender presentation. Although self-identified gender is not always congruent with perceived gender, it would tend to match the average perceptions of other members of the audience. To the extent that participation is driven by expectations of backlash due to gender or gender expression or perceptions of who visibly participates, gender presentation is an important aspect of how gender identity is measured (Carter et al., 2018; Jarvis & Kray, 2022). Future research should examine the experience of Q&A sessions among gender non-binary people in mainstream and queer spaces more specifically.

Another key limitation of this work is that it is observational. Certain conference attendees could choose to attend conferences with particular formats, which may impact whether gender differences in participation outcomes emerge, thus limiting our ability to assess causal relationships. The relative consistency between the virtual and in-person effects suggest that this may not be too much of a concern. In addition, the conclusions drawn from these data are limited by the conferences that were studied. We did our best to sample conferences with varying formats, sizes, lengths, and gender representations of attendees to allow these results to be as generalizable as possible. Coding conferences is time-intensive and access can be difficult. More work should be done to examine virtual conferences across fields and formats to test the robustness of these effects.

Practice Implications

With the onslaught of virtual conferences at the start of the COVID-19 pandemic came excitement about the benefit of increased access for people restricted by distance, finances, home obligations, health and disability, etc. One might hope or expect that shifts in access to conferences would also translate to changes not just in attendance, but also participation. In the present data, the experience of virtual conferences were not markedly different from the experience of in-person conferences. Men continued to take up disproportionately larger amounts of Q&A time and space compared to what would be expected by base rates. In the largest observed conference, some smaller divisions were deterministic in who participated by gender. It could be the case that the experience of gender gaps in participation is not impacted by structures, but due to deep-seeded cultural norms and expectations. More work is necessary to understand the gendered experiences and perceptions of Q&A sessions and how they translate to these behavioral outcomes.

Conclusion

Despite increasing interest and attention in issues of gender equity in STEM in the 21st century, significant gender gaps in career attainment remain (Gruber et al., 2021; Llorens et al., 2021; Schmader, 2022). Consistent with gender disparities in power and status in society more generally, the process of “doing science” shows evidence of inequity. The present research focuses on a known gender gap in participation rates in scientific discourse through the asking of questions at in-person academic conferences. Specifically, we conducted a systematic investigation to determine whether moving academic conferences to a virtual environment shifts some of the masculine defaults associated with Q&A sessions that might deter women from speaking up. Despite the potential for structural shifts in how and when questions are asked to mitigate gender gaps in participation rates, the results of this investigation instead provide evidence of their durability. It may be that the very system of convening—whether in person or virtually—to present research and have it interrogated by our peers rewards behavior that is more often associated with men and masculinity than women and femininity. Until these cultural associations are overridden, science may continue to be steered disproportionately by the thoughts and opinions of men. Although virtual conferences may not be the panacea some had hoped they would be, our goal in sharing the present research is to stimulate efforts by those in formal positions of power in organizing conferences, and practitioners more broadly interested in promoting inclusion throughout STEM and society in general, to continue to find avenues for more of women scientists’ voices to be heard.