Over the past decades, gender balance has been widely achieved among PhD graduates (47.9% female) and assistant professors (46.4% female) in the member countries of the European Union (European Commission, 2019). However, larger gender gaps are still found in the transition to associate professorships (40.5% female) and full professorships (23.7% female). In “DACH countries,” or Germany (D), Austria (A), and Switzerland (CH), women’s underrepresentation in the highest professorial ranks is even more evident (European Commission, 2019): Women hold only 25.6% of associate professorships in Germany (19.4% full professorships), 26.1% in Austria (22.7% full professorships), and 33.9% in Switzerland (23.3% full professorships).

The current literature provides important theoretical perspectives and empirical findings that explain women’s persistent low share in academic high-level positions. However, the literature on the role of gender discrimination and potentially counteracting effects in selection processes remains inconsistent: For example, an experiment with U.S. faculty members revealed gender biases in tenure-track hiring processes; however, it was conducted two decades ago, and the findings were limited to the field of psychology (Steinpreis et al., 1999). Further evidence of gender discrimination in academia pertained to evaluations of junior researchers’ work products (e.g., Knobloch-Westerwick et al., 2013), or sexist recommendation and hiring practices (e.g., Dutt et al., 2016; Moss-Racusin et al., 2012; Trix & Psenka, 2003). In contrast, hiring data from U.S. universities (e.g., National Research Council, 2010) and experimental data from U.S. STEM fields (i.e., fields of science, technology, engineering, mathematics) demonstrated new developments: When a female and a male applicant for an assistant professorship position had equivalently strong job qualifications, faculty members favored the female applicant (Ceci & Williams, 2015; Williams & Ceci, 2015). This evidence of female-favoring hiring preferences has fueled the assumption that affirmative action efforts to counter evaluators’ biases might have been internalized by many academics today, at least when it comes to tenure-track hiring (Williams & Ceci, 2015).

Despite these findings, research efforts on the actual use of affirmative action policies in academic hiring practices for high-level positions remain scarce. This lack of research is of high relevance for the following reasons: First, most universities highlight affirmative action policies (e.g., gender-based preferential selection when job candidates are equally competent) in job advertisements for professorships to encourage women to apply. Although previous studies revealed that these policies may successfully attract female job seekers (e.g., Ibanez & Riener, 2018; Nater & Sczesny, 2016), other research revealed negative consequences for female applicants’ job chances (e.g., lower competence ascriptions) when they were associated with affirmative action policies rather than with gender-neutral policies (e.g., Garcia et al., 1981; Heilman & Blader, 2001; Heilman & Welle, 2006). Second, given that women’s representation decreases more strongly as they advance in the academic hierarchy (European Commission, 2019), more research is needed on hiring practices for high-level professorships in permanent positions: Even if women have greater access to lower-level positions today (i.e., assistant professorships), they might still face obstacles when advancing to high-level positions (i.e., associate and full professorships which in DACH countries are both lifetime appointments, Academic Positions, 2021). Third, evidence on affirmative hiring preferences for female assistant professors is predominantly based on U.S. samples (Ceci & Williams, 2015; National Research Council, 2010; Williams & Ceci, 2015). Although Western European countries, including DACH countries, are culturally similar to the United States (Bosak & Sczesny, 2011), women’s underrepresentation in higher levels of professorship in the European countries is even more evident than in the United States and might thus reflect prevailing discriminatory hiring practices (e.g., Catalyst, 2017).

To address these issues, the current study seeks to contribute to the literature on gender and academic careers by examining the role of applicant gender and affirmative action policies in selection processes for high-level professorships. More specifically, the study presents an experimental framework based on well-established theories, such as the lack of fit model (Heilman, 1983, 2012) and signaling theory (Spence, 1973) to examine whether affirmative action policies either facilitate or hinder discriminatory hiring practices in a simulated selection process. To do so, we assess effects of different selection policies (gender-based preferential selection vs. non-gender-based selection) in job advertisements on evaluators’ ratings of equivalently qualified applicants (female vs. male) for an associate professorship in business administration, a field in which women are still substantially underrepresented. Furthermore, based on implications of the shifting standards model (Biernat et al., 1991), the study distinguishes between different measures of evaluators’ perceptions of applicants’ hireability to assess whether ranking orders are more likely to reveal gender stereotypes towards the female compared to the male applicant than subjective perceptions (i.e., perceived hireability). Finally, the study assesses whether affirmative hiring preferences rather than hiring biases against female applicants for professorships are similarly pronounced in cultural contexts other than the United States by examining simulated hiring decisions in DACH countries.

Gender Stereotypes in the Academic Context

Gender-aware management of higher education institutions includes practices to promote excellence in research by reducing the loss of female talent due to discriminatory processes at different levels of academic careers (European Commission, 2012). These practices positively affect gender-fair policies, selection and hiring processes. Yet, despite great efforts and initiatives to increase women’s share in academia, the proportion of women and men in Europe’s highest professorial ranks remains uneven (European Commission, 2019).

The lack of fit model (Heilman, 1983, 2012) explains how culturally shared gender stereotypes and the underrepresentation of women in (scientific) fields and positions contribute to the challenges that women may face in their career advancement. In most cultures, men are (more than women) associated with agentic qualities, such as achievement orientation, autonomy and rationality (Heilman, 2012); as well as people in male-dominated positions such as leadership or higher-level positions in both business (Koenig et al., 2011) and academic sectors (Carli et al., 2016). In contrast, women (more than men, leaders, and scientists) are typically associated with communal qualities, such as concern for others, deference and emotional sensitivity (Heilman, 2012). These stereotypical beliefs can negatively affect expectations about women’s performance in male-dominated work domains when they are seen to lack the masculine skills that are perceived to be required for success in these domains (Heilman, 2012).

According to the lack of fit theory (Heilman, 1983, 2012), prejudice against women occurs in situations where the masculinity of the work domain (e.g., a high prevalence of male job holders) or the female gender role (e.g., job advertisements that highlight gender-based preferential selection) are pronounced (Heilman, 2012; Koenig et al., 2011). Similar discriminatory effects occur for men in female-dominated work environments (e.g., male nurses), supporting the viewpoint that the gender-typing of work environments typically affect discrimination experiences of respective social group members that represent a numerical minority (e.g., Eriksen & Einarsen, 2004).

In line with these theoretical implications, empirical evidence revealed that science is in fact more strongly associated with stereotypically male-typed qualities: The science-is-male stereotype was demonstrated, for example, in a large sample of over 175,000 college-educated participants, including scientists, engineers, and physicians (Smyth & Nosek, 2015). Scientists in fields with a lower representation of women showed stronger explicit associations between science and men than scientists in fields with a higher representation of women. Similar results were found by Miller et al. (2015) using survey data of 350,000 participants in 66 nations. Furthermore, studies identifying the women ≠ science stereotype revealed that men and scientists were described as highly agentic in comparison to women, whereas women were described as more communal. Finally, agentic traits were more strongly associated with scientific success than communal traits were (e.g., Carli et al., 2016; Ramsey, 2017).

In previous studies, these gender-science stereotypes were found to be responsible for gender biases (disadvantaging women) in academia, including in performance evaluations (e.g., Knobloch-Westerwick et al., 2013), career support (Moss-Racusin et al., 2012), recommendation processes (e.g., Dutt et al., 2016), and sexist hiring practices (e.g., Steinpreis et al., 1999). For example, in recommendation letters, women are associated with fewer agentic qualities than men (Madera et al., 2009), more expressions of doubt (Madera et al., 2019), and fewer descriptions of being “outstanding” (Schmader et al., 2007). Furthermore, women are less likely to be hired for academic positions, receive less recommendations for career mentoring although they are equally qualified (Moss-Racusin et al., 2012), and receive fewer academic grants than men (e.g., Witteman et al., 2019). In the context of publications, women are less often cited than men (e.g., Bendels et al., 2018; Maliniak et al., 2013), and less often invited as reviewers (Nature, 2017).

Furthermore, meta-analytical evidence confirmed that gender biases also continue to be an obstacle to women’s career advancement in the business sector (Koch et al., 2015; Paustian-Underdahl et al., 2014). However, these studies identified factors that could weaken biases in hiring decisions, for example when decision-makers were experienced professionals (in contrast to working adults or undergraduates) and when applicants’ performance records clearly indicated high job competence (Koch et al., 2015) – thus minimizing the scope for interpretations and expectations that are based on gender stereotypes. Accordingly, a series of experiments showed that anti-female biases did not occur among academic professional raters who evaluated hypothetical job candidates for an assistant professorship position. That is, U.S. faculty members in STEM disciplines preferred to hire a highly competent female applicant for a tenure-track assistant professorship position over an equivalently qualified male applicant, at a ratio of 2:1 (Williams & Ceci, 2015). A follow-up study (Ceci & Williams, 2015) demonstrated that the female hiring advantage occurred only in scenarios where the female applicant had equivalent or stronger qualifications and not in scenarios in which she had slightly weaker qualifications than a male counterpart.

According to these female-friendly hiring preferences, the authors concluded that affirmative action efforts (i.e., preferential hiring of the underrepresented group when job candidates have equal qualifications) finally seem to be working together successfully with universities’ meritocratic ideals (i.e., hiring of the most qualified candidates) and that most academics seem to have internalized this ideal today. However, to the best of our knowledge, there has been no explicit experimental testing of the effects of affirmative action policies in hiring decisions for high-level academic positions, such as professorships.

Affirmative Action Policies and Gender Bias

Affirmative action refers to mandatory and voluntary programs and efforts by governments and organizations to reduce discrimination against historically disadvantaged groups and to promote equal educational and employment opportunities (e.g., Crosby et al., 2003). Thus, affirmative action is a temporary measure to ensure that, for example, selection procedures, performance evaluations, and decisions at the workplace are fair and align with the principle of meritocracy (Crosby et al., 2006) by compensating discriminated groups as a preference when candidates have equivalent qualifications. The scientific literature offers various classifications of affirmative action programs that range from quotas to focused recruitment trainings to hiring policies (e.g., Harrison et al., 2006; Heilman & Haynes, 2006). For example, universities can promote gender-forward action in hiring policies to encourage female scientists to apply (Crosby et al., 2003). In DACH countries, some universities highlight their commitment to affirmative action in job advertisements for professorships (e.g., preferential selection of women in cases of equivalent job qualifications), whereas other universities solely highlight their commitment to excellence without mentioning gender-based selection in their job advertisements.

Although affirmative action programs were originally introduced as an equality-measure to respond to and repair historical discrimination towards minority groups (e.g., Crosby et al., 2003; Heilman & Haynes, 2006; Kelly & Dobbin, 1998), they have stirred substantial debate across scholarly, policy, and public domains. On the one hand, research showed that affirmative hiring policies successfully attract women job candidates (e.g., Ibanez & Riener, 2018; Nater & Sczesny, 2016), on the other hand, female applicants who believed that they had benefited from gender-based preferential selection (compared to merit-based selection) inferred that others held negative expectations of their competence (Heilman & Alcott, 2001), and also devalued their own leadership capability and task performance (Heilman et al., 1987). Furthermore, affirmative action policies can have negative consequences for non-beneficiaries (i.e., men) who might become demotivated or feel that they have been treated unfairly (e.g., Heilman et al., 1996). Finally, affirmative action can have negative effects on evaluators’ perceptions of minority (i.e., underrepresented) group members’ and women’s job qualifications (e.g., Garcia et al., 1981; Heilman et al., 1992).

The underlying mechanisms of these stigmatizing effects towards minority group applicants have usually been explained by attribution theory (Kelley, 1973). From this viewpoint, stigmatizing effects occur when observers attribute an individual’s achievements to certain causes. For example, when preferential treatment provides an alternative attribution for an applicant’s success, evaluators are likely to discount the importance of the person’s job competencies. In contrast, when a person’s success can be plausibly attributed to performance based on records, evaluators are likely to discount alternative explanations (Crosby et al., 2003). Previous experimental and field studies have confirmed attribution theory’s discounting principle in the context of affirmative action. That is, evaluators rated women or minority job candidates as less competent and less hirable when they perceived them to be preferentially selected based on an affirmative action policy (e.g., demographic characteristic, minority status) compared to a non-affirmative action policy (e.g., Garcia et al., 1981; Heilman et al., 1997; Heilman & Haynes, 2006).

Despite this previous evidence revealing the possible detrimental effects of affirmative action in selection processes, we acknowledge that many of the experimental studies were conducted between the 1980s and the early 2000s. Furthermore, attribution theory (Kelley, 1973) seems to explain why evaluators rate female applicants as less qualified than male applicants after a selection decision was made (and affirmative action was said to be involved in the decision process). Recent evidence of hiring tendencies of academic women has fueled the assumption that today, affirmative action policies might in fact be working in accordance with universities’ meritocratic ideals, at least when professional raters evaluate candidates with clear performance records for high-level positions and they have positive effects on women’s application intention (e.g., Ibanez & Riener, 2018; Williams & Ceci, 2015).

Thus, a theoretical alternative to the point of view that women should typically be devalued when being associated with affirmative action follows from implications of signaling theory (Spence, 1973). According to the theory, individuals who are in situations that lack important information (e.g., about job candidates) usually interpret available information as providing signals about what is unknown. For example, when two candidates appear equally qualified for a position, it is difficult to determine who should be hired. In these situations, affirmative action policies might signal to decision makers that their institution values diversity and thus make it easier to decide in favor of the female applicant. In contrast, as scientific excellence is still more strongly associated with stereotyped masculine traits (e.g., Schmader et al., 2007), non-gender-based selection policies with a focus on excellence might signal to decision makers the valuing of men.

In sum, based on previously discussed theories, competing hypotheses could be formulated: First, previous findings based on well-established theories, such as the lack of fit model (Heilman, 1983, 2012), revealed that whether evaluators were biased against women in selection processes depended on the extent to which selection policies evoked or weakened stereotypical perceptions. To test these assumptions in the context of academic high-level positions, the study investigates whether evaluators perceive a female applicant as less qualified than an equally qualified male applicant for an associate professorship when the job advertisement states the university’s commitment to affirmative action (compared to its sole commitment to excellence).

Hypothesis 1. Female applicants are evaluated as less competent than male applicants when the university states a gender-based preferential selection policy but not when the university states a non-gender-based selection policy.

Hypothesis 2. Female applicants are evaluated as less hireable than male applicants when the university states a gender-based preferential selection policy but not when the university states a non-gender-based selection policy.

Second, however, we acknowledge that these effects may no longer occur in high-level hiring processes in the academic sector: Given that short-listed applicants for an associate professorship position provide strong performance records and universities are people-serving institutions that are publicly accountable as well as under stronger pressure to be inclusive of women and minority group members today. Thus, based on implications of signaling theory (Spence, 1973), affirmative action policies might in fact signal to decision makers to decide in favor of the female applicant, whereas non-gender-based policies (focusing solely on excellence and failing to shed light on the underrepresentation of women and minority groups) might signal to decision makers to decide in favor of the male applicant.

Further, the study distinguishes between: (a) evaluators’ perceptions of the female and male applicant’s competence and hireability, and (b) evaluators’ actual ranking order of the applicants. According to the shifting standards model (Biernat et al., 1991), gender stereotypes can impact social judgments by providing category-specific standards for individuals from different social groups. For example, given that leadership is commonly associated with masculine and agentic traits (e.g., Koenig et al., 2011), women can be perceived as highly skilled for leadership compared to most women, but evaluations of women and men may not be directly comparable. Accordingly, objective scales (e.g., ranking orders) are more likely to reveal evaluators’ stereotypical perceptions of applicants than subjective scales (e.g., Likert scales). Subjective scales allow evaluators to base their judgements on within-category rather than across-category standards (Biernat & Fuegen, 2001): A female and a male applicant can both be rated very competent or hireable compared to individuals in their own social group. Yet, when ranking these applicants as first and second choices for a job, evaluators are forced to rate both applicants according to the same evaluative dimension. Thus, given that an initial screening of a pool of equivalently strong applicants might evoke fewer differences in evaluators’ perceptions of applicants’ competence and hireability, stronger tendencies of hiring biases or affirmative hiring tendencies might occur when evaluators reveal their final decision by providing a ranking order of the applicants.

Hypothesis 3. Female applicants will be ranked as the first choice for the professorship less often than male applicants when the university states a gender-based preferential selection policy but not when the university states a non-gender-based selection policy.

Evaluator Gender and Hiring Bias

Previous research does not reveal a clear picture of whether evaluator gender is an important factor in academic hiring decisions. For example, implicit attitudes towards female authority were similarly negative for female and male participants, but women showed less explicit prejudice towards female authority than men did (Rudman & Kilianski, 2000). Meta-analytic findings indicated that male evaluators showed greater hiring biases than female evaluators for male-dominated jobs (Koch et al., 2015) as well as a stronger masculine construal of leadership (Koenig et al., 2011). In contrast, other meta-analyses (e.g., Davison & Burke, 2000) and experimental studies on hiring decisions (e.g., Moss-Racusin et al., 2012) found no differences among female and male evaluators. Therefore, we did not formulate hypotheses on this variable but include evaluator gender as potential moderating factor in our analyses.

Material and Method

Participants and Design

Data were collected from 481 economic scientists at German (76.7%), Swiss (15.6%), and Austrian (4.4%) universities.Footnote 1 Of these participants, 408 were academic mid-level faculty members (84.8%; 69 lecturers/post-doc level and 339 research associates/doctoral level), 63 student assistants (13.1%; 59 Master’s level and 4 Bachelor’s level), and 10 professors (2.0%; 6 assistant professors and 4 associate professors). Their ages ranged from 22 to 54 years (M = 29.4; SD = 3.8). Participant gender was largely balanced (47.4% female). The sample is ecologically valid, because in DACH countries, students and mid-level faculty act as members of selection committees for professorships (e.g., Frey et al., 2015). It should therefore be in their interest to appoint the most qualified person for a chair position in their own research department or academic unit. Additionally, a professorship in the field of economics represents a realistic career goal for mid-level faculty members and students in economic sciences.

The study was based on a 2 (selection policy: non-gender-based vs. gender-based preferential selection) x 2 (applicant gender: female vs. male) mixed factorial design, with applicant gender as a within-subjects factor and selection policy as a between-subjects factor. Evaluator gender (female vs. male) was included as a quasi-experimental factor. Participants were randomly assigned to one of the two experimental selection policy conditions.

Procedure and Material

Dean’s offices of economic sciences in Germany, Austria, and Switzerland were asked to forward the study invitation to mid-level faculty members and student assistants. Additional data were obtained by contacting participants via university websites and social networks. Study participation was voluntary, and anonymity was assured. The research was introduced as a web-based study investigating information processing and decision making in academic selection from different psychological angles.

The web-based materials stated that participants’ task was to assume the role of a search committee member, entrusted with the evaluation of three short-listed applicants for an associate professorship position in business administration. The materials comprised background information on the employing university and faculty, a job description, and three curricula vitae (CVs) of fictitious job applicants. Participants reviewed the study materials, before the three CVs were presented to them again for individual assessments of each applicant’s job competence and hireability. The study materials further explained that following the individual evaluations, participants would be asked to rank the candidates as their first, second, and third choices for the professorship position. Finally, participants were given manipulation checks to ensure that they were aware of both the university’s selection policy in the job advertisement and the applicants’ gender distribution. As an incentive, subjects could participate in a prize draw for vouchers for an online bookstore. All procedures performed in the study were in accordance with the ethical standards of the institutional research committee. Informed consent was obtained from all individual participants involved in the study.

Background information on the fictitious university and faculty of economics were adapted from an actual profile of a large Western European university. The university and the faculty were described as having excellent records regarding promotions of young researchers, international research, and teaching activities. Materials subtly noted women’s share among professors of economics (14%); this information appeared in brackets following the total number of professors and provided a realistic picture of the actual gender distribution among professors of economics in Western Europe. It was added to ensure that participants in all experimental conditions had a similar understanding of the vacant position. Job advertisements were based on actual job descriptions for associate professorships in business administration and included job requirements and the university’s selection policy (non-gender-based or gender-based preferential selection).

Application materials included three CVs: two target CVs (CV A and CV B), which were constructed to introduce two equivalently qualified applicants (female vs. male), and a distractor CV (CV C), which was constructed to make this applicant appear slightly less qualified than the target applicants. Following Williams and Ceci (2015), we introduced the distractor CV to make the aim of the study (i.e., gender bias in academic hiring) less obvious and create a more realistic short list for the vacant professorship. Therefore, the distractor CV was always displayed as a male applicant. The three CVs were based on actual applications from recently hired associate professors that the respective professors voluntarily shared to support the study. The CVs used in the study contained information on each applicant’s research focus, career history, teaching, research, and fund-raising activities in similar subfields of business administration. Three expert ratings and a pilot study with 46 mid-level faculty members and economics students revealed that the target applicants were perceived as equally competent and hireable for the advertised professorship position. Applicants’ gender and the university’s selection policy were not indicated in the pilot studies (See Supplement A in the online supplement for detailed information about the pilot test for the equivalence of the target CVs).

The order in which the two target CVs were displayed in the experiment was counterbalanced across study participants. Further, the gender of the target applicants was counterbalanced across participants, so that CV A was indicated to be either a female or male applicant and CV B was indicated to be the other gender, respectively. The male distractor CV was always presented third in the web-based study.

Experimental Manipulations

Selection Policies

Manipulations of the university’s selection policy were provided at the bottom of the job advertisement. The policy stated either a non-gender–based selection or a gender-based preferential selection policy. The wordings were drawn from actual selection policies used in job advertisements for professorships in DACH countries. The gender-based preferential selection condition highlighted the university’s commitment to affirmative action: The university is committed to equal opportunities in science and research. The university strives to enhance the share of women by preferentially hiring women with equal abilities, aptitudes, and professional performance for leadership positions and the scientific staff. The non-gender-based condition highlighted the university’s commitment to excellence: The university is committed to excellence in science and research. The university strives to promote this excellence by hiring candidates with the highest qualifications for leadership positions and the scientific staff.

Applicant Gender

Applicants’ gender was manipulated by: (a) gendered German linguistic forms of the applicants’ current and prior job titles (e.g., Assistenzprofessorin or Assistenzprofessor, ‘assistant professor, feminine noun’ or ‘assistant professor, masculine noun’), and (b) gender-specific first names (last names were blacked out). The first names appeared at the top of each CV as personal information. The female applicant (CV A or B) was introduced as Katrin throughout the study, and the male applicants, Michael and Daniel, were counterbalanced between the male target’s CV (CV A or B) and the male distractor’s CV. A pilot study among 225 academics revealed no mean differences between the first names in terms of associations with attractiveness and competence of the name bearer (Rudolph et al., 2007; Rudolph & Spörrle, 1999). Target applicants’ gender was dummy coded (0 = male applicant, 1 = female applicant). (See Supplement B in the online supplement for detailed information about the pilot test for the use of gender-specific names).

Dependent Measures

Perceived Competence

Three adjective scale items yielded participants’ perceived competence of each of the three applicants: competent, productive, and effective (Heilman & Okimoto, 2008). Participants responded on a visual analogue scale ranging from 0 (not at all) to 100 (very). The items were averaged, so that higher scale values indicated higher ratings of perceived competence. The internal consistencies were α = .88 for target CV A, α = .85 for target CV B, and α = .87 for the distractor CV C.

Perceived Hireability

Participants indicated the extent to which they would hire each applicant on a visual analogue scale item ranging from 0 (I am very certain not to hire the applicant) to 100 (I am very certain to hire the applicant; Bosak & Sczesny, 2011).

Ranking

Participants were asked to rank the applicants as their first, second, and third choice for the job (Williams & Ceci, 2015).

Control Variables

As participants’ perceptions of applicants’ job qualifications might be related to their own professional experiences (e.g., Koch et al., 2015), participants were asked to provide information on their current academic level (i.e., full, associate, or assistant professorship level, postdoctoral or doctoral level, Master’s level, or Bachelor’s level).

Results

Table 1 shows the mean scores and standard deviations for perceived competence and perceived hireability as a function of selection policy, applicant gender, and evaluator gender. Table 2 shows how often the study participants ranked the hypothetical female and male applicant as their first choice for the associate professorship position in the two selection policy conditions.

Table 1 Means and Standard Deviations for Dependent Measures as a Function of Selection Policy and Evaluator Gender (Controlling for Evaluators’ Academic Level)
Table 2 Proportions for Ranking Applicants as First Choice for the Professorship by Selection Policy Condition and Evaluator Gender

Manipulation Checks and Participant Flow

To examine whether manipulations of applicant gender and selection policies were effective, two manipulation checks were presented at the end of the web-based survey. In web-based surveys, participants’ attentiveness is a concern, and some participants might be motivated primarily by incentives (e.g., Hauser & Schwarz, 2016; Paolacci et al., 2010). To prevent participants from randomly selecting a response option when they in fact had not carefully reviewed the study materials, the response option “I don’t know” was available to the participants for both manipulation checks. Of the 594 participants that completed the survey, nine participants failed to remember the correct gender composition of the applicant pool. Further, 34 participants that were randomly assigned to the gender-based preferential selection condition falsely indicated that the university used a non-gender-based policy, and 58 participants that were randomly assigned to the non-gender-based condition indicated that the university used a gender-based preferential selection policy. Additionally, data of 12 individuals were deleted as they filled out the survey twice. With these exclusions, the final sample comprised 481 participants.

Main Results

To test Hypotheses 1 and 2, we conducted two three-way mixed analyses of covariance (ANCOVAs). We included applicant gender as a within-subject factor, selection policy as a between-subject factor, and competence and hireability rating as the dependent variables.

Additionally, we included evaluator gender as a between-subject factor to investigate whether the hypothesized patterns of relationships would be differentially pronounced in male and female evaluators and controlled for participants’ academic level.

Perceived Competence

Hypothesis 1 stated that the female applicant would be evaluated as less competent than the male applicant when the job advertisement stated a gender-based preferential selection policy but not when the job advertisement stated a non-gender-based selection policy. The three-way mixed ANCOVA for perceived competence showed that academic level was associated with perceptions of the applicants’ competence, F(1, 476) = 4.00, p = .046, ηp2 = .01. Contrary to Hypothesis 1, the results did not reveal a significant interaction between applicant gender and selection policy, F(1, 476) = 2.77, p = .097, ηp2 = .01.

However, there was a significant three-way interaction between applicant gender, selection policy, and evaluator gender on competence perceptions, F(1, 476) = 4.62, p = .032, ηp2 = .01 (Fig. 1). In order to decompose the interaction, analyses for female and male evaluators are presented separately: For male evaluators, the two-way mixed ANCOVA revealed no significant main or interaction effects. That is, male evaluators in both selection policy conditions perceived the female and male applicant as equally competent. Interestingly, however, for female evaluators, there was a significant interaction between applicant gender and selection policy on competence perceptions, F(1, 225) = 7.08, p = .008, ηp2 = .03. Simple effects analyses showed that in the gender-based preferential selection condition, female evaluators rated the female applicant (M = 88.90) and the male applicant (M = 88.54) as equally competent, p = .542. However, in the non-gender-based condition, female evaluators rated the female applicant (M = 88.98) as significantly more competent than the male applicant (M = 86.15), p < .001.

Fig. 1
figure 1

Perceived job competence of the female and male applicant by selection policy and evaluator gender, controlled for academic level

Hireability

Hypothesis 2 stated that the female applicant would be evaluated as less hireable than the male applicant when the job advertisement stated a gender-based preferential selection policy but not when the job advertisement stated a non-gender-based preferential selection policy. The three-way mixed ANCOVA showed that the covariate academic level was not associated with perceived hireability, F(1, 476) = .02, p = .888, ηp2 = .00. Contrary to our hypothesis, the interaction between applicant gender and selection policy on hireability was not significant, F(1, 476) = 2.40, p = .122, ηp2 = .01. Thus, we did not find support for Hypothesis 2. The three-way interaction between applicant gender, selection policy, and evaluator gender on hireability was also not significant, F(1, 476) = .24, p = .622, ηp2 = .00.

However, the analysis revealed interesting results concerning affirmative hiring preferences. There was a significant main effect of applicant gender, F(1, 476) = 5.29, p =.022, ηp2 = .01. Post hoc tests indicated that evaluators rated the female applicant (M = 86.27) as more hireable than the male applicant (M = 81.44), p < .001, regardless of the type of selection policy in the job advertisement. Further, there was a main effect for evaluator gender, F(1, 476) = 4.99, p =.026, ηp2 = .01, indicating that female evaluators (M = 85.18) provided higher ratings for hireability than male evaluators (M = 82.53), p = .026. A significant interaction between applicant gender and evaluator gender, F(1, 476) = 3.95, p = .048, ηp2 = .01 (Fig. 2) qualified the latter main effect. Simple effect analyses revealed that the female applicant was perceived as more hireable than the male applicant by both female evaluators (female applicant: M = 88.38; male applicant: M = 81.97; p < .001) and male evaluators (female applicant: M = 84.15; male applicant: M = 80.91; p = .003). Still, the preference for the female applicant was stronger among female evaluators. That is, the female applicant was rated as more hireable by female evaluators (M = 88.38) than male evaluators (M = 84.15; p = .004). In contrast, there were no significant differences in the ratings of the male applicant’s hireability by female evaluators (M = 81.97) and male evaluators (M = 80.91), p = .448.

Fig. 2
figure 2

Perceived hireability of the female and male applicant by evaluator gender, controlled for academic level

Ranking of Applicants

To test the hypothesized associations between selection policies, and applicant ranking, we conducted Chi-square tests (χ2). Also, we investigated the effects of evaluator gender. Overall, the female applicant was ranked as first choice for the job more often (366 times; 76.1%) than the male applicant (105 times; 21.8%). The distractor applicant was ranked as first choice 10 times (2.1%), and the data was not considered in the following analyses.

Hypothesis 3 stated that female applicants would be ranked as first choice less often than male applicants when the university stated a gender-based preferential selection policy but not when the university stated a non-gender-based selection policy. Results revealed that the type of selection policy was significantly associated with ranking, χ2(1) = 6.54, p = .011. Although in both selection policy conditions the female applicant was ranked first for the job significantly more often than the male applicant, comparisons within each ranking category revealed different patterns for the applicants. That is, the female applicant was ranked first significantly more often in the gender-based preferential selection condition (222 times; 60.7%) than in the non-gender-based condition (144 times; 39.3%). In contrast, the male applicant was ranked first significantly more often in the non-gender-based condition (56 times; 53.3%) than in the gender-based preferential selection condition (49 times; 46.7%). Thus, we did not find support for Hypothesis 3.

Further, we assessed the relations between selection policy, evaluator gender, and first rank. Among male evaluators, there was a significant association between selection policy and their rankings, χ2(1) = 5.32, p = .021. That is, overall, male evaluators ranked the female applicant (177 times; 72.2%) more often as their first choice for the job than the male applicant (68 times; 27.8%). Yet, comparisons within each ranking category revealed that male evaluators ranked the male applicant significantly more often as their first choice in the non-gender-based condition (38 times; 55.9%) than in the gender-based preferential selection condition (30 times; 44.1%). In contrast, male evaluators rated the female applicant more often as their first choice in the gender-based preferential selection condition (107 times; 60.5%) than in the non-gender-based selection condition (70 times; 39.5%). Among female evaluators, the results revealed no association between selection policies and their rankings of the female and the male applicant, χ2(1) = 1.16, p = .282. That is, overall, female evaluators ranked the female applicant as their first choice (189 times; 83.6%) more often than the male applicant (37 times; 16.4%), but the ranking did not depend on the selection policy condition.

Discussion

The study investigated the effects of applicant gender and selection polices in academic job advertisements (gender-based preferential selection vs. non-gender-based selection) on applicant perceptions in a simulated hiring process for an associate professorship position in business administration. Overall, the results did not yield strong evidence for discriminatory hiring practices against women among scientists in DACH countries. In fact, the overall pattern of results demonstrated a hiring preference for women both when the university highlighted its commitment to excellence and when it highlighted its commitment to affirmative action.

More precisely, the results were the following: First, we did not find evidence that the female applicant was evaluated as less competent than the equally qualified male applicant. In fact, evaluators perceived the applicants as equally competent in the gender-based preferential selection condition. In the non-gender-based selection condition, the female applicant was perceived as more competent than the male applicant, but only by female evaluators.

Second, the type of selection policy stated in the job advertisement did not affect perceptions of the applicants’ hireability. However, female and male evaluators perceived the female applicant as more hireable than the male applicant. This hiring preference favoring the woman was also more strongly pronounced among female than male evaluators. In contrast, the study revealed no gender differences for ratings of the male applicant’s hireability.

Third, the female candidate was ranked overall as the first choice for the professorship at an approximate ratio of 3:1. Yet, despite this preference for the female applicant, results revealed different patterns for evaluator gender. Female evaluators showed a preference for the female applicant independently of the type of selection policy. In contrast, male evaluators also preferred the female over the male applicant across selection policy conditions. However, they ranked the female applicant as their first choice more often in the gender-based preferential selection than in the non-gender-based condition, whereas they ranked the male applicant as their first choice more often in the non-gender-based than in the gender-based preferential selection condition.

In general, these findings on perceptions of competence and hireability seem to be in line with the implications of the shifting standards model (Biernat, et al., 1991) and findings from related studies: Both women and men can be perceived as competent when evaluators judge them on subjective scales (e.g., Likert scales; Biernat et al., 1998). One possible explanation for the finding that the female applicant was perceived as even more hireable than the male applicant is the “paradoxical judgement effect” (Biernat et al., 2003, p. 2063): If evaluators judged the performance of the applicants according to the stereotype that men are more successful professors than women, they were likely to define success differently for the applicants. That is, to be seen very competent or very hireable on a subjective scale, the male applicant might have been expected to provide even stronger performance records than the female applicant (e.g., Biernat et al., 2003).

However, the results also revealed affirmative hiring preferences favoring the female applicant on objective scales (ranking). In general, these results support recent claims that scientists seem to have internalized society’s goal to increase women’s representation in science (Williams & Ceci, 2015). Female and male evaluators were more likely to rank the female applicant as their first choice across selection policy conditions. Yet, the decision patterns by men seem to be more strongly influenced by perceptions of the university’s norm compared to the decision patterns by women. That is, although men ranked the female applicant first for the job more often than the male applicant overall, this hiring pattern was more evident in the affirmative action (gender-based preferential selection) than in the non-gender-based condition. Hence, in their decisions, men seem to rely on the universities’ commitment to support gender equality more strongly in the affirmative action condition. Nevertheless, male evaluators were more likely to support the male applicant in the non-gender-based condition than in the affirmative action condition, implying that male evaluators were still more likely to stereotypically associate excellence with men.

Given gender differences among evaluators, the ranking results were partly in line with the finding of a meta-analysis that female evaluators showed less hiring bias than male evaluators for male-dominated jobs (e.g., Koch et al., 2015) as well as fewer masculine construals of leadership and excellence (e.g., Koenig et al., 2011). Furthermore, at least for male evaluators, the findings seem to be in line with the implications of signaling theory (Spence, 1973): Particularly for non-beneficiaries of affirmative action (i.e., men) and in complex situations where two applicants appeared equally qualified, it was difficult to determine who should be hired. In these situations, affirmative action policies seemed to have made the hiring decision easier for male evaluators. That is, the policy might have signaled to male decision makers that the university values diversity and thus made it easier to support the underrepresented social group (i.e., women) in the hiring processes for the job as associate professor. However, given that the two equally competent application materials revealed equally strong applicants, competence ratings were not strongly affected by the hiring policies.

Another possible explanation for female-favoring hiring practices lies in the changing cultural construals of good leadership; and furthermore, in the political and organizational pressures to shift toward gender equality (Eagly, 2007). Recent studies from the private sector provide evidence that under certain conditions, a female advantage in male-dominated fields is beginning to emerge. For instance, evidence from field as well as experimental studies reveals that organizations with strong diversity goals are reversing the gender pay gap and rewarding women with higher pay – although this advantage is unique for high-potential women. These findings suggest that highly qualified women are perceived as having more diversity value for organizations than their male counterparts do (Gayle et al., 2012; Leslie et al., 2017). Furthermore, these findings that gender equality efforts are starting to be beneficial – at least for highly qualified women in fields in which they are underrepresented – are in line with our study results concerning evaluators’ actual rankings of applicants. Finally, the results build upon previous research showing the positive effects of affirmative action policies on academic achievements of minority students (e.g., Fischer & Massey, 2007) and application intentions of women (e.g., Nater & Sczesny, 2016).

Taken together, the study findings did not reveal the typical pattern of discriminatory practices towards women in hiring processes for professorships, pointing to potential changes towards equal treatment like earlier – scarce, but though present – research has shown (e.g., Ceci & Williams, 2015). Furthermore, the results did not replicate the negative consequences of gender-based preferential selection policies from previous experimental studies on women’s and minority applicants’ perceived job competence and hireabilty (e.g., Garcia et al., 1981; Heilman & Blader, 2001; Heilman & Welle, 2006); the results seem to build on recent encouraging findings on hiring practices for assistant professorships in U.S. STEM fields. In the academic sector, hiring data (National Research Council, 2010) and recent experimental studies (Ceci & Williams, 2015; Williams & Ceci, 2015) suggest that affirmative hiring practices favoring women can take place, at least when applicants for high-level positions present strong job qualifications. In fact, the results suggest that gender-based preferential selection policies can have positive effects for the targeted underrepresented group, as they might provide hiring norms, especially for non-beneficiaries (i.e., male evaluators), when decision making is difficult. Furthermore, the results imply that although women’s underrepresentation in high-level leadership positions in the Western European DACH region is more evident than in the United States, affirmative hiring preferences seem to be similarly pronounced for high-level professorship positions.

Limitations and Future Research Directions

Although the study provides important contributions to the literature on gender and academic careers, there are some limitations that need to be considered in interpreting the results. First, as data are based on a web-based experiment, we acknowledge that this artificial scenario calls the external validity of our study results into question. However, an experimental design was a desirable methodology in our attempt to integrate previous research findings on evaluators’ reactions to affirmative action in academic hiring processes, which mostly relied on experimental designs (e.g., Garcia et al., 1981; Heilman et al., 1992). Furthermore, we attempted to increase the external validity of our results by: (1) designing our study materials based on actual university profiles, selection policies, job descriptions, and applicant CVs, (2) conducting pilot studies among experts stemming from the same population as our study sample, and (3) using university members as a sample to test our hypotheses. Moreover, it would be challenging and unethical to test our hypotheses in a field experiment with actual selection committees, as different types of policies would be manipulated. However, future research would benefit if universities collected systematic data on percentages of female applicants for vacant professorship positions, rankings of the pre-selected applicants, and the gender of the final hires. If universities or academic units highlighted certain types of selection policies during the selection process, information on these policies could be included as well. Furthermore, real-life academic hiring and recruitment occurs in a complex context that could not be addressed in the current experiment and thus further limits the external validity of the results. There are a variety of action plans and strategies funded by the European Union that aim at facilitating Member States’ progress towards gender equality in higher education institutions (European Institute for Gender Equality, 2021). For example, working conditions that should allow female and male scientists’ work-family and work-life balance, is an important criterion for gender equality that employers should address in selection processes. Furthermore, the European Code of Conduct for the Recruitment of Researchers highlights the importance of a representative gender balance in selection committees in recruitment processes as well as transparent selection criteria that align with the principle of merit (European Institute for Gender Equality, 2021). Therefore, future studies should apply a mixed-method approach to understand and integrate the context in which academic hiring takes place.

A second limitation of our study design was that the affirmative action beneficiary was always portrayed as a woman with strong job qualifications. We acknowledge that we cannot generalize our interpretations to qualified members from other social groups that have historically been disadvantaged in the work context (e.g., applicants with different ethnicities, disabilities, sexual orientations, migration backgrounds, etc.). Future studies could test the effects of affirmative action for high-level academic positions among other social group members, and with a specific emphasis on intersectionality of social or diversity characteristics (e.g., female scientists from different ethnic backgrounds, female and male scientists with disabilities). In addition, future studies could test if the present results of affirmative hiring can be replicated: (a) for positions that are hierarchically below professorships (e.g., doctoral students, postdoc positions), and (b) across other disciplines. Although our study results reveal positive effects of affirmative action policies in academic hiring processes, previous research reported that gender biases in academic hiring processes were particularly evident when the job qualifications of applicants were not yet as visible (e.g., for laboratory manager) as they are for assistant professors applying for high-level professorships (e.g., Knobloch-Westerwick et al., 2013; Moss-Racusin et al., 2012). The reasoning for different study results could be that evaluators might implicitly fill in missing information about performance records (e.g., publications of doctoral students) with stereotypical assumptions (e.g., Heilman, 2012). Thus, studies are needed that test the possibility that affirmative action policies could still evoke stereotypical perceptions and thus have negative effects on job applicants’ perceived job qualifications, specifically for lower-ranked academic positions.

Third, apart from the individual factors (evaluator gender) and contextual factors (selection policies) that the study considered, future studies might include other factors, such as fairness perceptions. For example, studies showed that non-beneficiaries (Heilman et al., 1996) and beneficiaries (Gillespie & Ryan, 2012) perceived preferential selection as unfair. Procedural fairness, in turn, seems to be a key factor for applicants’ psychological well-being and self-perceptions (Gilliland, 1993). Thus, further research should investigate procedural fairness perceptions and consequences from the evaluator’s perspective. Similarly, decisions impacted by affirmative action might vary based on evaluators’ political orientation or system justification motive (e.g., motives to justify the status quo; Jost & Banaji, 1994).

Practice Implications

Despite these limitations, the study has practical implications for universities, politicians, and practitioners who endeavor to enhance women’s presence in academic leadership: The finding that a female applicant was preferentially hired in a field in which women are still substantially underrepresented – at least when both candidates provide equivalent qualifications – seems to reflect recent efforts on the part of universities to enhance gender diversity. Still, universities would benefit from monitoring actual numbers of female applicants from different minorities as well as their success rates in order to ensure that affirmative action goals are effectively implemented over time.

Further, the results imply that affirmative action efforts seem to have been internalized by much of the academic community today (Williams & Ceci, 2015). However, the different decision patterns of female and male evaluators suggest that a visible and explicit commitment to gender equality can still be advantageous. Particularly, evaluators who are non-beneficiaries of affirmative action due to their gender, such as men, might be guided by organizational norms in decision-making processes on equivalently qualified candidates. These implications are in line with comments made by some of our study participants, who reported that ranking the applicants was difficult, as they perceived them as equally competent for the job. Additionally, some of the participants noted that because the university stated its commitment to affirmative action, they ranked the female candidate as their first choice for the job.

Although the present study finds positive effects of affirmative action policies for associate professorships, we want to stress – given the previous literature on negative consequences of affirmative selection policies in hiring processes – that universities should acknowledge that the use of such hiring statements might lead to different outcomes on different academic levels as they might implicitly enhance stereotypical perceptions when performance records are less visible. Specifically, we refer to lower ranks, such as doctoral levels where performance records (e.g., list of publications, funding) are less visible compared to higher ranks. Gender equality in high-level positions can only be achieved when gender discrimination at the lower levels is eliminated. Thus, universities should test the effects of affirmative action policies on hiring decisions among scientists who do not yet exhibit strong performance records. This differentiation is important, as previous research has shown that stereotypes can be invoked especially when information about applicants is limited or ambiguous or when evaluators lack the motivation to make their decisions carefully (e.g., Koch et al., 2015).

Conclusion

The present study assessed the role of applicant gender and affirmative action policies in a simulated selection process for associate professorships in DACH countries. Overall, the study results did not replicate findings on the negative consequences of gender-based preferential selection policies on perceptions of women’s job qualifications. Rather, the results build on recent encouraging findings showing affirmative hiring practices towards highly qualified individuals in the corporate and academic sector (e.g., Leslie et al., 2017; Williams & Ceci, 2015) might in fact be working in accordance with meritocratic ideals of universities – at least for higher-level applicants. In fact, although both female and male evaluators largely perceived the fictitious female and male job candidate for an associate professorship position as equally competent, the female candidate was rated as more hirable and was ranked first for the position more often compared to the male candidate. However, given that male evaluators’ selection decisions seemed to rely more strongly on the universities’ commitment to support gender equality than female evaluators’ selection decisions, we positively acknowledge that gender-based preferential selection policies can have the intentional effect for the targeted group. They might provide hiring norms, especially for non-beneficiaries (i.e., men), when decision making is difficult.