1. Introduction
Volunteering plays a vital role in the organization of many firms. In various situations, tasks and resources are not allocated among employees by some supervisor, but rather employees have to solve the allocation process by themselves. In fact, especially with the rise of remote work, many tasks are organized more informally without a clear hierarchy and require the team members to take their own initiative. While about 4% of the workforce in Europe’s biggest economy Germany, worked from home in the years prior to 2020, up to 27% worked remotely in 2021 (Hans Böckler Stiftung, 2021). The US gives a similar picture, with 7.6% of exclusively remote working before the COVID-pandemic and peaking at 31.4% in mid-2020 (Bick et al., Reference Bick, Blandin and Mertens2021). In these situations, volunteering is a natural allocation mechanism that we want to examine in this paper. Completing a task requires time and effort, so each team member would prefer another person to finish it. If the task is completed, the product advances, which yields a higher reputation to the team in the organization and may improve the firm’s overall performance in the market. This, in turn, benefits the whole product team as it may improve wages or the job prospects of all team members.
While the described allocation process might seem inefficient due to the challenges of coordination and free-riding incentives, situations like this still occur frequently – and apparently for good reasons – in organizational contexts. For example, many duties in academia are allocated based on voluntary decisions (Babcock et al., Reference Babcock, Recalde, Vesterlund and Weingart2017), just as the development of open source software projects (Johnson, Reference Johnson2004), the contribution to network technology (Lee et al., Reference Lee, Levin, Gopalakrishnan and Bhattacharjee2007), the creation of online knowledge platforms like Wikipedia (Zhang & Zhu, Reference Zhang and Zhu2011), or modern work allocation mechanisms like the so-called agile project methods, which are commonly used in software development (Hoda et al., Reference Hoda, Salleh and Grundy2018). The substantial increase in remote work in recent years might have led to an increase in these allocation mechanisms in the work environment.
The volunteering mechanism does have attractive qualities. Not least, it may reduce the organizational overhead required to organize task allocations. Nevertheless, volunteering in an organizational context is usually not an altruistic act toward others, but often a profit-maximizing individual response to an organizational problem (Murnighan et al., Reference Murnighan, Kim and Metzger1993, Kim & Murnighan, Reference Kim and Murnighan1997). Economic (game) theory thus suggests that the volunteering mechanism also introduces two important obstacles. For one, it creates a coordination problem that the workers must resolve (Erev & Rapoport, Reference Erev and Rapoport1990). More importantly, however, the individual incentive structure of the firm may give rise to a social dilemma: While working is costly and the exact costs are usually unknown, all team members may enjoy the fruits of labor. This, in turn, leads to the famous Volunteer’s Dilemma (Diekmann, Reference Diekmann1985).
The strategic analysis of the Volunteer’s Dilemma closely resembles several relevant factors for the success of volunteering choices in organizations. First, companies and the teams within the company may be of various sizes, influencing the degree of volunteering (Rapoport, Reference Rapoport1985; Otsubo & Rapoport, Reference Otsubo and Rapoport2008; Rapoport & Amaldoss, Reference Rapoport, Amaldoss, Foddy, Smithson, Schneider and Hogg2013). This has been demonstrated in various laboratory and field experiments with a mostly negative effect of group size on individual volunteering (Latané & Nida, Reference Latané and Nida1981; Diekmann, Reference Diekmann1986; Franzen, Reference Franzen1995; Barron & Yechiam, Reference Barron and Yechiam2002; Przepiorka & Berger, Reference Przepiorka and Berger2016; Goeree et al., Reference Goeree, Holt and Smith2017; Kopányi-Peuker, Reference Kopányi-Peuker2019; Campos-Mercade, Reference Campos-Mercade2021). Second, the individual costs of volunteering might differ for different workers, which also affects individual volunteering choices (Diekmann, Reference Diekmann1993; Rapoport & Suleiman, Reference Rapoport and Suleiman1993; Kugler et al., Reference Kugler, Rapoport and Pazy2010; Przepiorka & Berger, Reference Przepiorka and Berger2016). Although laboratory evidence suggests that both factors negatively affect the provision of voluntary work, the prevalence of volunteering in real-world organizations is striking. This begs the question whether economic theory and experimental results from the lab translate into real-world work environments or whether other factors like peer effects, image concerns, or other factors like a different purpose of the task might counteract these problems.
In this paper, we scrutinize these economic arguments against volunteering in the workplace and the existing empirical evidence on volunteering by putting them to a test in real-life workplaces. In a large-scale field experiment with more than 2,000 workers, we analyze the prevalence of volunteering at the workplace and the effect of group size on volunteering behavior. Our main treatment manipulation is thus varying the group size of work teams: we compare the willingness to volunteer for a specific task when working alone to working in small (3 workers), medium (30 workers), and large groups of 300 workers.
In our field experiment, we act as an employer in an online labor market and offer a simple rating task to the online workers. Workers are assigned to a team of a certain group size, and after finishing the individual task, we offer each participant the opportunity to continue working on the task. If at least one person in a group volunteers for the additional team task, each team member receives an additional bonus payment, which resembles the Volunteer’s Dilemma. Before the beginning of the team task, participants receive information about their costs and the costs of others as we inform our workers about the time it has taken them to perform the individual task and how long it has taken others. Finally, we elicit workers’ beliefs about whether they think another co-worker will volunteer.
Our experimental setting allows us to study the causal effect of differences in team size. By creating a natural yet anonymous work environment, we can exclude reputational concerns as well as personal relationships between workers, which would impede the analysis of volunteering at more traditional workplaces. Compared to, say, a brick and mortar company, the online labor market provides a unique ecosystem with much tighter control over the task and environment while allowing us to measure individual opportunity costs, beliefs, and other essential variables. At the same time, we are explicitly interested in a natural task environment, and thus did not impose artificial “lab-like” controls on browser use, task completion times, or other things that would be uncommon in this kind of labor markets. Our setting makes individual and team incentives more directly visible compared to a typical work environment. Still, incentives are capturing the trade-off between free-riding and volunteering as in a typical work environment. Workers in our study do not learn whether others have done the task before. While this differs from many classical work environments where the task would be announced as being completed, this design allows us to measure workers’ individual willingness to volunteer, which would not be possible in a work environment where one can only observe whether there was a volunteer or not. We also contribute to the literature on the Volunteer’s Dilemma by providing an application in a real work environment. Compared to existing experiments where effort is purely monetary, we can capture more subtle differences using actual work tasks.
The results of our study stand in stark contrast to the game-theoretical predictions and results from earlier laboratory studies and field settings outside the work context. We find no support for the hypothesis that the group size influences the volunteering decision of our workers. More precisely, in groups of 3, 30, or 300 workers, the volunteering rates range between 51% and 55% and are not statistically different from each other. Also, the costs of volunteering play a minor role, given our proposed cost measure.
We can rule out several possible explanations for this null finding. First, we show that workers are aware of the group size, since around 93% of workers are able to recall the correct group size after the task is completed. In additional treatments in which workers work on the same task alone instead of in a group, we find that (i) workers are less likely to volunteer if volunteering is not compensated, suggesting that the effort is indeed costly, and (ii) workers are more likely to volunteer compared to the main treatments if their payment depends only on their own actions. This does not mean that every participant reacted to the strategic environment. Some might have simply enjoyed the task or perceived a duty to help out. It does, however, strongly suggest that our null result is indeed not just a result of strategic confusion or other external factors, but that our participants’ volunteering behavior is indeed insensitive to group size in this workplace environment.
A critical difference between our setting and more typical lab settings is that work has a purpose beyond simply a means to payoffs in an organizational context like ours. In particular, workers in our setting do meaningful work in that their work is used to classify hate speech. In an additional experiment, we shut down this channel by clarifying to workers that there is no further importance to their volunteering decision beyond the monetary incentive. In fact, we observe a significant effect of group size in this setting. It suggests that having a purpose might activate non-strategic motivations to volunteer, making the choice a more individual decision where the costs of the task or norms of behavior might still play a role. However, strategic considerations such as the exact team size are less crucial while in the absence of a purpose more strategic motives are prevalent.
We also observe a form of conditional volunteering. We asked the workers how likely it is that at least one other worker on their team is also a volunteer. We find that those who believe that it is more likely that there is at least one other volunteer are more likely to volunteer themselves. Image concerns could explain this seemingly irrational behavior. Subjects who believe that many others probably will volunteer might not want to be perceived (by themselves or us as an employer) as selfish. We show that this form of conditional volunteering behavior is a critical driver of volunteering in our work setting. This finding also replicates similar evidence by Rapoport et al. Reference Rapoport, Bornstein and Erev(1989), whose results suggest that participants deviate from equilibrium play in inter-group competitions if they believed that all others contributed as well.
Overall, our results thus suggest that if a company only cares about the task being completed, volunteering can also be justified as an allocation mechanism in large groups, even though theoretical reasoning would predict otherwise. A key difference in our setting compared to the literature on the Volunteer’s Dilemma is that the task in itself has a purpose beyond simply ensuring financial gains. According to our results, this does not necessarily lead to a decline in volunteering rates as long as workers have a particular regard for how their work (or not working) is perceived.
The remainder of the paper is structured as follows. In Section 2, we explain the experimental design of the field experiment. Following this, we present our preregistered hypotheses in Section 3. Our hypotheses are empirically tested in Section 4. Section 5 discusses our findings and concludes the paper.
2. Experimental design
We study volunteering in a workplace environment through the lens of the Volunteer’s Dilemma (Darley & Latané, Reference Darley and Latané1968; Diekmann, Reference Diekmann1985). In the Volunteer’s Dilemma, a certain number of participants can volunteer to supply a public good to all group members. Volunteering is assumed to be costly. A single volunteer in the group is sufficient to produce a benefit for all its members, which no one receives in case no volunteer can be found. Furthermore, the individual benefit from the public good is greater than the costs of volunteering. This gives rise to the dilemma situation of the game: If there was another volunteer in the game with certainty, the other workers would never volunteer. However, given that the benefit is greater than the costs, workers would prefer to volunteer if all of their colleagues defected. The model captures three essential characteristics of volunteering decisions in the workplace. First, volunteering is chosen simultaneously and without communication, which sets a lower bound for our purposes. Second, there is heterogeneity in, and incomplete information about, the costs of volunteering; and third, firms differ in size. Different tests in the lab (Diekmann, Reference Diekmann1986; Franzen, Reference Franzen1995; Goeree et al., Reference Goeree, Holt and Smith2017; Kopányi-Peuker, Reference Kopányi-Peuker2019) and in field environments (Latané & Nida, Reference Latané and Nida1981; Barron & Yechiam, Reference Barron and Yechiam2002; Przepiorka & Berger, Reference Przepiorka and Berger2016) have shown that the predictions from the Volunteer’s Dilemma are reasonably robust. Thus, one might expect it to be useful in providing predictions in our setting of an online workplace as well.
We use the Volunteer’s Dilemma framework to guide our experimental design of the field experiment. In order to establish causal claims, we have to maintain a high degree of experimental control. Online labor markets are therefore not only a convenient, but also a particularly useful environment for our experiment. Importantly, it is a regular and natural workplace for our experimental workers, and workers differ in their effort costs. At the same time, it allows us to set group sizes exogenously. This further differentiates online labor markets from classical work environments, where workers are usually connected through personal relationships and a common history across and within teams. These factors would impede the identification of the causal effects of group sizes, making the use of an online labor market crucial for this study.
In a nutshell, the field experiment consists of two stages (see Figure 1). In the first part of the job, workers were invited to work individually on a coding task for a fixed payment. Upon joining the job, the workers were randomly matched to one of our treatments in which we varied the group size and the incentive structure.

Figure 1. Structure of the experiment
After completing the first individual task, workers were informed about a second stage. We asked them if they would like to volunteer for a second round of coding, just like the one they had done before, but with a different payment scheme. This second stage implements the actual Volunteer’s Dilemma: If at least one worker in the group task volunteered we paid a bonus to all group members. Finally, we elicited the beliefs of the workers about the volunteering decision of their team members, and they had to answer a short questionnaire. All workers received the payoffs a few days after the experiment.
2.1. Workers and the online labor market
The field experiment was conducted on clickworker.de, an online crowdsourcing marketplace. Crowdsourcing marketplaces allow people to work on tasks that are usually easy to do for humans, but difficult to automate. Most tasks on such platforms require a couple of minutes to complete and include assignments like the processing of images or the cleaning of data (see Difallah et al. (Reference Difallah, Catasta, Demartini, Ipeirotis and Cudré-Mauroux2015) and Jain et al. Reference Jain, Sarma, Parameswaran and Widom(2017) for an overview of common tasks). Online labor markets have become increasingly popular in recent years (Difallah et al., Reference Difallah, Catasta, Demartini, Ipeirotis and Cudré-Mauroux2015), with 0.5% of the US adult population working in the “sharing economy” in 2016 (Farrell & Greig, Reference Farrell and Greig2017). For many workers, these jobs serve as a substitute for traditional offline work in times of economic downturn (Borchert et al., Reference Borchert, Hirth, Kummer, Laitenberger, Slivko and Viete2023). For them, online labor markets are a regular work environment, which makes it a perfect test bed for studying volunteering at the workplace.
Each worker was only allowed to participate once and required to speak German, but we did not impose any further restrictions on the pool of workers. Every active worker on the platform fulfilling those requirements was free to join. Since the platform requires workers to register with a valid German bank account, multiple participations and bots can be practically excluded. In total, 3,344 workers joined the assignment and read the explanation of the task. Overall, 2,203 workers finished the first stage of the field experiment and were then exposed to our volunteering treatments. A drop-out rate of about a third is typical in online labor markets. Altogether, 2,142 workers reached the end of the experiment. For the analysis, we consider only those workers who reached the end of the study. We obtain a diverse sample with a wide range in age, gender, educational status, and employment. Of all participants, 26% report an age between 18 and 25 years, but a sizeable share (14%) is also above 45 years of age. Our sample comprises self-employed, employed, and unemployed workers with various educational backgrounds. Table B.5 in the Appendix reports the sample composition by treatment. It shows that the sample is quite balanced, so different participant characteristics should not drive potential treatment differences.
2.2. The advertised job
We offered a standard job to all workers active at the time through an advertisement on the platform. Notably, there was no mention of an experiment or anything similar. The workers were asked to rate user comments from another study, a task that is often found in online labor markets. We provided a short description of the task and informed the workers that they could earn 0.90 € and how long it would take them to complete it. While this wage may seem low, it was the standard wage set by the platform, and since we wanted the task to feel as natural and standard as possible, we had to stick to the platform’s suggestion. As we will discuss later, the workers felt this was a decent enough wage to complete the task.
Once they had clicked on our link on the platform, they were redirected to our oTree server (Chen et al., Reference Chen, Schonger and Wickens2016). Workers then received detailed instructions about the task where we also made them aware that they would be working in a team for this assignment. We clarified that they would first be working alone on the described assignment. After completion they were offered a voluntary group project (see Appendix E for the full instructions).
The Task Workers were asked to evaluate user comments from an online forum. Each comment was made with reference to a picture and possibly comments from other users in this online forum. The picture’s theme was always related to migration, refugees, or cultural differences. The workers rated the comments regarding the expressed sentiment and evaluated whether the comments contained hate speech. The coding scheme and a screenshot of the task can be found in Appendix E.2. In both possible experiment stages, 30 distinct comments had to be rated, which were randomly drawn from a set of 13,356 comments. All comments had been collected as auxiliary data in a different study and are not part of our research question. Companies and corporations frequently use online labor markets for similar tasks to better understand customer comments or reviews. According to Difallah et al. (Reference Difallah, Catasta, Demartini, Ipeirotis and Cudré-Mauroux2015), those verification and validation tasks belonged to the most common assignments between 2009 and 2014 on Mturk, a US-based competitor of clickworker.de. According to clickworker.de’s own information, verification tasks and sentiment analyses are the most common tasks in their industry. Thus, we are confident that most workers were familiar with this type of task and perceived it as a regular assignment rather than part of a research project. The ratings of the comments will be used in a series of studies on hate speech for which the original comments had been collected as a dependent variable in the first place (Álvarez-Benjumea & Winter, Reference Álvarez-Benjumea and Winter2018, Reference Álvarez-Benjumea and Winter2020; Álvarez-Benjumea, Reference Álvarez-Benjumea2020). Thus, the work was indeed meaningful and important.
2.3. The volunteering decision
After finishing the first stage of the experiment, we explained to the workers that we needed exactly one volunteer in their team to ensure data quality and to be better able to evaluate the quality of the ratings within their group. We clearly stated that one volunteer within the group was sufficient for the task. Each worker received the offer to volunteer in their team and to continue to work on the task for another round of 30 comments in the second stage. If at least one person in the group volunteered, all members received a bonus payment of 0.90 €. If no person in the group volunteered, no group member received any bonus. The bonus payment did not increase if more than one person volunteered. Furthermore, we explained to the workers that, even if they had not volunteered themselves, they might still receive the bonus payment if one of their teammates volunteered. Importantly, workers were not made aware if others already did the task before making their own choice. This allows us to measure the actual willingness to volunteer. To avoid reputation effects, we clarified that their decision would not have an influence on the payoff of the first stage or their user rating in the online labor market. The volunteering decision in the second stage will be our key dependent variable in the analysis in Section 4. A screenshot and a translation of the decision screen can be found in Appendix E.3.
We used the time spent in the first part of the job as a cost measure for volunteering. In our context, time spent on a job is a fair measure of opportunity costs, because those who spend more time on the task miss out on more opportunities to work, for example, on another job on the platform or on more leisure. As we show later, workers differ substantially in the time it takes them to complete the task. Since both parts of the job consist of the same task, the time spent on the first part is also a good predictor for time spent on the second part (
$\rho = 0.73, p \lt .001$, Spearman’s rank correlation coefficient). Workers were made aware of the time it had taken them to complete the first stage.
In accordance with our theoretical model, we induced commonly known beliefs about the distribution of costs of other workers before making the volunteering decision. To this end, we informed them that other workers usually required between 7.5 to 15 minutes to complete the rating of the comments.Footnote 1 This gave workers a rough estimate as to whether they had high or low costs of volunteering relative to the other workers.
2.4. Treatments
Volunteer’s Dilemma Treatments Workers in our main treatments faced team incentives in the form of a Volunteer’s Dilemma. Within the field experiment, we varied three group sizes, the Small group with 3 workers, the Medium group with 30 workers, and the Big group with 300 workers.Footnote 2
The instructions were adapted accordingly for each group size. Since the information on the size of the group is the key manipulation in our study, we made sure that the information was as clear as possible for our workers. The group size was mentioned several times, in particular, right before workers made a choice.
Baseline Treatments (N=1) In order to provide a benchmark for volunteering rates in our setting, and to establish whether volunteering is actually costly in this task, we designed two control treatments: Incentivized and Unincentivized. The basic setup in this study was identical to our main experiment, the difference being that the workers did not face a Volunteer’s Dilemma. That is, the instructions were the same as in the Volunteer’s Dilemma treatments, including the fact that workers operated in a team. We pointed out, however, that their actions did not influence the payoffs of their team members, and vice versa.
Workers were notified at the end of the first stage that we needed a volunteer to continue working on the task.Footnote 3 We varied two conditions. In the Incentivized condition, subjects were paid the same bonus as in the Volunteer’s Dilemma treatments (90 cents) for completing the second part. Yet, if they did not volunteer, they did not receive the bonus even if another team member volunteered. This condition would provide an upper bound of volunteering without any team incentives. In the Unincentivized condition, there was no bonus, which allows us to control for intrinsic or non-monetary motivations to finish the task. The volunteering rates of our main treatment, where strategic considerations play a role, should lie between these two conditions.
The assignment was made unavailable once each main treatment had reached 600 workers who had finished the first stage.Footnote 4 For the baseline treatments, the task was made unavailable after 200 workers had made a volunteering decision. We remain with the treatment composition shown in Table 1.
Table 1. Number of observations for each treatment

Note: The number of workers for each treatment, with N being the group size.
2.5. Belief elicitation
We elicited the beliefs of the participants about the volunteering decision of other members in their group. We asked each participant how likely it is that at least one of their team members volunteers for the task. Participants reported the probability on a scale from 0% (“For sure no other person”) to 100% (“Surely at least one other person”) using a slider. The slider did not have an initial value to avoid any anchoring effect.Footnote 5 The belief elicitation was incentivized using the binarized scoring rule and the possible bonus is 0.90€ (Hossain & Okui, Reference Hossain and Okui2013).Footnote 6 The instructions were simple and natural. To increase truth-telling, we used the framing by Danz et al. Reference Danz, Vesterlund and Wilson(2022), and told participants that it is in their best interest to report their actual belief and that the chance to receive the bonus depends on the accuracy of their guess. We randomized the order in which we asked participants for their volunteering decision and their belief. Half of the participants were asked just before their volunteering decision about their belief. The other half answered the belief question after their volunteering decision but before the potential second round of coding comments. It allows us to investigate possible order effects. As the order in which we ask participants for their volunteering decision and their belief plays a negligible role in the results, we pool all observations along this treatment variation for the analysis in Section 4.4.Footnote 7
After the participants reported their beliefs, we also elicited how certain they were about this probability. To measure this “cognitive noise,” we followed the approach by Enke & Graeber (Reference Enke and Graeber2023) and Enke et al. Reference Enke, Graeber and Oprea(2023).
2.6. Questionnaire
At the end of the field experiment, workers completed a questionnaire about their economic preferences (Falk et al., Reference Falk, Becker, Dohmen, Huffman and Sunde2023) and their sociodemographic/economic background. We also asked the participants several questions about the task they had to perform. We use the answers as an additional measure of cost. Furthermore, we asked participants if they could correctly recall the size of their team. We kept the initial questions to a minimum so as not to disturb the impression that the experiment was anything other than a normal job.
3. Hypotheses
The incentive structure in our field experiment resembles a Volunteer’s Dilemma (Diekmann, Reference Diekmann1985) with incomplete information and heterogeneous costs. For our model, we thus focus on the setup by Weesie Reference Weesie1994. We introduce the model’s key features in the following to derive our hypothesis. The key insights from the model are that volunteering decreases in the group size and in the costs of volunteering. The full model, including all proofs, is provided in Appendix A.Footnote 8
To be more formal, there are N players in the game.Footnote 9 Each player
$i \in \{2, ..., N\}$ decides simultaneously to volunteer (
$a_i=V$) or to defect (
$a_i=D$). If at least one player in the game volunteers, all players receive a benefit of bi. Volunteering is costly and the costs are denoted by ci. In line with Weesie (Reference Weesie1994), we assume that
$\gamma_i := \frac{c_i}{b_i}$ follows some arbitrary probability distribution
$\gamma \sim \mathscr{F}$ with a continuous probability density function f and that
$b_i \gt c_i \gt 0 \,\forall i$. The payoff πi of worker i when X −i others volunteer is

The payoff structure gives rise to the dilemma situation of the game: If there were another volunteer in the game for sure, workers would never volunteer. However, given that the benefit is greater than the costs, workers would prefer to volunteer if all of their colleagues defected. Formally, player i volunteers if and only if its expected benefit is weakly greater than the benefit from defecting.

Thus, she volunteers if and only if the cost-benefit ratio
$\gamma_i=\frac{c_i}{b_i}$ of player i is weakly below the probability that there is no other volunteer. Weesie (Reference Weesie1994) shows that there exists a pure strategy equilibrium where players with low cost-benefit ratios volunteer, while those with a γi above some threshold do not volunteer.
In our field experiment, we can elicit the main parameters of the model. The additional bonus of 90 cents is the benefit bi. Volunteering for the task costs time and effort. We argue that those costs to volunteer (ci) can be approximated by the time it takes participants to complete the first part of the experiment and additional survey measures that we elicit in a questionnaire. Also, we directly ask participants for their belief about
$P(X_{-i} \gt 0)$, that is, the probability that there is at least one other team member volunteer. As a result, our experimental design allows us to test a wide range of hypotheses, which are derived from the equilibrium analysis by Weesie (Reference Weesie1994). We provide further details on the theoretical foundations in Appendix A.
In our baseline treatments, volunteering is an individual choice. It allows us to identify if volunteering is indeed costly, and participants react to the dilemma component of the game. If participants perceive the task as costly, they should volunteer less if there is no additional benefit. Thus, we expect volunteering rates in Incentivized to be higher than in the Unincentivized treatment. The group size treatments introduce the Volunteer’s Dilemma and are thus our main treatments. Accordingly, we expect that the volunteering rate is lower in the main treatments compared to the Incentivized treatment, in which participants only influence their own benefit if they volunteer. Yet, as we expect that volunteering is perceived to be costly, we hypothesise that more participants volunteer in the main treatments compared to the Unincentivized treatment. This argument is summarized by Hypothesis 1.Footnote 10
Hypothesis 1.
The volunteering rate in the Volunteer’s Dilemma treatments is higher than in the Unincentivized treatment but lower than in the Incentivized treatment.
Participants with lower costs should volunteer more for the additional task in the main treatments. This prediction follows intuitively from Equation 1 as the expected utility from volunteering is decreasing in the costs. A formal proof is provided in Appendix A.
Hypothesis 2.
Workers with lower costs are more likely to volunteer.
From a theoretical perspective, the share of volunteers decreases in the group size. Large groups give rise to a bigger “diffusion of responsibility”, and players volunteer less on an individual level. We expect that the effect carries over to our workplace environment.
Hypothesis 3.
The volunteering rate decreases in the group size.
We elicit the participants’ beliefs that there is another person in the team who volunteers. We can verify whether participants have correct beliefs using the average volunteering rate in each treatment. Furthermore, the expected value from defecting is increasing in this belief (see Equation 1). For higher beliefs, there exists a higher subjective probability that a participant receives the benefit without volunteering herself. It follows that participants with higher beliefs should volunteer less, as they can avoid incurring the volunteering costs while receiving the same benefit.
Hypothesis 4.
Participants who believe that there is another volunteer are less likely to volunteer themselves.
Clearly, all these hypotheses build on purely monetary-driven arguments. In a work environment, multiple additional factors, such as peer effects or image concerns, might play a role. These factors might influence the link between beliefs and actions in different ways.
4. Results
We will discuss our empirical results as follows: In the first step, we demonstrate that workers perceive the task as costly and react to free-riding incentives. We then follow the hypotheses outlined in Section 3. To understand the mechanisms underlying our results, we investigate the stated beliefs of subjects about their pivotality to provide the good and the interplay between beliefs and actions. We will finally discuss additional experimental evidence, suggesting that the perception of one’s work as “meaningful” may transform the “games” people play in social interaction.Footnote 11 Appendix B contains additional detailed empirical analyses supplementing our results.
4.1. Workers are sensitive to incentives and strategic situations
To satisfy our first identifying assumption, we have to establish whether the volunteering choice was actually costly. We therefore compare two baseline treatments in which we asked participants whether they wanted to volunteer in a second coding round. In these baseline treatments, the payment only depends on the worker’s decision and not on other workers.Footnote 12 In the incentivized condition, participants were paid an additional 0.90€ in case they volunteered for a second round of coding. In the unincentivized condition, they were only paid for the first round.Footnote 13 The volunteering rate in the incentivized version was 68.4%, but only 27.1% in the unincentivized treatment (See Figure 2,
$p \lt 0.01, \chi^2$-test). This allows us to conclude that the task was indeed perceived as a costly effort and provides the basis for the following results.

Figure 2. Volunteering rates in the incentivized and the unincentivized baseline treatments in comparison to the main Volunteer’s Dilemma (VOD) treatments. The error bars represent the 95% confidence intervals
Our second identifying assumption was that our participants react to the strategic situation of the Volunteer’s Dilemma and show different volunteering rates than in the individual choice situation. We expected that volunteering rates in the main treatments fall between those in the incentivized and the unincenitivized individual choice condition. This is also the case. Pooling the data for all main treatments, the volunteering rate is significantly lower, at 53.1%, than in the incentivized condition (
$p \lt 0.01, \chi^2$-test) and significantly higher than in the unincentivized individual choice condition (
$p \lt 0.01, \chi^2$-test, see Figure 2). Hence, we find clear support for Hypothesis 1.
Result 1.
Volunteering is costly, and workers react to free-riding incentives.
4.2. Heterogeneity in costs to volunteer
Hypothesis 2 predicts that participants with higher costs are less likely to volunteer. As explained in Section 2, we argue that individual costs of volunteering can be approximated by the time it takes a participant to finish the first stage of the field experiment. Most participants required between 8.23 (20th percentile) and 16.52 minutes (80th percentile) to complete the first stage, with an average of 14.50 minutes. Figure B.1(a) in the Appendix presents the estimated distribution of completion times by treatment.
The data does not support our Hypothesis 2 since the effect of costs operationalized as time is insignificant. Model 1 in Table 2 shows the results of a logistic regression model and estimates the probability of volunteering as a function of the z-standardized time. Importantly, the coefficient of Time is insignificant.Footnote 14
Table 2. Logistic regression to estimate the volunteering choice as a function of different cost measures

Notes:
* p < 0.1; *p < 0.05; ***p < 0.01, Standard errors in parentheses.
We constructed a z-standardized additive “Cost Index” from several control questions in our post-experimental questionnaire as an additional measure of subjective costs. We asked participants on a 7-point Likert scale whether they perceived the task as exhausting (µ = 3.01), interesting (µ = 2.91, reverse-coded), or emotionally challenging (µ = 2.30), and whether it was important to them to contribute to “better data quality” (µ = 2.44, reverse-coded). The subjective costs have a substantially meaningful and highly statistically significant negative effect on the probability to volunteer (see Model 2 and 3 in Table 2 and Table D.3). The effect of costs on volunterering are similar across treatments and visualized in Figures B.3 and B.4 in the Appendix.
We therefore conclude that
Result.
The probability to volunteer decreases in the subjective costs of volunteering.
4.3. Insensitivity to team size in the volunteers dilemma
Hypothesis 3 predicts that the volunteering rate decreases in the group size. This is clearly not the case (see Figure 3 and Tables B.1 and B.2). Volunteering rates are relatively high and statistically indistinguishable with regard to group size (all p > 0.1, χ 2-tests).Footnote 15 We therefore conclude that
Result.
The volunteering rate does not vary in the group size.
This result is surprising, given the extensive literature on the volunteer dilemma. It constitutes our main result and shows that our participants react to economic incentives and the strategic nature of the Volunteer’s Dilemma (Result 1), but that team size seems irrelevant.

Figure 3. The average volunteering rate across the different group sizes. The error bars represent 95% confidence intervals
4.4. Beliefs
A potential explanation for this surprising result might be that our participants had similar beliefs about their pivotality across treatments. In this section, we show that while beliefs differ across treatments in their level and marginal effect on the likelihood of volunteering, this difference is insufficient to account for the striking similarities in volunteering rates. To do so, we compare the beliefs with the actual probability that at least one other person volunteers for the given group size.Footnote 16
Table 3 reports the results. In each treatment, participants report beliefs that are, on average, smaller than the correct belief. As such, they underestimate the probability that there is at least one other volunteer, thereby overestimating their own true pivotality. The differences are statistically significant (p < 0.01 for all treatments).Footnote 17 Hence, participants do not form correct beliefs about the volunteering decisions of other team members.
Table 3. Belief overview by treatment

* The correct belief is calculated based on the average volunteering rate in the respective treatment. Here, we define the belief of a participant as correct if it is in the interval of +/- 5 pp around the correct belief.
Figure 4 shows the discrete distribution of the beliefs by group size. There exists a considerable variation across beliefs. While about a third of participants in N = 30 and N = 300 report beliefs close to the correct belief, beliefs are, on average, too low. In N = 3 even fewer participants report a belief close to the true probability that one other team member volunteers. This is also driven by a large share of participants who believe that no one else volunteers.

Figure 4. The belief that another person in the group volunteers by group size. To calculate the correct belief we use the average volunteering rate in the treatment
Result.
Beliefs about volunteering are on average incorrect.
Next, we consider differences in beliefs across treatments. Participants in the N = 3 treatment report statistically significantly smaller beliefs than in the other two treatments (p < 0.01 for both comparisons, two-sided Mann–Whitney U test). There are no differences in beliefs between N = 30 and N = 300 (p = 0.64, two-sided Mann–Whitney U test).
Result 5.
The belief that someone else volunteers is lower in small teams compared to larger teams.
The differences in beliefs make intuitive sense, given the observed volunteering rates in the field experiment. As volunteering rates are similar, the probability that another person in the team is a volunteer increases in the group size.Footnote 18
From a participant’s perspective, one would expect that lower beliefs lead to higher volunteering and, thus, to see more volunteers in small groups. However, participants volunteer at similar rates across treatments. A possible explanation is that a non-trivial link exists between actions and beliefs in the workplace environment, which is not captured by the classical Volunteer’s Dilemma framework. We investigate the relationship between beliefs and actions in the following section.
4.5. Conditional volunteering
We now focus on how beliefs translate into actions. As discussed in Hypothesis 4, differences in beliefs should translate into differences in behavior. The higher the probability that someone else in the team volunteers, the higher the likelihood that the player still receives the benefit without volunteering herself. From a purely monetary perspective, subjects who are certain that another participant volunteers (i.e., report a belief of 100%) should never volunteer themselves as volunteering is costly and the benefit does not increase if there are multiple volunteers. On the other hand, participants with pessimistic beliefs about the volunteering of others should be more likely to volunteer. We regress the volunteering decision on beliefs in a linear probability model to investigate this relation in the data. Figure 5 visualizes those effects in a local linear regression.

Figure 5. Probability to volunteer based on belief about at least one other worker volunteering. Fitted values from a local linear regression with Wang Ryzin Kernel and leave-one-out cross-validated bandwidth
Table 4 shows the results, which are in sharp contrast to Hypothesis 4 and the theoretical predictions from the Volunteer’s Dilemma. There exists a strong positive correlation between beliefs and the volunteering decision. In other words, subjects volunteer more if they believe that it is more likely that there is another volunteer in their team. An increase in the belief by one percentage point increases the probability of volunteering by 0.5 percentage points (Model 1 in Table 4).
Table 4. The Volunteering choice explained by Beliefs and all treatment dummies in a linear probability model with robust standard errors

Notes:
* p < 0.1; **p < 0.05; *** p < 0.01, Robust standard errors in parentheses.
The tendency to contribute to a public good conditional on the contributions of others has been documented in other public good games (see, for instance, Rapoport, Reference Rapoport1985; Fischbacher et al., Reference Fischbacher, Gächter and Fehr2001). However, compared to a classical public good game, the payoff remains the same when the number of volunteers increases in our setup. Clearly, this connection between beliefs and actions is not individually rational from a monetary perspective. However, it is well in line with results from the literature on (social or self-) image concerns (for example, Bénabou & Tirole, Reference Bénabou and Tirole2006) or the vast literature on descriptive norms as well as on peer effects (see, for example, Cornelissen et al., Reference Cornelissen, Dustmann and Schönberg2017, for a recent study on peer effects at the workplace). We interpret this effect as a normative driver for the outcomes of our field experiments. Compared to a standard volunteers dilemma in the laboratory, it is reasonable to assume that subjects in the workplace environment want to comply with the decision of their coworkers, that is, they do not want to be seen (or see themselves) as unsocial or lazy by acting differently than what they believe is the ‘correct’ thing to do. While the situation is arguably still very anonymous towards other workers, social image concerns could still apply towards the workplace or the experimenter.
Note that this finding has to be considered jointly with the results that beliefs are, on average, lower in the small group compared to the two larger group sizes (Result 5). That is, while participants, on average, have lower beliefs about the volunteering of their group members, those with higher beliefs react more strongly to the beliefs compared to those with similar beliefs in the larger groups. Therefore, we find differences in the beliefs but no differences in volunteering rates across group sizes
Result 6.
Workers are “conditional volunteers” and volunteer more if they believe others volunteer as well.
While Result 6 offers some insights into the driving forces of volunteering in our setup, it is essential to note that beliefs are not exogenous in this setting. We therefore consider further factors.
4.6. Replication and robustness of our results
Our results on volunteering behavior and beliefs are surprisingly different from previous results from lab and field studies on the Volunteer’s Dilemma. Thus, our first step is to confirm the validity of our results in a replication study. We find qualitatively and quantitatively equivalent results in another experiment with N = 2,733 participants on the same platform: Group size does not correlate with volunteering behavior. Volunteering rates in the replication range from 0.522 to 0.545, compared to the 0.514 to 0.552 reported in this study. Section C.1 in the Appendix provides a detailed discussion of the other experiment demonstrating the robustness of our findings.
In addition, we provide extensive randomization checks in Appendix C. We find that randomization worked very well, and the treatments are balanced according to several observable participant characteristics (see Table B.5 and Figure C.1 in the Appendix C).
Importantly, we find strong evidence that participants are well aware of their group size. First, while volunteer rates do not differ across treatments, beliefs are sensitive to group size, suggesting that participants were aware of it. Second, we explicitly asked participants at the end of the experiment if they remembered their group size, and 93 % did recall it correctly (see Table C.2 in Appendix C). This demonstrates that group size was salient and that participants were aware of it.
Overall, workers understood the incentives involved, and thus our results are not driven by a general misunderstanding of the task. In a further experiment discussed in Section 4.7 we included control questions to ensure that there are no misunderstandings of the incentives to volunteer. Also we informed participants about the correct answers. In general, understanding was high and around 80-85% of subjects answered the key control questions correctly. Crucially, also in the part of the study resembling our main treatments, we find high volunteering rates and no effect on group size.
To ensure that our results regarding beliefs are robust, we thoroughly analyze additional measures we obtained in the questionnaire on economic preferences such as reciprocity, risk preferences, altruism, and so on. (see Table C.3 and Section D in the Appendix). We find several main effects, for example, risk preferences negatively correlate with volunteering, but no substantial interaction that would come close to the explanatory power of measured beliefs.
While there is a significant main effect of reciprocity on volunteering, this effect does not explain the main findings that volunteering is insensitive to the group size. Table D.8 in the Appendix shows that there is no significant interaction effect between treatment and reciprocity on volunteering. However, this makes intuitive sense: If someone is certain that others will volunteer, equalizing the payoffs by volunteering is costly and does not add to anyone’s welfare. This speaks for the robustness of the explanation of beliefs as a driver of normative behavior.
Finally, we would like to reiterate that our baseline treatment allows us to rule out the possibility that participants did not understand the strategic incentives in the experiment or that volunteering was not costly enough. As summarized in Result 1, participants volunteer more when their payoff is independent of the decision of other team members (N = 1) and less when they are not paid for it. Thus, participants understood the strategic incentives of the experiment and that volunteering is costly, further strengthening the validity of our main results.
4.7. Purpose of the task
This section provides a potential explanation for the stark contrast between our robust results and previous (lab) experimental studies. One distinct feature in our work environment is that the task has a purpose beyond simply ensuring payments for the decision-makers and taking part in an organizational context. It might change the decision frame of decision-makers (Northcraft & Tenbrunsel, Reference Northcraft and Tenbrunsel2011) from an abstract strategic environment to an organizational context. As results by Kim & Murnighan Reference Kim and Murnighan(1997) suggest, measuring the intentions to volunteer within organizational scenarios may “dampen the effects of self-interest and other structural variables”. Thus, instead of perceiving the volunteering choice as a strategic situation in which the group size would be crucial, workers might perceive the game as non-strategic and focus more on the purpose of the task and maybe their individual costs.
Using an additional experiment with 338 subjects and focusing only on small and medium group sizes (3 and 30 workers), we actively vary the purpose of the volunteering choice. In the high-importance condition, the instructions are nearly identical to those in our main study. In the main study, people are informed at the beginning of the study that the task per se is meaningful as the ratings in the first part and that of one volunteer will be used in a scientific study but in the additional experiment we again emphasize this right before the volunteering decision.Footnote 19 In the low-importance condition, we informed the participants that their work in the volunteering task would not be used, thus “downgrading” the importance of the work to a mere financial bonus task and opening the decision further to strategic considerations.Footnote 20 We provide a screenshot of the decision screens and the translation in Appendix E.
In line with our previous results, we find no significant group size effect in the high-importance setting (see Figure 6,
$72.7\%$ vs.
$64.1\%$, χ 2 test, p > 0.1). However, we find a significant group size effect for the low-importance treatment where volunteering under the medium group size is significantly lower (
$71.9\%$ vs.
$55.1\%$, χ 2 test, p < 0.05). Taken together, we find no level effect due to importance, that is, importance does not generally shift the willingness to volunteer.
Our results suggest that the importance or purpose of the task might shift the perception of the situation, with more strategic motives only entering the decision-making process under the low-importance tasks.

Figure 6. The average volunteering rate across the different group sizes and with varying task importance. The error bars represent 95% confidence intervals
5. Discussion and concluding remarks
In this paper, we study volunteering at the workplace. While volunteering as an allocation mechanism in work environments is now widespread, and the industry strongly advocates its adoption, economic analysis, and empirical results suggest that two factors might impede volunteering. First, the incentives create a coordination problem, that is, who should volunteer, and second, workers prefer others to do the work, often leading to a situation similar to the Volunteer’s Dilemma. In particular, our game-theoretical model and past research predict a higher diffusion of responsibility in larger groups, that is, lower volunteering rates.
Following the long standing tradition of using experimental methods to understand business and organizational problems (see, for instance, Rapoport & Zwick, Reference Rapoport and Zwick2002; Camerer & Weber, Reference Camerer and Weber2013), we report the results of a field experiment with more than 2,000 workers conducted in an online labor market. In our experiment, our workers individually worked on a standard classification task, and were then asked to volunteer for a similar task to secure a bonus for all their team workers. We exogenously vary the team size and thus study the causal effect of team sizes on volunteering behavior. Additionally, we elicit workers’ beliefs about their co-workers’ volunteering probability.
We find that workers react to the strategic situation of the Volunteer’s Dilemma and shifts in the incentive structure. Workers volunteer much more if they are alone instead of embedded in a group and thus a Volunteer’s Dilemma. They also volunteer much less if they are not paid for the bonus task. Furthermore, in line with our theoretical predictions, workers with lower subjective costs are more likely to volunteer.
In stark contrast to previous results that study volunteering outside of the remote work context, we find no effect of the group size on volunteering behavior. On average, about 53% of workers choose to volunteer across all our main treatments.
We identify workers’ beliefs about the volunteering decision of their team members as a key driver of volunteering in our settings. Workers who believe that it is more likely that at least one other worker volunteers are more likely to volunteer themselves. This conditional volunteering behavior is puzzling from a purely material perspective. However, it aligns with findings by Rapoport et al. Reference Rapoport, Bornstein and Erev(1989) and with results from descriptive norms (Bicchieri, Reference Bicchieri2005). If workers perceive volunteering as a stronger descriptive norm, indicated by higher beliefs, they might be more inclined to follow these norms. A similar explanation can stem from image concerns. Workers do not want to be perceived as selfish by others or themselves for not volunteering if they believe that most other workers are volunteering. While this offers some insights into the driving forces of volunteering in our setup, it is essential to note that beliefs are not exogenous in this setting and we cannot rule out that other confounding factors simultaneously influence beliefs and volunteering choices. Nonetheless, it would be an interesting path for future research to study how beliefs influence the volunteering choice explicitly in a treatment design.
In contrast to previous settings in the Volunteer’s Dilemma where a group-size effect is usually observed, our workers make decisions in an organizational context with a task that has a purpose beyond simply providing financial incentives. This purpose and the organizational setting might change the perception of the game for workers, putting them in a “work frame” (Northcraft & Tenbrunsel, Reference Northcraft and Tenbrunsel2011) and thus dampening strategic considerations like group size. Similar arguments were used in an earlier study by Kim & Murnighan Reference Kim and Murnighan(1997) in which also no group size effects on intended volunteering are reported in a (framed) workplace scenario. Using additional experiments, we explicitly shut down the purpose of the task by telling workers that the work they do in the additional task will not be used for anything else than determining the bonus. Interestingly, in this setting, we observe a group size effect with workers in larger groups volunteering significantly less. Thus, having a purpose for the task seems to crowd out some of the strategic considerations often assumed when modeling the volunteer decision theoretically (see also Section 3), or considering it in more abstract lab experiments. These differences should be analysed in more detail in future studies and our results can therefore provide a starting point for an encompassing model of volunteering at the workplace.
Our results suggest that volunteering at the workplace is distinctively different from other volunteering situations. Thereby, we contribute to a growing body of literature studying the scope conditions of group-size effects: Weimann et al. Reference Weimann, Brosig-Koch, Heinrich, Hennig-Schmidt and Keser(2019) show that groups of up to 100 members successfully provide public goods, even though free-riding incentives are exceptionally high. The same observation has recently been found in peer punishment (Carpenter, Reference Carpenter2007) and third-party punishment experiments (Kamei, Reference Kamei2020).
Those findings have several implications for firms’ organizational structure or, more generally, social groups that rely on volunteering. We deliberately set up the work environment so that workers are not informed whether others worked before them, which allows us to measure their individual willingness to volunteer. However, depending on the exact organizational structure, our results might be good or bad news for volunteering at the workplace. Interpreted strictly within the context of the Volunteer’s Dilemma, the high level of volunteering is wasteful and inefficient. Especially in work settings where it is not possible for a manager to prevent this over-provision, the allocation mechanism has adverse welfare effects. Indeed, these inefficiencies are discussed in open source software development (McConnell, Reference McConnell1999; Kenwood, Reference Kenwood2001). It is especially relevant in workplaces with limited communication between workers or with little oversight by a manager, like in remote work contexts that have become more popular in recent years. Here, managers should focus on smaller team sizes to prevent too much volunteering and reduce inefficiencies.
For organizations that cannot prevent over-provision and only care about finding at least one volunteer, that is, who only care about producing the good, our results are good news. Contrary to what theory or intuition would suggest, giving workers the freedom to self-organize does not hurt commitment. Using larger team sizes might be desirable from the perspective of a manager. It is true even without oversight and without any disciplinary measures. Our findings might thus relate to high volunteering rates in other contexts, such as writing open-source code or Wikipedia articles, even though the number of potential volunteers is sometimes very large.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/eec.2024.13.
Acknowledgements
We acknowledge funding by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) 395336584 and the ZEW - Leibniz Zentrum für Europäische Wirtschaftsforschung. Ethical approval was obtained from the German Association for Experimental Economic Research e.V. and can be accessed under https://gfew.de/ethik/Ft9eR5SK. The replication material for the study is available at https://doi.org/10.17605/OSF.IO/PRWCS.
Statements and declarations
The authors declare that they have no competing financial or non-financial interests.