• PNAS Sustainability Science
  • Sign-up for PNAS eTOC Alerts

How social information can improve estimation accuracy in human groups

  1. Guy Theraulazb,e,1
  1. aLaboratoire de Physique Théorique, CNRS, Université de Toulouse (Paul Sabatier), 31062 Toulouse, France;
  2. bCentre de Recherches sur la Cognition Animale, Centre de Biologie Intégrative, CNRS, Université de Toulouse, 31062 Toulouse, France;
  3. cDepartment of Behavioral Science, Hokkaido University, 060-0810 Sapporo, Japan;
  4. dToulouse School of Economics, Institut National de la Recherche Agronomique (INRA), Université de Toulouse (Capitole), 31000 Toulouse, France;
  5. eInstitute for Advanced Study in Toulouse, 31015 Toulouse, France;
  6. fToulouse School of Economics, Université de Toulouse (Capitole), 31000 Toulouse, France;
  7. gDepartment of Social Psychology, The University of Tokyo, 113-0033 Tokyo, Japan
  1. Edited by Burton H. Singer, University of Florida, Gainesville, FL, and approved October 2, 2017 (received for review March 5, 2017)

Significance

Digital technologies deeply impact the way that people interact. Therefore, it is crucial to understand how social influence affects individual and collective decision-making. We performed experiments where subjects had to answer questions and then revise their opinion after knowing the average opinion of some previous participants. Moreover, unbeknownst to the subjects, we added a controlled number of virtual participants always giving the true answer, thus precisely controlling social information. Our experiments and data-driven model show how social influence can help a group of individuals collectively improve its performance and accuracy in estimation tasks depending on the quality and quantity of information provided. Our model also shows how giving slightly incorrect information could drive the group to a better performance.

Abstract

In our digital and connected societies, the development of social networks, online shopping, and reputation systems raises the questions of how individuals use social information and how it affects their decisions. We report experiments performed in France and Japan, in which subjects could update their estimates after having received information from other subjects. We measure and model the impact of this social information at individual and collective scales. We observe and justify that, when individuals have little prior knowledge about a quantity, the distribution of the logarithm of their estimates is close to a Cauchy distribution. We find that social influence helps the group improve its properly defined collective accuracy. We quantify the improvement of the group estimation when additional controlled and reliable information is provided, unbeknownst to the subjects. We show that subjects’ sensitivity to social influence permits us to define five robust behavioral traits and increases with the difference between personal and group estimates. We then use our data to build and calibrate a model of collective estimation to analyze the impact on the group performance of the quantity and quality of information received by individuals. The model quantitatively reproduces the distributions of estimates and the improvement of collective performance and accuracy observed in our experiments. Finally, our model predicts that providing a moderate amount of incorrect information to individuals can counterbalance the human cognitive bias to systematically underestimate quantities and thereby improve collective performance.

In a globalized, connected, and data-driven world, people rely increasingly on online services to fulfill their needs. AirBnB, Amazon, Ebay, and Trip Advisor, to name just a few, have in common the use of feedback and reputation mechanisms (1) to rate their products, services, sellers, and customers. Ideas and opinions increasingly propagate through social networks, such as Facebook or Twitter (2?4), to the point that they have the power to cause political shifts (5). In this context, it is crucial to understand how social influence affects individual decision-making and its resulting effects at the level of a group.

Two observations can be made about these collective phenomena: (i) people often make decisions not simultaneously but sequentially (6, 7), and (ii) decision tasks involve judgmental/subjective aspects. Social psychological research on group decision-making has established that consensual processes vary greatly depending on the demonstrability of answers (8). When the solution is easy to show, people often follow the “truth-wins” process, whereas when the demonstrability is low, they are much more susceptible to “majoritarian” social influence (9). Thus, collective estimation tasks where correct solutions cannot be easily shown are particularly well suited for measuring the impact of social influence on individuals’ decisions. Galton’s original work (10) on estimation tasks shows that the median of independent estimates of a quantity can be impressively close to its true value. This phenomenon has been popularized as the wisdom of crowds (WOC) effect (11), and it is generally used to measure a group’s performance. However, because of the independence condition, it does not consider potential effects of social influence.

In recent years, it has been debated whether social influence is detrimental to the WOC or not: some works argue that it reduces group diversity without improving the collective error (12, 13), while others show that it is beneficial if one defines collective performance otherwise (14, 15). One or two of the following measures were used to define performance and diversity. Let us define <mml:math><mml:msub><mml:mi>E</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math>Ei as the estimate of individual <mml:math><mml:mi>i</mml:mi></mml:math>i, <mml:math><mml:mrow><mml:mo stretchy="false">?</mml:mo><mml:mrow><mml:msub><mml:mi>E</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">?</mml:mo></mml:mrow></mml:math>?Ei? as its average over all individuals, and <mml:math><mml:mi>T</mml:mi></mml:math>T as the true value of the quantity to estimate. Then, <mml:math><mml:mrow><mml:msub><mml:mi mathvariant="script">G</mml:mi><mml:mi>D</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">?</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>E</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>?</mml:mo><mml:mrow><mml:mo stretchy="false">?</mml:mo><mml:mrow><mml:msub><mml:mi>E</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">?</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mo stretchy="false">?</mml:mo></mml:mrow></mml:mrow></mml:math>GD=?(Ei??Ei?)2? is a measure of group diversity, and <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi mathvariant="script">G</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mo stretchy="false">?</mml:mo><mml:msub><mml:mi>E</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">?</mml:mo></mml:mrow><mml:mo>?</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:math>G=(?Ei??T)2 and <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi mathvariant="script">G</mml:mi><mml:mo>′</mml:mo></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">?</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>E</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>?</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mo stretchy="false">?</mml:mo></mml:mrow></mml:mrow></mml:math>G′=?(Ei?T)2? are two natural measures of the group performance. However, these estimators are not independent, since <mml:math><mml:mrow><mml:mi mathvariant="script">G</mml:mi><mml:mo>′</mml:mo><mml:mo>=</mml:mo><mml:mrow><mml:mi mathvariant="script">G</mml:mi><mml:mo>+</mml:mo><mml:msub><mml:mi mathvariant="script">G</mml:mi><mml:mi>D</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:math>G′=G+GD, which shows that a decrease in diversity <mml:math><mml:msub><mml:mi mathvariant="script">G</mml:mi><mml:mi>D</mml:mi></mml:msub></mml:math>GD is beneficial to group performance, as measured by <mml:math><mml:mi mathvariant="script">G</mml:mi><mml:mo>′</mml:mo></mml:math>G′, contrary to the general claim. Later research showed that social influence helps the group perform better if one considers only information coming from informed (16), successful (17), or confident (18) individuals. We will show that these traits are actually strongly related. The way that social information is defined also matters: providing individuals with the arithmetic or geometric mean of estimates of other individuals has different consequences (18).

Other than these methodological issues, it is difficult to precisely analyze and characterize the impact of social influence on individual estimates without controlling the quality and quantity of information that is exchanged between subjects. Indeed, human groups are often composed of individuals with heterogeneous expertise; therefore, in a collective estimation task, one cannot rigorously control the quality and quantity of shared social information, and the quantification of individual sensitivity to this information is hence very delicate. To overcome this problem, we performed experiments in which subjects were asked to estimate quantities about which they had very little prior knowledge (low demonstrability of answers) before and after having received social information. The interactions between subjects were sequential and local, while most previous works have used a global kind of interaction, with all individuals being provided some information (estimates of other individuals in the group) at the same time (12?14, 18, 19). From the individuals’ estimates and the social information that they received, we were able to deduce their sensitivity to social influence. Moreover, by introducing virtual experts (artificial subjects providing the true answer, thus affecting social information) in the sequence of estimates—without the subjects being aware of it—we were able to control the quantity and quality of information provided to the subjects and to quantify the impact of this information on the group performance.

Our results show that the subjects’ reaction to social influence is heterogeneous and depends on the distance between personal and group opinion. We then use the data to build and calibrate a model of collective estimation to analyze and predict the impact of information quantity and quality received by individuals on the performances at the group level.

Experimental Design

Subjects were asked to answer questions for which they had to estimate various social, geographical, or astronomical quantities or the number or length of objects in a picture. For each question, the experiment proceeded in two steps: subjects had to first provide their personal estimate <mml:math><mml:msub><mml:mi>E</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:math>Ep. Then, after receiving the social information <mml:math><mml:mi>I</mml:mi></mml:math>I, they were asked to give a new estimate <mml:math><mml:mi>E</mml:mi></mml:math>E. <mml:math><mml:mi>I</mml:mi></mml:math>I is defined as the geometric mean of the <mml:math><mml:mi>τ</mml:mi></mml:math>τ previous estimates <mml:math><mml:mi>E</mml:mi></mml:math>E (<mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>τ</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>τ=1 or 3). Subjects answered each question sequentially (SI Appendix, Fig. S1) and were not told the value of <mml:math><mml:mi>τ</mml:mi></mml:math>τ. Since humans think in terms of orders of magnitude (20), we used the geometric mean for <mml:math><mml:mi>I</mml:mi></mml:math>I —which averages orders of magnitude—rather than the arithmetic one.

Virtual “experts” providing the true value <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>E</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:math>E=T for each question were inserted at random into the sequence of participants (SI Appendix, Fig. S1). For each sequence involving 20 human participants, we controlled the number <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>n</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>n=0, 5, 15, or 80, and hence, the percentage <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>ρ</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>/</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>n</mml:mi><mml:mo>+</mml:mo><mml:mn>20</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>ρ=n/(n+20)=0, <mml:math><mml:mn>20</mml:mn></mml:math>20, <mml:math><mml:mn>43</mml:mn></mml:math>43, or <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mn>80</mml:mn></mml:mpadded><mml:mo>%</mml:mo></mml:mrow></mml:math>80% of virtual experts, respectively. The social information delivered to human participants, being the geometric mean of previous estimates, is hence strongly affected by these virtual experts.

When providing their estimates <mml:math><mml:msub><mml:mi>E</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:math>Ep and <mml:math><mml:mi>E</mml:mi></mml:math>E, subjects had to report their confidence level in their answer on a Likert scale ranging from one (very low) to five (very high) and were asked to choose the reason that best explained their second estimate among a list of eight possibilities. We used initial conditions for the social information <mml:math><mml:mi>I</mml:mi></mml:math>I chosen reasonably far from the true answer <mml:math><mml:mi>T</mml:mi></mml:math>T and imposed loose limits to the estimates that subjects could give to prevent them from answering too absurdly. All graphs presented here are based on the 29 questions (<mml:math><mml:mrow><mml:mn>5</mml:mn><mml:mo>,</mml:mo><mml:mrow><mml:mn>394</mml:mn><mml:mo>×</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:mrow></mml:math>5,394×2 prior and final estimates) from the experiment performed in France. A similar experiment was conducted in Japan; all results can be found in SI Appendix, where the full experimental protocol is described in detail.

The aims and procedures of the experiments conformed to the ethical rules imposed by the Toulouse School of Economics and the Center for Experimental Research in Social Sciences at Hokkaido University. All subjects in France and Japan provided written consent for their participation.

Results

Distribution of Individual Estimates.

Previous works have shown that distributions of independent individual estimates are generally highly right-skewed, while distributions of their common logarithm are much more symmetric (12, 13, 18). This is because humans think in terms of orders of magnitude, especially when large quantities are involved, which makes the logarithmic scale more natural to represent human estimates (20). In these works, participants were mostly asked “easy” questions for which they had good prior knowledge (high demonstrability), such that the answers ranged over one to two orders of magnitude at most (12?14, 17?19, 21?23). To ensure that little information was present before the inclusion of our virtual experts and to more clearly identify the impact of social influence, we selected “hard” questions (low demonstrability). These questions involve very large quantities, and answers span several orders of magnitude, making the log transform of estimates even more relevant. To compare quantities that can differ by orders of magnitude, we normalize each estimate <mml:math><mml:mi>E</mml:mi></mml:math>E by the true answer <mml:math><mml:mi>T</mml:mi></mml:math>T to the question at hand and define the log-transformed estimate <mml:math><mml:mrow><mml:mi>X</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mi>log</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>E</mml:mi><mml:mo>/</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math>X=log(E/T). Note that the log transform of the actual answer <mml:math><mml:mi>T</mml:mi></mml:math>T is <mml:math><mml:mrow><mml:mi>X</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>X=0.

Fig. 1A shows the distribution of <mml:math><mml:mi>X</mml:mi></mml:math>X before and after social information has been provided to the subjects (SI Appendix, Table S1). Although such distributions have often been presented as close to Gaussian distributions (13, 18), we find that they are much better described by Cauchy distributions because of their fat tails, which account for the nonnegligible probability of estimates extremely far from the truth. The Cauchy probability distribution function reads<mml:math display="block"><mml:mrow><mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi><mml:mo>,</mml:mo><mml:mi>m</mml:mi><mml:mo>,</mml:mo><mml:mi>σ</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mi>π</mml:mi></mml:mfrac><mml:mfrac><mml:mi>σ</mml:mi><mml:mrow><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi><mml:mo>?</mml:mo><mml:mi>m</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msup><mml:mo>+</mml:mo><mml:msup><mml:mi>σ</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mfrac></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>f(X,m,σ)=1πσ(X?m)2+σ2,[1]where <mml:math><mml:mi>m</mml:mi></mml:math>m is the center/median and <mml:math><mml:mi>σ</mml:mi></mml:math>σ is the width of the distribution. SI Appendix, Fig. S2A shows the distribution of estimates in the Japan experiment, and SI Appendix, Fig. S2B shows that, when the same questions were asked, distributions of personal estimates in France and Japan are almost identical.

For the Cauchy distribution, the mean and standard deviation (SD) are not defined. Therefore, good estimators of <mml:math><mml:mi>m</mml:mi></mml:math>m and <mml:math><mml:mi>σ</mml:mi></mml:math>σ are the median and one-half the interquartile range (the difference between the third and the first quartiles) of the experimental distribution, respectively. In the following, <mml:math><mml:msub><mml:mi>m</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:math>mp (<mml:math><mml:mi>m</mml:mi></mml:math>m) and <mml:math><mml:msub><mml:mi>σ</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:math>σp (<mml:math><mml:mi>σ</mml:mi></mml:math>σ) will refer to the median and one-half the interquartile range of the experimental distribution before social influence (after social influence), respectively.

Cauchy and Gaussian distributions belong to the so-called stable distributions family. More generally, <mml:math><mml:mrow><mml:mo stretchy="false">{</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:math>{Xi} being a set of estimates drawn from a symmetric probability distribution <mml:math><mml:mi>f</mml:mi></mml:math>f characterized by its center <mml:math><mml:mi>m</mml:mi></mml:math>m and width <mml:math><mml:mi>σ</mml:mi></mml:math>σ, we define the weighted average <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>X</mml:mi><mml:mo>′</mml:mo></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:msub><mml:mo largeop="true" symmetric="true">∑</mml:mo><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow></mml:math>X′=∑ipiXi, with <mml:math><mml:mrow><mml:mrow><mml:msub><mml:mo largeop="true" symmetric="true">∑</mml:mo><mml:mi>i</mml:mi></mml:msub><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mpadded></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>∑ipi=1; <mml:math><mml:mi>f</mml:mi></mml:math>f is a stable distribution if <mml:math><mml:mi>X</mml:mi><mml:mo>′</mml:mo></mml:math>X′ has the same probability distribution <mml:math><mml:mi>f</mml:mi></mml:math>f as the original <mml:math><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math>Xi, up to the new width <mml:math><mml:mi>σ</mml:mi><mml:mo>′</mml:mo></mml:math>σ′. Indeed, the center <mml:math><mml:mi>m</mml:mi></mml:math>m remains the same because of the condition <mml:math><mml:mrow><mml:mrow><mml:msub><mml:mo largeop="true" symmetric="true">∑</mml:mo><mml:mi>i</mml:mi></mml:msub><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>p</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mpadded></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>∑ipi=1, but the width may decrease after averaging (law of large numbers), depending on the stable distribution <mml:math><mml:mi>f</mml:mi></mml:math>f considered. Cauchy and Gaussian represent two extremes of the stable distribution family, with Lévy distributions being intermediate cases: for the Cauchy distribution, the width <mml:math><mml:mi>σ</mml:mi></mml:math>σ remains unchanged, whereas the narrowing of <mml:math><mml:mi>σ</mml:mi></mml:math>σ is maximum for the Gaussian distribution (SI Appendix). In the case of actual human estimates, the relevance of a certain distribution <mml:math><mml:mi>f</mml:mi></mml:math>f can be related to the degree of prior knowledge of the group. When individuals have no idea about the answer to a question, the weighted average of arbitrary answers cannot be statistically better (<mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>σ</mml:mi><mml:mo>′</mml:mo></mml:mpadded><mml:mo><</mml:mo><mml:mi>σ</mml:mi></mml:mrow></mml:math>σ′<σ) or worse (<mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>σ</mml:mi><mml:mo>′</mml:mo></mml:mpadded><mml:mo>></mml:mo><mml:mi>σ</mml:mi></mml:mrow></mml:math>σ′>σ) than the arbitrary answers themselves, leading to a Cauchy distribution for these estimates (the only distribution for which <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>σ</mml:mi><mml:mo>′</mml:mo></mml:mpadded><mml:mo>=</mml:mo><mml:mi>σ</mml:mi></mml:mrow></mml:math>σ′=σ). However, when there is a good prior knowledge, one expects that combining answers gives a better statistical estimate (<mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>σ</mml:mi><mml:mo>′</mml:mo></mml:mpadded><mml:mo><</mml:mo><mml:mi>σ</mml:mi></mml:mrow></mml:math>σ′<σ; Gaussian). When the quantity to estimate is closely related to general intuition (ages, dates, etc.), estimates should hence follow a Gaussian-like distribution, while when individuals have very little knowledge about the answer, as in our experiment, estimates should be Cauchy-like distributed. The rationale for naturally observing stable distributions is explained in SI Appendix.

We use the term Cauchy-like, because Fig. 1A shows that the distributions of prior (<mml:math><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:math>Xp) and final (<mml:math><mml:mi>X</mml:mi></mml:math>X) estimates are slightly skewed toward low estimates (<mml:math><mml:mrow><mml:mi>X</mml:mi><mml:mo><</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>X<0), reminiscent of the human cognitive bias to underestimate numbers, because of the nonlinear internal representation of quantities (24). As we will show, this phenomenon has strong implications on the influence of information provided to the group. We also observe a clear sharpening of the distribution of estimates after social influence mainly caused by the presence of the virtual experts, hence affecting the value of the social information <mml:math><mml:mrow><mml:mi>M</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mi>log</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>I</mml:mi><mml:mo>/</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math>M=log(I/T) and ultimately, the final estimate <mml:math><mml:mi>X</mml:mi></mml:math>X of the actual subjects. This sharpening becomes stronger as the percentage of experts increases (SI Appendix, Fig. S3).

Moreover, consistent with our introductory discussion of the measurement methods of group performance, we propose the two following indicators: (i) collective performance <mml:math><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mrow><mml:mtext>median</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:math>|median(Xi)|, which represents how close the center of the distribution is to zero (the log transform of the true value <mml:math><mml:mi>T</mml:mi></mml:math>T), and (ii) collective accuracy <mml:math><mml:mrow><mml:mtext>median</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math>median(|Xi|), which is a measure of the proximity of individual estimates to the true value.

Distribution of Individual Sensitivities to Social Influence.

After having received social information, an individual <mml:math><mml:mi>i</mml:mi></mml:math>i may reconsider her personal estimate <mml:math><mml:mrow><mml:msub><mml:mrow><mml:msub><mml:mi>E</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math>Epi. The natural way for humans to aggregate estimates is to use the median (22) or the geometric mean (18), which both tend to reduce the effect of outliers. Here, the social information that we provided to the subject was the geometric mean of the <mml:math><mml:mi>τ</mml:mi></mml:math>τ previous answers (including that of the virtual experts providing the true answer <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>E</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mpadded><mml:mo>=</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:math>Ei=T): <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>I</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mpadded><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">∏</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>?</mml:mo><mml:mi>τ</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>?</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mi>E</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>/</mml:mo><mml:mi>τ</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math>Ii=(∏j=i?τi?1Ej)1/τ. Moreover, one can always represent the new estimate <mml:math><mml:msub><mml:mi>E</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math>Ei as the weighted geometric average of the personal estimate <mml:math><mml:mrow><mml:msub><mml:mrow><mml:msub><mml:mi>E</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math>Epi and the social information <mml:math><mml:msub><mml:mi>I</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math>Ii. Hence, we can uniquely define the sensitivity to social influence <mml:math><mml:msub><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math>Si by <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>E</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:msup><mml:msub><mml:mrow><mml:msub><mml:mi>E</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow><mml:mi>i</mml:mi></mml:msub><mml:mrow><mml:mn>1</mml:mn><mml:mo>?</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:msub><mml:mi>I</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:mrow></mml:math>Ei=Epi1?SiIiSi. The value <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mpadded><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>Si=0 corresponds to subjects keeping their initial estimates, while <mml:math><mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>Si=1 corresponds to subjects adopting the estimate of their peers. In terms of log-transformed variables <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mi>log</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>E</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>/</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math>Xi=log(Ei/T), we obtain<mml:math display="block"><mml:mrow><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>?</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:msub><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mi>M</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>Xi=(1?Si)Xpi+SiMi,[2]where the log-transformed social information is simply the arithmetic mean <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>M</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>/</mml:mo><mml:mi>τ</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">∑</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>?</mml:mo><mml:mi>τ</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>?</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow></mml:math>Mi=(1/τ)∑j=i?τi?1Xj, and thus, <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>?</mml:mo><mml:msub><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>M</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>?</mml:mo><mml:msub><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math>Si=(Xi?Xpi)/(Mi?Xpi). Note that, in this language, <mml:math><mml:msub><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:math>Si is simply the barycenter coordinate of the final estimate in terms of the initial personal estimate and the social information.

Fig. 1B shows that the experimental distribution of <mml:math><mml:mi>S</mml:mi></mml:math>S has a bell-shaped part that we roughly assimilate to a Gaussian, with two additional Dirac peaks exactly at <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>S</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>S=0 and <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>S</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>S=1 (SI Appendix, Table S2 shows the numerical values). Five types of behavioral responses can be identified: keeping one’s opinion (peak at <mml:math><mml:mrow><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>S=0), adopting the group’s opinion (peak at <mml:math><mml:mrow><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>S=1), making a compromise between one’s opinion and the group’s opinion (<mml:math><mml:mrow><mml:mn>0</mml:mn><mml:mo><</mml:mo><mml:mi>S</mml:mi><mml:mo><</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>0<S<1), overreacting to social information (<mml:math><mml:mrow><mml:mi>S</mml:mi><mml:mo>></mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>S>1), and contradicting it (<mml:math><mml:mrow><mml:mi>S</mml:mi><mml:mo><</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>S<0). Quite surprisingly, responses that consist of overreacting and contradicting were generally overlooked in previous works (21?23, 25), either considered as noise and simply not taken into account or sometimes included into the peaks at <mml:math><mml:mrow><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>S=0 and <mml:math><mml:mrow><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>S=1, despite these behaviors being not negligible (especially overreacting). We find that the median of <mml:math><mml:mi>S</mml:mi></mml:math>S is <mml:math><mml:mn>0.34</mml:mn></mml:math>0.34, in agreement with previous results (15, 18, 25), meaning that individuals tend to give more weight to their own opinion than to information coming from others (14, 19). Moreover, the distributions of <mml:math><mml:mi>S</mml:mi></mml:math>S for the experiment performed in Japan and for men and women (in France) are very similar to that of Fig. 1B (SI Appendix, Fig. S4).

We find that the subjects’ behavioral reactions are highly consistent, reflecting robust differences in personality or general knowledge: in each session, according to the way that subjects modified their estimates on average in the first <mml:math><mml:mn>24</mml:mn></mml:math>24 questions, we split the subjects into three subgroups. We first define “confident” subjects as the one-quarter of the group minimizing <mml:math><mml:msub><mml:mrow><mml:mo stretchy="false">?</mml:mo><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mi>q</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mo stretchy="false">?</mml:mo></mml:mrow><mml:mi>q</mml:mi></mml:msub></mml:math>?|Sq|?q, where <mml:math><mml:mi>q</mml:mi></mml:math>q is the index of the questions (i.e., the subjects who were on average closest to <mml:math><mml:mrow><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>S=0), and the “followers” as the one-quarter of the group minimizing <mml:math><mml:msub><mml:mrow><mml:mo stretchy="false">?</mml:mo><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>?</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mi>q</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mo stretchy="false">?</mml:mo></mml:mrow><mml:mi>q</mml:mi></mml:msub></mml:math>?|1?Sq|?q (i.e., closest to <mml:math><mml:mrow><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>S=1). The other one-half of the group is defined as the “average” subjects. SI Appendix, Fig. S5 shows the distributions of <mml:math><mml:mi>S</mml:mi></mml:math>S for the three subgroups computed from questions 25–29. The differences are striking (SI Appendix, Fig. S6): for the group of confident subjects, the peak at <mml:math><mml:mrow><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>S=0 is about seven times higher than the peak at <mml:math><mml:mrow><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>S=1, while for the group of followers, it is less than twice larger. Moreover, the distribution for average subjects is found to be very close to the global distribution shown in Fig. 1B.

Impact of the Difference Between Personal and Group’s Opinions on Individual Sensitivity to Social Influence.

Fig. 2A shows that, on average, <mml:math><mml:mi>S</mml:mi></mml:math>S depends on the distance <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>D</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mi>log</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>E</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>/</mml:mo><mml:mi>I</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>?</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:mrow></mml:math>D=log(Ep/I)=Xp?M between personal and group estimates. Up to a threshold of <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>t</mml:mi></mml:mpadded><mml:mo>≈</mml:mo><mml:mn>2.5</mml:mn></mml:mrow></mml:math>t≈2.5 orders of magnitude, there is a linear cusp relation between <mml:math><mml:mi>S</mml:mi></mml:math>S and <mml:math><mml:mi>D</mml:mi></mml:math>D. The farther away the social information <mml:math><mml:mi>M</mml:mi></mml:math>M is from a subject’s personal estimate <mml:math><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:math>Xp, the more likely the latter is to trust the group as <mml:math><mml:mi>S</mml:mi></mml:math>S increases. Fig. 2B shows the origin of this correlation: as social information gets farther from personal opinion, the probability to keep one’s opinion (<mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>S</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>S=0) decreases, while the probability to compromise increases. Interestingly, the adopting behavior does not change with <mml:math><mml:mi>D</mml:mi></mml:math>D. The same phenomena have been observed in the Japan experiment (SI Appendix, Fig. S8).

Fig. 2.

(A) Mean sensitivity to social influence <mml:math><mml:mi>S</mml:mi></mml:math>S against the distance <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>D</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>?</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:mrow></mml:math>D=Xp?M between personal estimate <mml:math><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:math>Xp and social information <mml:math><mml:mi>M</mml:mi></mml:math>M (group estimate). Black circles correspond to experimental data, while red open circles are simulations of the model. Note that only about <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mn>14</mml:mn></mml:mpadded><mml:mo>%</mml:mo></mml:mrow></mml:math>14% of data are beyond three orders of magnitude. (B) Fraction of subjects keeping (maroon), adopting (pink), and being in the Gaussian-like part of the distribution of <mml:math><mml:mi>S</mml:mi></mml:math>S (mostly compromisers; purple) against <mml:math><mml:mi>D</mml:mi></mml:math>D.

Model.

We now introduce an individual-based model to understand the respective effects of individual sensitivity to social influence and information quality and quantity on collective performance and accuracy observed at the group level. In the model, we simulate a sequence of <mml:math><mml:mn>20</mml:mn></mml:math>20 successive estimates performed by the agents (not including the virtual experts). A typical run of the model consists of the following steps for a given condition <mml:math><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>ρ</mml:mi><mml:mo>,</mml:mo><mml:mi>τ</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math>(ρ,τ).

  • i) An initial condition <mml:math><mml:msub><mml:mi>X</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:math>X0 is chosen at random according to the experimental ratios of initial conditions.

  • ii) With probability <mml:math><mml:mi>ρ</mml:mi></mml:math>ρ, the true value zero is introduced into the sequence, and with probability <mml:math><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>?</mml:mo><mml:mi>ρ</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math>(1?ρ), an agent plays.

  • iii) The agent first determines its personal estimate <mml:math><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:math>Xp from a Cauchy distribution <mml:math><mml:mrow><mml:mi>f</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>m</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>σ</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math>f(Xp,mp,σp) restricted to <mml:math><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:mo>?</mml:mo><mml:mn>7</mml:mn><mml:mo>;</mml:mo><mml:mn>7</mml:mn></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow></mml:math>[?7;7].

  • iv) The agent receives, as social information, the average of the <mml:math><mml:mi>τ</mml:mi></mml:math>τ previous final estimates <mml:math><mml:mi>M</mml:mi></mml:math>M.

  • v) The agent chooses its sensitivity to social influence <mml:math><mml:mi>S</mml:mi></mml:math>S, consistent with the results of Figs. 1B and 2. In particular, <mml:math><mml:mi>S</mml:mi></mml:math>S is drawn in a Gaussian distribution of mean <mml:math><mml:msub><mml:mi>m</mml:mi><mml:mi mathvariant="normal">g</mml:mi></mml:msub></mml:math>mg with probability <mml:math><mml:msub><mml:mi>P</mml:mi><mml:mi mathvariant="normal">g</mml:mi></mml:msub></mml:math>Pg or takes the value <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>S</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>S=0 or <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>S</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>S=1 with probability <mml:math><mml:msub><mml:mi>P</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:math>P0 and <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>P</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>?</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>?</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi mathvariant="normal">g</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:math>P1=1?P0?Pg. <mml:math><mml:msub><mml:mi>P</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:math>P0 and <mml:math><mml:msub><mml:mi>P</mml:mi><mml:mi mathvariant="normal">g</mml:mi></mml:msub></mml:math>Pg have a linear cusp dependence with <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>D</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>?</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:mrow></mml:math>D=Xp?M, while <mml:math><mml:msub><mml:mi>P</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:math>P1 is kept independent of <mml:math><mml:mi>D</mml:mi></mml:math>D. For a given value of <mml:math><mml:mi>D</mml:mi></mml:math>D, the average sensitivity is <mml:math><mml:mrow><mml:mrow><mml:mo stretchy="false">?</mml:mo><mml:mi>S</mml:mi><mml:mo stretchy="false">?</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>×</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>×</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mi mathvariant="normal">g</mml:mi></mml:msub><mml:mo>×</mml:mo><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>m</mml:mi><mml:mi mathvariant="normal">g</mml:mi></mml:msub></mml:mpadded></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mi>α</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi>β</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mi>D</mml:mi><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math>?S?=P0×0+P1×1+Pg×mg=α+β|D|, where <mml:math><mml:mi>α</mml:mi></mml:math>α and the slope <mml:math><mml:mi>β</mml:mi></mml:math>β are extracted from Fig. 2A. <mml:math><mml:msub><mml:mi>P</mml:mi><mml:mi mathvariant="normal">g</mml:mi></mml:msub></mml:math>Pg is hence given by <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>P</mml:mi><mml:mi mathvariant="normal">g</mml:mi></mml:msub></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mi>α</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi>β</mml:mi><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mi>D</mml:mi><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>?</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:msub><mml:mi>m</mml:mi><mml:mi mathvariant="normal">g</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:math>Pg=(α+β|D|?P1)/mg. The threshold <mml:math><mml:mi>t</mml:mi></mml:math>t is determined consistently by the condition <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>S</mml:mi><mml:mtext>max</mml:mtext></mml:msub></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mi>α</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi>β</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:mrow></mml:mrow></mml:math>Smax=α+βt, where <mml:math><mml:msub><mml:mi>S</mml:mi><mml:mtext>max</mml:mtext></mml:msub></mml:math>Smax is the value of the plateau beyond <mml:math><mml:mi>t</mml:mi></mml:math>t in Fig. 2A. The values of all parameters are reported in SI Appendix, Table S3.

  • vi) <mml:math><mml:mi>S</mml:mi></mml:math>S being drawn, the final estimate <mml:math><mml:mi>X</mml:mi></mml:math>X is given by Eq. 2. One starts again from step ii for the next agent.

Comparison Between Theoretical and Experimental Results.

For all graphs, we ran 100,000 simulations, so that the model predictions error bars are negligible. Fig. 1B shows that the distribution of sensitivities to social influence <mml:math><mml:mi>S</mml:mi></mml:math>S obtained in the model (red curve in Fig. 1B) is similar by construction to the experimental one. Also, by construction of the model (step v above), the cusp dependence of the sensitivity to social influence with respect to <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>D</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>?</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:mrow></mml:math>D=Xp?M is well-reproduced by the model (Fig. 2A, red curve with open symbols). We now address several nontrivial predictions of the model.

Estimates after social influence.

Fig. 1A (all values of <mml:math><mml:mi>ρ</mml:mi></mml:math>ρ aggregated) and SI Appendix, Fig. S3 (for each <mml:math><mml:mi>ρ</mml:mi></mml:math>ρ) compare favorably the distributions of estimates predicted by the model with the experimental results (before and after social influence). Social influence leads to the sharpening of the distributions of estimates, and this effect increases as more information is provided to the group.

Impact of social information on collective performance.

Fig. 3 shows the collective performance (precisely defined above) and the width of the distribution of estimates for the different <mml:math><mml:mi>ρ</mml:mi></mml:math>ρ and <mml:math><mml:mi>τ</mml:mi></mml:math>τ. The collective performance is zero when the distribution is centered on the true value, such that the closer it is to zero, the better. As expected, when <mml:math><mml:mrow><mml:mi>ρ</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mpadded width="+1.7pt"><mml:mn>0</mml:mn></mml:mpadded><mml:mo>%</mml:mo></mml:mrow></mml:mrow></mml:math>ρ=0%, no significant improvement is observed in the collective performance. Then, as <mml:math><mml:mi>ρ</mml:mi></mml:math>ρ increases, the center gets closer to the true value, and the width decreases accordingly, such as was also observed in the experiments in Japan (SI Appendix, Fig. S9). Note that the experimental error bars (SI Appendix describes their computation) decrease after social influence, reflecting the decrease of the width of the estimate distribution after social influence and the driving of people’s opinion by the virtual experts.

Fig. 3.

Collective performance, defined as the absolute value of the median of estimates (A) and width of the distribution of estimates (B), for all <mml:math><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>ρ</mml:mi><mml:mo>,</mml:mo><mml:mi>τ</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math>(ρ,τ) before (blue) and after (red) social influence. Both improve with <mml:math><mml:mi>ρ</mml:mi></mml:math>ρ after social influence, except for the collective performance at <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>ρ</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mpadded width="+1.7pt"><mml:mn>0</mml:mn></mml:mpadded><mml:mo>%</mml:mo></mml:mrow></mml:mrow></mml:math>ρ=0%. Full circles correspond to experimental data, while open circles represent the predictions of the full model. The black lines are the predictions of the simple solvable model presented in SI Appendix. For <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>ρ</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mpadded width="+1.7pt"><mml:mn>60</mml:mn></mml:mpadded><mml:mo>%</mml:mo></mml:mrow></mml:mrow></mml:math>ρ=60%, only model predictions are available.

The collective performance and estimate distribution width predicted by the model (Fig. 3, open circles) are in good agreement with those observed in the experiment. The very small effect of <mml:math><mml:mi>τ</mml:mi></mml:math>τ, only reliably observed in the model in Fig. 3A, is explained in SI Appendix. As shown there, a simpler model, where we neglect the dependence of <mml:math><mml:mi>S</mml:mi></mml:math>S with <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>D</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>?</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:mrow></mml:math>D=Xp?M (Fig. 2A), can be analytically solved. It leads to fair predictions (black lines on Fig. 3), although it tends to underestimate the collective performance improvement and does not capture the reduction of the distribution width already observed at <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>ρ</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mpadded width="+1.7pt"><mml:mn>0</mml:mn></mml:mpadded><mml:mo>%</mml:mo></mml:mrow></mml:mrow></mml:math>ρ=0%. This model guided us to design our experiments, and its relative failure motivated us to investigate the phenomenon illustrated in Fig. 2 and included in the full model described above.

Impact of sensitivity to social influence on collective accuracy.

Fig. 4 (SI Appendix, Fig. S11 shows an alternative representation) shows the collective accuracy for the five categories of behavioral responses identified in Fig. 1B and for the whole group before and after social information has been provided. Before social influence, keeping leads to the best accuracy, while adopting and overreacting behaviors are associated with the worst accuracy. However, as more reliable information is indirectly provided by the experts, and in particular for <mml:math><mml:mrow><mml:mi>ρ</mml:mi><mml:mo>≥</mml:mo><mml:mrow><mml:mpadded width="+1.7pt"><mml:mn>40</mml:mn></mml:mpadded><mml:mo>%</mml:mo></mml:mrow></mml:mrow></mml:math>ρ≥40%, adopting and overreacting lead to the best accuracy after social influence (14, 19). The contradicting behavior is the only one for which the accuracy is deteriorating after social influence. Finally, compromising leads to a systematic improvement of the accuracy as the percentage of experts increases (better than keeping for <mml:math><mml:mrow><mml:mi>ρ</mml:mi><mml:mo>≥</mml:mo><mml:mrow><mml:mpadded width="+1.7pt"><mml:mn>40</mml:mn></mml:mpadded><mml:mo>%</mml:mo></mml:mrow></mml:mrow></mml:math>ρ≥40%), very similar to that of the whole group. The collective accuracy for each behavioral category is again fairly well-predicted by the model (we discuss below the disagreement between model predictions and experimental data in Fig. 4 for the adopters before social influence).

Fig. 4.

Collective accuracy (median distance to the truth of individual estimates) before (blue) and after (red) social influence against <mml:math><mml:mi>ρ</mml:mi></mml:math>ρ for the five behavioral categories identified in Fig. 1B and for the whole group (all). Adopting leads to the sharpest improvement and the best accuracy for <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>ρ</mml:mi></mml:mpadded><mml:mo>≥</mml:mo><mml:mrow><mml:mpadded width="+1.7pt"><mml:mn>40</mml:mn></mml:mpadded><mml:mo>%</mml:mo></mml:mrow></mml:mrow></mml:math>ρ≥40%. Full circles correspond to experimental data, while open circles represent the predictions of the model (including for <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>ρ</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mn>60</mml:mn></mml:mrow></mml:math>ρ=60%, a case not tested experimentally).

The sensitivity to social influence and the collective accuracy are strongly related to confidence (SI Appendix, Fig. S10). The more confident the subjects, the less they tend to follow the group and the better their accuracy is, especially before social influence. This makes the link between confident (18), informed (16), and successful (17) individuals: they are generally the same persons. However, individuals who are too confident (keeping behavior; arguably because they have an idea about the answer, hence their good accuracy before social influence) tend to discard others’ opinion. Although it might sometimes work—especially if no external information is provided <mml:math><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>ρ</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mpadded width="+1.7pt"><mml:mn>0</mml:mn></mml:mpadded><mml:mo>%</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math>(ρ=0%)—they lose the opportunity to benefit from valuable information learned by others. Meanwhile, adopting and overreacting subjects have poor confidence and accuracy before social influence, arguably because they do not know much about the questions. Note that the model, not including any notion of confidence or heterogeneous prior knowledge, overestimates the quality of the accuracy before social influence for the adopting behavior. However, even at <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>ρ</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mpadded width="+1.7pt"><mml:mn>0</mml:mn></mml:mpadded><mml:mo>%</mml:mo></mml:mrow></mml:mrow></mml:math>ρ=0%, adopting subjects perform about as well as the other categories after social influence. In fact, if enough information is provided (<mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>ρ</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mpadded width="+1.7pt"><mml:mn>80</mml:mn></mml:mpadded><mml:mo>%</mml:mo></mml:mrow></mml:mrow></mml:math>ρ=80%), they are even able to reach almost perfect collective accuracy. Similar results have been found in the Japan experiment as shown on SI Appendix, Fig. S12. SI Appendix, Figs. S13–S15 show similar graphs for the collective performance in France and Japan.

Predicting the effect of incorrect information given to the human group by virtual agents.

We used the model to investigate the influence on the group performance of the quality and quantity of information delivered to the group (i.e., the value <mml:math><mml:mi>V</mml:mi></mml:math>V of the answer provided by the percentage <mml:math><mml:mi>ρ</mml:mi></mml:math>ρ of virtual agents). In our experiments, the group was provided with the (log transform of the) true value <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>V</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>V=0 (the agents were experts). We expect a deterioration of the collective performance and accuracy as <mml:math><mml:mi>V</mml:mi></mml:math>V moves too far away from zero and as a greater amount of incorrect information is delivered to the group (by increasing <mml:math><mml:mi>ρ</mml:mi></mml:math>ρ). The optimum collective accuracy is reached for a strictly positive V, whatever the value of <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>ρ</mml:mi></mml:mpadded><mml:mo>></mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>ρ>0 (SI Appendix, Fig. S16), as also predicted by our simple analytical model. Hence, incorrect information can be beneficial to the group: providing the group with overestimated values can counterbalance the human cognitive bias to underestimate quantities (24).

Discussion

Quantifying how social information affects individual estimations and opinions is a crucial step to understand and model the dynamics of collective choices or opinion formation (26). Here, we have measured and modeled the impact of social information at individual and collective scales in estimation tasks with low demonstrability. By controlling the quantity and quality of information delivered to the subjects, unbeknownst to them, we have been able to precisely quantify the impact of social influence on group performance. We also tested and confirmed the cross-cultural generality of our results by conducting experiments in France and Japan.

We showed and justified that, when individuals have poor prior knowledge about the questions, the distribution of their log-transformed estimates is close to a Cauchy distribution. The distribution of the sensitivity to social influence <mml:math><mml:mi>S</mml:mi></mml:math>S is bell-shaped (contradict, compromise, overreact), with two additional peaks exactly at <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>S</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>S=0 (keep) and <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>S</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math>S=1 (adopt), which lead to the definition of robust social traits as checked by further observing the subjects inclined to follow these behaviors. When subjects have little prior knowledge, we found that their sensitivity to social influence increases (linear cusp) with the difference between their estimate and that of the group, at variance with what was found in ref. 19, for questions where subjects had a high prior knowledge.

We used these experimental observations to build and calibrate a model that quantitatively predicts the sharpening of the distribution of individual estimates and the improvement in collective performance and accuracy as the amount of good information provided to the group increases. This model could be directly applied or straightforwardly adapted to similar situations where humans have to integrate information from other people or external sources.

We studied the impact of virtual experts on the group performance, a methodology allowing us to rigorously control the quantity (<mml:math><mml:mi>ρ</mml:mi></mml:math>ρ) and quality (<mml:math><mml:mi>V</mml:mi></mml:math>V) of the information provided to a group with little prior knowledge. These virtual experts can be seen either as an external source of information accessible to individuals (e.g., the Internet, social networks, media, etc.) or as a very cohesive (all having the same opinion <mml:math><mml:mi>V</mml:mi></mml:math>V) and overconfident (all having <mml:math><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>S</mml:mi></mml:mpadded><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math>S=0) subgroup of the population, such as can happen with “groupthink” (27). When these experts provide reliable information to the group, a systematic improvement in collective performance and accuracy is obtained experimentally and is quantitatively reproduced by our model. Moreover, if the experts are not too numerous and the information that they give is slightly above the true value, the model predicts that social influence can help the group perform even better than when the truth is provided, as this incorrect information compensates for the human cognitive bias to underestimate quantities.

We also showed that the sensitivity to social influence is strongly related to confidence and accuracy: the most confident subjects are generally the best performers and tend to weight the opinion of others less. When the group has access to more reliable information, this behavior becomes detrimental to individual and collective accuracy, as too confident individuals lose the opportunity to benefit from this information.

Overall, we showed that individuals, even when they have very little prior knowledge about a quantity to estimate, are able to use information from their peers or from the environment to collectively improve the group performance as long as this information is not highly misleading. Ultimately, getting a better understanding of these influential processes opens perspectives to develop information systems aimed at enhancing cooperation and collaboration in human groups, thus helping crowds become smarter (28, 29).

Future research will have to focus on the experimental validation of our theoretical predictions when providing incorrect information to the group, with the intriguing possibility of actually improving its performance. It would also be interesting to study the impact on the group performance of the number of estimates given as social information (instead of only their mean) and of revealing the confidence and/or reputation of those who share these estimates.

Acknowledgments

We thank Ofer Tchernichovski for his valuable comments. This work was supported by Agence Nationale de la Recherche project 11-IDEX-0002-02–Transversalité–Multi-Disciplinary Study of Emergence Phenomena, a grant from the CNRS Mission for Interdisciplinarity (project SmartCrowd, AMI S2C3), and by Program Investissements d’Avenir under Agence Nationale de la Recherche program 11-IDEX-0002-02, reference ANR-10-LABX-0037-NEXT. B.J. was supported by a doctoral fellowship from the CNRS, and R.E. was supported by Marie Curie Core/Program Grant Funding Grant 655235–SmartMass. T.K. was supported by Japan Society for the Promotion of Science Grant-in-Aid for Scientific Research JP16H06324 and JP25118004.

Footnotes

  • ?1To whom correspondence should be addressed. Email: guy.theraulaz{at}univ-tlse3.fr.
  • Author contributions: B.J., C.S., and G.T. designed research; B.J., H.-r.K., R.E., S.C., A.B., T.K., C.S., and G.T. performed research; B.J., C.S., and G.T. analyzed data; and B.J., C.S., and G.T. wrote the paper.

  • The authors declare no conflict of interest.

  • This article is a PNAS Direct Submission.

  • This article contains supporting information online at www.danielhellerman.com/lookup/suppl/doi:10.1073/pnas.1703695114/-/DCSupplemental.

References

  1. ?
    .
  2. ?
    .
  3. ?
    .
  4. ?
    .
  5. ?
    .
  6. ?
    .
  7. ?
    .
  8. ?
    .
  9. ?
    .
  10. ?
    .
  11. ?
    .
  12. ?
    .
  13. ?
    .
  14. ?
    .
  15. ?
    .
  16. ?
    .
  17. ?
    .
  18. ?
    .
  19. ?
    .
  20. ?
    .
  21. ?
    .
  22. ?
    .
  23. ?
    .
  24. ?
    .
  25. ?
    .
  26. ?
    .
  27. ?
    .
  28. ?
    .
  29. ?
    .

Online Impact

                                    1. 613261309 2018-02-21
                                    2. 6972481308 2018-02-21
                                    3. 2758991307 2018-02-21
                                    4. 5213301306 2018-02-21
                                    5. 6402651305 2018-02-21
                                    6. 975701304 2018-02-20
                                    7. 619701303 2018-02-20
                                    8. 6291841302 2018-02-20
                                    9. 8182271301 2018-02-20
                                    10. 7717531300 2018-02-20
                                    11. 2811781299 2018-02-20
                                    12. 9132041298 2018-02-20
                                    13. 285331297 2018-02-20
                                    14. 2838721296 2018-02-20
                                    15. 274321295 2018-02-20
                                    16. 2027431294 2018-02-20
                                    17. 2738641293 2018-02-20
                                    18. 9584601292 2018-02-20
                                    19. 9002021291 2018-02-20
                                    20. 7995901290 2018-02-20