• Call for Social Sciences Papers
  • Science Sessions: The PNAS Podcast Program

Bayesian posteriors for arbitrarily rare events

  1. Lorens A. Imhofc,d,1
  1. aDepartment of Economics, Massachusetts Institute of Technology, Cambridge, MA 02139;
  2. bDepartment of Economics, Harvard University, Cambridge, MA 02138;
  3. cDepartment of Statistics, Bonn University, 53113 Bonn, Germany;
  4. dHausdorff Center for Mathematics, Bonn University, 53113 Bonn, Germany
  1. Contributed by Drew Fudenberg, March 27, 2017 (sent for review November 14, 2016; reviewed by Keisuke Hirano, Demian Pouzo, and Bruno Strulovici)

Significance

Many decision problems in contexts ranging from drug safety tests to game-theoretic learning models require Bayesian comparisons between the likelihoods of two events. When both events are arbitrarily rare, a large data set is needed to reach the correct decision with high probability. The best result in previous work requires the data size to grow so quickly with rarity that the expectation of the number of observations of the rare event explodes. We show for a large class of priors that it is enough that this expectation exceeds a prior-dependent constant. However, without some restrictions on the prior the result fails, and our condition on the data size is the weakest possible.

Abstract

We study how much data a Bayesian observer needs to correctly infer the relative likelihoods of two events when both events are arbitrarily rare. Each period, either a blue die or a red die is tossed. The two dice land on side <mml:math><mml:mn>1</mml:mn></mml:math>1 with unknown probabilities <mml:math><mml:msub><mml:mi>p</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:math>p1 and <mml:math><mml:msub><mml:mi>q</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:math>q1, which can be arbitrarily low. Given a data-generating process where <mml:math><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>≥</mml:mo><mml:mrow><mml:mi>c</mml:mi><mml:msub><mml:mi>q</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:mrow></mml:math>p1≥cq1, we are interested in how much data are required to guarantee that with high probability the observer’s Bayesian posterior mean for <mml:math><mml:msub><mml:mi>p</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:math>p1 exceeds <mml:math><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>?</mml:mo><mml:mi>δ</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:math>(1?δ)c times that for <mml:math><mml:msub><mml:mi>q</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:math>q1. If the prior densities for the two dice are positive on the interior of the parameter space and behave like power functions at the boundary, then for every <mml:math><mml:mrow><mml:mrow><mml:mi>?</mml:mi><mml:mo>></mml:mo><mml:mn>?0</mml:mn></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>?>?0, there exists a finite <mml:math><mml:mi>N</mml:mi></mml:math>N so that the observer obtains such an inference after <mml:math><mml:mi>n</mml:mi></mml:math>n periods with probability at least <mml:math><mml:mrow><mml:mn>1</mml:mn><mml:mo>?</mml:mo><mml:mi>?</mml:mi></mml:mrow></mml:math>1?? whenever <mml:math><mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:msub><mml:mi>p</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow><mml:mo>≥</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:math>np1≥N. The condition on <mml:math><mml:mi>n</mml:mi></mml:math>n and <mml:math><mml:msub><mml:mi>p</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:math>p1 is the best possible. The result can fail if one of the prior densities converges to zero exponentially fast at the boundary.

Footnotes

  • ?1To whom correspondence should be addressed. Email: drew.fudenberg{at}gmail.com.
  • Author contributions: D.F., K.H., and L.A.I. designed research, performed research, and wrote the paper.

  • Reviewers: K.H., Pennsylvania State University; D.P., University of California, Berkeley; and B.S., Northwestern University.

  • The authors declare no conflict of interest.

  • This article contains supporting information online at www.danielhellerman.com/lookup/suppl/doi:10.1073/pnas.1618780114/-/DCSupplemental.

Online Impact

                                                1. 336531258 2018-02-17
                                                2. 6455421257 2018-02-17
                                                3. 5128821256 2018-02-17
                                                4. 4014601255 2018-02-17
                                                5. 9637141254 2018-02-17
                                                6. 6087041253 2018-02-17
                                                7. 6141561252 2018-02-17
                                                8. 16211251 2018-02-17
                                                9. 202981250 2018-02-17
                                                10. 1634281249 2018-02-17
                                                11. 2115681248 2018-02-17
                                                12. 8627591247 2018-02-17
                                                13. 1184961246 2018-02-17
                                                14. 9203941245 2018-02-17
                                                15. 4504061244 2018-02-16
                                                16. 5597191243 2018-02-16
                                                17. 5234981242 2018-02-16
                                                18. 6285841241 2018-02-16
                                                19. 3913011240 2018-02-16
                                                20. 5129741239 2018-02-16