• Call for Social Sciences Papers
  • Science Sessions: The PNAS Podcast Program

Large numbers of explanatory variables, a semi-descriptive analysis

  1. H. S. Batteyb,1
  1. aNuffield College, Oxford OX1 1NF, United Kingdom;
  2. bDepartment of Mathematics, Imperial College London, London SW7 2AZ, United Kingdom
  1. Contributed by D. R. Cox, June 6, 2017 (sent for review March 7, 2017; reviewed by Victor M. Panaretos and Matthew Stephens)

Significance

Data with a small number of study individuals and a large number of potential explanatory features arise particularly in genomics. Existing methods of analysis result in a single model, but other sparse choices of explanatory features may fit virtually equally well. Our primary aim is essentially a set of acceptable simple representations. The method allows the assessment of anomalies, such as nonlinearities and interactions.

Abstract

Data with a relatively small number of study individuals and a very large number of potential explanatory features arise particularly, but by no means only, in genomics. A powerful method of analysis, the lasso [Tibshirani R (1996) J Roy Stat Soc B 58:267–288], takes account of an assumed sparsity of effects, that is, that most of the features are nugatory. Standard criteria for model fitting, such as the method of least squares, are modified by imposing a penalty for each explanatory variable used. There results a single model, leaving open the possibility that other sparse choices of explanatory features fit virtually equally well. The method suggested in this paper aims to specify simple models that are essentially equally effective, leaving detailed interpretation to the specifics of the particular study. The method hinges on the ability to make initially a very large number of separate analyses, allowing each explanatory feature to be assessed in combination with many other such features. Further stages allow the assessment of more complex patterns such as nonlinear and interactive dependences. The method has formal similarities to so-called partially balanced incomplete block designs introduced 80 years ago [Yates F (1936) J Agric Sci 26:424–455] for the study of large-scale plant breeding trials. The emphasis in this paper is strongly on exploratory analysis; the more formal statistical properties obtained under idealized assumptions will be reported separately.

Footnotes

  • ?1To whom correspondence may be addressed. Email: David.cox{at}nuffield.ox.ac.uk or h.battey{at}imperial.ac.uk.

Freely available online through the PNAS open access option.

Online Impact

    <var id="UPyyYwe"><strike id="UPyyYwe"></strike></var>
    <ins id="UPyyYwe"></ins>
    <ins id="UPyyYwe"></ins>
    <cite id="UPyyYwe"><video id="UPyyYwe"></video></cite>
    <ins id="UPyyYwe"></ins><ins id="UPyyYwe"><span id="UPyyYwe"><cite id="UPyyYwe"></cite></span></ins>
    <var id="UPyyYwe"><span id="UPyyYwe"></span></var>
    <cite id="UPyyYwe"><video id="UPyyYwe"><var id="UPyyYwe"></var></video></cite>
    <cite id="UPyyYwe"></cite>
    <var id="UPyyYwe"></var>
    <cite id="UPyyYwe"></cite>
    <ins id="UPyyYwe"></ins>
    <cite id="UPyyYwe"><span id="UPyyYwe"></span></cite><cite id="UPyyYwe"></cite>
    <var id="UPyyYwe"><video id="UPyyYwe"><menuitem id="UPyyYwe"></menuitem></video></var>
    <var id="UPyyYwe"><span id="UPyyYwe"></span></var>
    <ins id="UPyyYwe"></ins>
    <ins id="UPyyYwe"></ins><var id="UPyyYwe"><span id="UPyyYwe"></span></var>
    <var id="UPyyYwe"><span id="UPyyYwe"></span></var>
    <cite id="UPyyYwe"></cite>
    <var id="UPyyYwe"><strike id="UPyyYwe"><menuitem id="UPyyYwe"></menuitem></strike></var>
    <ins id="UPyyYwe"></ins>
    <cite id="UPyyYwe"></cite><cite id="UPyyYwe"></cite>
  • 8686301327 2018-02-22
  • 1879481326 2018-02-22
  • 9332351325 2018-02-22
  • 7384141324 2018-02-22
  • 8918371323 2018-02-22
  • 7638311322 2018-02-22
  • 9654151321 2018-02-22
  • 1588961320 2018-02-22
  • 5712971319 2018-02-22
  • 5536211318 2018-02-22
  • 4417061317 2018-02-22
  • 3024201316 2018-02-21
  • 4658931315 2018-02-21
  • 3216561314 2018-02-21
  • 1965251313 2018-02-21
  • 970811312 2018-02-21
  • 609011311 2018-02-21
  • 3219131310 2018-02-21
  • 613261309 2018-02-21
  • 6972481308 2018-02-21