# Philosophy of statistics

Statistics are way important in addressing the problem of induction.

## Issues and positions

### Problem of induction

• How do we infer universals from particulars?
• Hume1
• Weintraub2
• Mill
• Peirce
• Reichenbach3
• Salmon4
• Good5
• Hacking6
• Huber7

### Probability and uncertainty

• Kolmogorov
• Frequentist vs Bayesian probability
• Accuracy vs precision8

### Foundations of statistics

• Ronald Fisher, Jerzy Neyman, Egon Pearson, Harold Jeffreys
• Fisher significance of the null hypothesis ($$p$$-values)
• On an absolute criterion for fitting frequency curves.10
• Frequency distribution of the values of the correlation coefficient in samples of indefinitely large population.11
• Neyman-Pearson confidence intervals with fixed error probabilities (also $$p$$-values but considering two hypotheses involves two types of errors)
• objective (non-informative) Jeffreys priors
• Responses
• Cox
• Berger12
• Mayo
• Learning from Error13
• Error statistics14
• Likelihood principle
• Birnbaum
• violated by both Frequentists and Bayesians
• Pedagogy

### Point estimation and confidence intervals

• regression
• MLE: Maximum likelihood estimators, Fisher19
• Cramér-Rao bound20
• $$\chi^2$$

### Statistical hypothesis testing

• classification
• Type-1 and type-2 errors in Neyman-Pearson theory
• Power and confidence
• Neyman-Pearson lemma21
• Wilks22 and Wald23
• $$p$$-values and significance24
• Flip-flopping and Feldman-Cousins confidence intervals25
• Asymptotics26

### Systematic uncertainties

• Class-1, class-2, and class-3 systematic uncertanties (good, bad, ugly), Classification by Pekka Sinervo (PhyStat2003)27
• Not to be confused with type-1 and type-2 errors in Neyman-Pearson theory

### Machine learning

• classification and regression
• supervised and unsupervised learning
• Hastie, Tibshirani, & Friedman30

### Auto-science

• Big data and extreme-scale computing: Pathways to Convergence…
• Note that this description of abduction is missing that it is normative (i.e. “best-fit”).

## My thoughts

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

## Annotated bibliography

• Mayo (1996)

• TODO

### Cowan, G. (1998). Statistical Data Analysis.

• Cowan (1998) and Cowan (2016)

• TODO

• James (2006)

• TODO

### Cowan, G. et al. (2011). Asymptotic formulae for likelihood-based tests of new physics.

• Cowan et al. (2011)
• Glen Cowan, Kyle Cranmer, Eilam Gross, Ofer Vitells

• TODO

• TODO

### Cranmer, K (2015). Practical statistics for the LHC.

• Cranmer (2015)

#### My thoughts

• TODO

• Univariate Distribution Relationships31
• All of Statistics32
• The Foundations of Statistics33

## References

Aldrich, J. (1997). R. A. Fisher and the making of maximum likelihood 1912-1922. Statistical Science, 12, 162–176.

ATLAS Collaboration. (2012). Combined search for the Standard Model Higgs boson in $$pp$$ collisions at $$\sqrt{s}$$ = 7 TeV with the ATLAS detector. Physical Review D, 86, 032003. https://arxiv.org/abs/1207.0319

Benjamin, D.J. et al. (2017). Redefine statistical significance. PsyArXiv. July 22, 2017. https://psyarxiv.com/mky9j/

Berger, J. O. (2003). Could Fisher, Jeffreys and Neyman have agreed on testing? Statistical Science, 18, 1–32.

Cowan, G. (1998). Statistical Data Analysis. Clarendon Press.

———. (2012). Discovery sensitivity for a counting experiment with background uncertainty. https://www.pp.rhul.ac.uk/~cowan/stat/notes/medsigNote.pdf

———. (2016). Statistics. In C. Patrignani et al. (Particle Data Group), Chinese Physics C, 40, 100001. http://pdg.lbl.gov/2016/reviews/rpp2016-rev-statistics.pdf.

Cowan, G., Cranmer, K., Gross, E., & Vitells, O. (2011). Asymptotic formulae for likelihood-based tests of new physics. European Physical Journal C, 71, 1544. https://arxiv.org/abs/1007.1727

Cramér, H. (1946). A contribution to the theory of statistical estimation. Skandinavisk Aktuarietidskrift, 29, 85–94.

Cranmer, K. (2015). Practical statistics for the LHC. https://arxiv.org/abs/1503.07622

Feldman, G. J. & Cousins, R. D. (1998). A unified approach to the classical statistical analysis of small signals. Physical Review D, 57, 3873. https://arxiv.org/abs/physics/9711021

Fisher, R. A. (1912). On an absolute criterion for fitting frequency curves. Statistical Science, 12, 39–41.

———. (1915). Frequency distribution of the values of the correlation coefficient in samples of indefinitely large population. Biometrika, 10, 507–521.

Fréchet, M. (1943). Sur l’extension de certaines évaluations statistiques au cas de petits échantillons. Revue de L’Institut International de Statistique, 11, 182–205.

Good, I. J. (1988). The interface between statistics and philosophy of science. Statistical Science, 3, 386–397.

Hacking, I. (2001). An Introduction to Probability and Inductive Logic. Cambridge University Press.

Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.). Springer.

Huber, F. (2007). Confirmation and induction. Internet Encyclopedia of Philosophy. http://www.iep.utm.edu/conf-ind/

Hume, D. (2007). An Enquiry Concerning Human Understanding. (P. Millican, Ed.). Oxford University Press. (Originally published in 1748).

James, F. (2006). Statistical Methods in Experimental Particle Physics. World Scientific.

Kendall, M. G. (1946). The Advanced Theory of Statistics, Vol.II. London: Charles Griffin & Company.

Leemis, L. M. & McQueston, J. T. (2008). Univariate distribution relationships. The American Statistician, 62, 45–53.

Mayo, D. G. (1981). In defense of the Neyman-Pearson theory of confidence intervals. Philosophy of Science, 48, 269–280.

———. (1996). Error and the Growth of Experimental Knowledge. Chicago University Press.

Neyman, J. & Pearson, E. S. (1933). On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society A, 231, 289–337.

Rao, C. R. (1945). Information and the accuracy attainable in the estimation of statistical parameters. Bulletin of the Calcutta Mathematical Society, 37, 81–91.

———. (1947). Minimum variance and the estimation of several parameters. In Mathematical Proceedings of the Cambridge Philosophical Society. 43, 280–283. Cambridge University Press.

Reichenbach, H. (1938). Experience and Prediction. University of Chicago Press.

———. (1940). On the justification of induction. The Journal of Philosophy, 37, 97–103.

Salmon, W. C. (1963). On vindicating induction. Philosophy of Science, 30, 252–261.

———. (1966). The Foundations of Scientific Inference. University of Pittsburgh Press.

———. (1991). Hans Reichenbach’s vindication of induction. Erkenntnis, 35, 99–122.

Savage, L. J. (1954). The Foundations of Statistics. John Wiley & Sons.

Sinervo, P. (2002). Signal significance in particle physics. In M. Whalley & L. Lyons (Eds.), Proceedings of the Conference on Advanced Statistical Techniques in Particle Physics. Durham, UK: Institute of Particle Physics Phenomenology. https://arxiv.org/abs/hep-ex/0208005v1

———. (2003). Definition and treatment of systematic uncertainties in high energy physics and astrophysics. In Lyons L., Mount R., & R. Reitmeyer (Eds.), Proceedings of the Conference on Statistical Problems in Particle Physics, Astrophysics, and Cosmology (PhyStat2003) (pp. 122–129). Stanford Linear Accelerator Center. https://www.slac.stanford.edu/econf/C030908/papers/TUAT004.pdf

Venn, J. (1888). The Logic of Chance. London: MacMillan and Co. (Originally published in 1866).

Wald, A. (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large. Transactions of the American Mathematical Society, 54, 426–482.

Wasserman, L. (2003). All of Statistics: A Concise Course in Statistical Inference. Springer.

Wasserstein, R. L. & Lazar, N. A. (2016). The ASA’s statement on p-values: Context, process, and purpose. American Statistician, 70, 129–133.

Weintraub, R. (1995). What was Hume’s contribution to the problem of induction? The Philosophical Quarterly, 45, 460–470.

Wilks, S. S. (1938). The large-sample distribution of the likelihood ratio for testing composite hypotheses. The Annals of Mathematical Statistics, 9, 60–62.

1. Hume (2007).

2. Weintraub (1995).

3. Reichenbach (1938) and Reichenbach (1940).

4. Salmon (1963), Salmon (1966), Salmon (1991).

5. Good (1988).

6. Hacking (2001).

7. Huber (2007).

8. Cowan (1998) and Cowan (2016).

9. Venn (1888)

10. Fisher (1912).

11. Fisher (1915).

12. Berger (2003).

13. Mayo (1996).

14. Mayo (1981).

15. Kendall (1946).

16. James (2006).

17. Cowan (1998) and Cowan (2016).

18. Cranmer (2015).

19. Aldrich (1997).

20. Fréchet (1943), Cramér (1946), Rao (1945), and Rao (1947).

21. Neyman & Pearson (1933).

22. Wilks (1938).

23. Wald (1943).

24. Sinervo (2002) and Cowan (2012).

25. Feldman & Cousins (1998).

26. Cowan, Cranmer, Gross, & Vitells (2011).

27. Sinervo (2003).

28. Wasserstein & Lazar (2016).

29. Benjamin, D.J. et al. (2017).

30. Hastie, Tibshirani, & Friedman (2009).

31. Leemis & McQueston (2008).

32. Wasserman (2003).

33. Savage (1954).