Philosophy of statistics

Statistics are way important in addressing the problem of induction.

Issues and positions

Problem of induction

• How do we infer universals from particulars?
• Hume1
• Weintraub2
• Mill
• Peirce
• Reichenbach3
• Salmon4
• Good5
• Hacking6
• Huber7

Probability and uncertainty

• Kolmogorov
• Frequentist vs Bayesian probability
• Accuracy vs precision8

Foundations of statistics

• Ronald Fisher, Jerzy Neyman, Egon Pearson, Harold Jeffreys
• Fisher significance of the null hypothesis ($$p$$-values)
• On an absolute criterion for fitting frequency curves.10
• Frequency distribution of the values of the correlation coefficient in samples of indefinitely large population.11
• Neyman-Pearson confidence intervals with fixed error probabilities (also $$p$$-values but considering two hypotheses involves two types of errors)
• objective (non-informative) Jeffreys priors
• Responses
• Cox
• Berger12
• Mayo
• Learning from Error13
• Error statistics14
• Likelihood principle
• Birnbaum
• violated by both Frequentists and Bayesians
• Pedagogy

Point estimation and confidence intervals

• regression
• MLE: Maximum likelihood estimators, Fisher19
• Cramér-Rao bound20
• $$\chi^2$$

Statistical hypothesis testing

• classification
• Type-1 and type-2 errors in Neyman-Pearson theory
• Power and confidence
• Neyman-Pearson lemma21
• Wilks22 and Wald23
• $$p$$-values and significance24
• Flip-flopping and Feldman-Cousins confidence intervals25
• Asymptotics26

Systematic uncertainties

• Class-1, class-2, and class-3 systematic uncertanties (good, bad, ugly), Classification by Pekka Sinervo (PhyStat2003)27
• Not to be confused with type-1 and type-2 errors in Neyman-Pearson theory Figure 1: Classification of measurement uncertainties (philosophy-in-figures.tumblr.com).

Machine learning

• classification and regression
• supervised and unsupervised learning
• Hastie, Tibshirani, & Friedman30

Auto-science Figure 2: The inference cycle for the process of scientific inquiry. The three distinct forms of inference (abduction, deduction, and induction) facilitate an all-encompassing vision, enabling HPC and HDA to converge in a rational and structured manner. HPC: high- performance computing; HDA: high-end data analysis (Asch, M. et al., 2018).

• Big data and extreme-scale computing: Pathways to Convergence-Toward a shaping strategy for a future software and data ecosystem for scientific inquiry.31
• Note that this description of abduction is missing that it is normative (i.e. “best-fit”).

My thoughts

Annotated bibliography

• Mayo (1996)

Cowan, G. (1998). Statistical Data Analysis.

• Cowan (1998) and Cowan (2016)

• James (2006)

Cowan, G. et al. (2011). Asymptotic formulae for likelihood-based tests of new physics.

• Cowan et al. (2011)
• Glen Cowan, Kyle Cranmer, Eilam Gross, Ofer Vitells

Cranmer, K (2015). Practical statistics for the LHC.

• Cranmer (2015)

My thoughts

• Univariate Distribution Relationships32
• All of Statistics33
• The Foundations of Statistics34

Others

