My research
Research interests
- Deep learning, both natural language and vision
- Large-scale language modeling
- Normalization in deep neural nets
- High-performance computing
- Data science and visualization
- Statistical inference
- Searches for new physics (GUTs & SUSY)
- Measuring properties of the Higgs boson
- Reconstruction and identification of hadronic decays of tau leptons
- Philosophy of science and science communication
Figure 1: A display of likely a \(pp\) → \(tt\) → \(bb\mu\tau\) event in ATLAS in 2011.
Research history
ATLAS
Previously as a graduate student at Penn, working with ATLAS Experiment at CERN, I helped commission the Transition Radiation Tracker (TRT), a sub-detector of the ATLAS tracker, during the start-up of the Large Hadron Collider (LHC) in 2009-2012. On July 4 of 2012, the ATLAS and CMS experiments both announced discovering a new particle consistent with the long-sought-after Higgs boson, a key to explaining electroweak symmetry breaking in the Standard Model of particle physics.
I continued to work in the ATLAS Experiment as a postdoc at the Santa Cruz Institute for Particle Physics (SCIPP). I’ve spent 10 years romping through datasets from ATLAS, learning how to explore and model large, multidimentional datasets. I have a knack for developing data analysis frameworks and an eye for technical detail and good scholarship. I am passionate about using scientific techniques to solve important problems and about how technologies extend our reach.
Figure 2: A figure I drew with Kyle Cranmer of the big-picture of the flow of ATLAS data.
Many theories of physics beyond the Standard Model have revolutionary implications for the concepts of symmetry and space-time, and for our understanding of the early universe. Searching for new physics in collider data involves detecting rare events among many, requiring big data reductions, data-driven background modeling, and severe statistical hypothesis testing.
My research in physics has primarily focused on supporting and optimizing the reconstruction of hadronic tau decays, and searching for exotic new physics in ditau and diphoton events, including for signs of grand unified theories and supersymmetry. In my time with the ATLAS Collaboration I made substantial contributions to 14 publications in scientific journals and several more physics conference proceedings.
Figure 3: Search for \(Z^{\prime}\) → \(\tau\tau\) with ATLAS (1608.00890).
I have extensive experience in Neyman-Pearson statistical hypothesis testing, including the modeling of systematic uncertainties, often with a combination of Monte Carlo and data-driven methods. In analyses for ATLAS, I have contributed to the measurement and calibration of many systematic uncertainties including tracking detector thresholds, tau identification efficiencies and fake rates, and the effects of QCD Monte Carlo variations.
Tau leptons are heavy enough to have complex hadronic decays with identifiable signatures, which form an interesting domain for machine learning. During 2009-2017, I was an active developer for the ATLAS Tau Working Group that developed the reconstruction and identification algorithms for taus in ATLAS during Run 1 and 2 of the LHC. I lead the developement and optimized the first cut-based tau identification used with ATLAS data, and I helped develop more advanced tau identification using Boosted Decision Trees (BDTs).
Since 2016, following the deep learning revolution, I’ve been trying to learn everything I can about deep learning. I’ve worked in computer vision, machine translation, and large-scale language modeling.
Cerebras
In April of 2018, I joined Cerebras Systems as a machine learning engineer. Cerebras makes cutting-edge accelerators for deep learning, the first to achieve Wafer-Scale Intergration. Collaborators at Cerebras and I published a new normalization technique, Online Normalization, for training deep neural networks with small batch sizes (NeurIPS 2019, arxiv:1905.05894).
Figure 4: Training ResNet-50 on ImageNet-1k with Online Normalization (ON) and Batch Normalization (BN) (NeurIPS 2019, 1905.05894).
I developed and supported several reference models for Cerebras in computer vision (ResNets) and NLP (GNMT, Transformer, Linformer, BERT, RoBERTa, GPT-2). Cerebras revealed its first product at Supercomputering 2019: the poweful CS-1 computer for AI. I helped represent Cerebras at its tradeshow booth. I directly supported customers at both national labs and industry in applied deep learning tasks ranging from natural language to physics surrogate models. In April of 2022, I co-authored a blog about Getting started with PyTorch BERT models on the Cerebras CS-2 System.
Tenstorrent
TODO
- BERT-based inference applications
- Falcon, Llama, and Mistral models
- Finetuning with QLoRA
Current focuses
- Large-scale language modeling
- Deep learning performance
- Sparsity in deep neural nets
- Hyperparameter optimization
- Understanding techniques for clustering and anomaly detection
Current independent research projects
Implementation and study of:
- Modern Portfolio Theory (AKA the “Markowitz portfolio”) and its variants
- Counterfactual Regret Minimization (CFR) and its variants
Figure 5: Markowitz portfolio analysis (left). Solving Kuhn and Leduc, simplified versions of poker, with CFR (right).
Not open sourced, but feel free to ask me about them.
Previous focuses
- Commissioning, operations, and threshold calibration of the ATLAS Transition Radiation Tracker [1005.5254]
- ATLAS tau reconstruction and identification, including the use of Boosted Decision Trees (BDTs) [1412.7086]
- ATLAS observation and cross section measurement of SM Z→ττ [1108.2016]
- ATLAS searches for exotic Higgs and Z’→ττ events [1210.6604, 1502.07177, 1608.00890]
- ATLAS searches for evidence of supersymmetry in diphoton events [1507.05493, 1606.09150, 1802.03158]
- Research in applications of deep learning (CNNs) for particle identification (\(e/\gamma/\pi\)) with ILC-CLIC simulation data [1807.02876]
- Development of a new type of normalization layer for training neural networks, Online Normalization, that can functionally replace BatchNorm and be used with batch size = 1. [NeurIPS 2019, 1905.05894]
Curriculum Vitae
- My brief résumé: html, pdf
- My full academic curriculum vitae: html, pdf
- More info about my projects:
Github profile
Selected software projects
Selected publications
Selected talks
Ph.D. thesis
My graduate research in particle physics was on the reconstruction and identification of hadronic tau decays with the ATLAS experiment, measuring the Z→ττ production cross section in proton-proton collisions at √s = 7 TeV, and searching for new physics in high-mass ditau events.