My research

Research interests

Deep learning, both natural language and vision
Large-scale language modeling
Normalization in deep neural nets
High-performance computing
Data science and visualization
Statistical inference
Searches for new physics (GUTs & SUSY)
Measuring properties of the Higgs boson
Reconstruction and identification of hadronic decays of tau leptons
Philosophy of science and science communication

$Figure 1: A display of likely a pp → tt → bb\mu\tau event in ATLAS in 2011.$

Figure 1: A display of likely a $pp$ → $tt$ → $bb\mu\tau$ event in ATLAS in 2011.

Research history

ATLAS

Previously as a graduate student at Penn, working with ATLAS Experiment at CERN, I helped commission the Transition Radiation Tracker (TRT), a sub-detector of the ATLAS tracker, during the start-up of the Large Hadron Collider (LHC) in 2009-2012. On July 4 of 2012, the ATLAS and CMS experiments both announced discovering a new particle consistent with the long-sought-after Higgs boson, a key to explaining electroweak symmetry breaking in the Standard Model of particle physics.

I continued to work in the ATLAS Experiment as a postdoc at the Santa Cruz Institute for Particle Physics (SCIPP). I’ve spent 10 years romping through datasets from ATLAS, learning how to explore and model large, multidimentional datasets. I have a knack for developing data analysis frameworks and an eye for technical detail and good scholarship. I am passionate about using scientific techniques to solve important problems and about how technologies extend our reach.

Figure 2: A figure I drew with Kyle Cranmer of the big-picture of the flow of ATLAS data.

Many theories of physics beyond the Standard Model have revolutionary implications for the concepts of symmetry and space-time, and for our understanding of the early universe. Searching for new physics in collider data involves detecting rare events among many, requiring big data reductions, data-driven background modeling, and severe statistical hypothesis testing.

My research in physics has primarily focused on supporting and optimizing the reconstruction of hadronic tau decays, and searching for exotic new physics in ditau and diphoton events, including for signs of grand unified theories and supersymmetry. In my time with the ATLAS Collaboration I made substantial contributions to 14 publications in scientific journals and several more physics conference proceedings.

$Figure 3: Search for Z^{\prime} → \tau\tau with ATLAS ([1608.00890](https://arxiv.org/abs/1608.00890)).$

Figure 3: Search for $Z^{\prime}$ → $\tau\tau$ with ATLAS (1608.00890).

I have extensive experience in Neyman-Pearson statistical hypothesis testing, including the modeling of systematic uncertainties, often with a combination of Monte Carlo and data-driven methods. In analyses for ATLAS, I have contributed to the measurement and calibration of many systematic uncertainties including tracking detector thresholds, tau identification efficiencies and fake rates, and the effects of QCD Monte Carlo variations.

Tau leptons are heavy enough to have complex hadronic decays with identifiable signatures, which form an interesting domain for machine learning. During 2009-2017, I was an active developer for the ATLAS Tau Working Group that developed the reconstruction and identification algorithms for taus in ATLAS during Run 1 and 2 of the LHC. I lead the developement and optimized the first cut-based tau identification used with ATLAS data, and I helped develop more advanced tau identification using Boosted Decision Trees (BDTs).

Since 2016, following the deep learning revolution, I’ve been trying to learn everything I can about deep learning. I’ve worked in computer vision, machine translation, and large-scale language modeling.

Cerebras

In April of 2018, I joined Cerebras Systems as a machine learning engineer. Cerebras makes cutting-edge accelerators for deep learning, the first to achieve Wafer-Scale Intergration. Collaborators at Cerebras and I published a new normalization technique, Online Normalization, for training deep neural networks with small batch sizes (NeurIPS 2019, arxiv:1905.05894).

Figure 4: Training ResNet-50 on ImageNet-1k with Online Normalization (ON) and Batch Normalization (BN) (NeurIPS 2019, 1905.05894).

I developed and supported several reference models for Cerebras in computer vision (ResNets) and NLP (GNMT, Transformer, Linformer, BERT, RoBERTa, GPT-2). Cerebras revealed its first product at Supercomputering 2019: the poweful CS-1 computer for AI. I helped represent Cerebras at its tradeshow booth. I directly supported customers at both national labs and industry in applied deep learning tasks ranging from natural language to physics surrogate models. In April of 2022, I co-authored a blog about Getting started with PyTorch BERT models on the Cerebras CS-2 System.

Tenstorrent

TODO

BERT-based inference applications
Falcon, Llama, and Mistral models
Finetuning with QLoRA

Current focuses

Large-scale language modeling
Deep learning performance
Sparsity in deep neural nets
Hyperparameter optimization
Understanding techniques for clustering and anomaly detection

Current independent research projects

Implementation and study of:

Modern Portfolio Theory (AKA the “Markowitz portfolio”) and its variants
Counterfactual Regret Minimization (CFR) and its variants

Figure 5: Markowitz portfolio analysis (left). Solving Kuhn and Leduc, simplified versions of poker, with CFR (right).

Not open sourced, but feel free to ask me about them.

Previous focuses

Commissioning, operations, and threshold calibration of the ATLAS Transition Radiation Tracker [1005.5254]
ATLAS tau reconstruction and identification, including the use of Boosted Decision Trees (BDTs) [1412.7086]
ATLAS observation and cross section measurement of SM Z→ττ [1108.2016]
ATLAS searches for exotic Higgs and Z’→ττ events [1210.6604, 1502.07177, 1608.00890]
ATLAS searches for evidence of supersymmetry in diphoton events [1507.05493, 1606.09150, 1802.03158]
Research in applications of deep learning (CNNs) for particle identification ($e/\gamma/\pi$) with ILC-CLIC simulation data [1807.02876]
Development of a new type of normalization layer for training neural networks, Online Normalization, that can functionally replace BatchNorm and be used with batch size = 1. [NeurIPS 2019, 1905.05894]

Curriculum Vitae

My brief résumé: html, pdf
My full academic curriculum vitae: html, pdf
More info about my projects:
  Github profile
  Selected software projects
  Selected publications
  Selected talks

Ph.D. thesis

My graduate research in particle physics was on the reconstruction and identification of hadronic tau decays with the ATLAS experiment, measuring the Z→ττ production cross section in proton-proton collisions at √s = 7 TeV, and searching for new physics in high-mass ditau events.

Download my thesis here

Site navigation:
home
essays
publications
software
talks