Kushal Tirumala

About me

I'm a researcher at FAIR, Meta AI Research, where I mainly work on understanding and improving the capabilities of large models. Some of my recent areas of interest:

  • data curation for pre-training
  • methods for efficient fine-tuning (quantization, pruning, etc.)

Previously, I received my B.S. at Caltech in math and computer science, where I was fortunate enough to work with Yisong Yue, Yaser Abu-Mostafa, and Ashish Mahabal. I primarily worked on applying ML to different scientific fields, as well as science of deep learning.

Email: {firstname}{lastname}99 at gmail dot com

Please reach out if you'd like to collaborate / have any questions about my work!

Publications

D4: Improving LLM pretraining via document de-duplication and diversification. [arXiv].

Kushal Tirumala*, Daniel Simig*, Armen Aghajanyan, Ari S. Morcos

ICML DMLR Workshop 2023

NeurIPS 2023

SemDeDup: Data-efficient learning at web-scale through semantic deduplication. [arXiv].

Amro Abbas, Kushal Tirumala*, Dániel Simig*, Surya Ganguli, Ari S. Morcos

ICML 2023

Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models. [arXiv].

Kushal Tirumala*, Aram H. Markosyan*, Luke Zettlemoyer, Armen Aghajanyan

NeurIPS 2022 (Oral presentation, top 2% of accepted papers)

Investigating Generalization by Controlling Normalized Margin. [arXiv].

Alexander Farhang, Jeremy Bernstein, Kushal Tirumala, Yang Liu, Yisong Yue

ICML 2022

Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks. [arXiv].

Tristan Thrush, Kushal Tirumala, Anmol Gupta, Max Bartolo, Pedro Rodriguez, Tariq Kane, William Gaviria Rojas, Peter Mattson, Adina Williams, Douwe Kiela

ACL 2022 System Demos

A Method for Finding Anomalous Astronomical Light Curves and their Analogues. [arXiv][package].

J Rafael Martínez-Galarza, Federica B Bianco, Dennis Crake, Kushal Tirumala, Ashish A Mahabal, Matthew J Graham, Daniel Giles

MNRAS 2021

A Granular Method for Finding Anomalous Light Curves and their Analogs. [pdf]

Kushal Tirumala, J Rafael Martínez-Galarza, Federica B Bianco, Dennis Crake, Ashish A Mahabal, Matthew J Graham, Daniel Giles

NeurIPS ML4PS workshop 2021

DeepStreaks: identifying fast-moving objects in the Zwicky Transient Facility data with deep learning. [pdf]

Dmitry A Duev, Ashish Mahabal, Quanzhi Ye, Kushal Tirumala, Justin Belicki, Richard Dekany, Sara Frederick, Matthew J Graham, Russ R Laher, Frank J Masci, Thomas A Prince, Reed Riddle, Philippe Rosnet, Maayane T Soumagnac

MNRAS 2019

Machine learning for the zwicky transient facility. [pdf]

Ashish Mahabal et al.

MNRAS 2018