Kushal Tirumala

Last updated: May 20, 2024

About me

I'm a researcher at FAIR, Meta AI Research, where I mainly work on understanding and improving the capabilities of large models. Some of my recent areas of interest:

  • image tokenization
  • data curation for pre-training

During my time at FAIR, I've been lucky to be mentored by many great researchers including: Armen Aghajanyan, Ari Morcos, Luke Zettlemoyer, and Surya Ganguli.

Previously, I received my B.S. at Caltech in math and computer science, where I was fortunate enough to work with Yisong Yue, Yaser Abu-Mostafa, and Ashish Mahabal. I primarily worked on applying ML to different scientific fields, as well as science of deep learning.

Email: {firstname}{lastname}99 at gmail dot com

Please reach out if you'd like to collaborate / have any questions about my work!

Selected Publications

See google scholar for more: https://scholar.google.com/citations?user=B8WLbLsAAAAJ

Deep Learning

Chameleon: Mixed-Modal Early-Fusion Foundation Models. [arXiv].

Pre-training team

In prep. 2024

The unreasonable ineffectiveness of the deeper layers. [arXiv].

Andrey Gromov*, Kushal Tirumala*, Hassan Shapourian, Paolo Glorioso, Daniel A. Roberts

In prep. 2024

Effective pruning of web-scale datasets based on complexity of concept clusters. [arXiv].

Amro Abbas*, Evgenia Rusak*, Kushal Tirumala, Wieland Brendel, Kamalika Chaudhuri, Ari S. Morcos

ICLR 2024

D4: Improving LLM pretraining via document de-duplication and diversification. [arXiv].

Kushal Tirumala*, Daniel Simig*, Armen Aghajanyan, Ari S. Morcos

NeurIPS 2023

SemDeDup: Data-efficient learning at web-scale through semantic deduplication. [arXiv].

Amro Abbas, Kushal Tirumala*, Dániel Simig*, Surya Ganguli, Ari S. Morcos

ICML 2023

ICLR Multimodal Representation Learning Workshop (Best paper award!)

Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models. [arXiv].

Kushal Tirumala*, Aram H. Markosyan*, Luke Zettlemoyer, Armen Aghajanyan

NeurIPS 2022 (Oral presentation, top 2% of accepted papers)

Investigating Generalization by Controlling Normalized Margin. [arXiv].

Alexander Farhang, Jeremy Bernstein, Kushal Tirumala, Yang Liu, Yisong Yue

ICML 2022

Astronomy

A Method for Finding Anomalous Astronomical Light Curves and their Analogues. [arXiv][package].

J Rafael Martínez-Galarza, Federica B Bianco, Dennis Crake, Kushal Tirumala, Ashish A Mahabal, Matthew J Graham, Daniel Giles

MNRAS 2021

A Granular Method for Finding Anomalous Light Curves and their Analogs. [pdf]

Kushal Tirumala, J Rafael Martínez-Galarza, Federica B Bianco, Dennis Crake, Ashish A Mahabal, Matthew J Graham, Daniel Giles

NeurIPS ML4PS workshop 2021

DeepStreaks: identifying fast-moving objects in the Zwicky Transient Facility data with deep learning. [pdf]

Dmitry A Duev, Ashish Mahabal, Quanzhi Ye, Kushal Tirumala, Justin Belicki, Richard Dekany, Sara Frederick, Matthew J Graham, Russ R Laher, Frank J Masci, Thomas A Prince, Reed Riddle, Philippe Rosnet, Maayane T Soumagnac

MNRAS 2019

Machine learning for the zwicky transient facility. [pdf]

Ashish Mahabal et al.

MNRAS 2018