I'm a researcher at FAIR, Meta AI Research, where I mainly work on understanding and improving the capabilities of large models. Some of my recent areas of interest:
During my time at FAIR, I've been lucky to be mentored by many great researchers including: Armen Aghajanyan, Ari Morcos, Luke Zettlemoyer, and Surya Ganguli.
Previously, I received my B.S. at Caltech in math and computer science, where I was fortunate enough to work with Yisong Yue, Yaser Abu-Mostafa, and Ashish Mahabal. I primarily worked on applying ML to different scientific fields, as well as science of deep learning.
Email: {firstname}{lastname}99 at gmail dot com
Please reach out if you'd like to collaborate / have any questions about my work!
See google scholar for more: https://scholar.google.com/citations?user=B8WLbLsAAAAJ
The unreasonable ineffectiveness of the deeper layers. [arXiv].
Andrey Gromov*, Kushal Tirumala*, Hassan Shapourian, Paolo Glorioso, Daniel A. Roberts
In prep. 2024
Effective pruning of web-scale datasets based on complexity of concept clusters. [arXiv].
Amro Abbas*, Evgenia Rusak*, Kushal Tirumala, Wieland Brendel, Kamalika Chaudhuri, Ari S. Morcos
ICLR 2024
D4: Improving LLM pretraining via document de-duplication and diversification. [arXiv].
Kushal Tirumala*, Daniel Simig*, Armen Aghajanyan, Ari S. Morcos
NeurIPS 2023
SemDeDup: Data-efficient learning at web-scale through semantic deduplication. [arXiv].
Amro Abbas, Kushal Tirumala*, Dániel Simig*, Surya Ganguli, Ari S. Morcos
ICML 2023
ICLR Multimodal Representation Learning Workshop (Best paper award!)
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models. [arXiv].
Kushal Tirumala*, Aram H. Markosyan*, Luke Zettlemoyer, Armen Aghajanyan
NeurIPS 2022 (Oral presentation, top 2% of accepted papers)
Investigating Generalization by Controlling Normalized Margin. [arXiv].
Alexander Farhang, Jeremy Bernstein, Kushal Tirumala, Yang Liu, Yisong Yue
ICML 2022
A Method for Finding Anomalous Astronomical Light Curves and their Analogues. [arXiv][package].
J Rafael Martínez-Galarza, Federica B Bianco, Dennis Crake, Kushal Tirumala, Ashish A Mahabal, Matthew J Graham, Daniel Giles
MNRAS 2021
A Granular Method for Finding Anomalous Light Curves and their Analogs. [pdf]
Kushal Tirumala, J Rafael Martínez-Galarza, Federica B Bianco, Dennis Crake, Ashish A Mahabal, Matthew J Graham, Daniel Giles
NeurIPS ML4PS workshop 2021
DeepStreaks: identifying fast-moving objects in the Zwicky Transient Facility data with deep learning. [pdf]
Dmitry A Duev, Ashish Mahabal, Quanzhi Ye, Kushal Tirumala, Justin Belicki, Richard Dekany, Sara Frederick, Matthew J Graham, Russ R Laher, Frank J Masci, Thomas A Prince, Reed Riddle, Philippe Rosnet, Maayane T Soumagnac
MNRAS 2019