Merging Text Transformer Models from Different Initializations

Publication
High Dimensional Learning Dynamics Workshop @ ICML 2024, Transactions of Machine Learning Research
Neha Verma
Neha Verma
PhD Student

I am a PhD student at Johns Hopkins Center for Language and Speech Processing.