Merging Text Transformer Models from Different Initializations

Publication
arXiv
Neha Verma
Neha Verma
PhD Student

I am a PhD student at Johns Hopkins working on machine translation and multilingual representations.