Integrating $(IA)^3$ with HuggingFace's PEFT
This was a simple course project for the CSE 256: Natural Language Processing course, Spring 2023 at UCSD. Parameter-efficient fine tuning (PEFT) methods are all the rage now, given that they enable you to finetune large models like Falcon-7B on just Google Colab! A new state-of-the-art method called $(IA)^3$ (Infused Adapter by Inhibiting and Amplifying Inner Activations) was proposed recently, shown to beat LoRA (the most popular, flexible, and powerful PEFT method) in certain parameter settings! The official implementation worked only for certain T5-based architectures. Further, the original authors only benchmarked performance on T0. We implemented (and benchmarked) $(IA)^3$ to support encoder-decoder and decoder-only models, and also enabled features like int-8 training, etc. Our implementation is now the official implementatoin for $(IA)^3$ in HuggingFace’s PEFT library! Pull Request.