Hacker News new | ask | show | jobs
by bitL 1151 days ago
My research shows otherwise. Tuning via transformer adapters pretty much added new knowledge to QA models or could be used for adversarial QA training. You can throw away learned adapters anytime and retrain from scratch with new information if your adapters become stale. Fine-tuning is cheap and small (e.g. 60kB data in an adapter). You can customize it in production for each individual customer as well by swapping adapters at the time of inference. Embeddings for very short-term facts and adapters for medium-long-term info seems like the best combination.
3 comments

You mean like what's described in this blog post, correct?

https://adapterhub.ml/blog/2022/03/adapter-transformers-v3-u...

Yes, those adapters. There are many types now, most recently LLaMA adapter:

https://arxiv.org/abs/2303.16199

Could you link to your research and/or describe the models, libraries, data and tests you used for this?
Have you tried fine-tuning via adapter, if so what has been your experience and was the the total cost?