Hacker News new | ask | show | jobs
by kouteiheika 525 days ago
> Oh 2x faster and uses >70% less memory than Hugging Face + Flash Attention 2!

Is this doing the same type of fine-tuning, or are you comparing full bf16 fine-tuning in HF with 4-bit QLoRA in Unsloth (in which case it's not really an apples-to-apples comparison)? If it's the latter then do you have a comparison of the former?

1 comments

Oh I compared 4bit QLoRA HF+FA2 with Unsloth 4bit QLoRA.

16bit LoRA have similar boosts in performance!

Full bf16 full finentuning is not yet supported, but it'll come out soon!