Hacker News new | ask | show | jobs
by xianshou 811 days ago
Single-GPU, optimal efficiency: unsloth + qlora + mistral-7b on runpod/vast/lambda

Blazing fast compared to out-of-the-box transformers, also make sure to use flash attention if you have A100s or better and context length >= 2k

Add FAISS (https://github.com/facebookresearch/faiss) if you need fast local RAG