If you're looking for the best small model, I'd recommend using Berkeley's Starling-7B model [0].
It'll run on a lot of commodity GPUs and performs well in head-to-head comparisons against bigger models, edging out the most up-to-date GPT-3.5-Turbo [1].