Lmsys hasn't released any official 4-bit version. It might be a better idea to wait for the official 4-bit version. But it is interesting to learn that the third-party 4bit version has performance degeneration.
Lmsys hasn't released any official weights for anything. They've released "deltas" and other people have applied those deltas to the appropriate llama weights and done the quantization.
I reject your premise that the 8 to 4 bit quantization is the cause of the vicuna fine-tuned llamas very average performance though. This hasn't been the case for any of the other 8 to 4 bit quantizations. It would be a unique outlier. And so I don't think this is the "cause" here.
My point is that I am not aware of any official 4-bit quantization version (delta or weights) by lmsys so it might too early to draw your conclusion that vicuna finetuned llamas degenerates a lot of performance at 4 bit but others are fine.
I reject your premise that the 8 to 4 bit quantization is the cause of the vicuna fine-tuned llamas very average performance though. This hasn't been the case for any of the other 8 to 4 bit quantizations. It would be a unique outlier. And so I don't think this is the "cause" here.