Not sure if this is of any value to you, but Ryzen 7 generates 2 tokens per second for the 7B-Instruct model.
The model itself is very unimpressive and I see no reason to play with it over the worst alternative from Hugging Face.
I can only imagine this was released for some bizarre compliance reasons.
For the 7B IT and a short factual query I see 5.3 tps on a 5 year old Skylake Gold 6154 CPU @ 3.00GHz, 16 threads.
Expect a slight increase as we improve scalability.
FYI using the NUQ (4.5-bit) quantization improves throughput by about 1.4x.