"It significantly outperforms existing models smaller or similar in size." is a statement that goes in that direction and would allow the comparison of a 1.7T param model with a 7b one