Everything including "objective" metrics includes bias. And that's before you take into account people outright gaming metrics (objective or subjective).
As a simplified example, if I write 1000 lines of code and you write 1000 lines of code. We should have the same rating if that's the metric used. There shouldn't be any bias there. It only introduces bias when the manager feel your code is better than mine, etc.
Now the objective measure itself might have some sort of bias, but at least the rules are set and you're not getting screwed over based on someone's feelings. You can argue metrics, you can't argue your managers feelings.
The thing is that those metrics are very poor metrics that don't correlate well to the "true ideal performance", even if compared to a subjective manager's intuition with all the randomness and biases.
Replacing a subjective metric that's at least somewhat effective with a metric that's totally useless (because those inherent inaccuracies/biases are even worse than even a poor manager's judgement), that's throwing out the baby with the bathwater. The primary purpose of a performance metric is to measure performance, and being prejudice-resistant is something that's nice to have - the primary reason why you implement a metric is not because you need something that can be argued.
But that is proper. The quality of craft/creative work matters. It belongs in the evaluation of craftsmen and creative workers. And it is fundamentally a feeling. When you are junior you may not yet have developed this judgement or taste. Your job is to learn it, from your own failures and the feedback of your senior colleagues. When you are senior, you have it. You are more valuable to an organization precisely because you can be trusted to have positive feelings about good work and negative feelings about bad work, and therefore do the right thing in a position of decision-making power. Also because you enculturate the next generation of senior craftsmen through your feedback.
This shouldn't be surprise at performance review time, nor should it necessarily come from your manager -- it should be coming from your senior colleagues on each of your code reviews, giving you a chance to improve your bad code before it gets checked in. But when your senior colleagues think your PRs are worse on average than those of your peers, then yes absolutely you should get a worse rating.
Your comment doesn't change if you replace lines of code with manager's perception of you. If you're both equally liked by your manager then you should receive the same rating. Within the metric being defined neither is biased since they have clear and explicit definitions. Against the true metric of "productive engineer" both are biased.
And how do you handle your manager having a cultural or unconscious bias against your <race / religion / body type / gender / appearance / clothing / hair color / fragrance of the soap you use / eyewear / etc.>? You just live with them not liking you and not measuring up to others in their mind?
Subjective evaluation for performance purposes is often done by committee for this reason. Your work is read by several people who are unlikely to have the same idiosyncratic biases, at least some of whom don't know you. (That cuts both ways, though; they also don't know the context for the work).
In order for these metrics to have even a tiny tiny chance of not being completely gamed (even unintentionally) you'd have to define a rigorous formula of weighted metrics that take things like one of my siblings mentioned into account (did your 1000 lines create a regression or 5 and mine didn't? Code quality? Lots of review comments that took forever to debate and resolve?). And that's assuming you could actually measure those things properly. Was that comment a valid one regarding you missing quality guidelines or was it someone trying to game your metrics negatively so that he'd look better?
I think it's impossible to create something like that and it'd be very very bureaucratic and still prone to gaming. I think having something 'in between' is the best approach. You still allow a manager to interpret these things together with you but the manager should give you a guideline for what to look out for. We can use these metrics to inform decisions about performance but it's completely counter productive to simply have a few metrics where you have to hit specific numbers.
PR throughput? No problem, I'll form a clique of a bunch of people that OK each others tiny PRs. This will result in so much overhead that we won't actually get much done, piss off other team members, create a hell of basically unusable commits, make it more likely that code quality suffers because nobody has any chance of having an overview of what you're doing overall and you will likely create regressions that developed over multiple commits and would've been caught otherwise because let's face it, each unit test you write is its own PR. You say obviously you won't get through with this because your manager is supposed to stop that? Well he can't if we just want a completely objective and metrics driven approach!
With the hybrid approach, you know from your manager that PR throughput is important not at the expense of quality and other things. You want small PRs for certain reasons but not at all costs. There is no exact formula because no two situations are exactly the same. Of course bias comes, of course bad managers make this bad. So does a completely "objective" metrics driven environment in which you play the metrics game. There is no perfect solution.
Now the objective measure itself might have some sort of bias, but at least the rules are set and you're not getting screwed over based on someone's feelings. You can argue metrics, you can't argue your managers feelings.