Im yet to see a system that consists of other versions of code than ”new” and “current”. You test against changes only, what you described is some mess in deployed versions / versions management.
Its losing „some” adventages of startup grade microservices and gain maintainability adventages of „netflix/facebook” level grid... Depends whats your scale. Shipping shit fast is often not the best solution at that scale, doing it right is. And I have already explained to someone else in this thread why that approach is important.
"New" and "current" are two different versions.