Since we're splitting hairs instead of training sets:
Classical RFs select one random feature and choose the optimal split, whereas Extremely RTs choose the best feature (out of a random subset) whereby for each feature only one random split is tested.
Another difference is that RFs use a bootstrapped dataset and ERTs use the full dataset.
Ah, I was under the impression that RFs choose from a subset of features, not just one feature.
In any case, I agree with the thrust of your original comment that the specifications of the RF algorithm can be relaxed, usually for performance reasons, and still retain strong performance. But this goes back to my original comment that the performance considerations of random forests often aren't highlighted to new learners (whereas introducing ERTs to a beginner would probably shock them - how could you take totally random splits and still get any reasonable performance!)
What you're talking about, where you simply generate a set of random splits across features, is Extremely Randomized Trees (https://link.springer.com/article/10.1007%2Fs10994-006-6226-...).