| HN Mirror

You don't have to instantiate the records; that's just the way I did it here. For example, you could do two passes: one to randomly select IDs, and one to fetch a small number of records for the sample set. That's O(2N), but still better than what most databases will do for an ORDER BY RANDOM() query.

There's also another, less-known variant of this algorithm that I'll go into in a future post that alleviates the concern.