| HN Mirror

To optimize for fast nearest neighbors, I chose 256 dims. Notably, this actually hurt some of the pre-training classification losses pretty severely compared to 2k dims, so it definitely has a quality cost.

The site uses cosine distance. The code itself implements Euclidean distance, but I decided to normalize the vectors last minute out of FUD that some unusually small vectors would appear as neighbors for an abnormal number of examples.