Hacker News new | ask | show | jobs
by oldesthacker 914 days ago
The results of Table 3 are not really exciting. Could this change with 100 times more data? The key novelty in the specific context of this particular application is the quantized variational encoder used "to derive discrete codex encoding and align it with pre-trained language models."