Y
Hacker News
new
|
ask
|
show
|
jobs
by
xnx
1102 days ago
I second this. You've done a great service to collect this data. I'm guessing the file must be much smaller than 20GB when compressed.
2 comments
zX41ZdbW
1102 days ago
I've also did an experiment by generating and searching embeddings for all the comments on HN. Here is the walkthrough:
https://www.youtube.com/watch?v=hGRNcftpqAk
link
zX41ZdbW
1102 days ago
It is only around 5 GB in ClickHouse. Details:
https://github.com/ClickHouse/ClickHouse/issues/29693
link