Hacker News new | ask | show | jobs
by minimaxir 1151 days ago
If that's the intent, IMO the release dataset should have more metadata (e.g. paragraph heading, article taxonomy)
1 comments

How would you add that data? As new columns you mean? Or add the paragraph headings to the text of the paragraphs before embedding them?
New columns.

For the headings, I mean the Wikipedia section headings (which isn't always a paragraphs, my mistake).

In both cases the data can be used like to classify/visualize Show HNs in your linked post.