Hacker News new | ask | show | jobs
by crazymoka 692 days ago
I can put all the content into text. Would putting it all into json or cvs with headers be helpful? Break down content by topic, categories, links to other pages if it needs to read other content?
1 comments

Yeah whatever will give you accurate vector embeddings.

You’ll combine those json properties into one vector.

For links you might hard code those as an extra part of each object or potentially require another API call to retrieve relevant links from your “link store”.

It’s useful to have benchmarks in place like:

input “should equal” output And start building a suite of test cases to evaluate your set up.