Hacker News new | ask | show | jobs
by fouc 2414 days ago
Organizing at the document level is the main mistake that virtually all of these tools make. Search doesn't get around that mistake either.
1 comments

I'm really interested, what do you mean by that? How do you think they should be organised?
Not the original commenter, but he got me thinking. (I’d like to know what he soecifically meant too.)

However, I find that nuggets are often embedded in subsections of larger documents.

Perhaps the search could rank higher a document with subsection with query words in the title. (Maybe they do in some tools?) Any set of query words could be by chance present, but scattered in a hundred large documents not relevant for the user. Conversely the tiny specific subsection I’m looking for is lost in the noise.

This very thing happened when I was looking for the delivery and address information for a main office. Huge number of documents mentioning address of something and delivery of something showed up. I even remebered seeing an earlier revision of the subsection I was looking for, but it didn’t help. IIRC even adding parts of the street address didn’t help because the plain address was common in many documents for random reasons.

Personally I also like to divide my technical documents into subsections with exact titles. It would be great if tools knew how to exploit this kind of hierarchy.

At the paragraph level or sentence level, with context markers (like tags) in order to "re-assemble" a document.