Hacker News new | ask | show | jobs
by dennisy 1089 days ago
Have you tried this on HTML?
1 comments

Yes, tried it on HTML to get "metadata" that was not present in the HTML meta tags, such as author, publish date, etc. Works good.
Actually not on raw HTML, but with the WebBaseLoader from Langchain which strips away HTML tags.
Ahh cool thank you!