Hacker News new | ask | show | jobs
by joeriddles 851 days ago
I added my site and received the following response:

> josephriddle.com/ideas without update time

I looked into the source code to determine how it's finding the update time. Come to find out, it's using ChatGPT! [0] It appears to only be looking at the article contents for the date, not at any page metadata.

[0] https://github.com/lindylearn/aboutideasnow/blob/main/apps/a...

2 comments

Yep but there is a fallback to metascraper [0] which does check the HTML tags. However the fallback didn't work in case GPT returns a 1970 date -- I just fixed this! [1]

I think you can now remove the date from your post content and it should still work. If you submit your website again it should do a re-scrape if you changed the content text. Thanks for catching this :)

[0] https://metascraper.js.org/#/

[1] https://github.com/lindylearn/aboutideasnow/commit/8b0ea5b46...

It would be nice if it also supported the If-Modified-Since and If-Unmodified-Since precondition headers.
Good idea, I just created https://github.com/lindylearn/aboutideasnow/issues/7 for this!
Does it also look at JSON+LD?
I tried to add https://jakeseliger.com, and I got an error saying that there is no "about" page. But if you look at https://jakeseliger.com/about, there is in fact one!
Oh, looks like the missing page detection went rogue in this case. It found the word "error" in your page and decided to use / instead of /about :)

I just fixed this, sorry!

Thank you! I think the idea is very cool.