Show HN: Days Since Last Elon

Y	Hacker News new \| ask \| show \| jobs

	Show HN: Days Since Last Elon (dayssincelastelon.com)
	26 points by landric 1212 days ago
	A toy project I created to track the appearance of the text "Elon" on the front pages of various news sites.

4 comments

dgrin91 1212 days ago

This is funny. I'm curious how exactly you count an "elon". E.g. Google news shows no Elons, but that is for sure 100% wrong.

Also maybe you shouldn't be counting news aggregators like Google News? Its basically double counting since its already on some other site.

link

landric 1212 days ago

I _had_ no good answer for the Google News result until you prompted me to Inspect source just now...

I'm basically scanning for <a> tags and searching the text within. Doing a Google News inspect, it appears that their links actually have no text, but are sibling elements of an <h#> tag. So, I need to figure out how to parse that correctly...

link

filoleg 1212 days ago

> Doing a Google News inspect, it appears that their links actually have no text, but are sibling elements of an <h#> tag. So, I need to figure out how to parse that correctly...

I just checked Google News myself, and you are correct that the sibling <h#> tag has the text. However, the <a> tag with the link has it too, but as a prop instead of being nested inside. Unless I am mistaken about the use case of that prop here, you can just extract the text from the aria-label property of the <a> tag.

And in case you want to proceed with parsing text from the sibling <h#> tag instead, you can just get the list of the parent <article> tag children nodes (yourAnchorTagNode.parentNode.parentNode.children; had to do a double .parentNode, because the <a> tag is wrapped in a singular <div> tag) and then search for the only <h#> tag there. That will be your target tag with the text.

link

landric 1212 days ago

Yep, that's right.

I was _hoping_ to get away with the same xml-parsing for each site, but I guess I'll need to customize

link

filoleg 1212 days ago

Practically speaking, you might actually sorta get away with it by using a single if-check, as long as you go with the aria-label approach instead of the <h#> sibling node search.

My logic is that it is very unlikely that another website will copy over the exact html layout of Google News, so the <h#> is only going to work there. But I bet that Google News is far from the only website that has the article title text inside the aria-label prop in the <a> tag.

So you can cover a heavy majority of websites you care about (if not all of them) by just checking both the inner text and (in case the inner text is absent) the aria-label prop. No need for any custom logic implemented just for Google News, as it would likely solve this issue for a lot of other sources.

link

unglaublich 1212 days ago

If you include your own website, you can simplify the algorithm to "0".

link

sys42590 1212 days ago

Stack was hugged to death... anyone has a screenshot or description?

link

bragr 1212 days ago

Calendars for a number of news sites and aggregators, showing, per their methodology, of when "elon" appeared on their front page. I have to say that a couple of their results seem suspect to me so I question the methodology

link

landric 1212 days ago

OP here. Happy to share any details as I deal with the hug-of-death...

What seemed off to you?

link

aristophenes 1212 days ago

For one this article is now on the main page of hacker news and it reports four days since last Elon on hacker news :D Might just be that it hasn’t been on the front page long?

link

croes 1212 days ago

And Twitter itself.

Can't open Twitter without one of Elon's tweets on top of it.

link