| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by smt88 2701 days ago
	FB intentionally obfuscates their HTML classes and IDs to make it harder to scrape or to build browser extensions on top of their site. It's almost impossible to keep up these days.

1 comments

stratenjine 2701 days ago

They do obfuscate, but they still have to follow basic markup rules and can't beat css selectors. Keeping up with Facebook changes does not even require releasing new versions, as updated regex(-ish) selectors can be downloaded. I think it's a (theoretically) losing battle to rely on such obfuscation.

link

db48x 2701 days ago

I don't work at Facebook, but I don't think that's intended primarily as an obfuscation step. Instead it's a _compilation_ step. It ensures that css rules for widget X can't accidentally apply to and therefore break break widget Y.

But yes, it is effectively obfuscated, and it would be foolish to try to reverse-engineer it every time the identifiers change, since they're effectively random.

link

stratenjine 2701 days ago

Actually, I've built a side project/POC, to check if the concept works, 6 months ago.

Recently I picked it up and continued. Selectors still valid.

Just sayin :)

link

smt88 2701 days ago

They also change the HTML structure. There's no selector that works well when the data, structure, class names, and IDs can change between page loads.

link