Hacker News new | ask | show | jobs
by bluelightning2k 793 days ago
Amazing that this was done by a non-professional dev. Huge cudos. (Tbh why not become a professional dev? You clearly have the skill and at least some interest.)

There's a huge miss in this implementation though: 95% of the time what is on screen is web-browsing. So to go from nicely formatted markup with title tags -> video -> OCR is clearly missing am ore obvious path (either a proxy or browser extension).

2 comments

I'm not so sure. This generalizes, plus "nicely formatted markup" is a big assumption in the brave new world of modern web apps. I guess it's probably pretty reliable to look for visible text strings, but there's something cool about only caring about what is directly visible.

I've often thought that screen readers might be better off now by attempting this visual approach. It used to not be possible, now it may be easier than relying on accessible markup.

Thank you! However, coding feels quite daunting to me, as my talents lean more towards design. The reason I embarked on this project is that I strongly desired to address a particular need, but had unable to find an appropriate alternative or willing programming friends to help. Due to my limited development capabilities, many aspects of this app don’t seem to be optimally or fundamentally implemented. Potentially, an open-source approach may inspire others with the same needs to collaborate or provide better solutions.

And yes, most of my screen time is also web-browsing. Currently, I can use the web page's Windows title as a reference, which provides quite a bit of information. To enhance this feature, it might consider using like Chromedriver to gather more details, such as web links and page text, similar to what 'Rewind' offers.