Yea there were several attempts (including ar5iv), and distill.pub is no longer active + Semantic Scholar is PDF-based.
None quite made the full use of HTML or have a robust conversion system. Jeff Dean's post is awesome - though using Gemini 3 is compute intensive and may still hallucinate in the end (I'm using a source-based latex to json parser). And the output is still...not very interactive.
Just passing by to mention that if you get excited about seeing your upgrades in arXiv itself, we can talk about contributing them to the arXiv HTML pages.
But seeing your plans for Science Stack, all the best with the endeavour!
And I am curious to know if arXiv:2105.10386 works well.
It works! After the initial data load (big paper), the scrolling and performance works nicely.
Can visit at sciencestack.ai/arxiv/2105.10386
Note: no support for nomenclature/index yet.
I'm also working on refactoring the data/json to a streaming model (right now it's one big json dump on load)