Hacker News new | ask | show | jobs
by jasondavies 4841 days ago
Yes. I will probably add the core wordtree layout as a plugin to https://github.com/d3/d3-plugins

The whole application ties together text processing, data retrieval, the wordtree and longscroll.js for fast rendering of the text view on the right-hand side: https://github.com/d3/d3-plugins/tree/master/longscroll

1 comments

Any way to shift the text processing to the client? I'd like to use the bookmarklet on some academic papers (many behind a paywall) and the few I've tried only seem to parse the abstract...I assume this is because the text processing is happening server-side, but I could be wrong.

Alternatively, could you release your backend code as well? I'd like to run this on larger corpora.

Very elegant and useful project!

The text is in fact processed in the client, and is quite fast even for large corpora such as the whole Bible: http://www.jasondavies.com/wordtree/?source=kjv.txt&pref...

It attempts to access URLs directly but this only works if the server sends the appropriate CORS headers (hardly ever).

Otherwise, it falls back to using a proxy, which means the client only sees what the proxy sees. However, you can also paste raw text on the main page.

I could imagine modifying the bookmarklet so it lifts the text directly from the browser instead of just copying the URL. This would solve the proxy issue neatly and would also work for local-only or intranet sites, for which the proxy also fails.