Hacker News new | ask | show | jobs
Show HN: Text data browser for NLP, LLM researchers and developers (github.com)
2 points by eulerian 479 days ago
I created an app to easily browse and analyze large text datasets (local or remote). The app supports many data formats including JSONL and HuggingFace. Key features include:

Intuitive Navigation: Effortlessly browse local (or remote) data in HuggingFace, JSONL, etc., formats. Efficient Browsing: Stream large local (or remote) datasets without loading (or downloading) in memory. Powerful Analysis: Easily filter and sort data for better insights. Pretty-Print Code: Human-friendly visualization of code embedded in your data.

Package lives here - https://github.com/nihaljn/datahawk and welcomes contributions !

Setup and usage are very simple: `pip install datahawk; datahawk -p $port`