| For the last year I’ve been developing Hyperparam — a collection of small, fast, dependency-free open-source libraries designed for data scientists and ML engineers to actually look at their data. - Hyparquet: Read any Parquet file in browser/node.js - Icebird: Explore Iceberg tables without needing Spark/Presto - HighTable: Virtual scrolling of millions of rows - Hyparquet-Writer: Export Parquet easily from JS - Hyllama: Read llama.cpp .gguf LLM metadata efficiently CLI for viewing local files: npx hyperparam dataset.parquet Example dataset on Hugging Face Space: https://huggingface.co/spaces/hyperparam/hyperparam?url=http... No cloud uploads. No backend servers. A better way to build frontend data applications. GitHub: https://github.com/hyparam
Feedback and PRs welcome! |
> This stems from an industry-wide realization that model performance is ultimately bounded by data quality, not just model architecture or hyperparameters.
Generally we think of model architecture + weights (parameters) as making up the model itself, and hyperparam(s|eters) are the more relevant to how one arrives at those weights -- and for this reason are more relevant to the efficacy of training than the performance of the resultant model.