| Hey HN! I'm open-sourcing DataKit today. GitHub: https://github.com/datakitpage/datakit
Live demo: https://datakit.page DataKit is a browser-based data analysis platform that processes multi-gigabyte files (CSV, Parquet, JSON, Excel) entirely client-side using DuckDB-WASM. Your data never leaves your browser. What it does:
• Process large files (tested up to 20GB) without any server
• Full SQL interface powered by DuckDB compiled to WebAssembly
• Python notebooks via Pyodide for data science workflows
• Connect to remote sources (PostgreSQL, MotherDuck, S3) with
optional proxy
• AI assistant that only sees column schemas, not actual data I was done with having to choose between cloud tools and heavy local installations. I wanted something that just works in a browser tab but has real power. It's AGPL licensed with commercial licenses available for enterprises. I've been building this solo as a side project for the past few months. Would love your feedback on:
- Performance bottlenecks you encounter
- Features you'd need for your workflows
- The architecture decisions (all client-side vs hybrid) |