|
|
|
Ask HN: Is there a case for small / medium data
|
|
2 points
by pklee
1851 days ago
|
|
A lot of innovation happens in the big data (data that needs distributed compute - spark etc.) like ability to blend data across sources, schemaless / schema on the fly, deploy analytical models to production etc. Is there a case for similar innovations in the small to medium data (working with ~10M dataset) blended across data sources, simple analytical models and such ? What percentage of usecases are in the bigdata realm vs. small/medium data. |
|
There is definitely research being done on sparse data sets. Early stats methods were applied to what we would consider small data. Tukey did a lot of work on data viz and exploratory data analysis that was important and applies to small data sets. Many medical experiments use small data sets. Bayesian methods can apply to small data sets.