Hacker News new | ask | show | jobs
by teddyuk 2587 days ago
This really is true, I worked on a project where we were meant to be getting hourly files into a data lake, the files so small we couldn't reach the recommended size of 256mb per file (compressed parquet in azure adls) - the files were like 1 mb each - a years worth of data was tiny and the processing overhead ridiculous
1 comments

If your workflows don't bog down your servers, add big data technologies until they do!