|
|
|
|
|
by DiscreteTom
595 days ago
|
|
I tried to spread large dataset into thousands of files on S3 and use StepFunctions Distributed Map to launch thousands of Lambda instances to process those files in parallel, using DuckDB (or other libs) in Lambda. The parallel loading and processing is way faster than doing this in a single big EC2 instance. |
|