Hacker News new | ask | show | jobs
by kvz 3407 days ago
Getting aan index of (millions of) files on s3 is very slow for us, like, days. Is there anything you do to work around this? It seems since this is not an AWS Lambda project the client first has to acquire an index from S3 before concurrency benefits set in?
1 comments

This does not have to do with AWS Lambda, I'm thinking about renaming it to "functional-s3", or something similar.

To answer your question, there isn't really a workaround for this yet, although indexing should be much quicker than "days". All the keys are listed recursively before running the lambda expression locally. If you have a huge number of files, this can take several minutes, maybe hours depending on the scope.

A workaround I've been considering is using a generator function to list the keys; that way, the lambda expression can start immediately, generating keys as it needs them.