| HN Mirror

If you're committed to using a 3rd-party API, then parallelizing your API calls seems like the easiest way to speed things up. The benefits of a 3rd party API are - of course - that you're likely going to be able to generate embeddings using a much more powerful model. That being said, you may not need something as powerful as PaLM and having everything go over a network might just take too long. IME (which is entirely use-case dependent) something like SentenceTransformers (even the smallest pretrained models) can get you up and running on your own infra pretty quickly and generate embeddings with reasonable performance in a reasonable amount of time on modest hardware.