You’d want persistence, since the embedding process takes some time. But you don’t need to go all Pinecone on this. There is FAISS, and there is hnswlib, for example. Like SQLite for vector search.
Friendly reminder that we (Pinecone) have a free tier that holds up to ~5M SBERT embeddings (x768 dimensions). For quick projects, going "all Pinecone on this" could turn out to be the easier and faster option.
I like to stand up for the little guy. I hear Pinecone this and Pinecone that. And nobody seems to pay any attention to the awesome dude who made hnswlib.