Just create a RAG with wikipedia as the corpus and a low parameter model to run it and you can basically have an instantly queryable corpus of human knowledge runnable on an old raspberry pi.
Even then, it depends how you use them. Some embeddings pack the highest signal in the beginning so you can truncate the vector, while most can not. You might want that truncated version for a fast dirty index. Same with using multiple models of differing vector sizes for the same content.
Do you preprocess your text? There will be a model there. Likely the same model you would use to process the query.
There is a model for asking questions from context. Sometimes that is a different model. [2]
> on an old raspberry pi
I bet the LLM responses will be great... You're better off just opening up a raw text dump of Wikipedia markup files in vim.