That's the only way to do it. You can't index the whole thing. The challenge is chunking. There are several different algorithms to chunk content for vectorization with different pros and cons.
As far as I understand it, context length degrades llm performance, so just because an llm "supports" a large context length it basically just clips a top and bottom chunk and skips over the middle bits.
Why would you want chunks that big for vector search? Wouldn't there be too much information in each chunk, making it harder to match a query to a concept within the chunk?
In theory this would be an efficiency boost but the performance math can be tricky.