The concept behind this is reference stability, and if you need a collection that has stable references, you must introduce a level of indirection, that is, instead of a vector<T>, you use a vector<unique_ptr<T>> and then you can take references as follows:
std::dequeue is as good as useless. The defaults for "batch size" in different compilers are at extreme opposites of the tradeoff spectrum. So unless you really don't care about performance, memory or portability, it's not a datastructure you can rely on.
From memory:
In MSVC, dequeue will allocate memory for every single element if your elements are > 8 bytes. This will never be changed, due to ABI compatibility.
Clang and gcc have batching sizes of 1K and 4K (i.e. you throw out a whole page of memory even if your dequeue contains only 1 element).
I had a vague feeling that std::deque is a "heavy" thing which you shouldn't have a million of, but iterating through a couple big ones is pretty fast. 1–4K batches wouldn't hurt my feelings. Looked up GCC, it's actually slightly less heavy at 512 bytes per node[1]. But the MSVC part — that caught me completely off guard.
In general it's better to use the same standard library everywhere - discrepancies like that occur for almost every type so if you care about having the same performance on every platform... Either use libc++ or boost
I would generally suggest avoiding reference stability here (extra heap allocations) and going with the offset-based approach mentioned in the other responses.
I would generally suggest going for correctness over performance and the solution I provided is correct in the general case. Using an offset is only correct in the special case where objects will not be inserted or removed at an index less than the offset, otherwise you will end up with bugs as the offset becomes invalid upon such operations.
Furthermore, depending on the size of T, the performance penalty of the extra heap allocations is amortized over the cost of resizing the vector. That is vector reallocation is significantly faster for a unique_ptr<T> than it is for T when T is large and almost all memory allocators are tuned to allocate objects close together in space when they are allocated close together in time, so you don't lose the cache locality or need to worry about memory fragmentation.
In addition to other answers, sometimes you do know the final/max length of the vector when you construct it. In that case reserve() can reserve the necessary space, and as long as you stay under the limit all the addresses will remain valid.
(Though it's still pretty brittle, so you may want to add a ton of comments to warn yourself in the future...)