What are we doing wrong? We use Prometheus for many things, for example I'd like to know how specific latencies have changed over time. Why should I store these numbers somewhere else?
Prometheus is not intended as durable long term storage, it's fundamentally limited to the size of a machine. You should also design your monitoring be able to tolerate completely losing the data of a Prometheus.
The problem is (as you know) that single machines are in practice still too reliable and Prometheus is still too good at storing data for long times that many people have come to rely on it despite warnings :)
We recommend using another system for long term data, see https://prometheus.io/docs/operating/integrations/#remote-en... for some examples.