Check the RAM requirements for the model (can be hard to find, though) and compare to the available RAM you want to run it in (VRAM if you are running on GPU, system RAM if running on CPU.)
E.g., this extended-context Llama 3 70B requires 64GB at 256K context and over 100GB at 1M.
E.g., this extended-context Llama 3 70B requires 64GB at 256K context and over 100GB at 1M.