Here's an example: https://github.com/silphendio/sliced_llama
A gist pertaining to said example: https://gist.github.com/silphendio/535cd9c1821aa1290aa10d587...
Here's a discussion about integrating this capability with ExLlama: https://github.com/turboderp/exllamav2/pull/275
And same as above but for llama.cpp: https://github.com/ggerganov/llama.cpp/issues/4718#issuecomm...