|
|
|
|
|
by leroman
847 days ago
|
|
- we have llama.cpp (could be enough or at least as mentioned in the paper a co-processor to accelerate the calc can be added, less need for large RAM / high end hardware) - as most work is inference, might not need for as many GPUs - consumer cards (24G) could possibly run the big models |
|