|
|
|
|
|
by jmward01
8 days ago
|
|
Yeah. That is the plan I think I have settled on. I'll release something interesting here shortly but the full architecture, including all the multimodal input/output streaming is something I am considering my options on. I may even try to get to the 1-2b moderately well trained model stage and host it to show how transformative cached states are compared to cache tokens. |
|