Hacker News new | ask | show | jobs
by jmward01 9 days ago
As a follow-up, I can see there is not a lot of belief which is why it is also hard to find a company to partner with on this. So, how -do- you make money on something like this as an independent researcher. Maybe I release trick one, show how guided window attn (and nn memory and probably a lot of robotics) can be trained? Thoughts? I can do that pretty quickly. By itself that is a pretty great tech (combined with fixed windows of full attn it is pretty amazing). The second trick, I think, is a bit more powerful although both are general purpose. If I do this, think people will believe trick two (and all the real time multi-modal streaming stuff)?
1 comments

Demonstrate results. If you can produce results that are somehow better than what already exists, it doesn't matter much what the actual trick is. If the way your results are better is difficult to explain without significant technical background knowledge, you might be limited to only a small pool of angel investors at first, but you only need to convince one to get funding for a better demo and intros to VCs with deeper pockets.
Yeah. That is the plan I think I have settled on. I'll release something interesting here shortly but the full architecture, including all the multimodal input/output streaming is something I am considering my options on. I may even try to get to the 1-2b moderately well trained model stage and host it to show how transformative cached states are compared to cache tokens.