|
|
|
|
|
by wcallahan
321 days ago
|
|
I just used GPT-OSS-120B on a cross Atlantic flight on my MacBook Pro (M4, 128GB RAM). A few things I noticed:
- it’s only fast with with small context windows and small total token context; once more than ~10k tokens you’re basically queueing everything for a long time
- MCPs/web search/url fetch have already become a very important part of interacting with LLMs; when they’re not available the LLM utility is greatly diminished
- a lot of CLI/TUI coding tools (e.g., opencode) were not working reliably offline at this time with the model, despite being setup prior to being offline That’s in addition to the other quirks others have noted with the OSS models. |
|
I think 99% of web searches lead to the same 100-1k websites. I assume it's only a few GBs to have a copy of those locally, thus this raises copyright concerns.