| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by seangrogg 829 days ago

This feels... a bit obvious to the point of being silly?

It is fairly well-established that context windows are a general issue among LLMs due to SOTA context windows still being somewhere greater than linear. It's also fairly well-established that LLMs aren't necessarily good at things they aren't trained at.

If you are unwilling or unable to throw enough hardware to overcome the context window problem you'll need to reduce the context. If you're unwilling or unable to train the LLM to task you'll have to restructure information such that the task is more tractable.

I'm glad to see that given their constraints they chose a sensible solution for the business, but overall this really seems like a series of known limitations being called out and doesn't feel like it's a good look coming from a company that touts leveraging AI for pulling information from documents and integrating with existing systems...

1 comments

araghuvanshi 829 days ago

I don't think that this is obvious at all. Yes, AI people who read papers on arxiv and know what "SOTA" stands for know it, but that is no longer the main user base of LLMs.

This is meant to be for the developer who doesn't fit the above profile and thinks a model that has a million token context window and "can handle complex analysis, longer tasks with multiple steps, and higher-order math and coding tasks" (direct quote from Anthropic's website), actually can do those things.

seangrogg 829 days ago

Valid! I think the disparity is that the article appears to be written for a fairly technical crowd but the expectations appear to come from what these particular models are marketing. Most that are fine-tuning LLMs or aware of LongRoPe for extending context windows are probably consumers of research/white papers rather than marketing material.

Having read some of your other comments it appears that part of the issue is that you were marketed a 1 million token context window and research has shown that's not quite the case. That said, the article doesn't do a good job of painting that picture - it is alluded to with "all fail at this task despite having big context windows" but I think it's worth being crystal clear here that the marketing says 1m and that is disingenuous in your experience and backed by research findings.