Hacker News new | ask | show | jobs
by Balgair 408 days ago
Yeah, that was intentional, well, somewhat.

The project requires the full list of every known city in the western hemisphere and also Japan, Korea, and Taiwan. But that dataset is just maddeningly large, if it is possible at all. Like, I expect it to take me years, as I have to do a lot of translations. So, I figured that I'd be nice and just as for the top 250 for the various models.

There's a lot more data that we're trying to get too and I'm hoping that I can get approval to post it as its a work thing.

2 comments

Sounds like the you're having it conduct research and then solve the Knapsack problem for you on the collected data. We should do the same for the traveling salesman one.

How do you validate its results in that scenario? Just take its word for it?

Ahh, no. We'll be doing more research on the data once we have it. Things like ranking and averages and distributions on the data will come later, but first we just need it to begin with.
If you have the data, but need to parse all of it, couldn’t you upload it to your LLM of choice (with a large enough context window) and have it finish your project?
I'm sorry I was unclear. No, I do not have the data yet and I need to get it.
Well remember listing/ranking things are structurally hard for these models because you have to keep track of what it has listed and what it hasn't, etc.