Hacker News new | ask | show | jobs
by zozbot234 336 days ago
> trying to cross-reference my tons of downloaded games my HDD - for which i only have titles as i never bothered to do any further categorization over the years aside than the place i got them from - with wikipedia articles - assuming they have one - to organize them in genres, some info, etc and after some experimentation it turns out an LLM - specifically a quantized Mistral Small 3.2 - can make some sense of the chaos while being fast enough to run from scripts via a custom llama.cpp program

You can do this a lot easier with Wikidata queries, and that will also include known video games for which an English Wikipedia article doesn't exist yet.

1 comments

I'm not sure about this, i just checked Tron 2.0 (just a random game i thought of) and Wikidata seems to have wrong info (e.g. genre) compared to the Wikipedia article. Also i need to it describe a bit with what the game is about since i want to generate an html file with all the games and do a quick scan of them and Wikidata doesn't have that.

IGDB would be a better source than Wikidata (especially since it does have a small description too) but i wanted to do things offline. And having Wikipedia locally doesn't hurt. And TBH i don't think it'd be any easier, extracting the data from Wikipedia pages was the most trivial part.

That said I'll need to use some other source at some point since, as you mentioned, Wikipedia does not have everything.

Would MobyGames be a better source for this information? The information is curated, and an API is available.