Hacker News new | ask | show | jobs
by nicola_alessi 195 days ago
You are 100% correct, and this is the central limitation. An LLM like ChatGPT, trained on general web text, is a terrible movie recommendation engine for exactly the reasons you state. Its knowledge is broad but shallow, skewed toward popular discourse, and it will happily confabulate titles.

Our approach with lumigo.tv is different by necessity, and it's a direct response to the problem you've nailed. We don't use an LLM for knowledge.

Here's the technical split:

The LLM is strictly a query translator. Its only job is to take your messy, natural language prompt ("a gloomy noir set in a rainy city") and convert it into a structured set of searchable tags, genres, and metadata filters. It is forbidden from generating or hallucinating movie titles, actors, or plots. The recommendations come from a structured database. Those translated filters are executed against a traditional database of movies/shows (we've integrated with TMDB and similar sources). The results are ranked by existing metrics like popularity, rating, and release date. The LLM never invents a result; it can only return what exists in the connected data. You're right that pure collaborative filtering (like Netflix's) has a massive data advantage for mainstream tastes. Where it falls short is for edge cases and specific intent. If you want "movies like the third act of Parasite," a collaborative filter has no vector for that. Our hypothesis is that a human can describe that intent, an LLM can map it to tags (e.g., "class tension," "thriller," "dark comedy"), and a database can find matches.

So, it's not AI vs. collaborative filtering. It's AI as a natural-language front-end to a traditional database. The AI handles the "what I want" translation; the database handles the "what exists" retrieval. This avoids the hallucination problem but still allows for queries that a "Because you watched..." algorithm could never process.

Does that distinction make sense? It's an attempt to use each tool for what it's best at.

2 comments

Maybe it's just me, but I find it weird to ask for a movie with very detailed characteristics. What I care above all is watching a good movie rather than wasting my time on a bad movie. I have a long list of movies that I plan to watch because I expect them to be good. My mood decides in which order I watch them, that's all. That's why I prefer collaborative filtering: I want to find movies that I'll like, I don't care if the city is rainy or sunny.
I'm convinced that in the future (5 or 10 years from now) you'll ask the AI precisely what movie you want to watch and it'll generate it on the fly. If you don't like the direction the story takes, you'll ask it to rectify. It'll be the end of the cinema as we know it today. I'm not sure it's a future that excites me :(
Yes, it does make sense, and it's a very interesting approach. So if you ask "a gloomy noir set in a rainy city" it'll translate into TMDB Keywords? I doubt that the TMDB Keywords have that depth (yet a data problem). How do you translate "in a rainy city"?