| I'm thinking about the same thing i.e. building a personal HN reader with Python and QT. I have never explored the API, and I'm not a professional programmer, but allow me to describe the ideas from top of head when I read your posts: Requirements: 1 - Self-hosted: So essentially, it's too heavy to keep the whole sites offline. However, because I always read by topic (e.g. I'd type "SICP ycombinator" in Google and read the top pages), I think one approach is to let the user enter a topic, say "SICP", as well as number of top-level stories to return, say 25, and invoke the API to return stories. The app will then dump the stories and their comments into a database with data modelling that suits for a forum. 2 - Offline access of data: Essentially what I mentioned above. The app should also allow user to remove a story (and its children), modify topics, favorite a story and create new topics. I think those are the barebone requirements. The backend would interact with the database and do updates. 3 - Query data via SQL: I think it might be too much work to parse queries, and the easiest thing to do is to just access a string as query and pass that to the database engine. Or maybe only allow user certain actions and let the backend assemble the queries. 4 - Full text search of stories and comments: Not sure what to do as my programming knowledge is very limited. I saw the second link you provided and it is very interesting. 5 - Notification of replies to comments: Maybe give user a button to update all his favorite stories. TBH I really want to see what other people's implementations will be. |
1) It is possible to keep the whole site offline. A database from 2017 is 9 GB: https://archive.org/details/hackernews-2017-05-18.db I think a 2020 DB could be less than 20 GB.
2) My focus is on reading, not writing. Local favorites make sense. Maybe with importing public favorites. A user can set their name without logging in.
3/4) SQLite does the heavy lifting here.