|
|
|
|
|
by BoorishBears
1168 days ago
|
|
Not to go full "Dropbox in a weekend", but if you're technical enough to self-host, this is something you can build for yourself Everyone is going straight to embeddings, but it'd be easy enough to use old school NLP summarization from NLTK (https://www.nltk.org/) Hook that up a web scraping library like https://scrapy.org/ and get a summary of each page. Then embed a site map in your system prompt and use langchain (https://github.com/hwchase17/langchain) to allow GPT to query for a specific page's summary. - The point of this isn't to say that's how OP did it, but there might be people seeing stuff like this and wondering how on earth to get into it: This is something you could build in a weekend with pretty much no understanding of AI |
|
What people want is something they can run on their own hardware without sending their queries to some third party service which is doing who knows what with them.
This is already possible if you want to mess around with green code that isn't in system repositories yet and buy expensive hardware to make it fast, but you can imagine why some people don't have the time or money for that.
I'm waiting for Intel or AMD to realize there would be a line out the door if they'd make a CPU with an iGPU that could use system memory and run these models at even a quarter of the speed of typical discrete GPUs.