|
|
|
|
|
by netsectoday
2225 days ago
|
|
You can boot up your own custom search engine in a few minutes with YaCy (Ya See!) an open-source, P2P, Dockerized crawler and search engine built on top of Solr. https://yacy.net/ If you're generous; you can make your index available to other P2P instances. I wanted to run an API search the other week and was blown away with how quickly I could prop-up my own custom search portal (I didn't want to pay for API access to other search engines, and YaCy comes with a JSON and Solr endpoints). I ran it locally to test my crawl filters, then pushed a private instance out to Digital Ocean to turn up the heat with the crawling. The only issue I had was the crawler would hit the max memory threshold on long crawls and the container would restart, but that was fixed by scaling up the box. |
|
While I typically still use RSS for reading music blogs, I find having the search engine is a great way to go back and find something or discover something new! Every time I find a new blog, I just add it as an index to yacy to crawl.
I think it'd be great to see people spinning up larger instances that are highly specialized. For example, maybe a search engine that is dedicated solely to sci-fi and only crawls high quality boards, personal sites and blogs, and skips all the spammy, seo-optimized sites.