|
|
|
|
|
by searchcord
400 days ago
|
|
Hey, Thanks for your feedback. For software, I use ScyllaDB and Elasticsearch. It's split across 6 physical nodes (8 including the CDN). Data collection is handled using standard user accounts, accessing only public, discoverable servers. I plan to write a blog post about the technical aspect of how this was done soon. Admins of these servers weren't contacted, as the content indexed is already publicly accessible, comparable to a forum like this or public subreddit. That said, I understand the sensitivity around data visibility, and I've made it very simple for any user to opt out of indexing at any time. Private or invite-only servers are, of course, completely excluded. |
|