Hacker News new | ask | show | jobs
by nyclounge 510 days ago
Why is ClickHouse exposing unauthenticated database access at port 9000 to the public? Is this the default behavior or did DeepSeek open it up for dev purposes?
4 comments

ClickHouse does not allow external connections by default.

If someone wants to configure an unauthenticated access from the Internet, they have to do the following extra steps:

- enable listening to the wildcard address;

- remove IP filtering for the default user;

- set up a no-password authentication;

It is possible to ignore and turn off all guardrails that the system has by default, but it needs extra efforts. However, it's possible that someone copy-pasted a wrong configuration file from somewhere without knowing what is inside, or do something like - listen to localhost, but expose ports from Docker.

A use case for direct database access exists, and is acceptable, assuming you set up a readonly user, grant access to specific tables, limit queries by complexity, and limit total usage by quotas. This is demonstrated by the following public services:

https://play.clickhouse.com/

https://adsb.exposed/

https://reversedns.space/

In this way, ClickHouse can be used to implement public data APIs (which is probably not what DeepSeek wanted).

ClickHouse has a wide range of security and access control restrictions: authentication methods with SSL certificates; SSH keys; even simple password-based auth allows bcrypt and short-living credentials; integration with LDAP and Kerberos; every authentication method can be limited on a network level; full Role-Based Access Control; fine-grained restrictions on query complexity and resource consumption, user quotas.

But still, according to Shodan, there are 33,000 misconfigured ClickHouse servers on the Internet: https://www.shodan.io/search?query=clickhouse This can be attributed to a high popularity of ClickHouse (it is the most widely used analytic DBMS).

When you use ClickHouse Cloud, which is a commercial cloud service based on the open-source ClickHouse database (https://clickhouse.com/cloud), it ensures the needed security measures, improving strong defaults even more: TLS, stong credentials, IP filtering; plus it allows private link, data encryption with customer keys, etc.

Thanks for your insight. I got ratioed to fuck for trying to defend the standpoint that this is an unusual expectation of a regular engineer to stand this up correctly.

https://news.ycombinator.com/item?id=42873134

If you're referring to the downvotes on https://news.ycombinator.com/item?id=42873211, I think that comment would have done better if you had omitted the swipes, as the site guidelines ask: https://news.ycombinator.com/newsguidelines.html.

e.g. "You are, in typical HN style, minimising the problem into insignificance" and "love how this is getting ratioed by egotistical self confessed x10 engineers". This is the sort of thing commenters here are asked to edit out of their comment, and when they don't, it's correct to downvote them (even though your underlying points may otherwise be correct).

lol, nice. getting out in front of anyone even potentially pointing fingers at ClickHouse. Good initiative.
That used to be the default setup for Redis, too. Might still be. You aren’t supposed to have it on a public subnet.
> You aren’t supposed to have it on a public subnet.

That's an incredibly bad assumption. To have defaults assume that you are on a protected network (what does that even mean? like what permissions are assumed just because you are on the same network? admin?) is just bad practice.

Private networking for internal things like databases has been the standard best practice for a long, long time.
Safe default configuration has been the standard practice for even longer.
I’m all for both.
It's not anymore! They actually changed their defaults and it helped tremendously to reduce the exposure of Redis instances on the Internet.
I don't have personal experience but from a quick google it looks like default setup is to accept connections on localhost only [0], and there's a default user without capability to run SQL statements. They would have had to open remote connections and enable SQL capability for the default user (it looks like this is the first step to creating other users, the 3rd step is, removing SQL capability for default user.) [1]:

  1. Enable SQL-driven access control and account management for the default user.
  2. Log in to the default user account and create all the required users. Don’t forget to create an administrator account (GRANT ALL ON *.* TO admin_user_account WITH GRANT OPTION).
  3. Restrict permissions for the default user and disable SQL-driven access control and account management for it.
[0] https://chistadata.com/knowledge-base/allow-clickhouse-to-ac...

[1] https://clickhouse.com/docs/en/operations/access-rights

From https://clickhouse.com/docs/en/operations/access-rights#acce...

> By default, the ClickHouse server provides the default user account which is not allowed using SQL-driven access control and account management but has all the rights and permissions. The default user account is used in any cases when the username is not defined, for example, at login from client or in distributed queries

This seems... very antiquated as a default? Clickhouse is relatively modern, first released in 2016, long after people were finding unauthenticated MongoDB servers left and right. Why not design it that starting a server requires at least a user-provided password in a config file? And then, even if that password was shared amongst all DeepSeek devs, at least it wouldn't be publicly accessible.

I imagine it wouldn't necessarily require their opening of remote connections, just a misconfigured reverse proxy.
when deployed to kubernetes you will have to open up to remote conns (thats how they were using it)
I suspect this is a docker container hijacking host firewall rules which is a common pitfall. Of course there should be an ingress and others, but it is also common to roll out a VPS in a hurry. No bad intentions from any side, just lack of practice.