| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by whalesalad 1777 days ago
	PostgreSQL should be everyone’s first choice for a data store. It can do so much, including serving as your full text search system.

8 comments

tempest_ 1777 days ago

Elastic's main selling point is not so much the full text search.

The search is what it does but most of it's value is centered in the management/scaling/monitoring of full text search over many machines.

I love Postgres but it's "clustering" story is definitely not as user friendly.

link

dijit 1776 days ago

for logs (where you can shard on a modulo of timestamp) you might have luck with CitusDB (PostgreSQL sharding)

link

nuker 1776 days ago

Is there a real scaling? Like increasing node count on indexing latency or CPU metrics?

link

bpicolo 1776 days ago

Yes. You can scale ES to fairly massive data volumes. Postgres is a very different system with different design constraints.

There are plenty of peta-scale ES clusters in the wild

link

rad_gruchalski 1777 days ago

Have a look at YugabyteDB then.

link

threeseed 1776 days ago

Why would I do that when Elasticsearch is a proven search engine.

link

rad_gruchalski 1776 days ago

I don’t know, maybe you yourself get confused. Why would you say this otherwise:

> I love Postgres but it's "clustering" story is definitely not as user friendly.

Then have a look at YugabyteDB.

link

ageitgey 1777 days ago

I love postgres and the full-text search feature works great in some use cases, but it is not really comparable to elastic search in many scenarios (huge document stores, complex text processing or querying, etc).

link

whalesalad 1776 days ago

For sure, but I would posit most startups and smaller stage companies can get by with it. It really comes down to indexing data properly and designing for your search patterns. If your search patterns are vast or change constantly, ES might be better, but if you just need basic text search over X attributes, Postgres will be sufficient.

link

catmanjan 1777 days ago

Do you know for sure that postgres doesn't perform as well as elasticsearch if you don't use the relational capabilities of postgres?

Instinctively I believe what you're saying, just wondering if you know for sure.

link

rpedela 1777 days ago

Yes I know for sure. Postgres search is essentially an easier to use regex engine. If you have a recall-only use case and/or a small dataset, then that works great. As soon as you need multiple languages, advanced autocomplete, misspelling detection, large documents, large datasets, custom scoring, etc you need Solr or ES.

link

hardwaresofton 1776 days ago

While I don't doubt that you know your usecase and weighed/tried the option.

> Postgres search is essentially an easier to use regex engine.

I'm not sure exactly what you meant to convey here, but if you're searching with LIKE or `~` you're not doing Postgres's proper Full Text Search. You should be dealing with tsvectors[0]

> As soon as you need multiple languages

Postgres FTS supports multiple languages and you can create your own configurations[1]

> advanced autocomplete

I'm not sure what "advanced" autocomplete is but you can get pretty fast trigram searches going[2] (back to LIKE/ILIKE here but obviously this is an isolated usecase). In the end I'd expect auto complete results to actually not hit your DB most of the time (maybe I'm naive but that feels like a caching > cache invalidation > cache pushdown problem to me)

> misspelling detection

pg_similarity_extension[3] might be of some help here, but it may require some wrangling.

> large documents, large datasets,

PG has TOAST[4], and obviously can scale (maybe not necessarily great at it) -- see pg_partman/Timescale/Citus/etc.

> custom scoring

Postgres only has basic ranking features[5], but you can write your own functions and extend it of course.

Solr/ES are definitely the right tools for the job (tm) when the job is search, but you can get surprisingly far with Postgres. I'd argue that many usecases actually don't want/need a perfect full text search solution -- it's often minor features that turn into overkill fests and ops people learning/figuring out how to properly manage and scale an ES cluster and falling into pitfalls along the way.

[0]: https://www.postgresql.org/docs/current/textsearch-intro.htm...

[1]: https://www.postgresql.org/docs/current/textsearch-intro.htm...

[2]: https://about.gitlab.com/blog/2016/03/18/fast-search-using-p...

[3]: https://github.com/eulerto/pg_similarity

[4]: https://www.postgresql.org/docs/current/storage-toast.html

[5]: https://www.postgresql.org/docs/9.5/textsearch-controls.html...

link

MapleWalnut 1777 days ago

Scoring results in Postgres requires scanning all matches, which is slow if you have a lot of results.

Elastic search and other search solutions don’t have this problem.

link

jeromegv 1776 days ago

Searching a structured database just isn't the same as having a full on indexed search engine. Those are different tools for different usage.

link

ayush--s 1776 days ago

Even though we are currently replacing ES (hosted on elastic.co) with Postgres for ~100M docs + low QPS usecase, it's no real competition to Elasticsearch. There are better™ alternatives for niches (like Algolia), but nothing just works like elasticsearch at scales when not everything can fit in a single machine.

link

threeseed 1776 days ago

a) Elasticsearch should not be used as a primary data store.

b) PostgreSQL does not compare to Elasticsearch when it comes to full text searching capabilities.

c) PostgreSQL has no vendor-supported, built-in solution for horizontal scalability which is a big reason why you would choose Elasticsearch over a more lightweight search system.

link

eric4smith 1777 days ago

Not a good story with tokenizing asian languages. And even the way how to tokenizes roman languages is not that great.

However, it does get one up to that 80% mark for text search. But that other 20% is why Elasticsearch and Algolia etc exists.

link

joshxyz 1777 days ago

I don't think so, elastic's https://www.elastic.co/guide/en/elasticsearch/reference/curr... is quite ahead of that.

Example here https://www.judyrecords.com/info

link

bdcravens 1777 days ago

How much of Elastic's usage is ELK logging vs. application search?

link

t-writescode 1776 days ago

Are you aware that Lucene, the technology that powers ElasticSearch runs on top of SQL?

link

nl 1776 days ago

No it doesn't.

link