Hacker News new | ask | show | jobs
by softwaredoug 1123 days ago
This will be a big deal if Lucene got competitive on http://ann-benchmarks.com if it became a serious alternative (and more holistic) than the vector databases.

But it comes with continued challenges if I understand:

- Panama is an incubating API and Java has taken its time having an official way of using SIMD. It could all change in Java 22

- It only works on Java 20, with a very specific set of flags passed to the JVM. It’ll take time for this change to make it into Elasticsearch and Solr

- Panama itself is a weird and very low level API.

- Lucene organizes the HNSW vector index graph alongside its inverted index segments. And these need to be merged/compacted periodically. Merging HNSW graphs, as I understand it, is computationally difficult as the graph gets rebuilt.

2 comments

Elasticsearch is already on Java 20: https://github.com/elastic/elasticsearch/pull/95373
Great if this forces Java upgrades.

Using Solr with Java 8 is still quite common.

The newer GC options should have already forced people to upgrade. A use case like search churns through memory.

Maybe just what I've seen but Solr usually has users sticking with older versions of Java.

From Solr 8 on you need a Java 11 runtime.
Solr 9, for Solr 8, Java 8 is still used.

https://solr.apache.org/guide/8_11/solr-system-requirements....

Which is a pain, many products that use Solr as dependency like Sitecore, still lag behind in Solr versions.

https://www.searchstax.com/docs/searchstax-cloud-sitecore-so...

While it mostly works with more recent versions, there is no support if issues cannot be replicated on officially supported versions.