Hacker News new | ask | show | jobs
by RealJon 2527 days ago
Predicate fields are indeed an oddity, but not an architectural one - it's for situations where the documents need to specify criteria (predicates) for when they should match - like only match for certain users, certain times of day etc. It's probably an underused feature imho since most people don't know this can be done efficiently.

If you have dynamic fields like in your SaaS example I recommend using a single map field rather than let data not under your control drive changes to the set of fields.

> Do you know how long a package update takes in Vespa, to add, say, a single field?

A few seconds. However, rather than having operators do any of this manually, set up an automatic process which deploys on each change made to the repo (i.e do CD).

> all the reference documentation in one place

https://docs.vespa.ai/documentation/api.html

2 comments

Another thing is that Vespa doesn't seem to support indexing of nested data, either structs or arrays of structs. For example:

  {
    "location": {
      "city": "Washington",
      "state: "District of Columbia"
    },
    "friends": [
      {"firstName: "Bill", "lastName": "Clinton"}
    ]
  }
Maps aren't suitable here because they can't be used for ranking. So you have to use structs, but those aren't indexable.

An application's search module could flatten the location key (e.g. "location_city", "location_state") for simple attributes, but the same is not possible for the array, since there can be arbitrary array elements. And you can't split it to an array of strings:

  "friends_firstName_elems": ["Bill"]
  "friends_lastName_elems": ["Clinton"]
...because queries like "firstName contains 'Bill' and lastName contains 'Clinton'" could match different records ("Bill Bryson" and "George Clinton"). Never mind deeply nested arrays of objects containing arrays containing objects containing arrays.

This seems unnecessarily restrictive. A search engine should be able to index the data you already have, not force the application to contort its data to whatever shape the engine requires.

Is there no way around this?

Thanks. I'm still learning about Vespa, and it's still not clear how map fields work.

Edit: Documentation says: "Accessing attributes in maps and arrays of struct in ranking is not possible". So maps aren't really usable.

Regarding how long it takes to update a field, the application I described would have to do this programmatically. It would have to keep track of all known fields in some kind of registry, and then if a new unknown field came in, it would have to perform an "application package" deploy just for that field, using the REST API. (Unless there's a less cumbersome way to do it?)

Reference docs: That's nice, but that's just a bunch of links. Good reference documentation has tables of contents. Bonus points for runnable examples in multiple languages. For an example of good reference API documentation, look at Stripe's [1].

[1] https://stripe.com/docs/api