Hacker News new | ask | show | jobs
by kiyoto 4255 days ago
>Kdb+ has sharp elbows.

No shit. I used to work as a quant, and while I was an okay quant and mediocre trader at best, I survived for three years in the industry because of my kdb+ proficiency: the firm I was at spent a couple of million dollars on kdb+ only to find out that most people could not wrap their heads around kdb+ let alone debug it effectively.

My (former) colleagues were definitely smart people. In many ways, they were way smarter than myself. But I somehow could get a much better handle of kdb+'s idiosyncrasies, and my ability to stare at dense k/q code (usually no more than a dozen lines) and figure out what's wrong with it earned me the reputation as the "q guy" - and some level of job security.

The firm eventually phased out kdb+ completely after my boss and I left (the two proponents of kdb+).

2 comments

I've worked in many shops that use kdb+ and the ones who really benefit are the ones who bothered to get some training on it rather than those who just assume they'll wing it somehow. Kx themselves have been running great intro workshops for a couple of years now. Some guys at the next desk attended one and came back buzzing with excitement at how they now saw through the noise. So the take away is - if you didn't bother to learn it, stop complaining that you don't understand it. It's not difficult to get when it is explained well, but you're not gonna get it just by staring at it.
>I've worked in many shops that use kdb+ and the ones who really benefit are the ones who bothered to get some training on it rather than those who just assume they'll wing it somehow.

Yea, I know all about the training and First Derivative. My employer also hired them.

In their defense, every First Derivative KDB+ consultant that I worked with was very sharp and an excellent teacher. They really knew their stuff, and First Derivative is no small part of what has made KDB+ so successful. However, even with their excellent pedagogy, most of my co-workers were totally lost/weren't willing to apply themselves to learn q/kdb+ well.

Here is another way to think about it: many people can't ever get their heads around certain conceptually difficult topics, say, measure theory or quantum physics. I don't think kdb+ is nearly as hard, but it seemed that way looking at my peers who were no slowpokes.

I initially read that as:

> ... came back buzzing with excrement

Why were you a proponent of kdb+?
For several reasons, some more legitimate than others.

1. kdb+ was (and maybe is) a good solution to the problem that we had: doing complex data manipulation/simple statistical calculations against billions of rows of time series data. Hadoop is the term du jour for data processing, but truth of the matter is that finance doesn't have really huge data. At best, it's a couple of terabytes, and most of the time, you are working with a small subset of it. Running KDB+ on a beefy server or two would usually do the job (rather well).

2. Maybe because I studied math, but I find k/q's vectorial/functional sematics appealing. I think the syntax is horrible, but the semantics is very neat.

3. Finally, because it helped me keep my job. It was rather amazing to me that all these Ph.D. statisticians that I worked with couldn't bring themselves to learn kdb+ effectively. Apparently this stuff can be very hard for even the smartest people (or maybe they thought it was such a niche skill with a low ROI).

It almost sounds like kdb needs an alternative syntax that is more human readable.
It has one: q. But once you get over the syntax, you realize that you also need to grok different semantics that you are used to.

Some q is readable english - e.g., an expression like

   sum price where size>3
is (to the uninitiated) more readable than the equivalent k

   +/price@&size>3
but that only works for simple stuff. The (idiomatic!) computation of maximum-subarray-sum[0]

   |/0(0|+)\
becomes

   max over 0 (0 max +) scan
which is not more readable. And you can drive the point ad absurdum by making it even more verbose:

   max over zero (zero max plus) scan
The syntax seems like it is what stops you from understanding it because it is the first thing you meet. But it's the semantics that you need to grok, and the syntax just matches them.

[0] http://en.wikipedia.org/wiki/Maximum_subarray_problem

I would find something like

over(max, scan(0, max(+, 0))

slightly more readable, even if that still gives me a higher-order-headache.

Because, y'know, it's confusing to have symbol salad with a weird mixture of infix and postfix operators, even more so if you have type raising (or the moral equivalent thereof) thrown in for good measure.

The q in your latter example is the same complexity in my book as a dense Python list comprehension. Could be worse.
The equivalent python is:

    mss = lambda x: max(scan(lambda a,b: max(0,a+b),0,x))
assuming a definition "scan" (which is like "reduce", except it gives you all intermediate values), an example of which is:

    def scan(f,x0,x):
      r = [x0]
      for x1 in x:
        x0 = f(x0, x1)
        r.append(x0)
      return r
Note that the K is idiomatic whereas the python is (arguably) not. Of course, it could be worse; the advantage is that, much like math, the 9-char K version is pattern-matched by your eyes once you are familiar with it, whereas no other version presented here (or in almost any other language) can utilize that feature of your brain.