|
|
|
|
|
by j42
3931 days ago
|
|
Hey, thank you for this! Bought your book a while ago -- didn't realize your background was in ad tech. I personally eeked out ~30ms avg latencies with PHP by modifying ReactPHP (non-blocking event loop) and getting creative with Redis and nginx. It's performing comparably at high loads, and saving us a ton of money but obviously... it's PHP. I'm obsessed with Haskell and have spent a bit of time playing around with HLearn as I think SVM is an obvious use case where it would excel (though in terseness/structure more than performance relative to C). Any advice on selling a rewrite and/or challenges as one nears 1B daily queries for which Haskell would be uniquely advantageous? I want to integrate Haskell in our high-frequency serving arm (or as a backup, for machine learning), but honestly don't know enough about it relative to performance to sell its advantages over Erlang, Go or even C. Aside from the obvious "functional is better" and "types are great," which tend to appeal to the folks who see beauty in the language, is it feasible to claim that Haskell will result in a faster, more performance program in less time? ... And not to wear out my welcome, but I'm interested in reading more about performant Haskell (as per your post); any resources you recommend? |
|
hand wobble
This is my second ad-tech gig but I'm more of a catch-all SSE that is into databases/dist-sys and who usually works at startups.
>I personally eeked out ~30ms avg latencies with PHP by modifying ReactPHP
That's quite good. We haven't done any real tuning or cleanup with our application, I think if we did we could get a fair bit lower. The fact is, our app is fast because we avoid network I/O like the plague and we don't write _totally_ clownshoes Haskell code. It hasn't taken much work otherwise. We even have unnecessary String/List values, inefficient parsers, and some other things rolling around. Haven't had time for a clean sweep (feature push rn).
>Any advice on selling a rewrite and/or challenges as one nears 1B daily queries for which Haskell would be uniquely advantageous?
Haskell's just really good at anything enterprise really. Frontend JS, API, traditional web backend. As long as you avoid weird/unpleasant libraries and you're proficient/comfortable in Haskell it can be a great experience. Libraries in Haskell are generally pretty opinionated so you need to make sure you know what you like. HLearn is a good example of that opinionatedness being very fecund but also unusual to people accustomed to more ordinary ML libraries. HLearn is super cool :)
Haskell's a bit less compelling for non-JS apps because you'll be serving a volleyball over the FFI net (iOS, Android) but I know people that have done both fruitfully and happily. Haskell for Mac (http://haskellformac.com/) is a Swift+Haskell app from what I understand from Chakravarty. Manuel Chakravarty's done some really cool work on enabling programmers to avoid FFI (not that it's bad, the FFI is really quite nice actually) through the use of the inline objective c support.
> honestly don't know enough about it relative to performance to sell its advantages over Erlang, Go or even C.
If you have the labor, time, and money to write something in C, I want to work where you work.
So that I can write it in Haskell and use the spare time to make it extra shiny.
Okay I'll take the Erlang/Go bits seriously:
Erlang: if you're perf sensitive, shared memory is reaaaaallyyyy nice. The concurrency primitives in Haskell give you database-style transactions which COMPOSE, you can't do that in Erlang. Re: shared memory - look at how much of CouchDB was C. Haskell has a superset of the options Erlang offers. It doesn't excel as deeply in the niche Erlang offers but I think it does a credible job.
Worth considering re: Erlang:
1. http://hackage.haskell.org/package/courier
2. http://haskell-distributed.github.io/
Go: Okay, Go has shared memory but Haskell again has a superset of options. The most common concurrency primitives in Haskell are the TVar (I default to this one for correctness) and the MVar. The Tvar is your STM container - that's where your transactions come from. MVar is a bit like a Golang channel in that it blocks by default unlike Erlang's messaging.
However, we can get away with defaulting to having blocking APIs because our threading is _cheaper_ than Erlang's! Green threads + shared memory -> fuck it, fire off a thread. It's also nice because it means we can preserve our usual synchronous intuition within the context of a single thread and block when we really do want to block.
We actually have an even smaller and cheaper execution context called a spark, but that's more concerned with parallelism than concurrency so we'll put it aside for the moment.
If you want a rundown on strengths & weak points, I suggest looking at the State of the Haskell ecosystem document Gabriel Gonzalez put together - it's very good!
https://github.com/Gabriel439/post-rfc/blob/master/sotu.md
Re: what I've just said, particularly note the points about concurrency:
https://github.com/Gabriel439/post-rfc/blob/master/sotu.md#s...
But all the bullshit aside, I write Haskell because it's really fun (I have no patience for things computers can do) and keeps me from going feral and quitting society.