| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by crdb 3790 days ago

For our last 2 clients (with both, offering a "data team as a service"): the reporting tools were written in Haskell, using Servant [1] which got started trying to do the same at Zalora to set up a customer service management tool. The motivation was that it made painful stuff much simpler and faster to write (performance was a bonus).

It's also used for a bunch of tricky data manipulation problems that were harder to solve using pure SQL. Example: identify a unique customer based on attributes (email, phone, address, etc.) one or more of which can change when a customer tries to create a "new" profile to grab the $10 new customer signup voucher. This ran recursively on the entire customer dataset (12 countries, some customers had created as many as 250 profiles) in about 4 seconds on a m3.medium instance.

We don't/didn't write blog posts or tweets about it though. I think the set of people who write blog posts (i.e. both have free time to do so, the talent, and the motivation) is relatively small, and the number of blog posts and other visible stuff being published is proportional to the size of the community, so you'll see many more Node.js/RoR posts than you will on using set theory to reduce a 5,000-table flawed data model systematically or the kind of everyday stuff Haskellers are doing everywhere. Also, personally at least, I don't feel like I know enough to have much to offer by writing a blog, so I stick to making these semi-anonymous HN comments.

Interviewing Haskellers, I found a lot of examples of similar work - CRUD tasks, web services, etc. built in Haskell within a larger organisation and running quietly in the background. Something small and modular that can be tacked on quietly.

One of the creators of Servant moved on to Tweag [2], a French big data company, where he's working on PB-size distributed machine learning projects for large corporate clients, which I think is where Haskell really shines today. I suspect NDAs will stop them from talking much about it... I've always wanted to do similar work for clients but so far, no dataset was "big" enough to justify moving away from well established R libraries.

[1] https://github.com/haskell-servant/servant

[2] http://www.tweag.io/