Hacker News new | ask | show | jobs
by arafa 3381 days ago
"Going from imperative programming to functional programming has been a powerful paradigm shift for us to think about financial processing and accounting. We can now think of this system as a straightforward actor/handler system rather than getting mired in complicated SQL-join logic."

Though whether SQL is a functional language (or a programming language at all, if you're talking ANSI SQL) is a subtle question, I would at the very least not describe it as a traditional imperative programming language. I think this is an important distinction for the article because, contrary to what this article suggests, I've found SQL to be quite helpful for understanding functional and declarative programming concepts. That said, it might be a lot easier to express the types of tasks in the article as straight-forward functions rather than getting wrapped up in all this set-based talk in SQL.

3 comments

Relational algebra is a really useful model to think about ETL task generally; SQL is an awkward dialect to express relational algebra, but it is at least a well-known one, and reasonably portable for a subset of querying. You can see the payoff in the Hadoop ecosystem too: Hive with HQL, spark-sql, Impala - SQL being used to express a data flow graph with a bunch of relational operators.

When you program directly against Spark, you're effectively building SQL plans explicitly. It's both more indirect - instead of writing a program that does stuff, you write a program that creates a data flow graph that does stuff; and you have more responsibility for performance, for good and bad.

I think to get good performance, you simply can't think on a per-item basis. You need to orient your thinking towards what can be efficiently performed at the bulk level. Whether it's column scanning in HDFS, or index scanning in a RDBMS, you need to be aware of the engineering properties of the operators you're applying. Doing lots of things per-item is a recipe for blowing your budgets, whether it's cache, memory, I/O, whatever. You want to iteratively do a little work to lots of items, and then join, rather than lots of work to each item one at a time.

Hi, I'm the author. Yeah, you have a good point. Imperative programming was just the way that we were using SQL to build that system.
Did you consider an off the shelf ERP product?

I've written something similar to the first part - extract raw data out of various source DB's using SQL queries then push it to our organisation's ERP product (SAP) using A2A messaging.

From my view SAP is black box but it handles the actual accounting/ financial logic part i.e Ledgers, product tracking, inventory management etc. Our Accountants all seem pretty comfortable using it.

Airbnb's data models weren't initially designed to be financially reported on, and by the time we needed better financial reporting, it was too late to change those models. 90% of the work was about rethinking the way to think about all of this, what financial impact should be booked, and how it could be derived from the data. None of the ERP solutions fit our use case, and I think it would have been very difficult to integrate.

We still use a general ledger to book the outputs of our new financial pipeline, but I don't think we have a traditional business model (no traditional inventory management). That's more in the finance and accounting department though, and I can't speak much to that.

I don't know the volume of transactions they have, but in my past experience with SAP it was extremely hard and expensive to implement and make it scale (1~2 years for the initial implementation and migration and 10M+ USD spent), and we weren't even that big (~3M sales orders per year). Debugging it was also a nightmare, and they still had frequent data integrity issues, even just between SAP modules.
Do you plan on using Scala and Akka Persistence in the future for the entirely event-based system? http://doc.akka.io/docs/akka/current/scala/persistence.html.

We have been using it at the finance company I work for to maintain customer's ledgers.

It's definitely something to consider, depends on our project roadmap.
Thanks for the writeup! I was on the payments team at Groupon and have fond memories of the same challenges.
SQL is in fact one of the most well known declarative languages. It operates on sets so I'm not sure it even knows how to do imperative.

Can it?

They mention triggers, my guess is that they were using something like PL/SQL to push data into the financial tables.