Hacker News new | ask | show | jobs
Using Scala to handle exponential growth at a startup (lucidchart.com)
62 points by bhanks 4930 days ago
6 comments

Its nice to see another shop switching to Scala. I have been advocating the same change where I work ( currently PHP and really feeling the pinch of what it can do ). Looks like I now have another data point to use as a success story for transitioning away.
You're not yet invested in the JVM, so have you considered some alternatives like Haskell or OCaml? I've found these to be simpler and more productive. The main worry would be about libraries, but that really depends on what you're doing. If it's web development, both Haskell and OCaml have top-notch frameworks that are very good for asynchronous code. OCaml also has a very good JavaScript compiler, so you can use it on the front-end. (Haskell can also be compiled to JS, but I've never used it that way.)
I have looked at Haskell, but the problem is getting the team migrated over. Its a lot easier to migrate a team to Scala then to Haskell. Also the library selection for the JVM is unparallelled.
A bit off-topic but I'm curious what people familiar/experienced with other Java web app frameworks think of the Play framework?
The choices really come down to (imo anyway),

1. JSP

2. GWT/GAE/AppScale

3. Play

JSP is just awful - taking PHP and trying to fit it into Java somehow. It's just terrible and needs to be phased out ASAP. (EDIT: Probably too harsh.. JSP works in the same way PHP works: it powers most of the web, and it does a grudgingly good job. The ease of plopping in some dynamic content into am html website made by a designer should not be discounted lightly for small websites. Probably best to just use PHP and not JSP in this use case though...)

GWT/GAE/AppScale mix is my preference. You can share code across server and client. Building RESTful services is a dream with Jersey on the server/thick client, and RestyGWT in the browser. The GWT compiler produces some very well performing javascript (only a bit slower than highly optimized libraries, but feels faster than nearly all webapp javascript). GWT also has incredibly good tooling support in automatic sprite sheet creation, compiled CSS styles, uibinder for declarative layouts, etc. The downsides are (1) java language, so no functional programming and (2) compile time can get nasty when you head in the millions-lines-of-code arena. GWT team is seemingly working on speeding up the dev cycle with SuperDevMode and that.

Play framework is most similar to the ruby on rails approach and works incredibly well for your standard webapp. The support for Comet and WebSockets are particularly nice and making very dynamic 'push' webapps is the real strongest point. The tooling can get pretty annoying at times with features/bugs, but it is steadily improving. The biggest downside to Play is that it is in a 'hipster phase' currently with the release of Play 2. This means both a huge surge in interest, but that same surge can be a curse when it drops out of 'hipster' mode in the future and loses devs. Numerous promising frameworks/languages have hit this problem and never really recover when the 'fashion' changes.

GWT, for example, has long since fallen out of fashion but continues to be heavily developed because of the huge and profitable existing user base inside Google (Adwords, etc), guaranteeing it active development for awhile to come.

I'd argue that Play will circumvent this "hipster phase" issue for (at least) these three reasons:

(1) The ubiquity of the Java/JVM ecosystem paired with the interest in a alternative to Java will keep the functional JVM Langauges like Scala/Clojure in high demand.

(2) The Typesafe machine pushing Play (see http://blog.typesafe.com/typesafe-announces-14m-series-b-fin...).

(3) Scala is your best bet for a Statically Typed Functional JVM language. Also, we use Play 2.0 at the startup I develop for.

I also find that Play is pretty cool to work with. Gone are the days of having to use servlet containers for development.

There is also Lift, one of the early scala web frameworks, but after using it for over a year, I'd say stay away.
What drove you away from Lift?
What about JBoss Seam (or Weld under J2EE6)?
I use Weld for a side project, had hard time getting it to work with JRebel, and the redeployment is a headache, also hotswap didn't work for proxied classes (Injection / CDI). The JEE 6 stack, somewhat affected by Gavin King (author of Seam and Hibernate) is not that bad, it's just not that cool anymore, same way that Spring MVC is less cool (just because there are arguably easier and more rapid frameworks such as Play) and by saying less cool I mean less fun and less fun means to me less productive.
We heavily use the Play framework at https://balancedpayments.com where we wrap various Java SDKs over an internal JSON API to communicate to our various banks / processors from our Python backends.

It is ABSOLUTELY a pleasure to use. I did a thorough analysis of all the Java frameworks that I could use and Play is what I landed on. It gets out of your way and if you use an IDE like IntelliJ IDEA - you will develop on the same speed with something like Django or Rails.

Play has made me really enjoy Java again since during our CS courses and my time as a high frequency algorithmic developer on Wall St was spent in C++/Python, I never needed to really use Java.

Now, I can tell you that I have new found respect for Java and I can't speak highly enough about the Play framework. I'm happy to elaborate a bit more on our use of it -- feel free to contact me via email in my profile.

I "want to do a project with it" and "want it to succeed" more than I "like it as a framework because I've done production projects with it"

Why? Since it's in the official typesafe stack (a company founded by Scala's creator to promote it and it's ecosystem commercially), based on concepts taken from RoR, has rapid deployment without the need to pay JRebel 100$ per developer, and is supporting both Scala and Java.

So I see it as "finally something good happening to those who wait". It's the only way enterprises can start doing modern web development without giving up on their existing investment in the Java / Spring / Hibernate stack. My view on it? I can't wait to get a work related project done with it.

Lift is also an interesting option (Scala only though), but you asked about Play. Lift has it's learning curve, but it's creator claims it is better than Play in almost any aspect, and many tend to agree, but I don't know enough to judge...

As someone who has used Lift in production on a non-trivial webapp for the last 18 months, and has used Play 1 and 2 in both Java and Scala for side projects, I would not recommend Lift.

The "the-view-is-the-controller" approach encourages a messy blend of logic and presentation, with lots of HTML ending up in your Scala code.

It's stateful in the extreme, using opaque callback identifiers to identify serialized closures, so rather than having a nice, clean, easy to reason about boundry between your app and the outside world (your routes file, controller methods, whatever), any code anywhere could be invoked by someone clicking something in their browser.

Validation now needs to be enforced in the models, since you no longer have a single place to check it as it comes in. It goes beyond that and embeds rendering logic in models, which have to toForm method which generates HTML.

The extreme statefulness also means you lose everything when you roll a server, you need sticky sessions, you spend way too much time GC'ing and you need far more RAM/user.

I work in an enterprise-ish setting and regularly develop with Spring MVC, Struts and Rails on the server side. More recently have been using Play 2.0/Scala (I must admit I have not had a Play app make it to production yet). I find it more productive, like rails in dev mode you can make a change, refresh the browser and watch what happens. This kicks arse compared to hot deploy/in place deploy with Java, or even to JRebel. The main advantage I find over the others in terms of framework features is the non blocking/asynchronous constructs provided out of the box. For anyone new to it, it's also worth noting that from 2.1 (almost out) you can integrate with Spring, and that there seems to be a healthy collection of 3rd part modules for it, ie for stuff like oauth. I could go on for ages, but essentially I prefer it to the other frameworks Im familiar with...
I don't really feel like they actually get into the specifics of how scala actually benefited them outside of what everyone already knows about scala from a really high level.

Bummer. I'm using scala at a small startup.

No. But their "Hey Jude" diagram is worth a watch! https://www.lucidchart.com/pages/examples/flowchart_software
No, not at all. There was almost no meat to this article.
Since Scala is also a JVM language, from scalability perspective can someone expand on what it offers that Java, or any other JVM language, doesn't.

I'm not asking about language semantics since I believe that to be a personal preference, rather scalability and performance.

> "language semantics since I believe that to be a personal preference"

If you start with such patently false beliefs you aren't going to get anywhere. Good abstractions are good, regardless of your personal preference. I have a very hard time programming without immutability, catamorphisms, pattern matching, sequence comprehension, varargs, fmaps/functors, partial functions, monads, predicates, filters, higher order functions, lambdas, closures, trampolines, continuations, strong types and co/contra-variance, options, actors, immutable collections, regex, combinators ( kestrels ), a dash of abstract algebra ( monoids semigroups groups rings fields ), basic math structures ( at a minimum, graphs with traversal algos baked in ), macros, implicits, dsl capability, and a mathstat library.

If you give me a language without some of the above, I will first invest time buying/building whatever is missing, so that I get that full list.

The claim is that scalability is an acronym for that list. If you give me everything in that list I give you scalability. Scalability is not about being close to the metal, gobs of ram etc...rather we are talking about the same block of code that services 200 million clients with the same sla/latencies as easily as 200 clients when you throw reasonable extra horsepower at it.

Without that list, you are essentially programming in the 19th century. You will, without even knowing, end up building various versions of these in your own pet language...for example, after writing a "for loop" with an "if statement" to vary behavior based on containment in a collection for the 100th time, it will occur to you that there must be a better way to encapsulate this logic...and sure that's what a predicate with a partial function is...but it's not like you are newton and you sit under apple tree and apple strikes your head and you get gravity out of it...these things won't come to you like magic...you have to essentially sit down and read a book on fp where they tell you why they do what they do.

The thing with scala is that you can pair it up with akka/breeze/scalaz & hit every single item on that list and then some. So also with haskell, clos, ocaml, sml, clojure, erlang...the match degree varies but you can get close to 90-100%. If you try to hit that list with a non-fp lang like c++/java...the degree is like 20% and you will have to do 80% of the work to hit that list.

Now you could say (as my manager at my former big-co did) I don't give a shit about that list above, maybe your list consists of only 3 things - personal preference, easy availability of devs from india for hiring purposes, and spending less money. Then sure, java is on the table, as is js, ruby, python, some subset of c/c++ etc.

I'm not sure what you are looking for. Isn't your question akin to asking since Java just compiles to bytecode, what advantage is there to writing java instead of bytecode.

The higher level abstractions that are directly supported by Scala are precisely what make it more "scalable" from an authoring perspective. In particular, the amount of help it provides in writing many objects to support an OO design along with the functional support it has go a long way to reducing your code down to quickly communicating the abstractions you are using, and not the language used to describe them.

This is particularly true in a method body, where one can use inference to remove the types from the code rather well. With judicious use of map/etc, often changing the collection type you are using is reduced to a single change in the code, and not through every intermediate collection along the way.

As far as I know, Scala can't give you _better_ performance than Java, but I also can't see it being too far off.

I've never really seen the benefits of Scala. Everyone whines about Java syntax, but it's really not that bad. Verbose, sure, but it just doesn't seem like time spent typing is where my productivity goes. Maybe I'm just getting old, but I no longer care about language syntax as much as I used to. A preprocessor to give me Java 8 lambda syntax would probably solve all the issues I have with Java right now.

"A preprocessor to give me Java 8 lambda syntax would probably solve all the issues I have with Java right now."

That's one of the things that Scala gives you: the functional features are there now and have been for almost ten years.

Still, I know where you're coming from. I was a die-hard Java guy until I gave Scala a shot. I'm not looking back, though I do think that Java's tooling is much more mature than Scala's.

Scala has much worse tooling (SBT is an oxymoron), and it's much harder to read than Java.

Conceding that it's more fun to write, I think 'harder to read' is more important. Scaling refers to team size and amount of functionality in addition to requests per second.

The "much harder to read" varies greatly between codebases. Most "sane" codebases are 90% (random percentage) as easy to read as Java. Just collection initialization, transformations, simple pattern-matching, method invocations... Some times there is an ugly nested flatMaps/maps/filters. But in my experience, in general, they're more the exception than the rule.
It's also the slowest compilation I've ever seen, even with really clever custom tooling.
Scala compilation is indeed slow (although I think C++ is worse). But I always have a "sbt ~compile" running, compiling every file as they change. So the compilation time is pretty much irrelevant to me. While using eclipse, I guess, you'd get the same speed too.
I didn't either, until I tried it. On one level, yes, it's java with type inference and lambdas and saves you typing. When you dig a little deeper in scala, things like case classes (immutability), pattern matching, for comprehension (monad syntax) suddenly fundamentally change the way you program.

Code becomes a lot more clear, explicit, and... fun. I recently talked to a dev who used Akka with java for a network server, and he ended up coding it in 'an object for each callback handler' style.

When I asked why he did it that way, he said nesting callbacks to sequence operations would get too ugly fast.

This is trivial in scala. The language flaws in java ended up dictating the architecture of his entire application, and for the worse IMO.

"A preprocessor to give me Java 8 lambda syntax would probably solve all the issues I have with Java right now." Xtend seems to do this and more http://blog.efftinge.de/2012/12/java-8-vs-xtend.html
I think the benefit is that people don't want to say they're Java programmers. It's like a hipster thing. Most modern scala programmers don't even use the language correctly, or couldn't write "hello world" if it wasn't in the play framework.
I think you mistook HN for YouTube.
He raised a point, though a little abrasively. You left an entirely content-free response. Who mistook what for what again?

Attack his point and win, instead of pointless drivel like this.

Off the top of my head, Immutability means less locking, which is big enemy of scalability; Actor Model support in the standard library also makes concurrency much easier to deal with. Wikipedia [1] mentions that Scala "actors may also be distributed or combined with software transactional memory".

[1] https://en.wikipedia.org/wiki/Scala_(programming_language)#C...

Semantics aside, there is really no difference between performance in Scala and Java, equivalent code in both compiles to near identical byte code, save for a few niche optimisations (eg tail call recursion) that can make Scala faster. I discussed this here:

https://jazzy.id.au/default/2012/10/16/benchmarking_scala_ag...

Since it's all just bytecode, then there's no difference in what you can do when it comes to scalability. However, it's the semantics, that you think are just personal preference, of Scala that make writing scalable code easier, it biases you towards scalable code with things like default immutability and using asynchronous io and things like that. Anything you do in Scala you can do in Java, but the question is, when the syntax overhead of for example monadic programming is over 5 times greater than the equivalent scala code (and many times less readable and maintainable), would you do that? Most developers don't. I wrote another blog post about that here:

http://jazzy.id.au/default/2012/11/02/scaling_scala_vs_java....

Just a guess (and I didn't look at the link) but possibly the Akka library. http://akka.io/
Akka can be used in Java just fine :)
One simple example of Scala power is the greatness of the Future pattern (scala, akka, finagle/twitter and other implementations).

You can still have the exact same Futures functionality in Java, but they will be so much more verbose and annoying to use that it ends up not being a good idea.

Strawman: you're reducing differences between languages to semantical differences and that's simply not correct.

Someone already mentioned immutability. Another example is sound concurrency support: a language which is using non-blocking synchronization is not going to have threads constantly blocked monitoring locks (I'm not talking about Scala particularly: but in Java the language, the "synchronized" SNAFU is really totally messed up if you've done any serious concurrent programming and it's very hard to get "right").

Then another example it's also a fact that crazy optimization can be achieved by using macro pre-processing at compile-time. Java the language doesn't offer this. It's actually so problematic in certain cases that there have been specific tools developed (e.g. by Nokia for accessing the J2ME Nokia API) that did add macro support to Java.

Another example where macro do shine: an API like trove, using primitives-backed maps/set, runs circles around, say, a normal HashMap{Integer,Long}. But Java cannot automate/duplicate the code needed to offer this for all the primites types (which, some may argue, is precisely the reason why such performant maps/sets have never been part of the stock Java APIs). So how do you do if you want to "run around circle" of these default totally lame HashMap{Long,Integer} like if you're say, the Trove author? You write your own code generator generating all the .java files needed from a template. Pretty cool (from the Trove author) but also pretty lame (from "Java the language").

With real macros this one is a no-brainer. So much for "semantics".

These are just two examples: you're belief that language differences can be resumed to semantics is incorrect.

Would love to see a more detailed post on how Scala helps you specifically.
Been getting that feedback from others. If we did a followup article what specifics would be most interesting to know?
- How did Scala make it easier to design scalable abstractions - both for performance and for features?

- What kind of tooling do you use? Which version of Scala?

- Any tips around hiring and ramping up a team?

Thanks for the offer.

Edit: formatting

I'm so confused. Some diagramming app no one has heard of has to scale to 'exponential' growth and figure out:

* [not] loading a lot of session data on every request

* storing data across shards

* parallel processing

* a services approach

Wow, I think something is over-architected.

Curious: does exponential growth mean 2 concurrent users to 4? If they have more than 100 concurrent users, I would be highly skeptical.

And, if they can't solve that problem with any language, I would also be highly skeptical.

Lucidchart employee here. Though we don't publicly release total user stats, this recent post about 500,000 installs through the Chrome Web Store gives a hint:

https://www.lucidchart.com/blog/2012/10/31/500000-chrome-web...

Even with 550,000 installs there, the Chrome Web Store is just one of a number of strong channels for Lucidchart and contributes a small minority of our user base.

So, how many requests / second is your load balancer receiving from outside?
Why do you think no one has ever heard of an app just because you haven't? In general, your comment feels rather agressive for no good reason.

That said, i have my doubts about the term 'exponential'. I forgive financial journalists who think that it means the same as 'explosive', but a programmer should know better. Unless, of course, Lucidchart does have exponential growth, in which case i guess congrats are in order.

This seems like an unfair reaction from the outside. There is no way you can know their use case and traffic better than they can. What does this add to the conversation anyway?