Hacker News new | ask | show | jobs
by slashdotdash 3555 days ago
Gary Bernhardt describes a similar architecture using a "Functional Core, Imperative Shell" in his Boundaries talk[1].

"Purely functional code makes some things easier to understand: because values don't change, you can call functions and know that only their return value matters—they don't change anything outside themselves. But this makes many real-world applications difficult: how do you write to a database, or to the screen?"

"This design has many nice side effects. For example, testing the functional pieces is very easy, and it often naturally allows isolated testing with no test doubles. It also leads to an imperative shell with few conditionals, making reasoning about the program's state over time much easier."

[1] https://www.destroyallsoftware.com/talks/boundaries

[2] https://www.destroyallsoftware.com/screencasts/catalog/funct...

7 comments

At the lowest levels, the pattern can be reversed: you provide a nice functional shell around a little bit of imperative core (e.g. implementing "map" with a loop).
You are referring to, for example: http://clojure.org/reference/transients

Combining the two ideas

Transient imperative logic in the core (5%), Functional mantle (90%), Side-effecting imperative crust (5%).

Yes, because some purely functional approaches cannot beat imperative ones when it comes to resource usage.
Could you give a concrete example?
Most in-place algorithms. E.g. quicksort

You can do merge sort in Haskell asymptotically as well as C, but not quicksort (because you can't mutate things in place).

I am of course omitting things like ST which do give you this sort of ability in Haskell, but I doubt that's what the OP meant by "purely functional".

ST is exactly the same sort of thing junke was talking about, except it also uses the type system to ensure the imperative core doesn't leak into the outside world.
I think the grandparent comment boils down to saying "there's no known persistent data structure with O(1) random access for read and write". Whether you need such a structure (for a cache, histogram, frame buffer, etc.) is up to you.

Some functional languages allow transient data structures (via something like ST or uniqueness types), so you can match the performance of any imperative algorithm at the cost of ugly code.

I was about to cite Okasaki, but I found a detailed answer over there:

http://stackoverflow.com/a/1990580

> Note also that all of this discusses only asymptotic running times. Many techniques for implementing purely functional data structures give you a certain amount of constant factor slowdown, due to extra bookkeeping necessary for them to work, and implementation details of the language in question. The benefits of purely functional data structures may outweigh these constant factor slowdowns, so you will generally need to make trade-offs based on the problem in question.

Hash tables are a big example. You can implement pure associative arrays, and they have some nice benefits, but the best functional implementations still have much poorer asymptotic performance than a basic hash table.

Pure data structures also tend to include a lot of extra pointers. That can create a lot of overhead. A pure list of 32-bit integers will need either 4 or 8 bytes of overhead (depending on whether you're running 32 or 64 bit) for every item stored. A resizable array just needs whatever empty space it preallocates, plus the occasional memcpy when it needs to add capacity. Also locality of reference and all that fun stuff.

Another example: Caching of results, transparent to the user of your function. If your function is otherwise pure you can simply use its parameter and hash them for a key to your result index. Examples where I've used this: Caching of regex results, replacing file reads with modification date polls and read from cache if not changed..
I was disappointed to learn that Quicksort implemented functionally is almost always much much slower than if implemented procedurally, to the point that functional langs use other sorts such as mergesort. What makes it extra annoying is that Quicksort implemented functionally is so damn elegant!

http://stackoverflow.com/questions/7717691/why-is-the-minima...

Neural networks are a good example where mutable structures are the only realistic choice. You're dealing with fully connected networks of millions of nodes, each of which needs to be updated multiple times for each layer at every pass.
the main thing being that side effects still aren't happening in the core, so the core is still referentially transparent.
mmmmmmmmmmmm. Functional sandwich.
This kind of abstraction layer is really challenging though unless you're totally aware of the extent of your imperative code's side effects.
Isn't this already happening, though, on a low-level (i.e. ASM) ?
Yes, but the domain of your computer memory is surprisingly disjoint from the domain of your business logic, and that's really what matters here.

Consider a counter-example, where calling a functional method that takes an object and creates a copy with a new field updated (a classic pattern for introducing immutability to a mutable environment). What if internally the constructor calls a log call or increments a shared resource tally?

Not unreasonable, but in a functional context an update now has weird side effects that creates misleading results.

CakeML, a Standard ML subset, can compile to ASM with mathematical proof of correctness. You know your ASM will be what you expected it to be. There's also tools like QuickCheck, QuickSpec, and SPARK that can automate lots of analysis. Tons of work like this in compilers and static analysis with smart people always improving them.

Good chance your business logic will not be handled that well. So, good to structure it in a way to facilitate easy analysis or optimization by tools that currently exist or are in development. You get long-term benefits.

But the abstraction there is a lot more battle-hardened than your business logic at $COMPANY. Not that compiler bugs never happen, but it's comparatively quite rare.
I understand the value of referential transparency and how it makes "certain" things easy, but saying that it automatically makes testing functional code easy is a myth. Sometimes it does, sometimes it doesn't.

If you want to stick to referential transparency, you can't use dependency injection: you have to pass all the parameters the function needs. None of these can be implicit or belong to a field on the class since that would mean side effects. The `Reader` monad is not dependency injection, it's dependency passing and it comes with a lot of unpleasant effects on your code.

And because of that, functional code is often very tedious to test. Actually, in my experience, there is a clear tension between code that's referentially transparent and code that's easily testable. In practice, you have to pick one, you can't have both.

> And because of that, functional code is often very tedious to test. Actually, in my experience, there is a clear tension between code that's referentially transparent and code that's easily testable. In practice, you have to pick one, you can't have both.

In all my years of software development, I've never encountered a referentially-transparent function that was even remotely hard to test, let alone harder than one with environmental baggage. In fact, being referentially transparent opens you up to new kinds of powerful testing strategies that are nearly impossible if the function isn't, like QuickCheck. (I can't highly recommend quick check enough, it's worth the little learning curve 100x over)

First of all, passing parameters in functions is "dependency injection". And what you're describing is a really good thing.

Lets be honest, most dependency injection frameworks and techniques are about hiding junk under the rug. But they fix the symptoms, not the disease. You see, if you find yourself having components with too many dependencies, feeling pain on initialization, the problem is that you have too many dependencies, which actually means you have too much tight coupling and not that it is hard to initialize them. At this point you should embrace that pain and treat the actual disease.

Also, functional programming naturally leads to building descriptions of what you want, in a declarative way. So instead of depending directly on services that trigger side-effects directly, like a DB component that does inserts or something doing HTTP requests or whatever, instead you build your application to trigger events that will eventually be linked to those side-effects triggering services.

There are multiple ways of doing this. For example you could have a channel / queue of messages, with listeners waiting for events on that queue. And TFA actually speaks about the Free monad. Well the Free monad is about separating the business logic from the needed side-effects, the idea being to describe your business logic in a pure way and then build an interpreter that will go over the resulting signals and trigger whatever effects you want. There's no dependency injection needed anymore, because you achieve decoupling.

> And because of that, functional code is often very tedious to test.

That hasn't been my experience at all, quite the contrary, we've had really good results and we're doing such a good job of pushing the side-effects at the edge that we no longer care to unit-test side-effecting code. And yes, I believe you've had that experience, but I think it happens often with people new to FP that try and shoehorn their experience into the new paradigm.

E.g. do you need a component that needs to take input from a database? No, it doesn't have to depend on your database component at all. Do you need a component that has to insert stuff into the database? No, it doesn't have to depend on your database component at all. Etc.

> First of all, passing parameters in functions is "dependency injection". And what you're describing is a really good thing.

It's only injection if the parameter is passed automatically by a framework. Otherwise, it's parameter passing.

And it's only a good thing if you value referential transparency over ease of testing and encapsulation. Not everybody does (and personally, sometimes I do and sometimes I don't).

Dude you're mixing up terms. Quoting from https://en.wikipedia.org/wiki/Dependency_injection : "A dependency is an object that can be used (a service). An injection is the passing of a dependency to a dependent object (a client) that would use it". Basically if A depends on B, but A does not initialize B, but is instead receiving it as a parameter from somewhere, then that's dependency injection.

> it's only a good thing if you value referential transparency over ease of testing and encapsulation

I get the feeling that you're mixing up terms again, as you cannot have ease of testing or good encapsulation without referential transparency.

"passing parameters in functions is 'dependency injection'"

No, its not.

You're agreeing with me.

I was saying that passing parameters to functions is "dependency passing", not "dependency injection".

When you do dependency injection well, you don't inject every object; that would be horrible/impossible. What you may have noticed is that there are two kinds of objects, ones that you inject and ones that you don't. The ones you don't are things like numbers, strings and maps/sets; things that you treat as values. The other objects do things, I tend to call them services.

In order to do a straight-forward conversion to functional programming, I suggest leaving the values as they are and each service becomes a free monad transformer. So, instead of having a logger, you have a logging monad transformer that has a log instruction. Instead of having a database, you have a database monad transformer that has a query instruction, etc.

You are then free (no pun intended) to replace the interpreters of these free monads during testing with whatever mock implementation you please and the result is a more principled dependency injection inspired style.

Actually, I would constrain the monad type via type classes, rather than using free monads, but the approaches are equivalent.

I agree there is a dichotomy between objects you inject and objects you don't but I think your characterization is incorrect: what decides if an object needs to be injected is not tied to its type but to its role. Sometimes, I inject integers or strings or other primitive types. Other times, I pass them explicitly.

The decision is made based on whether that object is a runtime object (i.e. decided by the user or some other factor that cannot be known when the app starts) or a dependency that's decided early and won't change through the life of the app.

Either way, this aspect is independent of the point I was making above and which is that functional code is not inherently easier to test than procedural code.

> Either way, this aspect is independent of the point I was making above and which is that functional code is not inherently easier to test than procedural code.

That's true, you can certainly just write procedural code in functional languages and there's no benefit. However, you also have the ability to structure code in a way that is testable and is actually more structured than the equivalent OOP style. By which I mean: the operations on the dependencies are more constrained (since they can't be replaced or duplicated, etc).

> I understand the value of referential transparency and how it makes "certain" things easy, but saying that it automatically makes testing functional code easy is a myth. Sometimes it does, sometimes it doesn't.

> And because of that, functional code is often very tedious to test.

Your argument rests on a fundamentally wrong assumption. Expressions in functional programs do not have to be (and indeed are almost never) referentially transparent. Just consider global or module-level immutable variables. Those function names? Also not referentially transparent. This goes all the way back to free variables in the lambda calculus: https://en.wikipedia.org/wiki/Lambda_calculus#Free_variables

Further, dependency injection is a completely idiotic and broken pattern and IMO the worst thing to come out of object oriented programming. Once you have dynamic scoping (surprise! also not referentially transparent) everything that DI does (and much more) becomes trivial.

My own opinions on the matter aside, I don't fully understand why anyone who likes dynamic scoping would dislike dependency injection.
Because of things like: https://github.com/google/guice

4k LOC of "lightweight" garbage for... variable lookups?

To put that into perspective, the Squeak interpreter for the gold standard of OOP languages, Smalltalk, was about 3951loc of Smalltalk for logic to handle language and 1681loc of C for OS interface. A lightweight scheme for dependency injection took them more loc to express than a whole Smalltalk interpreter. And to hack around bad OOP or tooling in the first place.

This might not mean anything. It just jumped out in my brain for some reason.

So your issue is with the implementation? I can understand that, I guess. Personally I don't like either dynamic scoping or DI; it had just never occurred to me that someone might prefer one to the other.
The reader monad removes 90% of the syntactic overhead of dependency passing. That's the point.

If you want dependency injection as you've defined it, you can use (if we're talking about Haskell) typeclasses or, by extension, implicit parameters, to do dependency injection in the way you like.

It's still much safer and easier to reason about than Java-style dynamic dependency injection.

Actually, `Reader` adds a lot of boiler plate that's not present with traditional @Inject injection:

- All your functions now need to return a Reader[C,A] instead of just A

- You need to pass all the parameters explicitly in each method signature as opposed to passing just the ones that don't need to be injected.

There is a very small amount of boilerplate. And I would argue strongly that it's a good thing; it indicates to the reader of the code that an object's behavior reads from some initialized value, or equivalently that its behavior depends on some initial value which remains fixed through the computation. The reader monad gives you a simple language to express this common pattern, as well as the ability to easily set the behavior.

    -- This function will always return the same thing given the same input
    function1 :: Int -> String
    
    -- This function depends on reading some configuration which 
    -- needs to be provided upstream
    function2 :: Int -> Reader Configuration String
You don't need to pass parameters explicitly in each signature; indeed this is exactly what the reader monad obviates: the details of what is being read are not expressed inside the function (until the point that they they are actually used). This is hardly an onerous burden, in my opinion. And if typing `Reader X Y` is too annoying, you can just make a type alias.
What if `function2` needs to log something? Without dependency injection, you need to pass that logger to the function. With dependency injection, that logger is available without having to pollute the method signature with an implementation detail.
If `function2` needs to log something, then it should have a type signature which reflects that.

Logging is a side-effect. Logging requires configuration to be passed in; it means having access to some file descriptor or other object to interact with, it could potentially fail to connect, or cause a computation to hang, or cause a service to trigger, or make a disk run out of space, etc. If a function wants to log something it's not a simple reader anymore but something more complex. The fact that in Haskell this is reflected in the type signature of the function is again a good thing. It's not "polluting" the method signature; it's putting more information in the method signature. Not letting you hide side effects in a computation that appears to have no externalities is a strength of Haskell, not a weakness.

A small example.

    class (MonadIO m) => HasLogging m where
      log :: String -> m ()
    
    data AppConfig = AppConfig { stuff :: Int }
    
    newtype MyApp a = MyApp { runApp :: ReaderT AppConfig IO a}
      deriving (Functor, Applicative, Monad, MonadIO, MonadReader AppConfig)
    
    instance HasLogging MyApp where
      log s = liftIO (putStrLn s)
    
    function2 :: Int -> MyApp String
    function2 x = do
      log "hey guys I'm logging"
      return (show x)
    
    -- or without specifying the base monad, yay abstraction
    
    function2' x = do
      log "heyooo logging here"
      return (show x)
    
    -- Haskell will infer this type:
    -- function2' :: (HasLogging m, Show a) => a -> m String
Given that you'll notice whether or not function2 does any logging, it's definitely not an implementation detail.
Not sure I agree with you on that, but monads handle your concern nicely. For example, I can write some code that does some database operations, and by parameterizing the code over the type of database actions, my code doesn't care if it's calling a "real" database action or a "fake" one for testing or whatever. That is, the code is completely agnostic as to the implementation details, but we still have full visibility and static checking when we actually run the database code, because we have to specify which database implementation we want to use. Boom, statically verified dependency injection.
"Implementation detail"

We are in strong disagreement about what constitutes an "implementation detail".

But also, you can just use a monad transformer stack and add whatever side-effectful operations you want into it, use it as needed. Boom, dependency injection. And more control over what your functions actually do is there when you need it.

You might be interested in reflection/implicit configurations: https://hackage.haskell.org/package/reflection

This (ab)uses Haskell's type class mechanism to essentially implement dependency injection directly. The implementation looks a bit dirty, but this is a feature that more modern approaches to generic programming can handle natively (e.g., http://homepages.inf.ed.ac.uk/wadler/papers/implicits/implic... ).

In particular, there is nothing shady about the semantics of implicitly passing configuration values/dependencies. Your functions are still referentially transparent if you treat the implicit dependencies as additional parameters (which is what they are, no matter how you implement it).

I like the 'weakly pure' concept in D. A function like

  pure int frignate(database db, const config cfg);
cannot change anything except the database object (and anything reachable from it). Can not mutate the environment. Can not mutate the config object parameter. This is finer control than pure functional programming and safer than imperative/object-oriented programming.
You can achieve this level of control using monad transformers or extensible effects -- this is a standard technique in Haskell.

    frignate :: (MonadDB m) => Config -> m Int
    frignate cfg = do
      db <- getDB
      ...
      return 1
And it is composable, so if a function calls a function that uses one of the managed resources then the requirement propagates upward.

And you can swap in non-IO based instances for testing, or whatever else you want.

I'm not sure how D's purity system works, but that doesn't seem very pure to me. Mutating the database object is a globally visible effect, after all.
The guarantee it's making is that it's not going to manipulate program state that is outside the scope of "db". That is a pretty big deal for a systems developer since, as I discovered while working with D, many standard library functions that you wouldn't have given a second thought about happen to do unpure things like set a processor flag. We aren't even talking about your own program.

From an abstracted viewpoint, it's not great, since a whole database covers potentially a lot of scope, and you may not want to care about the details of your floating point calculations in hardware, but in a concrete sense this is totally correct!

I'd like to use the occasion that I'm really waiting for the video of his talk "Ideology" given at StrangeLoop 2015.
If it helps at all, here's a transcript of that talk: https://github.com/strangeloop/StrangeLoop2015/blob/master/t...
Same, but for whatever he talked about in PyCon 2015
I'm enjoying watching the boundaries talk, especially when he converts the serial code the concurrent code using actors.
This is one of the best talks I've ever seen, highly recommend anything by Gary Bernhardt
I was quite disappointed the author chose not to cite Bernhardt's work, as this is pretty clearly derivative of that talk and other work in the community around this design.

I'm certain I've heard Hickey talk about it a few years ago as well. Trying to remember where.

I don't think it's derivative. He mentions the onion architecture, which is an older concept. I really liked the way Gary presented the idea, but he wasn't the originator.