Hacker News new | ask | show | jobs
by lukaslalinsky 439 days ago
Whenever I hear someone talking about purely functional programming, no side effects, I wonder what kind of programs they are writing. Pretty much anything I've written over the last 30 years, the main purpose was to do I/O, it doesn't matter whether it's disk, network, or display. And that's where the most complications come from, these devices you are communicating with have quirks that need you need to deal with. Purely functional programming is very nice in theory, but how far can you actually get away with it?
22 comments

The idea of pure functional programming is that you can really go quite far if you think of your program as a pure function f(input) -> outputs with a messy impure thing that calls f and does the necessary I/O before/after that.

Batch programs are easy to fit in this model generally. A compiler is pretty clearly a pure function f(program source code) -> list of instructions, with just a very thin layer to read/write the input/output to files.

Web servers can often fit this model well too: a web server is an f(request, database snapshot) -> (response, database update). Making that work well is going to be gnarly in the impure side of things, but it's going to be quite doable for a lot of basic CRUD servers--probably every web server I've ever written (which is a lot of tiny stuff, to be fair) could be done purely functional without much issue.

Display also can be made work: it's f(input event, state) -> (display frame, new state). Building the display frame here is something like an immediate mode GUI, where instead of mutating the state of widgets, you're building the entire widget tree from scratch each time.

In many cases, the limitations of purely functional isn't that somebody somewhere has to do I/O, but rather the impracticality of faking immutability if the state is too complicated.

I guess my point is that you actually have to write the impure code somehow and it's hard, external world has tendencies to fail, needs to be retried, coordinated with other things. You have to fake all these issues. In your web server examples, if you need to a cache layer for certain part of the data, you really can't without encoding it to the state management tooling. And this point you are writing a lot of non-functional code in order to glue it together with pure functions and maybe do some simple transformation in the middle. Is it worth it?

I have respect for OCaml, but that's mostly because it allows you to write mutable code fairly easily.

Roc codifies the world vs core split, but I'm skeptical how much of the world logic can be actually reused across multiple instances of FP applications.

There's a spectrum of FP languages with Haskell near the "pure" end where it truly becomes a pain to do things like io and Clojure at the more pragmatic end where not only is it accepted that you'll need to do non functional things but specific facilities are provided to help you do them well and in a way that can be readily brought into the functional parts of the language.

(I'm biased though as I am immersed in Clojure and have never coded in Haskell. But the creator of Clojure has gone out of his way to praise Haskell a bunch and openly admits where he looked at or borrowed ideas from it.)

> external world has tendencies to fail, needs to be retried, coordinated with other things.

This is exactly why I'm so aggressive in splitting IO from non-IO.

A pure function generally has no need to raise an exception, so if you see one, you know you need to fix your algorithm not handle the exception.

Whereas every IO action can succeed or fail, so those exceptions need to be handled, not fixed.

> You have to fake all these issues.

You've hit the nail on the head. Every programmer at some point writes code that depends on a clock, and tries to write a test for it. Those tests should not take seconds to run!

In some code bases the full time is taken.

  handle <- startProcess
  while handle.notDone
    sleep 1000ms
  check handle.result
In other code-bases, some refactoring is done, and fake clock is invented.

   fakeClock <- new FakeClock(10:00am)
   handle <- startProcess(fakeClock);
   fakeClock.setTime(10:05am)
   waitForProcess handle
Why not go even further and just pass in a time, not a clock?

   let result = process(start=10:00am, stop=10:05)
Typically my colleagues are pretty accepting of doing the work to fake clocks, but don't generalise that solution to faking other things, or even skipping the fakes, and operating directly on the inputs or outputs.

Does your algorithm need to upload a file to S3? No it doesn't, it needs to produce some bytes and a url where those bytes should go. That can be done in unit-test land without any IO or even a mocking framework. Then some trivial one-liner higher up the call-chain can call your algorithm and do the real S3 upload.

I completely agree, but I still question the purpose of FP languages. Writing the S3 upload code is quite hard, if you really want to handle all possible error scenarios. Even if you use whatever library for that, you still need to know about which errors can it trigger and which need to be handle, and how to handle them. The mental work can be equal to the core function for generating the file. In any language, I'd separate these two pieces of code, but I'm not sure if I'd want to handle S3 upload logic with all the error handling in a FP language. That said, I've not used Clojure yet and that seems like a very pragmatic language, which might be actually usable even for these parts of the code.
Think of it like other features:

* Encapsulation? What's the point of having it if's perfectly sealed off from the world? Just dead-code eliminate it.

* Private? It's not really private if I can Get() to it. I want access to that variable, so why hide it from myself? Private adds nothing because I can just choose not to use that variable.

* Const? A constant variable is an oxymoron. All the programs I write change variables. If I want a variable to remain the same, I just wont update it.

Of course I don't believe in any of the framings above, but it's how arguments against FP typically sound.

Anyway, the above features are small potatoes compared to the big hammer that is functional purity: you (and the compiler) will know and agree upon whether the same input will yield the same output.

Where am I using it right now?

I'm doing some record linkage - matching old transactions with new transactions, where some details may have shifted. I say "shifted", but what really happened was that upstream decided to mutate its data in-place. If they'd had an FPer on the team, they would not have mutated shared state, and I wouldn't even need to do this work. But I digress.

Now I'm trying out Dijkstra's algorithm, to most efficiently match pairs of transactions. It's a search algorithm, which tries out different alternatives, so it can never mutate things in-place - mutating inside one alternative will silently break another alternative. I'm in C#, and was pleasantly surprised that ImmutableList etc actually exist. But I wish I didn't have to be so vigilant. I really miss Haskell doing that part of my carefulness for me.

>I'm in C#, and was pleasantly surprised that ImmutableList etc actually exist.

C# has introduced many functional concepts. Records, pattern matching, lambda functions, LINQ.

The only thing I am missing and will come later is discriminated unions.

Of course, F# is more fited for the job if you want a mostly functional workflow.

I don't want functional-flavoured programming, I want functional programming.

Back when I was more into pushing Haskell on my team (10+ years ago), I pitched the idea something like:

  You get: the knowledge that your function's output will only depend on its input.

  You pay: you gotta stop using those for-loops and [i]ndexes, and start using maps, folds, filters etc.
Those higher-order functions are a tough sell for programmers who only ever want to do things the way they've always done them.

But 5 years after that, in Java-land everyone was using maps, folds and filters like crazy (Or in C# land, Selects and Wheres and SelectManys etc,) with some half-thought-out bullshit reasoning like "it's functional, so it must good!"

So we paid the price, but didn't get the reward.

Using map, fold etc. is not the hard part of functional programming. The hard part is managing effects (via monads, monad transformers, or effects). Trying to convert a procedural inner mutating algorithm to say Haskell is challenging.
Never used monads with Clojure (the only Lisp I've done "serious" work in). Haskell introduced them to me, but I've never done anything large with Haskell (no jobs!). Scala, however, has monads via the cats or (more recently) the ZIO library and they work just fine there.

The main problem with Monads is you're almost always the only programmer on a team who even knows what a Monad is.

> The hard part is managing effects

You can say that again!

Right now I'm working in C#, so I wished my C# managed effects, but it doesn't. It's all left to the programmer.

I don't know, stacking monads is a comparable level of pain to me.
One struggle I’ve had with wrapping my head around using FP and lisp like languages for a “real world” system is handling something like logging. Ideally that’s handled outside of the function that might be doing a data transformation but how do you build a lot message that outputs information about old and new values without contamination of your “pure” transducer?

You could I guess have a “before” step that iterates your data stream and logs all the before values, and then an “after” step that iterates after and logs all the after and get something like:

``` (->> (map log-before data) (map transform-data) (map log-after-data)) ```

But doesn’t that cause you to iterate your data 2x more times than you “need” to and also split your logging into 2x as many statements (and thus 2x as much IO)

So, do you mean like you have some big array, and you want to do something like this? (Below is not a real programming language.)

  for i in 0 to arr.len() {
      new_val = f(arr[i]);
      log("Changing {arr[i]} to {new_val}.\n");
      arr[i] = new_val;
  }
I haven't used Haskell in a long time, but here's a kind of pure way you might do it in that language, which I got after tinkering in the GHCi REPL for a bit. In Haskell, since you want to separate IO from pure logic as much as possible, functions that would do logging return instead a tuple of the log to print at the end, and the pure value. But because that's annoying and would require rewriting a lot of code manipulating tuples, there's a monad called the Writer monad which does it for you, and you extract it at the end with the `runWriter` function, which gives you back the tuple after you're done doing the computation you want to log.

You shouldn't use Text or String as the log type, because using the Writer involves appending a lot of strings, which is really inefficient. You should use a Text Builder, because it's efficient to append Builder types together, and because they become Text at the end, which is the string type you're supposed to use for Unicode text in Haskell.

So, this is it:

  import qualified Data.Text.Lazy as T
  import qualified Data.Text.Lazy.Builder as B
  import qualified Data.Text.Lazy.IO as TIO
  import Control.Monad.Writer
  
  mapWithLog :: (Traversable t, Show a, Show b) => (a -> b) -> t a -> Writer B.Builder (t b)
  mapWithLog f = mapM helper
    where
      helper x = do 
        let x' = f x
        tell (make x <> B.fromString " becomes " <> make x' <> B.fromString ". ")
        pure x'
      make x = B.fromString (show x)

  theActualIOFunction list = do
    let (newList, logBuilder) = runWriter (mapWithLog negate list)
    let log = B.toLazyText logBuilder
    TIO.putStrLn log
    -- do something with the new list...
So "theActualIOFunction [1,2,3]" would print:

  1 becomes -1. 2 becomes -2. 3 becomes -3.
And then it does something with the new list, which has been negated now.
Does this imply that the logging doesn't happen until all the items have been processed though? If I'm processing a list of 10M items, I have to store up 10M*${num log statements} messages until the whole thing is done?
Now repeat it for every function where you want to log.

Now repeat this for every location where you want to log something because you're debugging

> You get: the knowledge that your function's output will only depend on its input.

> You pay: you gotta stop using those for-loops and [i]ndexes, and start using maps, folds, filters etc.

You're my type of guy. And literally none of my coworkers in the last 10 years were your type of guy. When they read this, they don't look at it in awe, but in horror. For them, functions should be allowed to have side effects, and for loops is a basic thing they don't see good reason to abandon.

Statistically most of ones coworkers will never have looked at and used to write actual code with a functional language, so it is understandable they don't get it. What makes me sad is the apparent unwillingness to learn such a thing and sticking with "everything must OOP" even in situations where it would be (with a little practice and knowledge in functional languages) simple to make it purely functional and make testing and parallelization trivial.
> Statistically most of ones coworkers will never have looked at and used to write actual code with a functional language, so it is understandable they don't get it.

I'm not against functional languages. My point was that if you want to encourage others to try it, those two are not what you want to lead with.

But that's the irony of it, they did abandon the for-loops!

Maps and folds and filters are everywhere now. Why? Because 'functional is good!' ... but why is functional good?

> I don't want functional-flavoured programming, I want functional programming.

> you gotta stop using those for-loops and [i]ndexes, and start using maps, folds, filters etc.

You mean what C# literally does everywhere because Enumerable is the premier weapon of choice in the language, and has a huge amount of exactly what you want: https://learn.microsoft.com/en-us/dotnet/api/system.linq.enu...

(well, with the only exception of foreach which is for some odd reason is still a loop).

> But 5 years after that

Since .net 3.5 18 years ago: https://learn.microsoft.com/en-us/dotnet/api/system.linq.enu...

> So we paid the price, but didn't get the reward.

Who is "we", what was the price, and what was the imagined reward?

> Who is "we", what was the price, and what was the imagined reward?

Slow down and re-read.

>> You get: the knowledge that your function's output will only depend on its input.

>> You pay: you gotta stop using those for-loops and [i]ndexes, and start using maps, folds, filters etc.

Still makes no sense. Once again: who paid, what was the price, what was the expected reward?
Those starred rhetorical questions initially looked to me like a critique of Lisp! Because that's how Lisp (particularly Common Lisp) works. All those things are softish. You can see unexported symbols even if you're not supposed to use them. There is no actual privacy unless you do something special like unintern then recreate a symbol.
> you (and the compiler) will know and agree upon whether the same input will yield the same output

What exactly does this mean? Haskell has plenty of non-deterministic functions — everything involving IO, for instance. I know that IO is non-deterministic, but how is that expressed within the language?

Functions which use IO are tagged as such in the type system. IO can call non-IO, but not vice-versa.
Not even the most fanatical functional programming zealots would claim that programs can be 100% functional. By definition, a program requires inputs and outputs, otherwise there is literally no reason to run it.

Functional programming simply says: separate the IO from the computation.

> Pretty much anything I've written over the last 30 years, the main purpose was to do I/O, it doesn't matter whether it's disk, network, or display.

Every useful program ever written takes inputs and produces outputs. The interesting part is what you actually do in the middle to transforms inputs -> outputs. And that can be entirely functional.

> Every useful program ever written takes inputs and produces outputs. The interesting part is what you actually do in the middle to transforms inputs -> outputs. And that can be entirely functional.

My work needs pseudorandom numbers throughout the big middle, for example, drawing samples from probability distributions and running randomized algorithms. That's pretty messy in a FP setting, particularly when the PRNGs get generated within deeply nested libraries.

At what point does this get messy?
When deeply nested libraries generate PRNGs, all that layering becomes impure and must be treated like any other stateful or IO code. In Haskell, that typically means living with a monad transformer or effect system managing the whole stack, and relatively little pure code remains.

The messiness gets worse when libraries use different conventions to manage their PRNG statefulness. This is a non-issue in most languages but a mess in a 100% pure setting.

What I don't understand about your comment is: Where do these "deeply nested libraries" come from? I use one library or even std library and pass the RNG along in function arguments or as a parameter. Why would there be "deeply nested" libraries? Is it like that in Haskell or something? Perhaps we are using different definitions of "library"?
It's not that bad if you're using a splittable RNG, is it? Any function that (transitively) depends on an RNG needs an extra input, but that's it.
>Not even the most fanatical functional programming zealots would claim that programs can be 100% functional. By definition, a program requires inputs and outputs, otherwise there is literally no reason to run it.

So a program it's a function that transforms the input to the output.

>separate the IO from the computation.

What about managing state? I think that is an important part and it's easy to mess it.

Each step calculates the next state and returns it. You can then compose those state calculators. If you need to save the state that’s IO and you have a bit specifically for it.
It takes a bit of discipline, but generally all state additions should be scoped to the current context. Meaning, when you enter a subcontext, it has become input and treated as holy, and when you leave to the parent context, only the result matters.

But that particular context has become inpure and decried as such in the documentation, so that carefulness is increased when interacting with it.

> separate the IO from the computation.

Can you please elaborate on this point? I read it as this web page (https://wiki.c2.com/?SeparateIoFromCalculation) describes, but I fail to see why it is a functional programming concept.

> but I fail to see why it is a functional programming concept.

"Functional programming" means that you primarily use functions (not C functions, but mathematical pure functions) to solve your problems.

This means you won't do IO in your computation because you can't do that. It also means you won't modify data, because you can't do that either. Also you might have access to first class functions, and can pass them around as values.

If you do procedural programming in C++ but your functions don't do IO or modify (not local) values, then congrats, you're doing functional programming.

Thanks. I now see why it makes sense to me. I work in DE so in most of our cases we do streaming (IO) without any transformation (computation), and then we do transformation in a total different pipeline. We never transform anything we consumed, always keep the original copy, even if it's bad.
> I fail to see why it is a functional programming concept.

Excellent! You will encounter 0 friction in using an FP then.

To the extent that programmers find friction using Haskell, it's usually because their computations unintentionally update the state of the world, and the compiler tells them off for it.

Think about this: if a function calls another function that produces a side effect, both functions become impure (non-functional). Simply separating them isn't enough. That's the difference when thinking of it in functional terms

Normally what functional programmers will do is pull their state and side effects up as high as they can so that most of their program is functional

Having functions which do nothing but computation is core functional programming. I/O should be delegated to the edges of your program, where it is necessary.
> The interesting part is what you actually do in the middle to transforms inputs -> outputs.

Can you actually name something? The only thing I can come up with is working with interesting algorithms or datastructures, but that kind of fundamental work is very rare in my experience. Even if you do, it's quite often a very small part of the entire project.

A whole web app. The IO are generally user facing network connections (request and response), IPC and RPC (databases, other services), and files interaction. Anything else is logic. An FP programs is a collection of pipes, and IO are the endpoints. With FP the blob of data passes cleanly from one section to another while in imperative, some of it sticks. In OOP, there’s a lot of blob, that flings stuff at each other and in the process create more blobs.
A general "web app"'s germane parts are:

- The part that receives the connection

- The part that sends back a response

- Interacting with other unspecified systems through IPC, RPC or whatever (databases mainly)

The shit in between, calculating a derivative or setting up a fancy data structure of some kind or something, is interesting but how much of that do we actually do as programmers? I'm not being obtuse - intentionally anyway - I'm actually curious what interesting things functional programmers do because I'm not seeing much of it.

Edit: my point is, you say "Anything else is logic." to which I respond "What's left?"

> calculating a derivative or setting up a fancy data structure of some kind or something, is interesting but how much of that do we actually do as programmers?

A LOT, depending on the domain. There are many R&D and HPC labs throughout the US in which programmers work directly with specialists in the hard sciences. A significant percentage of their work is akin to "calculating a derivative".

There's lots left!

"When a customer in our East Coast location makes this purchase then we apply this rate, blah blah blah".

"When someone with >X karma visits HN they get downvote buttons on comments, blah blah blah".

Yes! In most projects, those requirements are stretched across tecnicalities like IOs. But you can pull them back to the core of your project. It takes effort, but the end result is a pleasure to work with. It can be done with FP, OOP, LP,…
> Even if you do, it's quite often a very small part of the entire project.

So your projects are only moving bits from one place to another? I've literally never seen that in 20 years of programming professionally. Even network systems that are seen as "dumb pipes" need to parse and interpret packet headers, apply validation rules, maintain BGP routing tables, add their own headers etc.

Surely the program calculates something, otherwise why would you need to run the program at all if the output is just a copy of the input?

Yes and I notice you still did not provide an interesting example. Surely parsing packets is not an interesting example of functional programming's powers?

What interesting things do you do as a programmer, really?

> parse and interpret packet headers, apply validation rules, maintain BGP routing tables, add their own headers etc.

That's a few more than zero. I don't do network programming, that was just an example to show how even the quintessential IO-heavy application requires non-trivial calculations internally.

Fair enough. It's just that in my experience the "cool bits" are quickly done and then we get bogged down in endless layers of inter-systems communication (HTTP, RPC, file systems, caches). I often see FP people saying stuff like "it's not 100% pure, of course there are some isolated side-effects" and I'm thinking.. my brother, I live inside side-effects. The days I can have even a few pure functions are few and far between. I'm honestly curious what percentage of your code bases can be this pure.

But of course this heavily depends on the domain you are working in. Some people work in simulation or physics or whatever and that's where the interesting bits begin. (Even then I'm thinking "programming" is not the interesting bit, it's the physics)

It's a matter of framing. Think of any of the following:

- Refreshing daily "points" in some mobile app (handling the clock running backward, network connectivity lapses, ...)

- Deciding whether to send an marketing e-mail (have you been unsubscribed, how recently did you send one, have you sent the same one, should you fail open or closed, is this person receptive to marketing, ...)

- How do you represent a person's name and transform it into the things your system needs (different name fields, capitalization rules, max characters, what it you try to put it on an envelope and it doesn't fit, ...)

- Authorization logic (it's not enough to "just use a framework" no matter your programming style; you'll still have important business logic about who can access what when and how the whole thing works together)

And so on. Everything you're doing is mapping inputs to outputs, and it's important that you at least get it kind of close to correct. Some people think functional programming helps with that.

When I see this list all I can think of is how all these things are just generic, abstract rules and have nothing to do with programming. This, of course, is my problem. I have a strange mental model of things.

I can't shake off the feeling we should be defining some clean sort of "business algebra" that can be used to describe these kind of notions in a proper closed form and can then be used to derive or generate the actual code in whatever paradigm you need. What we call code feels like a distraction.

I am wrong and strange. But thanks for the list, it's helpful and I see FP's points.

You're maybe strange (probably not, when restricted to people interested in code), but wrongness hasn't been proven yet.

I'd push back, slightly, in that you need to encode those abstract rules _somehow_, and in any modern parlance that "somehow" would be a programming language, even if it looks very different from what we're used to.

From the FP side of things, they'd tend to agree with you. The point is that these really are generic, abstract rules, and we should _just_ encode the rules and not the other state mutations and whatnot that also gets bundled in.

That implicitly assumes a certain rule representation though -- one which takes in data and outputs data. It's perfectly possible, in theory, to describe constraints instead. Looking at the example of daily scheduling in the presence of the clock running backward; you can define that in terms of inputs and outputs, or you can say that the desired result satisfies (a) never less than the wall clock, (b) never decreases, (c) is the minimal such solution. Whether that's right or not is another story (it probably isn't, by itself -- lots of mobile games have bugs like that allowing you to skip ads or payment forever), but it's an interesting avenue for exploration given that those rules can be understood completely orthogonally and are the business rules we _actually_ care about, whereas the FP, OOP, and imperative versions must be holistically analyzed to ensure they satisfy business rules which are never actually written down in code.

I agree.

Especially when reading Rust or C++.

That's code I would prefer to have generated for me as needed in many cases, I'm generally not that interested in manually filling in all the details.

Whatever it is, it hasn't been created yet.

You can name almost anything (these are general-purpose languages, after all), but I'll just throw a couple of things out there:

1. A compiler. The actual algorithms and datastructures might not be all that interesting (or they might be if you're really interested in that sort of thing), but the kinds of transformations you're doing from stage to stage are sophisticated.

2. An analytics pipeline. If you're working in the Spark/Scala world, you're writing high-level functional code that represents the transformation of data from input to output, and the framework is compiling it into a distributed program that loads your data across a cluster of nodes, executes the necessary transformations, and assembles the results. In this case there is a ton of stateful I/O involved, all interleaved with your code, but the framework abstracts it away from you.

Thanks, especially two is very interesting. Admittedly the framework itself is the actually interesting part and that's what I meant with this work being "rare" (I mean how many people work on those kinds of frameworks fulltime? It's not zero, but..)

I think what I engaged with is the notion that most programming "has some side-effects" ("it's not 100% pure"), but much of what I see is like 95% side-effects with some cool, interesting bits stuffed in between the endless layers of communication (without which the "interesting" stuff won't be worth shit).

I feel FP is very, very cool if you got yourself isolated in one of those interesting layers but I feel that's a rare place to be.

Yeah, it's not that e.g. Haskell won't allow side effects, it's that side effects are constrained: 1) all the side-effectful operations have types that forbid you from using them outside of a side-effect context; 2) and it's a good thing they do, because Haskell's laziness means the results you would get otherwise are counterintuitive.

Other FP frameworks are far less strict about such things, and many FP features are now firmly in the mainstream. So no, I don't think this stuff is particularly rare, though Haskell/OCaml systems probably still are. There are pluses and minuses with structuring code in a pure-core-with-side-effect-shell way – FP enthusiasts tend to think the pluses outweigh the minuses.

Best, I think, to view FP not as dogma or as a class of FP-only languages, but rather as a paradigm first, a set of languages second.

It's always hard to parse if people mean functional programming when bringing up Lisp. Common Lisp certainly is anything but a functional language. Sure, you have first order functions, but you in a way have that in pretty much all programming languages (including C!).

But most functions in Common Lisp do mutate things, there is an extensive OO system and the most hideous macros like LOOP.

I certainly never felt constrained writing Common Lisp.

That said, there are pretty effective patterns for dealing with IO that allow you to stay in a mostly functional / compositional flow (dare I say monads? but that sounds way more clever than it is in practice).

> It's always hard to parse if people mean functional programming when bringing up Lisp. Common Lisp certainly is anything but a functional language. Sure, you have first order functions, but you in a way have that in pretty much all programming languages (including C!).

It's less about what the language "allows" you to do and more about how the ecosystem and libraries "encourage" you to do.

Any useful program has side-effects. IMHO the point is to isolate the part of the code that has the side-effects as much as possible, and keep the rest purely functionsl. That makes it easier to debug, test, and create good abstractions. Long term it is a very good approach.
> Pretty much anything I've written over the last 30 years, the main purpose was to do I/O, it doesn't matter whether it's disk, network, or display.

Erlang is a strictly (?) a functional language, and the reason why it was invented was to do network-y stuff in the telco space. So I'm not sure why I/O and functional programming would be opposed to each other like you imply.

> Erlang is a strictly (?) a functional language,

First and foremost Erlang is a pragmatic programming language :)

This is discussing Common Lisp which is not even a mostly-functional language, and far from purely functional.
He says Lisp, rather than Common Lisp. Sure, given the context he's writing in now, maybe he means Common Lisp, but Joe Marshall was a Lisp programmer before Common Lisp existed, so he may not mean Common Lisp specifically.
Somehow haskell and friends shifted the discussion around functional programming to pure vs non-pure! I am pretty sure it started with functions as first order objects as differentiator in schemes, lisps and ml family languages. Thus functional, but that's just a guess.
> Somehow haskell and friends shifted the discussion around functional programming to pure vs non-pure

In direct response every other language in the mid 2010s saying, "Look, we're functional too, we can pass functions to other functions, see?"

  foo.bar()
     .map(x => fireTheMissiles())
     .collect();
C's had that forever:

  void qsort(void *base, size_t nmemb, size_t size,
             int (*compar)(const void *, const void *))
In a way this is true.

A function pointer is already half way there. What it lacks is lexical environment capture.

And things that are possible to do with closures never stop amazing me.

Anyways, functional programming is not about purity. It is something that came from the academia, with 2 major language families: ML-likes and Lisp-likes, each focusing on certain key features.

And purity is not even the key feature of MLs in general.

Closures bring me joy.
They are one of those language features that, having learned them, it's a little hard to flip my brain around into the world I knew before I learned them.

If I think hard, I can sort of remember how I used to do things before I worked almost exclusively in languages that natively support closures ("Let's see... I create a state object, and it copies or retains reference to all the relevant variables... and for convenience I put my function pointer in there too usually... But I still need rules for disposing the state when I'm done with it..." It's so much nicer when the language handles all of that bookkeeping for you and auto-generates those state constructs).

No, functions aren't first class in C. When you use a function in an expression it undergoes lvalue conversion and "decays" to a pointer to the function. You can only call, store, etc function pointers, not functions. Function pointers are first class. Functions are not as you can't create them at runtime.

A functional programming language is one with first class functions.

What is the impact on the user of having first class functions vs first class function pointers?

Last I checked when you implement lambda in lisp it's also a pointer to the lambda internally.

Function pointers can’t close over variables.
I wrote a recursive descent parser in Lisp for a YAML replacement language[1]. It wasn't difficult. Lisp makes it easy to write I/O, but also easy to separate logic from I/O. This made it easy for me to write unit tests without mocking.

I also wrote a toy resource scheduler at an HTTP endpoint in Haskell[2]. Writing I/O in Haskell was a learning curve but was ultimately fine. Keeping logic separate from I/O was the easy thing to do.

1: https://github.com/djha-skin/nrdl

2: https://github.com/djha-skin/lighthouse

It's about minimizing and isolating state and side effects, not eliminating them completely

Functional core, imperative shell is a common pattern. Keep the side effects on the outside. Instead of doing side effects directly, just return a data structure that can be used to enact the side effect

As others have said, a pure program is a useless program. The only place stuff like that has in this world is as a proof assistant.

What I will add is look up how the GHC runtime works, and the STGM. You may find it extremely interesting. I didn't "get" functional programming until I found out about how exotic efficient execution of functional programs ends up being.

"Purely Functional Programming", I guess mostly Haskell/Purescript.

So this only really mean:

Purely Functional Programming by default.

In most programming languages you can write

"hello " + readLine()

And this would intermix pure function (string concatenation) and impure effect (asking the user to write some text). And this would work perfectly.

By doing so, the order of evaluation becomes essential.

With a pure functional programming (by default).

you must explicitely separate the part of your program doing I/O and the part of your program doing only pure computation. And this is enforced using a type system focusing on I/O. Thus the difference between Haskell default `IO` and OCamL that does not need it for example.

in Haskell you are forced by the type system to write something like:

    do 
      name <- getLine
      let s = "Hello " <> name <> "!"
      putStrLn s
you cannot mix the `getLine` directly in the middle of the concatenation operation.

But while this is a very different style of programming, I/O are just more explicit, and they "cost" more, because writing code with I/O is not as elegant, and easy to manipulate than pure code. Thus it naturally induce a way of coding that try to really makes you conscious about the part of your program that need IO and the part that you could do with only pure function.

In practice, ... yep, you endup working in a "Specific to your application domain" Monad that looks a lot like the IO Monad, but will most often contains IO.

Another option is to use a free monad for your entire program that makes you able to write in your own domain language and control its evaluation (either using IO or another system that simulates IO but is not really IO, typically for testing purpose).

This is a good write up. Thank you for it. My experience with Haskell only comes from university and the focus there was primarily on the pure side of the code. I'll have a look at how Haskell deals with the pure/inpure split for some real-world tasks. The Lisp way of doing it just seems weird to me, too ad hoc, not really structured.
OpenSCAD is such a good school of functional programming. There is no "time" or flow of execution. Or variables, scopes and environments. You are not constructing a program, but a static model which has desired properties in space and time.
The point of functional, "sans I/O" style is to separate the definition of I/O from the rest of your logic. You're still doing I/O, but what sorts of I/O you're doing has a clear self-contained definition within your program. https://sans-io.readthedocs.io/how-to-sans-io.html
There is no reason you can't use side effects in pure functional programming. You just need to provide the appropriate description of the side effect to avoid caching and force a particular evaluation order. If you have linear types, you do it by passing around opaque tokens. I'm not entirely sure how IO works in Haskell, but I think the implementation is similar. Even C compilers use a system like that internally.
The boundary between the program and the rest of the system allows I/O of course. What FP does is "virtualize" I/O by representing it as data (thus it can be passed around). Then at some point these changes get "committed" to the outside. Representing I/O separately from how it is carried out allows a lot of things to be done, such as cancelling (ctrl+z) operations.
Everyone writes real programs that have side effects. Functional programming is no different. But the side effects happen in specific ways or places, rather than all over the place.
Most of the code in most programs is not the part that is doing the I/O. It's doing stuff on a set of values to transform them. It gets values from somewhere, does stuff using those values, and then outputs some values. The complicated part is not the transfer of the final byte sequence to whatever I/O interface they go to, the core behavior of the program is the stuff that happens before that.
There are ways to handle side effects with pure functions only (it’s kind of cheating, because the actual side effects are performed by the non-pure runtime/framework that’s abstracted away, while the pure user code just defines when to perform them and how to respond to them). It’s possible, but it gets very awkward very fast. I wouldn’t use FP for any part of the code that deals with IO.
That was the question, what would you use FP for?
I had the same question until I understood one key pattern of pure functional programming. Not sure it has a name but here goes.

There is world, and there is a model of the world - your program. The point of the program, and all functions, is to interact with the model. This part, data structures and all, is pure.

The world interacts with the model through an IO layer, as in haskell.

Purity is just an enforcement of this separation.

I thinks it the imperative shell, functional core. The shell provide the world, the core act on it, the the shell commit it at various intervals.

Functional React follows this pattern. The issue is when the programmer thinks the world is some kind of stable state that you can store results in. It’s not, the whole point is to be created anew and restart the whole computation flow. The escape hatches are the hooks. And each have a specific usage and pattern to follow to survive world recreation. Which why you should be careful with them as they are effectively world for subcomponents. So when you add to the world with hooks, interactions with the addition should stay at the same level

> Whenever I hear someone talking about purely functional programming, no side effects, I wonder what kind of programs they are writing

Where have you ever heard anyone talk about side-effect free programs, outside of academic exercises? The linked post certainly isn't about 100% side-effect/state free code.

Usually, people talk about minimizing side-effects as much as possible, but since we build programs to do something, sometimes connected to the real world, it's basically impossible to build a program that is both useful and 100% side-effect free, as you wouldn't be able to print anything to the screen, or communicate with other programs.

And minimizing side-effects (and minimizing state overall) have a real impact on how easy it is to reason about the program. Being really carefully about where you mutate things, leads to most of the code being very explicit about what it's doing, and code only affects data that is close to where the code itself is, compared to intertwined state mutation, where things everywhere in the codebase can affect state anywhere.

The pragmatic approach is to see that FP's key point is statelessness and use that in your code (written in more mainstream languages) when appropriate.
I never believed any FP evangelist ever since I realized I can't even write quicksort with it *.

(* Yes, you can technically write it procedurally like a good C programmer, sure.)