| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by weavejester 4241 days ago

I realised I might not have been very clear in my previous comment. Let me see if I can improve it with an example.

Let's forget about all other considerations and instead consider the simplest possible build system we can conceive. This build system should take a directory structure of source files, and produce a directory structure of output files.

If our sole consideration is simplicity, we might construct a build system like:

    (defn -main [task & args]
      (-> (read-dir (cwd))
          (run-task task args)
          (write-dir (cwd))))

So we take every file in the current working directory, read everything into memory, perform some functional transformation that produces a data structure of output files, then write that to disk. This minimises I/O, and gives us a functional data structure to play around with.

It's a naive approach, and one made without regard for memory or efficiency, but given that the amount of memory on a modern machine is far larger than the source directory is likely to be, it actually seems feasible.

However, we can also consider optimisations that don't alter the behaviour. For instance, we could only read in files when their contents are accessed. In order to protect against changes, we could check the modification date, and abort if it changes. It's a compromise, but a small one.

We might also conceive of a system where the contents of the file are memory mapped, or held in some temporary file, or any number of clever ways to avoid keeping the file in memory while not breaking the integrity of the data structure.

This is just a toy example, and lacking in many areas like network I/O, but it's easier to start simple and add complexity when necessary, than it is to start from an assumption of complexity and try to work backward to simplicity. This is why I think it's incorrect to start with side-effectful functions, because that means starting from complexity.