Hacker News new | ask | show | jobs
by barrkel 4151 days ago
OK. How do you easily fork to run a command in the background? How does setting up pipes work? What's the idiom for chdir'ing to a subdirectory such that you pop back out again when you're done (I'd use a subshell with (ch xxx; ...) in bash)?

Getting into more tricky stuff, what's the equivalent of <() in bash?

This doesn't really demonstrate anything that shell scripts are actually written for: orchestrating and composing other processes, and job control.

If you wanted to leverage type checking for safety, it would be more interesting to typecheck the streams input and output by pipes.

3 comments

Did we read the same article? The entire 'streaming section' is about pipes and I/O redirects. Running a command in the background is just forkIO $ proc ..etc.., as in regular Haskell.

Nothing tricky about it.

The streaming section of the article has nothing about composing processes, that I could see; it appeared to be about treating the output of commands as input to Haskell lazy lists. I may have misread it, though.

Here's a pattern that comes up fairly frequently for me:

  foo | fgrep -v -f <(cut -f 2 info.csv) | bar
It uses the second column in info.csv as fixed strings to match inside lines in the output of foo, and filters them out, with the remaining lines going to bar.

All 4 processes (foo, bar, fgrep, cut) run concurrently. Likely fgrep will block on cut sooner or later, but the point is that multiple communicating concurrent processes are set up using a fairly easy to use DSL.

That's what a shell is, to me.

There are two ways you can embed that within `turtle`. You can either embed each step as its own concurrent process, like this:

    -- Note, the flow is right-to-left, not left-to-right
    inshell "bar" (inshell "fgrep ..." (inshell "foo" empty))
Or you can just embed the entire thing within a single `inshell` command:

    inshell "foo | fgrep ... | bar"
The reason this works is that the type of `inshell` is:

    inshell
        :: Text        -- Command line
        -> Shell Text  -- Standard input to feed
        -> Shell Text  -- Standard output from command
This leads you stream to any shell command's input and read the command's output also as a stream.
> What's the idiom for chdir'ing to a subdirectory such that you pop back out again when you're done

This can be done with the "bracket" function, which works roughly like a context manager in Python:

  import Control.Exception
  import System.Directory

  withDirectory :: FilePath -> IO a -> IO a
  withDirectory path action = bracket (getCurrentDirectory <* setCurrentDirectory path) 
                                      setCurrentDirectory 
                                      (const action)
> How do you easily fork to run a command in the background?

`turtle` provides `fork` for running a command in the background. Example usage:

    example = do
        using (fork commandToForkInAnotherThread)
        theseCommandsStillRunInTheOriginalThread
> How does setting up pipes work?

See the `inproc` and `inshell` commands, which let you convert any shell command into a stream transformation embedded within Haskell.

> What's the idiom for chdir'ing to a subdirectory such that you pop back out again when you're done (I'd use a subshell with (ch xxx; ...) in bash)?

You can write a combinator for this using `turtle` pretty easily:

    pushd newDir = do
        oldDir <- pwd
        cd newDir
        return (cd oldDir)
... and you use it like this:

    example = do
        popDir <- pushDir "/tmp"
        ... do stuff ...
        popDir
> what's the equivalent of <() in bash?

`inproc`/`inshell` which let you read in a command's standard output as a stream

FWIW:

<(foo) in bash creates a fifo, and pipes the output of foo to the fifo. It then replaces the whole <(foo) argument with the path to the fifo. This means that commands that normally expect to read from a file on the command line can instead be wired to read their input from a process. And, of course, both processes run concurrently.

>(foo) does the same thing, except the other way around, for process output.

There is a "createPipe" function in the "unix" package http://hackage.haskell.org/package/unix-2.7.1.0/docs/System-... that gets us half-way towards process substitution.

Unfortunately, I don't know how to get the name of the device file associated to the pipe, and I need it in order to pass it as an argument to the reading process :(

In my opinion it be nice to write it as a wrapper type thing:

    inDir d action = do
      oldDir <- pwd
      cd newDir
      result <- action
      cd oldDir
      return result
Then you could use it like a python `with` statement. :)