Hacker News new | ask | show | jobs
by pnathan 5274 days ago
I have never been able to figure out how - in Python - to be able to stream asynchronously both stdout and stderr from the subprocess, both printing both of them as well as writing the data to a file.
3 comments

I'm using the mkfifo method on linux/macosx:

    import os
    import sys
    import time
    import subprocess

    # turn off stdout buffering. otherwise we won't see things like wget progress-bars that update without newlines.
    sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)

    pipename = "tempfile"

    if os.path.exists(pipename):
        os.remove(pipename)

    # create a pipe. one side is connected to the ping process, other side is connected to python.
    os.mkfifo(pipename)
    read_fd = os.open(pipename, os.O_RDONLY|os.O_NONBLOCK)
    writer = open(pipename, "w+")

    proc = subprocess.Popen("ping www.google.com", cwd=sys.path[0], stdout=writer, stderr=writer, shell=True)

    while 1:
        try:
            # nonblocking poll data from the external process.
            s = os.read(read_fd, 1024)
            if s:
                sys.stdout.write(s)
        except OSError:
            pass
        # sidenote: minimum sleep time is 1/64 seconds on many windows pc-s.
        time.sleep(0.1)

    # remember to remove the pipe "tempfile"
Replying to myself. Using mkfifo is not necessary:

    import os, sys, time, subprocess, fcntl
    sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
    read_fd, write_fd = os.pipe()
    fcntl.fcntl(read_fd, fcntl.F_SETFL, os.O_NONBLOCK) # don't know of any windows equivalent for this line
    proc = subprocess.Popen("ping www.google.com", cwd=sys.path[0], stdout=write_fd, stderr=write_fd, shell=True)
    while 1:
        try:
            s = os.read(read_fd, 1024)
            if s:
                sys.stdout.write(s)
        except OSError:
            pass
        time.sleep(0.1)
You're listening for two file descriptor events, so you need some sort of event loop. select can do it but it's low-level; and since there can be only one event loop per program, your choices are frameworks and not simply libraries.

Here's a way to do it with Twisted (docs here: http://twistedmatrix.com/documents/current/core/howto/proces... ):

  from twisted.internet import reactor, protocol

  class PrintAndLogProtocol(protocol.ProcessProtocol):
      def outReceived(self, data):
          # print and log
      errReceived = outReceived

  reactor.spawnProcess(PrintAndLogProtocol(),
       '/path/to/exe', ['exe', 'arg1', 'arg2'])
  reactor.run()
I've done this using select.