Running external programs with Python (system and subprocess)

GitHub

One of the things we often have to do is glue together various programs written by other people. If these other programs are GUI based then we will have a very hard time doing so, but if they are command line based then there are some nice ways to do that in Python. We'll see two different ways to accomplish this.

Our external program

In order to demonstrate this we need an "external tool" that we will handle as a "black box". As you are using Python I can assume you already have Python on your computer so we'll use a script written in Python as the "external tool".

You can see it here:

examples/python/process/process.py

import time
import sys

if len(sys.argv) != 3:
    exit(f"{sys.argv[0]} SECONDS EXIT_CODE")

seconds = int(sys.argv[1])
exit_code = int(sys.argv[2])

for sec in range(seconds):
   print("OUT {}".format(sec), flush=True)
   print("ERR {}".format(sec), file=sys.stderr)
   time.sleep(1)

exit(exit_code)

The idea is to have a program that can demonstrate a process with

Output on Standard Output (STDOUT)
Output on Standard Error (STDERR)
A process that takes time
Various exit codes (ERRORLEVELs)

So we can see how to deal with either of those.

The user of the process.py can tell it how many iterations to do. On every iteration it will print to both STDOUT and STDERR and wait for 1 second. The user can also tell the process how to set its exit code.

This can be a nice tool to fake the behavior of some external tool.

If we run the process as follows:

$ python process.py 3 4

We get the following output:

OUT 0
ERR 0
OUT 1
ERR 1
OUT 2
ERR 2

We can also observe the exit code on Linux/macOS:

$ echo $?
4

and on Windows:

> echo %ERRORLEVEL%
4

Using os.system

The simplest way to run an external program is to use os.system.

examples/python/process/os_system.py

import os
import sys

exit_code = os.system(f"python process.py 5 2")
print(f'exit code: {exit_code // 256}')

It accepts a string - exactly what you would type in on the command line and executes the external program.

Your program connects its own STDOUT and STDERR channels to the external process so while the external program is running whatever it prints to STDOUT and STDERR will be handled exactly as if they came directly from your program.

It waits till the external program ends and at the end it returns the exit-code of the external program. Well it actually returns two bytes and the real exit code is the higher byte so we need to do an integer division exit_code // 256 in order to get to the real value. (It is the same as int(exit_code / 256).)

$ python os_system.py

OUT 0
ERR 0
OUT 1
ERR 1
OUT 2
ERR 2
OUT 3
ERR 3
OUT 4
ERR 4
exit code: 2

This can be very useful, but this way the output of the external program "gets lost" to our program. Often this is not what we want.

Often we would want to capture the output of the external program, parse it and do something based on the understanding from it.

We might also want to do something else while the external program does its thing.

Let's see how the subprocess module can help us. We will see a few examples.

subprocess waiting for external process to finish

In the first example we will imitate the os.system just to lay the ground-work.

We have created a function called run_process. Instead of a string, the command we would want to type in, it is expected to receive a list. The pieces of the command divided up. That is probably not be a problem to write.

I sprinkled the whole program with print statements to make it easier to see what is the order of things happening.

The first thing is to call proc = subprocess.Popen(command). This will start the external program and return immediately passing us an object that represents this external process. (Popen stands for process open)

At this point the external program will run regardless of what our program does. So we can wait for 1.5 seconds and see the output (and error) of the external program. We could also do some other work while the external program runs. We'll see that later.

At one point, however, we will likely want to wait for the external program to end. This is what the proc.communicate() does. (It's name is strange, I know. The next example will shed some light on why it is called that way.) It stops our program and waits till the external program ends.

Then we can fetch the exit code (that Windows calls ERRORLEVEL) from the attribute returncode of the proc object.

(Are you already having fun by the fact that the same thing is called "exit code", ERRORLEVEL, and "returncode" by three different systems?)

Anyway, here is the code:

examples/python/process/run_command.py

import subprocess
import time

def run_process(command):
    print("Before Popen")
    proc = subprocess.Popen(command)  # This starts runing the external process
    print("After Popen")
    time.sleep(1.5)

    print("Before communicate")
    proc.communicate()
    print("After communicate")

    exit_code = proc.returncode
    return exit_code

print("Before run_process", flush=True)
exit_code = run_process(['python', 'process.py', '5', '0'])
print("After run_process", flush=True)

print(f'exit code: {exit_code}', flush=True)

Here is how you'd run it:

$ python run_command.py

Here is the output. As you can see the external program already managed to print out 4 lines while we were sleeping, before we called "communicate".

In this case we still let through STDOUT and STDERR to the respective channels of our script.

Before run_process
Before Popen
After Popen
OUT 0
ERR 0
OUT 1
ERR 1
Before communicate
OUT 2
ERR 2
OUT 3
ERR 3
OUT 4
ERR 4
After communicate
After run_process
exit code: 0

subprocess capture both STDOUT and STDERR separately

In the next example we passed stdout = subprocess.PIPE and stderr = subprocess.PIPE to the subprocess.Popen() call.

These will connect the STDOUT and STDERR channels of the external program to two separate pipes in our program. Anything the external program prints will go into these pipelines instead to the screen.

In this example too, we call the communicate method to wait for the external program to end. Once the external program terminates the communicate method returns and it returns all the content it collected from the external program as two separate byte-streams.

Our own run_process function then returns the exit code along with these two.

We can then use these two variables directly or we can convert them to UTF-8 strings by calling decode('utf8') on each one of them.

examples/python/process/run_command_collect_output.py

import subprocess
import time

def run_process(command):
    print("Before Popen")
    proc = subprocess.Popen(command,
        stdout = subprocess.PIPE,
        stderr = subprocess.PIPE,
    )  # This starts runing the external process
    print("After Popen")
    time.sleep(1.5)

    print("Before communicate")
    out, err = proc.communicate()
    print("After communicate")

    # out and err are two strings
    exit_code = proc.returncode
    return exit_code, out, err

print("Before run_process")
exit_code, out, err = run_process(['python', 'process.py', '5', '0'])
print("After run_process")

print("")
print(f'exit code: {exit_code}')

print("")
print('out:')
for line in out.decode('utf8').split('\n'):
    print(line)

print('err:')
for line in err.decode('utf8').split('\n'):
    print(line)

$ python run_command_collect_output.py

In the output you can see that we only print the output of the external program after run_process ended. Of course instead of printing them to the screen you could parse them using a regular expression or some other tool.

Before run_process
Before Popen
After Popen
Before communicate
After communicate
After run_process

exit code: 0

out:
OUT 0
OUT 1
OUT 2
OUT 3
OUT 4

err:
ERR 0
ERR 1
ERR 2
ERR 3
ERR 4

Run external process and capture STDOUT and STDERR merged together

In the next example we combine the STDERR and STDOUT channels.

In the Popen call we pass stdout = subprocess.PIPE as previously, but now we pass stderr = subprocess.STDOUT. This will tell subprocess to merge STDERR into STDOUT.

The rest of the code is the same.

examples/python/process/run_command_combine_stderr_and_stdout.py

import subprocess
import time

def run_process(command):
    print("Before Popen")
    proc = subprocess.Popen(command,
        stdout = subprocess.PIPE,
        stderr = subprocess.STDOUT,
    )  # This starts runing the external process
    print("After Popen")
    time.sleep(1.5)

    print("Before communicate")
    out, err = proc.communicate()
    print("After communicate")

    # out and err are two strings
    exit_code = proc.returncode
    return exit_code, out, err

print("Before run_process")
exit_code, out, err = run_process(['python', 'process.py', '5', '0'])
print("After run_process")

print("")
print(f'exit code: {exit_code}')

print("")
print('out:')
for line in out.decode('utf8').split('\n'):
    print(line)

print('err:')
print(err)

In the output you can see that the two channels are now mixed again as they were in the first case. However there is no promise that this will be the exact same order as we had earlier. It seems to be now, but our output was very regular. If they come at some fancy schedule then the the Standard Output and Standard Error channels might be mixed in a different way.

$ python run_command_combine_stderr_and_stdout.py

Before run_process
Before Popen
After Popen
Before communicate
After communicate
After run_process

exit code: 0

out:
OUT 0
ERR 0
OUT 1
ERR 1
OUT 2
ERR 2
OUT 3
ERR 3
OUT 4
ERR 4

err:
None

Doing things while subprocess is running in the background

There are some more examples showing how to do something else while our external program is running in the background, but for now I think it is enough.

Written by
Gabor Szabo

Published on 2022-12-08

If you have any comments or questions, feel free to post them on the source of this page in GitHub. Source on GitHub. Comment on this post