Subprocess fundamentals

Python's subprocess module lets your scripts run external commands — understanding the difference between run() and Popen, and why shell=True is usually wrong, keeps your scripts safe and predictable.

Every operating system ships with useful command-line tools: git, ffmpeg, grep, image converters, database CLI clients. Python's subprocess module is the bridge that lets your scripts invoke these tools without leaving Python. You get the best of both worlds — Python for data processing and control flow, existing tools for what they do best.

subprocess.run(): run and wait

The standard call for most situations:

import subprocess

result = subprocess.run(["echo", "hello"])
print(result.returncode)   # 0 means success

subprocess.run() launches the process, waits for it to finish, and returns a CompletedProcess object. Your Python code is blocked until the command exits. For the vast majority of automation tasks — running a compiler, converting a file, invoking a CLI tool — this is exactly the behaviour you want.

subprocess.Popen(): launch and continue

Popen is the lower-level primitive. It launches a process and returns immediately without waiting. Your Python code continues executing while the subprocess runs in parallel. You call .wait() or .communicate() when you need the result:

proc = subprocess.Popen(["sleep", "2"])
print("Subprocess started — Python keeps running")
proc.wait()
print("Subprocess finished")

Use Popen when you need to run processes in parallel, stream output as it arrives, or build a pipeline between two processes (covered in the pipelines lesson). For everything else, run() is simpler.

Capturing output

By default, a subprocess's stdout and stderr go straight to the terminal — the same place Python prints to. If you want to capture that output and do something with it in Python, add capture_output=True:

result = subprocess.run(
    ["echo", "hello"],
    capture_output=True,
    text=True,    # decode bytes to str automatically
)
print(result.stdout)   # "hello\n"

Without text=True, result.stdout is a bytes object. Adding text=True (or equivalently encoding="utf-8") saves you from manually calling .decode().

capture_output=True is shorthand for stdout=subprocess.PIPE, stderr=subprocess.PIPE. The longer form is useful when you want to capture stdout but let stderr pass through to the terminal for debugging.

Why shell=True is usually wrong

You might see code like this:

subprocess.run("echo hello", shell=True)   # avoid this

With shell=True, Python passes the entire string to the system shell (/bin/sh), which interprets it. This introduces two problems.

First, injection risk: if any part of the command string comes from user input or an external source, a malicious value can run arbitrary commands. shell=True with constructed strings is a classic security vulnerability.

Second, portability: shell behaviour differs between sh, bash, and Windows cmd.exe. The list form is unambiguous.

Pass a list of strings instead:

subprocess.run(["echo", "hello"])         # safe and portable

Each list element is passed directly to the OS without shell interpretation. Spaces, quotes, and special characters in arguments are handled correctly because there is no shell to misinterpret them.

The only legitimate use of shell=True is when you are running a short, fully hardcoded shell pipeline that cannot be expressed with the list API. Even then, think twice — the pipelines lesson shows how to replicate pipes in Python without the shell.

Where to go next

Next: subprocess in practice — a runnable example showing check=True, .stdout, .returncode, and what happens when a command fails.

Finished reading? Mark it complete to track your progress.

subprocess.run(): run and wait

subprocess.Popen(): launch and continue

Capturing output

Why shell=True is usually wrong

Where to go next

On this page