How to use multiprocessing in Python
Learn how to use multiprocessing in Python. This guide covers different methods, tips, real-world applications, and debugging common errors.

Python's multiprocessing module lets you run tasks in parallel. This method can dramatically improve your application's performance. It uses multiple CPU cores and bypasses the Global Interpreter Lock.
Here, you'll explore core techniques, practical tips, and real-world applications. You'll also find effective advice to debug your code, so you can confidently implement multiprocessing in your own projects.
Using the multiprocessing module for basic parallelism
import multiprocessing
def worker(num):
return f"Worker {num} result"
if __name__ == "__main__":
with multiprocessing.Pool(processes=3) as pool:
results = pool.map(worker, [1, 2, 3])
print(results)--OUTPUT--['Worker 1 result', 'Worker 2 result', 'Worker 3 result']
The if __name__ == "__main__" guard is a critical safeguard. It ensures the main script's code doesn't re-run inside the child processes that are spawned, which prevents errors and infinite loops.
The multiprocessing.Pool object is what orchestrates the parallel execution. It creates a pool of worker processes—three, in this example. The pool.map() function then distributes the worker function to these processes, applying it to each item in the list. It automatically manages the workload and gathers the results in order once all tasks are complete.
Core multiprocessing techniques
While Pool is great for many cases, you can get more control by managing processes directly with Process and sharing data using Queue or Manager.
Using Process to run functions in parallel
import multiprocessing
def task(name):
print(f"Running task {name}")
if __name__ == "__main__":
processes = []
for i in range(3):
p = multiprocessing.Process(target=task, args=(f"Process-{i}",))
processes.append(p)
p.start()
for p in processes:
p.join()--OUTPUT--Running task Process-0
Running task Process-1
Running task Process-2
The Process class offers fine-grained control by letting you manage individual processes directly. You instantiate it by passing your function to the target parameter and its arguments via args. This setup gives you a handle on each parallel task.
- Calling
p.start()kicks off the process, letting it run independently. - Using
p.join()makes your main program wait for the process to finish. It’s essential for ensuring all your parallel tasks are complete before the script exits.
Passing data with Queue between processes
from multiprocessing import Process, Queue
def producer(q):
q.put("Hello")
q.put("from")
q.put("another process")
if __name__ == "__main__":
q = Queue()
p = Process(target=producer, args=(q,))
p.start()
p.join()
while not q.empty():
print(q.get())--OUTPUT--Hello
from
another process
When processes need to exchange data, the multiprocessing.Queue provides a safe and simple way to do it. It acts like a pipeline between your main script and worker processes, handling all the complex synchronization for you.
- The
producerprocess usesq.put()to add items to the queue. - Back in the main process,
q.get()retrieves items one by one. - The queue follows a first-in, first-out order, so you get the data in the same sequence it was sent.
Sharing data with Manager dictionaries
from multiprocessing import Process, Manager
def update_dict(shared_dict, key, value):
shared_dict[key] = value
if __name__ == "__main__":
with Manager() as manager:
shared_dict = manager.dict()
processes = []
for i in range(3):
p = Process(target=update_dict, args=(shared_dict, f"key{i}", i*10))
processes.append(p)
p.start()
for p in processes:
p.join()
print(dict(shared_dict))--OUTPUT--{'key0': 0, 'key1': 10, 'key2': 20}
For sharing complex data structures like dictionaries, the Manager class is a powerful tool. It creates a server process that holds Python objects, allowing multiple processes to modify them safely. This is ideal when you need a shared state rather than just passing messages with a Queue.
- The
with Manager() as manager:block starts this server and ensures it’s cleaned up afterward. - You create a shared dictionary using
manager.dict(). - Each process can then update this dictionary directly, and the manager handles the underlying synchronization to prevent data corruption.
Advanced process management
Beyond the basics of Process, Queue, and Manager, the multiprocessing module offers even more powerful tools for synchronizing tasks and sharing memory efficiently.
Synchronizing processes with Lock and Event
from multiprocessing import Process, Lock, Event
import time
def worker(lock, event, worker_id):
with lock:
print(f"Worker {worker_id} acquired the lock")
event.wait()
print(f"Worker {worker_id} received event signal")
if __name__ == "__main__":
lock = Lock()
event = Event()
processes = [Process(target=worker, args=(lock, event, i)) for i in range(2)]
for p in processes:
p.start()
time.sleep(0.5)
event.set()
for p in processes:
p.join()--OUTPUT--Worker 0 acquired the lock
Worker 1 acquired the lock
Worker 0 received event signal
Worker 1 received event signal
Synchronization primitives like Lock and Event help you coordinate complex workflows. They prevent race conditions and manage the timing of your processes.
- A
Lockensures only one process can access a critical section of code at a time. Thewith lock:statement automatically acquires and releases the lock, making it a safe way to protect shared resources. - An
Eventacts as a simple signaling mechanism. Processes can wait for a signal usingevent.wait(), and another process can send the signal to all waiting processes by callingevent.set().
Creating a custom Pool with different processing methods
from multiprocessing import Pool
import time
def slow_task(x):
time.sleep(0.1)
return x * x
if __name__ == "__main__":
with Pool(processes=4) as pool:
# Asynchronous processing
result = pool.apply_async(slow_task, (10,))
print(f"Async result: {result.get()}")
# Map for multiple inputs
print(f"Map results: {pool.map(slow_task, [1, 2, 3])}")--OUTPUT--Async result: 100
Map results: [1, 4, 9]
The Pool object gives you flexible ways to manage tasks. You can run a single function asynchronously or apply one function to many inputs at once, depending on your needs.
apply_asynclets you offload a single task to a worker process. Your main program doesn't have to wait and can continue running. You retrieve the result later by calling.get()on the returned object.mapis perfect for parallelizing a function across a list of items. It blocks execution until all tasks are complete and returns the results in a list, maintaining their original order.
Using multiprocessing.Value and Array for shared memory
from multiprocessing import Process, Value, Array
import ctypes
def update_shared(number, arr):
number.value += 100
for i in range(len(arr)):
arr[i] *= 2
if __name__ == "__main__":
num = Value(ctypes.c_int, 0)
arr = Array(ctypes.c_int, [1, 2, 3, 4])
p = Process(target=update_shared, args=(num, arr))
p.start()
p.join()
print(f"Shared number: {num.value}")
print(f"Shared array: {list(arr)}")--OUTPUT--Shared number: 100
Shared array: [2, 4, 6, 8]
For high-performance data sharing, Value and Array let processes modify data in shared memory directly. It's more efficient than using a Manager for simple types because it avoids inter-process communication overhead. These objects are essentially memory-safe wrappers around C data types from the ctypes module.
- You create a
Valuefor a single piece of data, like a number, and access it using the.valueattribute. - An
Arrayworks for a sequence of items and can be modified in place by worker processes. - Both require a
ctypesdata type, likectypes.c_int, to define how the data is stored.
Move faster with Replit
Replit is an AI-powered development platform that transforms natural language into working applications. You can take the concepts from this article and use Replit Agent to build complete apps—with databases, APIs, and deployment—directly from a description.
For the multiprocessing techniques we've explored, Replit Agent can turn them into production-ready tools. For example, you could:
- Build a batch image processor that resizes thousands of files in parallel using a process
Pool. - Create a web scraper where one process finds links and adds them to a
Queue, while multiple worker processes fetch and parse the pages. - Deploy a real-time monitoring dashboard that uses shared
ValueandArrayobjects to display system metrics from different processes.
Describe your app idea, and Replit Agent writes the code, tests it, and fixes issues automatically, all in your browser.
Common errors and challenges
Navigating multiprocessing requires avoiding a few common pitfalls, from infinite loops to data corruption and unresponsive programs.
- Forgetting the
if __name__ == "__main__"guard is a classic mistake. Without it, each new process re-imports and runs your main script, creating a recursive loop that quickly overwhelms your system. - Directly sharing standard Python objects like lists or dictionaries between processes can lead to silent bugs. Each process gets its own copy, so modifications made in one won't appear in another. For shared state, you must use a
Managerto ensure all processes are working on the same, synchronized object. - A worker process can sometimes hang, causing your main program to wait forever when it calls
join(). You can prevent this by adding atimeout, likep.join(timeout=10). This tells the main process to wait for a set number of seconds before moving on, making your application more resilient.
Avoiding recursive spawning with if __name__ == "__main__"
Each new process re-imports the main script. Without the if __name__ == "__main__" guard, your process-creation code runs again inside the child process, triggering an infinite loop that can crash your program. The code below shows this error in action.
import multiprocessing
def worker(num):
return f"Worker {num} result"
# Missing if __name__ == "__main__" guard
pool = multiprocessing.Pool(processes=3)
results = pool.map(worker, [1, 2, 3])
print(results)
When the script is imported by a new process, the multiprocessing.Pool() line runs again, creating another pool. This triggers an infinite loop of process creation. The corrected code below shows how to prevent this from happening.
import multiprocessing
def worker(num):
return f"Worker {num} result"
if __name__ == "__main__":
pool = multiprocessing.Pool(processes=3)
results = pool.map(worker, [1, 2, 3])
print(results)
By wrapping your process-creation logic in an if __name__ == "__main__" block, you ensure it only runs when the script is executed directly. When a child process is created, it imports the script, but the code inside this block won't run again. This is the standard way to prevent the recursive spawning error. It's a crucial habit to adopt whenever you're using the multiprocessing module to keep your applications stable.
Properly handling mutable objects with Manager() vs direct sharing
Directly sharing mutable objects like a list or dict between processes can lead to unexpected behavior. Each process receives a separate copy, not a shared reference, so modifications in one process are invisible to others, causing silent data inconsistencies. The following code illustrates this problem in action.
from multiprocessing import Process
def update_list(lst):
lst.append(100)
print(f"Inside process: {lst}")
if __name__ == "__main__":
my_list = [1, 2, 3]
p = Process(target=update_list, args=(my_list,))
p.start()
p.join()
print(f"In main process: {my_list}") # Still [1, 2, 3]
The update_list function modifies its own private copy of the list, so the original my_list in the main process remains untouched. The changes aren't synchronized back. The corrected code below shows how to fix this.
from multiprocessing import Process, Manager
def update_list(lst):
lst.append(100)
print(f"Inside process: {lst}")
if __name__ == "__main__":
with Manager() as manager:
my_list = manager.list([1, 2, 3])
p = Process(target=update_list, args=(my_list,))
p.start()
p.join()
print(f"In main process: {my_list}") # Now contains 100
The Manager solves this by creating a proxy object—a special version of the list that can be safely shared across processes. This ensures that any modifications are synchronized correctly.
- When you pass the
manager.list()to the new process, both the main and child processes are working on the same underlying data. - This guarantees that changes, like
lst.append(100), are reflected everywhere.
You'll need this solution whenever multiple processes must read from and write to the same mutable object.
Safely terminating processes with timeout in join()
A worker process can sometimes hang, leaving your main program stuck waiting indefinitely when it calls join(). This can make your entire application unresponsive. The code below shows what happens when a long-running task blocks the main process from continuing.
from multiprocessing import Process
import time
def long_task():
print("Starting long task...")
time.sleep(10) # Simulate a task that might hang
print("Task complete")
if __name__ == "__main__":
p = Process(target=long_task)
p.start()
p.join() # This will wait indefinitely if the process hangs
print("Main process continued")
The main program is stuck because p.join() has no time limit, forcing it to wait for the 10-second sleep to finish. If the task hangs, the application freezes. The corrected code below shows a more resilient approach.
from multiprocessing import Process
import time
def long_task():
print("Starting long task...")
time.sleep(10) # Simulate a task that might hang
print("Task complete")
if __name__ == "__main__":
p = Process(target=long_task)
p.start()
p.join(timeout=5) # Wait for at most 5 seconds
if p.is_alive():
print("Process is taking too long, terminating")
p.terminate()
print("Main process continued")
By adding a timeout to p.join(), you prevent your main program from getting stuck. If the process doesn't finish within the specified time, your script moves on. You can then check if it's still running with p.is_alive() and forcefully stop it using p.terminate(). This is crucial for tasks that might hang, like network requests or long computations, as it keeps your application responsive and stable.
Real-world applications
With the core techniques and error-handling patterns covered, you can now apply multiprocessing to solve practical problems in data processing and task scheduling.
Processing large datasets with Pool.map()
You can significantly speed up data analysis by using Pool.map() to divide a large dataset into manageable chunks and process each one on a separate CPU core.
import multiprocessing
import random
def analyze_chunk(chunk):
# Simulate processing a chunk of data
return sum(chunk) / len(chunk)
if __name__ == "__main__":
data = [random.randint(1, 100) for _ in range(1000)]
chunks = [data[i:i+250] for i in range(0, len(data), 250)]
with multiprocessing.Pool(processes=4) as pool:
results = pool.map(analyze_chunk, chunks)
print(f"Chunk averages: {results}")
This example breaks a large list of random numbers into smaller chunks for parallel processing. A multiprocessing.Pool manages a group of four worker processes to handle the workload, distributing the chunks among them.
- The
pool.map()function is the core of the operation. It assigns theanalyze_chunktask to each process, applying it to one of the data chunks. - Once all processes complete their calculations,
pool.map()gathers the individual results and returns them in a single, ordered list.
Creating a simple parallel task scheduler with Pool.map()
You can also use Pool.map() to build a simple task scheduler, which is perfect for running a batch of independent jobs like data backups and log analysis concurrently.
import multiprocessing
import time
from datetime import datetime
def scheduled_task(task_info):
name, delay = task_info
time.sleep(delay) # Simulate task running time
return name, datetime.now().strftime("%H:%M:%S")
if __name__ == "__main__":
tasks = [("Data backup", 2), ("Log analysis", 1), ("Email sending", 3)]
print(f"Starting tasks at {datetime.now().strftime('%H:%M:%S')}")
with multiprocessing.Pool(processes=3) as pool:
results = pool.map(scheduled_task, tasks)
for task_name, completion_time in results:
print(f"{task_name} completed at {completion_time}")
This code uses a Pool to run several independent functions, each with different arguments. The tasks list defines three jobs, each with a unique name and a simulated delay. A Pool with three processes is then created to handle the work concurrently.
- The
pool.map()function is the core of the operation. It applies thescheduled_taskfunction to every item in thetaskslist. - Because each task runs in its own process, they don't have to wait for each other to finish.
The program gathers all the results and prints them in order once every task is complete.
Get started with Replit
Turn these concepts into a real tool with Replit Agent. Describe what you want to build, like “a batch image resizer that uses a process Pool” or “a web scraper that uses a Queue to manage tasks.”
Replit Agent writes the code, tests for errors, and deploys your app. Start building with Replit.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.
Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.


.png)
.png)