How to pickle in Python
Learn how to pickle in Python. Discover different methods, tips, real-world applications, and how to debug common errors.

Python's pickle module converts objects into a byte stream through serialization. This process lets you save complex data structures to a file or transfer them across a network with ease.
Here, you'll explore core pickle techniques and practical tips. You will also find real-world applications and get advice to debug common issues you might encounter.
Basic pickling with pickle.dump() and pickle.load()
import pickle
data = {'name': 'Alice', 'age': 30, 'scores': [85, 90, 95]}
with open('data.pkl', 'wb') as file:
pickle.dump(data, file)
with open('data.pkl', 'rb') as file:
loaded_data = pickle.load(file)
print(loaded_data)--OUTPUT--{'name': 'Alice', 'age': 30, 'scores': [85, 90, 95]}
The pickle.dump() function serializes the data dictionary. It’s essential to open the file in binary write mode ('wb') because pickling creates a byte representation of your object, not a simple text string.
Conversely, pickle.load() reconstructs the object from the file. You must open the file in binary read mode ('rb') so Python can correctly interpret the byte stream and deserialize it back into its original data structure.
Common pickling techniques
Beyond saving objects to a file, you can also serialize them directly into memory, handle multiple objects in sequence, and optimize the process with protocols.
Using pickle.dumps() and pickle.loads() for in-memory serialization
import pickle
user = {'username': 'john_doe', 'email': '[email protected]'}
# Pickle to bytes object instead of file
pickled_data = pickle.dumps(user)
# Unpickle from bytes
unpickled_data = pickle.loads(pickled_data)
print(f"Pickled data (first 15 bytes): {pickled_data[:15]}")
print(f"Unpickled data: {unpickled_data}")--OUTPUT--Pickled data (first 15 bytes): b'\x80\x04\x95$\x00\x00\x00\x00\x00\x00\x00}\x94('
Unpickled data: {'username': 'john_doe', 'email': '[email protected]'}
Sometimes you don't need a file. The pickle.dumps() function serializes an object directly into a bytes object in memory. This is incredibly useful for tasks like sending data over a network or storing it in a database that handles byte strings.
pickle.dumps()converts the object into bytes.pickle.loads()takes those bytes and reconstructs the original object.
This allows you to easily move complex data structures between different parts of an application without touching the file system.
Pickling multiple objects in sequence
import pickle
user1 = {'name': 'Bob', 'role': 'admin'}
user2 = {'name': 'Charlie', 'role': 'user'}
with open('users.pkl', 'wb') as file:
pickle.dump(user1, file)
pickle.dump(user2, file)
with open('users.pkl', 'rb') as file:
loaded_user1 = pickle.load(file)
loaded_user2 = pickle.load(file)
print(loaded_user1, loaded_user2, sep='\n')--OUTPUT--{'name': 'Bob', 'role': 'admin'}
{'name': 'Charlie', 'role': 'user'}
You can serialize multiple objects into a single file by calling pickle.dump() sequentially. Each call simply appends the byte stream of the next object to the file.
To retrieve the objects, you must call pickle.load() for each one you saved. The key is maintaining the correct sequence.
- The first call to
pickle.load()retrieves the first object that was dumped. - The second call retrieves the second, and so on.
This means you need to know how many objects are in the file to load them all correctly without causing an error.
Optimizing pickle with different protocol versions
import pickle
data = [1, 2, 3, 4, 5]
# Compare different protocol versions
protocol0 = pickle.dumps(data, protocol=0) # ASCII protocol
protocol5 = pickle.dumps(data, protocol=5) # Latest binary protocol
print(f"Protocol 0 size: {len(protocol0)} bytes")
print(f"Protocol 5 size: {len(protocol5)} bytes")
print(f"Protocol 0 starts with: {protocol0[:15]}")--OUTPUT--Protocol 0 size: 19 bytes
Protocol 5 size: 15 bytes
Protocol 0 starts with: b'(lp0\n1\naI2\naI3\na'
The pickle module supports different data formats, called protocols, which you can specify with the protocol argument. Think of them as different versions of the pickling language.
protocol=0is the original ASCII format. It’s human-readable but less efficient.- Newer protocols are binary, creating smaller and faster-to-process byte streams, as the example shows.
For modern applications, it's best to use pickle.HIGHEST_PROTOCOL to get the latest optimizations, unless you need compatibility with older Python versions.
Advanced pickling strategies
Now that you've covered the fundamentals, you can tackle more complex scenarios like serializing custom classes and tailoring pickle's behavior for greater control.
Pickling custom class objects
import pickle
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def greet(self):
return f"Hello, my name is {self.name} and I'm {self.age} years old."
person = Person("David", 25)
with open('person.pkl', 'wb') as file:
pickle.dump(person, file)
with open('person.pkl', 'rb') as file:
loaded_person = pickle.load(file)
print(loaded_person.greet())--OUTPUT--Hello, my name is David and I'm 25 years old.
You can easily pickle instances of your own custom classes. When you use pickle.dump() on an object, it serializes the object's attributes—in this case, name and age. The methods themselves, like greet(), are not pickled.
- The key requirement is that the class definition,
Personin this example, must be available when you callpickle.load().
Pickle needs the class blueprint to reconstruct the object. Once loaded, the object is a fully functional instance, ready to use its original methods.
Customizing pickling with __getstate__ and __setstate__
import pickle
class Counter:
def __init__(self, start=0):
self.count = start
self.temporary = "This won't be pickled"
def __getstate__(self):
return {'count': self.count} # Only pickle count
def __setstate__(self, state):
self.count = state['count']
self.temporary = "Restored default"
counter = Counter(10)
pickled = pickle.dumps(counter)
unpickled = pickle.loads(pickled)
print(f"Original: {counter.temporary}, Unpickled: {unpickled.temporary}")
print(f"Count preserved: {unpickled.count}")--OUTPUT--Original: This won't be pickled, Unpickled: Restored default
Count preserved: 10
For more control over serialization, you can implement the __getstate__ and __setstate__ methods. They let you decide exactly what gets saved and how the object is rebuilt, which is perfect for excluding temporary data or handling complex states that don't pickle well.
__getstate__()returns a dictionary of the attributes you want to save. In this example, it intentionally omits thetemporaryattribute from the pickled data.__setstate__()receives that dictionary during unpickling to restore the object. It's also where you can reinitialize any excluded attributes, like settingtemporaryto a default value.
Using dill for enhanced pickling capabilities
import dill # pip install dill
# Pickle a lambda function (not possible with standard pickle)
square = lambda x: x**2
pickled_func = dill.dumps(square)
# Unpickle and use the function
unpickled_func = dill.loads(pickled_func)
result = unpickled_func(5)
print(f"Square of 5: {result}")
print(f"Function type: {type(unpickled_func).__name__}")--OUTPUT--Square of 5: 25
Function type: function
Sometimes, the standard pickle module hits its limits. It can't serialize more complex Python objects like lambda functions or generators. This is where the third-party library dill comes in, offering a more robust solution.
dillextendspickle's capabilities, allowing you to serialize a much wider range of objects.- As the example shows, you can easily pickle a
lambdafunction, which would normally raise an error with the standard library. - It uses the same function names, like
dumps()andloads(), so switching is straightforward.
Move faster with Replit
Replit is an AI-powered development platform that transforms natural language into working applications. Describe what you want to build, and Replit Agent creates it—complete with databases, APIs, and deployment.
For the pickling techniques we've explored, Replit Agent can turn them into production-ready tools:
- Build a session manager that saves user settings to a file, preserving their configuration between visits.
- Create a simple caching system that serializes expensive database query results to speed up future requests.
- Deploy a progress tracker for a game or application that saves the current state, allowing users to resume exactly where they left off.
Describe your app idea to Replit Agent, and it will write the code, test it, and fix issues automatically, all in your browser.
Common errors and challenges
While pickling is powerful, you'll likely run into a few common roadblocks, from version mismatches to file mode errors and circular references.
Handling unpickling errors with incompatible class definitions
One of the most frequent issues arises when you change a class definition after an object has already been pickled. If you try to load an old object with a new class structure—for instance, if you've added or removed attributes—pickle.load() will likely fail with an AttributeError because the blueprint no longer matches the data.
- To prevent this, you can implement a versioning system for your classes.
- Alternatively, use the
__setstate__method to gracefully handle old data formats, allowing your new class to correctly initialize itself from an outdated state dictionary.
Dealing with file mode errors when pickling
A simple but common mistake is opening the file in the wrong mode. Since pickle generates a byte stream, you must use binary modes. Using text mode ('w' or 'r') will result in a TypeError because the data format is incompatible.
Always remember to open your file with 'wb' (write binary) when using pickle.dump() and 'rb' (read binary) for pickle.load(). This ensures Python handles the data as raw bytes, just as pickle expects.
Troubleshooting recursion depth issues with circular references
You might encounter a RecursionError if your objects contain circular references. This happens when an object refers to another object that, in turn, refers back to the first one, creating an infinite loop that pickle can't resolve.
To fix this, you can either temporarily break the circular link before pickling or implement the __getstate__ method. By using __getstate__, you can explicitly tell pickle to exclude the attribute causing the circular reference, allowing the rest of the object to serialize without issue.
Handling unpickling errors with incompatible class definitions
It's a classic "gotcha" with pickling. You save an object, then later refactor its class by renaming an attribute. When you try to load the old object, Python often throws an AttributeError because the data no longer fits the blueprint. See it happen below.
import pickle
# Original class definition when pickling
class User:
def __init__(self, name, role):
self.name = name
self.role = role
user = User("John", "admin")
with open('user.pkl', 'wb') as f:
pickle.dump(user, f)
# Class definition has changed (attribute renamed)
class User:
def __init__(self, username, role):
self.username = username
self.role = role
with open('user.pkl', 'rb') as f:
loaded_user = pickle.load(f)
print(f"Username: {loaded_user.username}") # Will fail
The unpickling fails because the saved object has a name attribute, but the new class definition expects a username attribute. This mismatch causes an error. The following code demonstrates how to handle this gracefully.
import pickle
# Original class definition when pickling
class User:
def __init__(self, name, role):
self.name = name
self.role = role
user = User("John", "admin")
with open('user.pkl', 'wb') as f:
pickle.dump(user, f)
# Class definition has changed, but handle compatibility
class User:
def __init__(self, username, role):
self.username = username
self.role = role
# Handle backward compatibility
def __setstate__(self, state):
if 'name' in state:
self.username = state['name']
self.__dict__.update(state)
with open('user.pkl', 'rb') as f:
loaded_user = pickle.load(f)
print(f"Username: {loaded_user.username}")
The fix is to implement the __setstate__ method, which acts as a translator during unpickling. It lets you handle outdated data structures gracefully.
- It checks if the old attribute,
name, exists in the saved data. - If so, it maps its value to the new
usernameattribute. - Finally,
self.__dict__.update(state)applies any remaining attributes from the pickled object.
This approach is vital for applications where class definitions might change over time.
Dealing with file mode errors when pickling
It's a classic TypeError source: opening the file in text mode ('w') instead of binary ('wb'). Because pickle creates a byte stream, not a text string, this mismatch will always cause an error. The code below shows this common mistake in action.
import pickle
data = {"key": "value"}
# Using text mode instead of binary mode
with open('data.pkl', 'w') as f: # Wrong: 'w' instead of 'wb'
pickle.dump(data, f)
The pickle.dump() function tries to write bytes, but the file is opened in text mode ('w'), which only accepts strings. This data type conflict is what causes the TypeError. The following example shows the simple fix.
import pickle
data = {"key": "value"}
# Using correct binary mode
with open('data.pkl', 'wb') as f: # Correct: 'wb' for writing binary
pickle.dump(data, f)
# Similarly, use 'rb' for reading
with open('data.pkl', 'rb') as f:
loaded_data = pickle.load(f)
The fix is simple: always use binary mode. The pickle.dump() function writes a byte stream, but opening a file with 'w' puts it in text mode, which only accepts strings. This data type conflict is what causes the TypeError.
- Always use
'wb'(write binary) when saving withpickle.dump(). - Always use
'rb'(read binary) when loading withpickle.load().
This ensures Python handles the data as the raw bytes that pickle requires.
Troubleshooting recursion depth issues with circular references
A RecursionError can pop up if your objects contain circular references. This happens when an object refers to another that, in turn, refers back to the first one, creating an infinite loop that older pickle protocols can't resolve.
The following code demonstrates this problem, where a parent dictionary refers to a child dictionary, and the child refers back to the parent.
import pickle
# Create objects with circular references
parent = {}
child = {'parent': parent}
parent['child'] = child
# This can lead to maximum recursion depth exceeded
with open('circular.pkl', 'wb') as f:
pickle.dump(parent, f, protocol=0) # Lower protocol hits limits faster
The parent and child dictionaries reference each other, creating a loop that confuses older pickle protocols. This triggers a RecursionError as the serializer gets stuck. See how newer protocols handle this situation in the next example.
import pickle
import sys
# Increase recursion limit if needed
original_limit = sys.getrecursionlimit()
sys.setrecursionlimit(3000) # Increase limit
# Create objects with circular references
parent = {}
child = {'parent': parent}
parent['child'] = child
# Use higher protocol version for better handling
with open('circular.pkl', 'wb') as f:
pickle.dump(parent, f, protocol=pickle.HIGHEST_PROTOCOL)
# Restore original recursion limit
sys.setrecursionlimit(original_limit)
Modern pickle protocols are designed to handle circular references automatically. Using a higher protocol is the most effective fix.
- Specify
protocol=pickle.HIGHEST_PROTOCOLto letpickletrack object references and prevent infinite loops. - While you can increase Python's recursion limit with
sys.setrecursionlimit(), it's often a temporary patch. Relying on the protocol is more robust.
Keep an eye out for this error when working with complex, nested objects that might refer back to themselves.
Real-world applications
Now that you've seen how to handle its complexities, you can use pickle for practical tasks like caching computation results and building object databases.
Caching computation results with pickle
Pickling is a straightforward way to cache the output of expensive operations, saving you valuable computation time on subsequent runs.
import pickle
def expensive_calculation(n):
# Simulate an expensive computation
return [i**2 for i in range(n)]
# Cache the results
results = expensive_calculation(5)
with open('calc_cache.pkl', 'wb') as f:
pickle.dump(results, f)
print(f"Cached results: {results}")
# Later, load the cached results
with open('calc_cache.pkl', 'rb') as f:
loaded_results = pickle.load(f)
print(f"Loaded from cache: {loaded_results}")
This example demonstrates how to persist a function's output. After expensive_calculation runs, pickle.dump() serializes the resulting list and writes it to the calc_cache.pkl file. This effectively saves the object's state to your disk.
You can then retrieve the data without rerunning the original function.
- The
pickle.dump()function takes your object and writes its byte representation into a file. - Conversely,
pickle.load()reads that byte stream from the file and reconstructs the original Python object in memory.
Implementing a simple object database with pickle
You can also use pickle to build a simple, persistent key-value store, turning a file into a basic database for your objects. This is perfect for small-scale applications where you need to save and retrieve data without the overhead of a traditional database. By wrapping the logic in a class, you can abstract away the file handling and serialization, leaving you with clean set() and get() methods to work with.
import pickle
import os
class SimpleDB:
def __init__(self, db_file='simple_db.pkl'):
self.db_file = db_file
self.data = {}
if os.path.exists(db_file):
with open(db_file, 'rb') as f:
self.data = pickle.load(f)
def save(self):
with open(self.db_file, 'wb') as f:
pickle.dump(self.data, f)
def set(self, key, value):
self.data[key] = value
self.save()
def get(self, key):
return self.data.get(key)
# Using our simple database
db = SimpleDB()
db.set('user1', {'name': 'Eva', 'email': '[email protected]'})
print(f"Retrieved: {db.get('user1')}")
The SimpleDB class shows how to build a persistent key-value store. When you create an instance, its __init__ method loads data from a pickle file if one exists, or it starts with an empty dictionary. This ensures your data is ready for the session.
- The
set()method adds or updates an entry in the in-memory dictionary and immediately callssave()to write the entire dataset back to the file. - The
get()method retrieves an item by its key, giving you quick access to the stored data.
Get started with Replit
Turn your knowledge into a real tool. Tell Replit Agent: "Create a CLI tool that saves user settings with pickle" or "Build a script that caches expensive function results to a file."
Replit Agent writes the code, tests for errors, and deploys your application right from your browser. Start building with Replit.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.
Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.


.png)
.png)