How to Use Pickle for Object Serialization in Python

Abdou Rockikz · 03 nov 2019

Abdou Rockikz · 4 min read · General Python Topics

Object serialization is the process of translating data structures or object state into a format that can be stored in a file or transmitted and reconstructed later. In this tutorial, you will learn how you can use pickle built-in module to serialize and deserialize objects in Python.

Serialization in Python is often called pickling. Pickling is simply the process whereby a Python object hierarchy is converted into a byte stream, and unpickling is the inverse operation.

RELATED: How to Compress & Decompress Files in Python.

Let's start off by pickling basic Python data structures:

import pickle

# define any Python data structure including lists, sets, tuples, dicts, etc.
l = list(range(10000))

I used a list here that contains 10000 elements just for demonstration purposes, you can use any Python object, the below code save this list to a file:

# save it to a file
with open("list.pickle", "wb") as file:
    pickle.dump(l, file)

pickle.dump(obj, file) writes a pickled representation of obj (in this case, the list) to the open file (in write and bytes mode "wb"), let's load this object again:

# load it again
with open("list.pickle", "rb") as file:
    unpickled_l = pickle.load(file)

pickle.load(file) reads and returns an object from the pickle data stored in a file (opened in read and bytes mode "rb"), comparing the original and the unpickled object:

print("unpickled_l == l: ", unpickled_l == l)
print("unpickled l is l: ", unpickled_l is l)

Output:

unpickled_l == l:  True
unpickled l is l:  False

Makes sense, the values of the list is still the same (equal), but it is not identical, in other words, the unpickled list has another place in memory, so it's literally a copy of the original object.

You can also save and load object instances of user defined classes. For instance, let's define a simple Person class:

class Person:
    def __init__(self, first_name, last_name, age, gender):
        self.first_name = first_name
        self.last_name = last_name
        self.age = age
        self.gender = gender

    def __str__(self):
        return f"<Person name={self.first_name} {self.last_name}, age={self.age}, gender={self.gender}>"

p = Person("John", "Doe", 99, "Male")

Let's make the same process again:

# save the object
with open("person.pickle", "wb") as file:
    pickle.dump(p, file)

# load the object
with open("person.pickle", "rb") as file:
    p2 = pickle.load(file)

print(p)
print(p2)

This outputs:

<Person name=John Doe, age=99, gender=Male>
<Person name=John Doe, age=99, gender=Male>

In general, if you want to unpickle a user defined specific object, you need to implement its class in the current scope, otherwise it'll raise an error.

For instance, if you unpickle a numpy array (or any other defined objects that are within modules you have installed), Python will automatically import numpy module and loads the object for you.

You can also use pickle.dumps(obj) function that returns the pickled representation of the object as a bytes object, so you can encrypt it, transfer it or whatever. The below code pickles and unpickles the previous object using pickle.dumps(obj) and pickle.loads(data) functions:

# get the dumped bytes
dumped_p = pickle.dumps(p)
print(dumped_p)

# write them to a file
with open("person.pickle", "wb") as file:
    file.write(dumped_p)

# load it
with open("person.pickle", "rb") as file:
    p2 = pickle.loads(file.read())

Take a look at the bytes represenation of that object:

b'\x80\x03c__main__\nPerson\nq\x00)\x81q\x01}q\x02(X\n\x00\x00\x00first_nameq\x03X\x04\x00\x00\x00Johnq\x04X\t\x00\x00\x00last_nameq\x05X\x03\x00\x00\x00Doeq\x06X\x03\x00\x00\x00ageq\x07KcX\x06\x00\x00\x00genderq\x08X\x04\x00\x00\x00Maleq\tub.'

Yes, that's right, not human readable, that's because it is in a binary format.

Finally, here are the list of objects you can pickle and unpickle:

  • None.
  • Boolean variables (True and False).
  • Integers, floating point numbers and complex numbers.
  • Strings, bytes, bytearrays.
  • Tuples, lists, sets and dictionaries containing only pickleable objects.
  • Functions defined at the top level of a module (using def, not lambda).
  • Built-in functions defined at the top level of a module (such as max, min, bool, etc.).
  • Classes that are defined at the top level of a module.

See the official Python documentation for more information.

Happy Coding ♥

View Full Code
Sharing is caring!


Read Also





Comment panel

   
Comment system is still in Beta, if you find any bug, please consider contacting us here.