For the benefit of object persistence, the pickle module
supports the notion of a reference to an object outside the pickled
data stream. Such objects are referenced by a ``persistent id'',
which is just an arbitrary string of printable ASCII characters.
The resolution of such names is not defined by the pickle
module; it will delegate this resolution to user defined functions on
the pickler and unpickler3.9.
To define external persistent id resolution, you need to set the
persistent_id attribute of the pickler object and the
persistent_load attribute of the unpickler object.
To pickle objects that have an external persistent id, the pickler
must have a custom persistent_id() method that takes an
object as an argument and returns either None or the persistent
id for that object. When None is returned, the pickler simply
pickles the object as normal. When a persistent id string is
returned, the pickler will pickle that string, along with a marker
so that the unpickler will recognize the string as a persistent id.
To unpickle external objects, the unpickler must have a custom
persistent_load() function that takes a persistent id
string and returns the referenced object.
Here's a silly example that might shed more light:
import pickle
from cStringIO import StringIO
src = StringIO()
p = pickle.Pickler(src)
def persistent_id(obj):
if hasattr(obj, 'x'):
return 'the value %d' % obj.x
else:
return None
p.persistent_id = persistent_id
class Integer:
def __init__(self, x):
self.x = x
def __str__(self):
return 'My name is integer %d' % self.x
i = Integer(7)
print i
p.dump(i)
datastream = src.getvalue()
print repr(datastream)
dst = StringIO(datastream)
up = pickle.Unpickler(dst)
class FancyInteger(Integer):
def __str__(self):
return 'I am the integer %d' % self.x
def persistent_load(persid):
if persid.startswith('the value '):
value = int(persid.split()[2])
return FancyInteger(value)
else:
raise pickle.UnpicklingError, 'Invalid persistent id'
up.persistent_load = persistent_load
j = up.load()
print j
In the cPickle module, the unpickler's
persistent_load attribute can also be set to a Python
list, in which case, when the unpickler reaches a persistent id, the
persistent id string will simply be appended to this list. This
functionality exists so that a pickle data stream can be ``sniffed''
for object references without actually instantiating all the objects
in a pickle3.10. Setting
persistent_load to a list is usually used in conjunction with
the noload() method on the Unpickler.
The actual mechanism for
associating these user defined functions is slightly different for
pickle and cPickle. The description given here
works the same for both implementations. Users of the pickle
module could also use subclassing to effect the same results,
overriding the persistent_id() and persistent_load()
methods in the derived classes.