Most of the security issues surrounding the pickle and
cPickle module involve unpickling. There are no known
security vulnerabilities
related to pickling because you (the programmer) control the objects
that pickle will interact with, and all it produces is a
string.
However, for unpickling, it is never a good idea to unpickle
an untrusted string whose origins are dubious, for example, strings
read from a socket. This is because unpickling can create unexpected
objects and even potentially run methods of those objects, such as
their class constructor or destructor3.11.
You can defend against this by customizing your unpickler so that you
can control exactly what gets unpickled and what gets called.
Unfortunately, exactly how you do this is different depending on
whether you're using pickle or cPickle.
One common feature that both modules implement is the
__safe_for_unpickling__ attribute. Before calling a callable
which is not a class, the unpickler will check to make sure that the
callable has either been registered as a safe callable via the
copy_reg module, or that it has an
attribute __safe_for_unpickling__ with a true value. This
prevents the unpickling environment from being tricked into doing
evil things like call os.unlink() with an arbitrary file name.
See section 3.14.5 for more details.
For safely unpickling class instances, you need to control exactly
which classes will get created. Be aware that a class's constructor
could be called (if the pickler found a __getinitargs__()
method) and the the class's destructor (i.e. its __del__() method)
might get called when the object is garbage collected. Depending on
the class, it isn't very heard to trick either method into doing bad
things, such as removing a file. The way to
control the classes that are safe to instantiate differs in
pickle and cPickle3.12.
In the pickle module, you need to derive a subclass from
Unpickler, overriding the load_global()
method. load_global() should read two lines from the pickle
data stream where the first line will the the name of the module
containing the class and the second line will be the name of the
instance's class. It then look up the class, possibly importing the
module and digging out the attribute, then it appends what it finds to
the unpickler's stack. Later on, this class will be assigned to the
__class__ attribute of an empty class, as a way of magically
creating an instance without calling its class's __init__().
You job (should you choose to accept it), would be to have
load_global() push onto the unpickler's stack, a known safe
version of any class you deem safe to unpickle. It is up to you to
produce such a class. Or you could raise an error if you want to
disallow all unpickling of instances. If this sounds like a hack,
you're right. UTSL.
Things are a little cleaner with cPickle, but not by much.
To control what gets unpickled, you can set the unpickler's
find_global attribute to a function or None. If it is
None then any attempts to unpickle instances will raise an
UnpicklingError. If it is a function,
then it should accept a module name and a class name, and return the
corresponding class object. It is responsible for looking up the
class, again performing any necessary imports, and it may raise an
error to prevent instances of the class from being unpickled.
The moral of the story is that you should be really careful about the
source of the strings your application unpickles.
A special note of
caution is worth raising about the Cookie
module. By default, the Cookie.Cookie class is an alias for
the Cookie.SmartCookie class, which ``helpfully'' attempts to
unpickle any cookie data string it is passed. This is a huge security
hole because cookie data typically comes from an untrusted source.
You should either explicitly use the Cookie.SimpleCookie class
-- which doesn't attempt to unpickle its string -- or you should
implement the defensive programming steps described later on in this
section.
A word of caution: the
mechanisms described here use internal attributes and methods, which
are subject to change in future versions of Python. We intend to
someday provide a common interface for controlling this behavior,
which will work in either pickle or cPickle.