mirror of
https://github.com/python/cpython.git
synced 2024-12-01 03:01:36 +01:00
7696344182
untrusted data.
101 lines
4.6 KiB
TeX
101 lines
4.6 KiB
TeX
\section{\module{marshal} ---
|
|
Internal Python object serialization}
|
|
|
|
\declaremodule{builtin}{marshal}
|
|
\modulesynopsis{Convert Python objects to streams of bytes and back
|
|
(with different constraints).}
|
|
|
|
|
|
This module contains functions that can read and write Python
|
|
values in a binary format. The format is specific to Python, but
|
|
independent of machine architecture issues (e.g., you can write a
|
|
Python value to a file on a PC, transport the file to a Sun, and read
|
|
it back there). Details of the format are undocumented on purpose;
|
|
it may change between Python versions (although it rarely
|
|
does).\footnote{The name of this module stems from a bit of
|
|
terminology used by the designers of Modula-3 (amongst others), who
|
|
use the term ``marshalling'' for shipping of data around in a
|
|
self-contained form. Strictly speaking, ``to marshal'' means to
|
|
convert some data from internal to external form (in an RPC buffer for
|
|
instance) and ``unmarshalling'' for the reverse process.}
|
|
|
|
This is not a general ``persistence'' module. For general persistence
|
|
and transfer of Python objects through RPC calls, see the modules
|
|
\refmodule{pickle} and \refmodule{shelve}. The \module{marshal} module exists
|
|
mainly to support reading and writing the ``pseudo-compiled'' code for
|
|
Python modules of \file{.pyc} files. Therefore, the Python
|
|
maintainers reserve the right to modify the marshal format in backward
|
|
incompatible ways should the need arise. If you're serializing and
|
|
de-serializing Python objects, use the \module{pickle} module instead.
|
|
\refstmodindex{pickle}
|
|
\refstmodindex{shelve}
|
|
\obindex{code}
|
|
|
|
\begin{notice}[warning]
|
|
The \module{marshal} module is not intended to be secure against
|
|
erroneous or maliciously constructed data. Never unmarshal data
|
|
received from an untrusted or unauthenticated source.
|
|
\end{notice}
|
|
|
|
Not all Python object types are supported; in general, only objects
|
|
whose value is independent from a particular invocation of Python can
|
|
be written and read by this module. The following types are supported:
|
|
\code{None}, integers, long integers, floating point numbers,
|
|
strings, Unicode objects, tuples, lists, dictionaries, and code
|
|
objects, where it should be understood that tuples, lists and
|
|
dictionaries are only supported as long as the values contained
|
|
therein are themselves supported; and recursive lists and dictionaries
|
|
should not be written (they will cause infinite loops).
|
|
|
|
\strong{Caveat:} On machines where C's \code{long int} type has more than
|
|
32 bits (such as the DEC Alpha), it is possible to create plain Python
|
|
integers that are longer than 32 bits.
|
|
If such an integer is marshaled and read back in on a machine where
|
|
C's \code{long int} type has only 32 bits, a Python long integer object
|
|
is returned instead. While of a different type, the numeric value is
|
|
the same. (This behavior is new in Python 2.2. In earlier versions,
|
|
all but the least-significant 32 bits of the value were lost, and a
|
|
warning message was printed.)
|
|
|
|
There are functions that read/write files as well as functions
|
|
operating on strings.
|
|
|
|
The module defines these functions:
|
|
|
|
\begin{funcdesc}{dump}{value, file}
|
|
Write the value on the open file. The value must be a supported
|
|
type. The file must be an open file object such as
|
|
\code{sys.stdout} or returned by \function{open()} or
|
|
\function{posix.popen()}. It must be opened in binary mode
|
|
(\code{'wb'} or \code{'w+b'}).
|
|
|
|
If the value has (or contains an object that has) an unsupported type,
|
|
a \exception{ValueError} exception is raised --- but garbage data
|
|
will also be written to the file. The object will not be properly
|
|
read back by \function{load()}.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{load}{file}
|
|
Read one value from the open file and return it. If no valid value
|
|
is read, raise \exception{EOFError}, \exception{ValueError} or
|
|
\exception{TypeError}. The file must be an open file object opened
|
|
in binary mode (\code{'rb'} or \code{'r+b'}).
|
|
|
|
\warning{If an object containing an unsupported type was
|
|
marshalled with \function{dump()}, \function{load()} will substitute
|
|
\code{None} for the unmarshallable type.}
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{dumps}{value}
|
|
Return the string that would be written to a file by
|
|
\code{dump(\var{value}, \var{file})}. The value must be a supported
|
|
type. Raise a \exception{ValueError} exception if value has (or
|
|
contains an object that has) an unsupported type.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{loads}{string}
|
|
Convert the string to a value. If no valid value is found, raise
|
|
\exception{EOFError}, \exception{ValueError} or
|
|
\exception{TypeError}. Extra characters in the string are ignored.
|
|
\end{funcdesc}
|