Move the 2.6 reST doc tree in place.

This commit is contained in:
Georg Brandl 2007-08-15 14:28:01 +00:00
parent 4fc61114e5
commit bc32dde2a5
445 changed files with 0 additions and 136056 deletions

View File

@ -1,196 +0,0 @@
Contributors to the Python Documentation
----------------------------------------
This file lists people who have contributed in some way to the Python
documentation. It is probably not complete -- if you feel that you or
anyone else should be on this list, please let us know (send email to
docs@python.org), and we'll be glad to correct the problem.
* Aahz
* Michael Abbott
* Steve Alexander
* Jim Ahlstrom
* Fred Allen
* A. Amoroso
* Pehr Anderson
* Oliver Andrich
* Jesús Cea Avión
* Daniel Barclay
* Chris Barker
* Don Bashford
* Anthony Baxter
* Bennett Benson
* Jonathan Black
* Robin Boerdijk
* Michal Bozon
* Aaron Brancotti
* Georg Brandl
* Keith Briggs
* Lee Busby
* Lorenzo M. Catucci
* Carl Cerecke
* Mauro Cicognini
* Gilles Civario
* Mike Clarkson
* Steve Clift
* Dave Cole
* Matthew Cowles
* Jeremy Craven
* Andrew Dalke
* Ben Darnell
* L. Peter Deutsch
* Robert Donohue
* Fred L. Drake, Jr.
* Jeff Epler
* Michael Ernst
* Blame Andy Eskilsson
* Carey Evans
* Martijn Faassen
* Carl Feynman
* Hernán Martínez Foffani
* Stefan Franke
* Jim Fulton
* Peter Funk
* Lele Gaifax
* Matthew Gallagher
* Ben Gertzfield
* Nadim Ghaznavi
* Jonathan Giddy
* Shelley Gooch
* Nathaniel Gray
* Grant Griffin
* Thomas Guettler
* Anders Hammarquist
* Mark Hammond
* Harald Hanche-Olsen
* Manus Hand
* Gerhard Häring
* Travis B. Hartwell
* Tim Hatch
* Janko Hauser
* Bernhard Herzog
* Magnus L. Hetland
* Konrad Hinsen
* Stefan Hoffmeister
* Albert Hofkamp
* Gregor Hoffleit
* Steve Holden
* Thomas Holenstein
* Gerrit Holl
* Rob Hooft
* Brian Hooper
* Randall Hopper
* Michael Hudson
* Eric Huss
* Jeremy Hylton
* Roger Irwin
* Jack Jansen
* Philip H. Jensen
* Pedro Diaz Jimenez
* Kent Johnson
* Lucas de Jonge
* Andreas Jung
* Robert Kern
* Jim Kerr
* Jan Kim
* Greg Kochanski
* Guido Kollerie
* Peter A. Koren
* Daniel Kozan
* Andrew M. Kuchling
* Dave Kuhlman
* Erno Kuusela
* Detlef Lannert
* Piers Lauder
* Glyph Lefkowitz
* Marc-André Lemburg
* Ulf A. Lindgren
* Everett Lipman
* Mirko Liss
* Martin von Löwis
* Fredrik Lundh
* Jeff MacDonald
* John Machin
* Andrew MacIntyre
* Vladimir Marangozov
* Vincent Marchetti
* Laura Matson
* Daniel May
* Doug Mennella
* Paolo Milani
* Skip Montanaro
* Paul Moore
* Ross Moore
* Sjoerd Mullender
* Dale Nagata
* Ng Pheng Siong
* Koray Oner
* Tomas Oppelstrup
* Denis S. Otkidach
* Zooko O'Whielacronx
* William Park
* Joonas Paalasmaa
* Harri Pasanen
* Bo Peng
* Tim Peters
* Christopher Petrilli
* Justin D. Pettit
* Chris Phoenix
* François Pinard
* Paul Prescod
* Eric S. Raymond
* Edward K. Ream
* Sean Reifschneider
* Bernhard Reiter
* Armin Rigo
* Wes Rishel
* Jim Roskind
* Guido van Rossum
* Donald Wallace Rouse II
* Nick Russo
* Chris Ryland
* Constantina S.
* Hugh Sasse
* Bob Savage
* Scott Schram
* Neil Schemenauer
* Barry Scott
* Joakim Sernbrant
* Justin Sheehy
* Michael Simcich
* Ionel Simionescu
* Gregory P. Smith
* Roy Smith
* Clay Spence
* Nicholas Spies
* Tage Stabell-Kulo
* Frank Stajano
* Anthony Starks
* Greg Stein
* Peter Stoehr
* Mark Summerfield
* Reuben Sumner
* Kalle Svensson
* Jim Tittsler
* Ville Vainio
* Martijn Vries
* Charles G. Waldman
* Greg Ward
* Barry Warsaw
* Corran Webster
* Glyn Webster
* Bob Weiner
* Eddy Welbourne
* Mats Wichmann
* Gerry Wiener
* Timothy Wild
* Collin Winter
* Blake Winton
* Dan Wolfe
* Steven Work
* Thomas Wouters
* Ka-Ping Yee
* Rory Yorke
* Moshe Zadka
* Milan Zamazal
* Cheng Zhang

View File

@ -1,62 +0,0 @@
#
# Makefile for Python documentation
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#
# You can set these variables from the command line.
PYTHON ?= python
SVNROOT ?= http://svn.python.org/projects
SPHINXOPTS ?=
ALLSPHINXOPTS = -b$(BUILDER) -dbuild/doctrees $(SPHINXOPTS) . build/$(BUILDER)
.PHONY: help checkout update build html web htmlhelp clean
help:
@echo "Please use \`make <target>' where <target> is one of"
@echo " html to make standalone HTML files"
@echo " web to make file usable by Sphinx.web"
@echo " htmlhelp to make HTML files and a HTML help project"
checkout:
@if [ ! -d tools/sphinx ]; then \
echo "Checking out Sphinx..."; \
svn checkout $(SVNROOT)/doctools/trunk/sphinx tools/sphinx; \
fi
@if [ ! -d tools/docutils ]; then \
echo "Checking out Docutils..."; \
svn checkout $(SVNROOT)/external/docutils-0.4/docutils tools/docutils; \
fi
@if [ ! -d tools/pygments ]; then \
echo "Checking out Pygments..."; \
svn checkout $(SVNROOT)/external/Pygments-0.8.1/pygments tools/pygments; \
fi
update:
svn update tools/sphinx
svn update tools/docutils
svn update tools/pygments
build: checkout
mkdir -p build/$(BUILDER) build/doctrees
$(PYTHON) tools/sphinx-build.py $(ALLSPHINXOPTS)
@echo
html: BUILDER = html
html: build
@echo "Build finished. The HTML pages are in build/html."
web: BUILDER = web
web: build
@echo "Build finished; now you can run"
@echo " PYTHONPATH=tools $(PYTHON) -m sphinx.web build/web"
@echo "to start the server."
htmlhelp: BUILDER = htmlhelp
htmlhelp: build
@echo "Build finished; now you can run HTML Help Workshop with the" \
"build/hhp/pydoc.hhp project file."
clean:
-rm -rf build/*
-rm -rf tools/sphinx

View File

@ -1,121 +0,0 @@
Python Documentation README
~~~~~~~~~~~~~~~~~~~~~~~~~~~
This directory contains the reStructuredText (reST) sources to the Python
documentation. You don't need to build them yourself, prebuilt versions are
available at http://docs.python.org/download/.
Documentation on the authoring Python documentation, including information about
both style and markup, is available in the "Documenting Python" chapter of the
documentation. There's also a chapter intended to point out differences to
those familiar with the previous docs written in LaTeX.
Building the docs
=================
You need to install Python 2.5 or higher; the toolset used to build the docs are
written in Python. The toolset used to build the documentation is called
*Sphinx*, it is not included in this tree, but maintained separately in the
Python Subversion repository. Also needed are Jinja, a templating engine
(included in Sphinx as a Subversion external), and optionally Pygments, a code
highlighter.
Using make
----------
Luckily, a Makefile has been prepared so that on Unix, provided you have
installed Python and Subversion, you can just run ::
make html
to check out the necessary toolset in the `tools/` subdirectory and build the
HTML output files. To view the generated HTML, point your favorite browser at
the top-level index `build/html/index.html` after running "make".
Available make targets are:
* "html", which builds standalone HTML files for offline viewing.
* "web", which builds files usable with the Sphinx.web application (used to
serve the docs online at http://docs.python.org/).
* "htmlhelp", which builds HTML files and a HTML Help project file usable to
convert them into a single Compiled HTML (.chm) file -- these are popular
under Microsoft Windows, but very handy on every platform.
To create the CHM file, you need to run the Microsoft HTML Help Workshop
over the generated project (.hhp) file.
A "make update" updates the Subversion checkouts in `tools/`.
Without make
------------
You'll need to checkout the Sphinx package to the `tools/` directory::
svn co http://svn.python.org/projects/doctools/trunk/sphinx tools/sphinx
Then, you need to install Docutils 0.4 (the SVN snapshot won't work), either
by checking it out via ::
svn co http://svn.python.org/projects/external/docutils-0.4/docutils tools/docutils
or by installing it from http://docutils.sf.net/.
You can optionally also install Pygments, either as a checkout via ::
svn co http://svn.python.org/projects/external/Pygments-0.8.1/pygments tools/pygments
or from PyPI at http://pypi.python.org/pypi/Pygments.
Then, make an output directory, e.g. under `build/`, and run ::
python tools/sphinx-build.py -b<builder> . build/<outputdirectory>
where `<builder>` is one of html, web or htmlhelp (for explanations see the make
targets above).
Contributing
============
For bugs in the content, the online version at http://docs.python.org/ has a
"suggest change" facility that can be used to correct errors in the source text
and submit them as a patch to the maintainers.
Bugs in the toolset should be reported in the Python bug tracker at
http://bugs.python.org/.
You can also send a mail to the Python Documentation Team at docs@python.org,
and we will process your request as soon as possible.
If you want to help the Documentation Team, you are always welcome. Just send
a mail to docs@python.org.
Copyright notice
================
The Python source is copyrighted, but you can freely use and copy it
as long as you don't change or remove the copyright notice:
----------------------------------------------------------------------
Copyright (c) 2000-2007 Python Software Foundation.
All rights reserved.
Copyright (c) 2000 BeOpen.com.
All rights reserved.
Copyright (c) 1995-2000 Corporation for National Research Initiatives.
All rights reserved.
Copyright (c) 1991-1995 Stichting Mathematisch Centrum.
All rights reserved.
See the file "license.rst" for information on usage and redistribution
of this file, and for a DISCLAIMER OF ALL WARRANTIES.
----------------------------------------------------------------------

View File

@ -1,6 +0,0 @@
To do
=====
* split very large files and add toctrees
* finish "Documenting Python"
* care about XXX comments

View File

@ -1,33 +0,0 @@
=====================
About these documents
=====================
These documents are generated from `reStructuredText
<http://docutils.sf.net/rst.html>`_ sources by *Sphinx*, a document processor
specifically written for the Python documentation.
In the online version of these documents, you can submit comments and suggest
changes directly on the documentation pages.
Development of the documentation and its toolchain takes place on the
docs@python.org mailing list. We're always looking for volunteers wanting
to help with the docs, so feel free to send a mail there!
Many thanks go to:
* Fred L. Drake, Jr., the creator of the original Python documentation toolset
and writer of much of the content;
* the `docutils <http://docutils.sf.net/>`_ project for creating
reStructuredText and the docutils suite;
* Fredrik Lundh for his `Alternative Python Reference
<http://effbot.org/zone/pyref.htm>`_ project from which Sphinx got many good
ideas.
See :ref:`reporting-bugs` for information how to report bugs in Python itself.
.. including the ACKS file here so that it can be maintained separately
.. include:: ACKS.txt
It is only with the input and contributions of the Python community
that Python has such wonderful documentation -- Thank You!

View File

@ -1,59 +0,0 @@
.. _reporting-bugs:
************************
Reporting Bugs in Python
************************
Python is a mature programming language which has established a reputation for
stability. In order to maintain this reputation, the developers would like to
know of any deficiencies you find in Python.
If you find errors in the documentation, please use either the "Add a comment"
or the "Suggest a change" features of the relevant page in the most recent
online documentation at http://docs.python.org/.
All other bug reports should be submitted via the Python Bug Tracker
(http://bugs.python.org/). The bug tracker offers a Web form which allows
pertinent information to be entered and submitted to the developers.
The first step in filing a report is to determine whether the problem has
already been reported. The advantage in doing so, aside from saving the
developers time, is that you learn what has been done to fix it; it may be that
the problem has already been fixed for the next release, or additional
information is needed (in which case you are welcome to provide it if you can!).
To do this, search the bug database using the search box on the top of the page.
If the problem you're reporting is not already in the bug tracker, go back to
the Python Bug Tracker. If you don't already have a tracker account, select the
"Register" link in the sidebar and undergo the registration procedure.
Otherwise, if you're not logged in, enter your credentials and select "Login".
It is not possible to submit a bug report anonymously.
Being now logged in, you can submit a bug. Select the "Create New" link in the
sidebar to open the bug reporting form.
The submission form has a number of fields. For the "Title" field, enter a
*very* short description of the problem; less than ten words is good. In the
"Type" field, select the type of your problem; also select the "Component" and
"Versions" to which the bug relates.
In the "Change Note" field, describe the problem in detail, including what you
expected to happen and what did happen. Be sure to include whether any
extension modules were involved, and what hardware and software platform you
were using (including version information as appropriate).
Each bug report will be assigned to a developer who will determine what needs to
be done to correct the problem. You will receive an update each time action is
taken on the bug.
.. seealso::
`How to Report Bugs Effectively <http://www-mice.cs.ucl.ac.uk/multimedia/software/documentation/ReportingBugs.html>`_
Article which goes into some detail about how to create a useful bug report.
This describes what kind of information is useful and why it is useful.
`Bug Writing Guidelines <http://www.mozilla.org/quality/bug-writing-guidelines.html>`_
Information about writing a good bug report. Some of this is specific to the
Mozilla project, but describes general good practices.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,534 +0,0 @@
.. highlightlang:: c
.. _exceptionhandling:
******************
Exception Handling
******************
The functions described in this chapter will let you handle and raise Python
exceptions. It is important to understand some of the basics of Python
exception handling. It works somewhat like the Unix :cdata:`errno` variable:
there is a global indicator (per thread) of the last error that occurred. Most
functions don't clear this on success, but will set it to indicate the cause of
the error on failure. Most functions also return an error indicator, usually
*NULL* if they are supposed to return a pointer, or ``-1`` if they return an
integer (exception: the :cfunc:`PyArg_\*` functions return ``1`` for success and
``0`` for failure).
When a function must fail because some function it called failed, it generally
doesn't set the error indicator; the function it called already set it. It is
responsible for either handling the error and clearing the exception or
returning after cleaning up any resources it holds (such as object references or
memory allocations); it should *not* continue normally if it is not prepared to
handle the error. If returning due to an error, it is important to indicate to
the caller that an error has been set. If the error is not handled or carefully
propagated, additional calls into the Python/C API may not behave as intended
and may fail in mysterious ways.
.. index::
single: exc_type (in module sys)
single: exc_value (in module sys)
single: exc_traceback (in module sys)
The error indicator consists of three Python objects corresponding to the
Python variables ``sys.exc_type``, ``sys.exc_value`` and ``sys.exc_traceback``.
API functions exist to interact with the error indicator in various ways. There
is a separate error indicator for each thread.
.. % XXX Order of these should be more thoughtful.
.. % Either alphabetical or some kind of structure.
.. cfunction:: void PyErr_Print()
Print a standard traceback to ``sys.stderr`` and clear the error indicator.
Call this function only when the error indicator is set. (Otherwise it will
cause a fatal error!)
.. cfunction:: PyObject* PyErr_Occurred()
Test whether the error indicator is set. If set, return the exception *type*
(the first argument to the last call to one of the :cfunc:`PyErr_Set\*`
functions or to :cfunc:`PyErr_Restore`). If not set, return *NULL*. You do not
own a reference to the return value, so you do not need to :cfunc:`Py_DECREF`
it.
.. note::
Do not compare the return value to a specific exception; use
:cfunc:`PyErr_ExceptionMatches` instead, shown below. (The comparison could
easily fail since the exception may be an instance instead of a class, in the
case of a class exception, or it may the a subclass of the expected exception.)
.. cfunction:: int PyErr_ExceptionMatches(PyObject *exc)
Equivalent to ``PyErr_GivenExceptionMatches(PyErr_Occurred(), exc)``. This
should only be called when an exception is actually set; a memory access
violation will occur if no exception has been raised.
.. cfunction:: int PyErr_GivenExceptionMatches(PyObject *given, PyObject *exc)
Return true if the *given* exception matches the exception in *exc*. If *exc*
is a class object, this also returns true when *given* is an instance of a
subclass. If *exc* is a tuple, all exceptions in the tuple (and recursively in
subtuples) are searched for a match. If *given* is *NULL*, a memory access
violation will occur.
.. cfunction:: void PyErr_NormalizeException(PyObject**exc, PyObject**val, PyObject**tb)
Under certain circumstances, the values returned by :cfunc:`PyErr_Fetch` below
can be "unnormalized", meaning that ``*exc`` is a class object but ``*val`` is
not an instance of the same class. This function can be used to instantiate
the class in that case. If the values are already normalized, nothing happens.
The delayed normalization is implemented to improve performance.
.. cfunction:: void PyErr_Clear()
Clear the error indicator. If the error indicator is not set, there is no
effect.
.. cfunction:: void PyErr_Fetch(PyObject **ptype, PyObject **pvalue, PyObject **ptraceback)
Retrieve the error indicator into three variables whose addresses are passed.
If the error indicator is not set, set all three variables to *NULL*. If it is
set, it will be cleared and you own a reference to each object retrieved. The
value and traceback object may be *NULL* even when the type object is not.
.. note::
This function is normally only used by code that needs to handle exceptions or
by code that needs to save and restore the error indicator temporarily.
.. cfunction:: void PyErr_Restore(PyObject *type, PyObject *value, PyObject *traceback)
Set the error indicator from the three objects. If the error indicator is
already set, it is cleared first. If the objects are *NULL*, the error
indicator is cleared. Do not pass a *NULL* type and non-*NULL* value or
traceback. The exception type should be a class. Do not pass an invalid
exception type or value. (Violating these rules will cause subtle problems
later.) This call takes away a reference to each object: you must own a
reference to each object before the call and after the call you no longer own
these references. (If you don't understand this, don't use this function. I
warned you.)
.. note::
This function is normally only used by code that needs to save and restore the
error indicator temporarily; use :cfunc:`PyErr_Fetch` to save the current
exception state.
.. cfunction:: void PyErr_SetString(PyObject *type, const char *message)
This is the most common way to set the error indicator. The first argument
specifies the exception type; it is normally one of the standard exceptions,
e.g. :cdata:`PyExc_RuntimeError`. You need not increment its reference count.
The second argument is an error message; it is converted to a string object.
.. cfunction:: void PyErr_SetObject(PyObject *type, PyObject *value)
This function is similar to :cfunc:`PyErr_SetString` but lets you specify an
arbitrary Python object for the "value" of the exception.
.. cfunction:: PyObject* PyErr_Format(PyObject *exception, const char *format, ...)
This function sets the error indicator and returns *NULL*. *exception* should be
a Python exception (class, not an instance). *format* should be a string,
containing format codes, similar to :cfunc:`printf`. The ``width.precision``
before a format code is parsed, but the width part is ignored.
.. % This should be exactly the same as the table in PyString_FromFormat.
.. % One should just refer to the other.
.. % The descriptions for %zd and %zu are wrong, but the truth is complicated
.. % because not all compilers support the %z width modifier -- we fake it
.. % when necessary via interpolating PY_FORMAT_SIZE_T.
.. % %u, %lu, %zu should have "new in Python 2.5" blurbs.
+-------------------+---------------+--------------------------------+
| Format Characters | Type | Comment |
+===================+===============+================================+
| :attr:`%%` | *n/a* | The literal % character. |
+-------------------+---------------+--------------------------------+
| :attr:`%c` | int | A single character, |
| | | represented as an C int. |
+-------------------+---------------+--------------------------------+
| :attr:`%d` | int | Exactly equivalent to |
| | | ``printf("%d")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%u` | unsigned int | Exactly equivalent to |
| | | ``printf("%u")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%ld` | long | Exactly equivalent to |
| | | ``printf("%ld")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%lu` | unsigned long | Exactly equivalent to |
| | | ``printf("%lu")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%zd` | Py_ssize_t | Exactly equivalent to |
| | | ``printf("%zd")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%zu` | size_t | Exactly equivalent to |
| | | ``printf("%zu")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%i` | int | Exactly equivalent to |
| | | ``printf("%i")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%x` | int | Exactly equivalent to |
| | | ``printf("%x")``. |
+-------------------+---------------+--------------------------------+
| :attr:`%s` | char\* | A null-terminated C character |
| | | array. |
+-------------------+---------------+--------------------------------+
| :attr:`%p` | void\* | The hex representation of a C |
| | | pointer. Mostly equivalent to |
| | | ``printf("%p")`` except that |
| | | it is guaranteed to start with |
| | | the literal ``0x`` regardless |
| | | of what the platform's |
| | | ``printf`` yields. |
+-------------------+---------------+--------------------------------+
An unrecognized format character causes all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.
.. cfunction:: void PyErr_SetNone(PyObject *type)
This is a shorthand for ``PyErr_SetObject(type, Py_None)``.
.. cfunction:: int PyErr_BadArgument()
This is a shorthand for ``PyErr_SetString(PyExc_TypeError, message)``, where
*message* indicates that a built-in operation was invoked with an illegal
argument. It is mostly for internal use.
.. cfunction:: PyObject* PyErr_NoMemory()
This is a shorthand for ``PyErr_SetNone(PyExc_MemoryError)``; it returns *NULL*
so an object allocation function can write ``return PyErr_NoMemory();`` when it
runs out of memory.
.. cfunction:: PyObject* PyErr_SetFromErrno(PyObject *type)
.. index:: single: strerror()
This is a convenience function to raise an exception when a C library function
has returned an error and set the C variable :cdata:`errno`. It constructs a
tuple object whose first item is the integer :cdata:`errno` value and whose
second item is the corresponding error message (gotten from :cfunc:`strerror`),
and then calls ``PyErr_SetObject(type, object)``. On Unix, when the
:cdata:`errno` value is :const:`EINTR`, indicating an interrupted system call,
this calls :cfunc:`PyErr_CheckSignals`, and if that set the error indicator,
leaves it set to that. The function always returns *NULL*, so a wrapper
function around a system call can write ``return PyErr_SetFromErrno(type);``
when the system call returns an error.
.. cfunction:: PyObject* PyErr_SetFromErrnoWithFilename(PyObject *type, const char *filename)
Similar to :cfunc:`PyErr_SetFromErrno`, with the additional behavior that if
*filename* is not *NULL*, it is passed to the constructor of *type* as a third
parameter. In the case of exceptions such as :exc:`IOError` and :exc:`OSError`,
this is used to define the :attr:`filename` attribute of the exception instance.
.. cfunction:: PyObject* PyErr_SetFromWindowsErr(int ierr)
This is a convenience function to raise :exc:`WindowsError`. If called with
*ierr* of :cdata:`0`, the error code returned by a call to :cfunc:`GetLastError`
is used instead. It calls the Win32 function :cfunc:`FormatMessage` to retrieve
the Windows description of error code given by *ierr* or :cfunc:`GetLastError`,
then it constructs a tuple object whose first item is the *ierr* value and whose
second item is the corresponding error message (gotten from
:cfunc:`FormatMessage`), and then calls ``PyErr_SetObject(PyExc_WindowsError,
object)``. This function always returns *NULL*. Availability: Windows.
.. cfunction:: PyObject* PyErr_SetExcFromWindowsErr(PyObject *type, int ierr)
Similar to :cfunc:`PyErr_SetFromWindowsErr`, with an additional parameter
specifying the exception type to be raised. Availability: Windows.
.. versionadded:: 2.3
.. cfunction:: PyObject* PyErr_SetFromWindowsErrWithFilename(int ierr, const char *filename)
Similar to :cfunc:`PyErr_SetFromWindowsErr`, with the additional behavior that
if *filename* is not *NULL*, it is passed to the constructor of
:exc:`WindowsError` as a third parameter. Availability: Windows.
.. cfunction:: PyObject* PyErr_SetExcFromWindowsErrWithFilename(PyObject *type, int ierr, char *filename)
Similar to :cfunc:`PyErr_SetFromWindowsErrWithFilename`, with an additional
parameter specifying the exception type to be raised. Availability: Windows.
.. versionadded:: 2.3
.. cfunction:: void PyErr_BadInternalCall()
This is a shorthand for ``PyErr_SetString(PyExc_TypeError, message)``, where
*message* indicates that an internal operation (e.g. a Python/C API function)
was invoked with an illegal argument. It is mostly for internal use.
.. cfunction:: int PyErr_WarnEx(PyObject *category, char *message, int stacklevel)
Issue a warning message. The *category* argument is a warning category (see
below) or *NULL*; the *message* argument is a message string. *stacklevel* is a
positive number giving a number of stack frames; the warning will be issued from
the currently executing line of code in that stack frame. A *stacklevel* of 1
is the function calling :cfunc:`PyErr_WarnEx`, 2 is the function above that,
and so forth.
This function normally prints a warning message to *sys.stderr*; however, it is
also possible that the user has specified that warnings are to be turned into
errors, and in that case this will raise an exception. It is also possible that
the function raises an exception because of a problem with the warning machinery
(the implementation imports the :mod:`warnings` module to do the heavy lifting).
The return value is ``0`` if no exception is raised, or ``-1`` if an exception
is raised. (It is not possible to determine whether a warning message is
actually printed, nor what the reason is for the exception; this is
intentional.) If an exception is raised, the caller should do its normal
exception handling (for example, :cfunc:`Py_DECREF` owned references and return
an error value).
Warning categories must be subclasses of :cdata:`Warning`; the default warning
category is :cdata:`RuntimeWarning`. The standard Python warning categories are
available as global variables whose names are ``PyExc_`` followed by the Python
exception name. These have the type :ctype:`PyObject\*`; they are all class
objects. Their names are :cdata:`PyExc_Warning`, :cdata:`PyExc_UserWarning`,
:cdata:`PyExc_UnicodeWarning`, :cdata:`PyExc_DeprecationWarning`,
:cdata:`PyExc_SyntaxWarning`, :cdata:`PyExc_RuntimeWarning`, and
:cdata:`PyExc_FutureWarning`. :cdata:`PyExc_Warning` is a subclass of
:cdata:`PyExc_Exception`; the other warning categories are subclasses of
:cdata:`PyExc_Warning`.
For information about warning control, see the documentation for the
:mod:`warnings` module and the :option:`-W` option in the command line
documentation. There is no C API for warning control.
.. cfunction:: int PyErr_Warn(PyObject *category, char *message)
Issue a warning message. The *category* argument is a warning category (see
below) or *NULL*; the *message* argument is a message string. The warning will
appear to be issued from the function calling :cfunc:`PyErr_Warn`, equivalent to
calling :cfunc:`PyErr_WarnEx` with a *stacklevel* of 1.
Deprecated; use :cfunc:`PyErr_WarnEx` instead.
.. cfunction:: int PyErr_WarnExplicit(PyObject *category, const char *message, const char *filename, int lineno, const char *module, PyObject *registry)
Issue a warning message with explicit control over all warning attributes. This
is a straightforward wrapper around the Python function
:func:`warnings.warn_explicit`, see there for more information. The *module*
and *registry* arguments may be set to *NULL* to get the default effect
described there.
.. cfunction:: int PyErr_CheckSignals()
.. index::
module: signal
single: SIGINT
single: KeyboardInterrupt (built-in exception)
This function interacts with Python's signal handling. It checks whether a
signal has been sent to the processes and if so, invokes the corresponding
signal handler. If the :mod:`signal` module is supported, this can invoke a
signal handler written in Python. In all cases, the default effect for
:const:`SIGINT` is to raise the :exc:`KeyboardInterrupt` exception. If an
exception is raised the error indicator is set and the function returns ``-1``;
otherwise the function returns ``0``. The error indicator may or may not be
cleared if it was previously set.
.. cfunction:: void PyErr_SetInterrupt()
.. index::
single: SIGINT
single: KeyboardInterrupt (built-in exception)
This function simulates the effect of a :const:`SIGINT` signal arriving --- the
next time :cfunc:`PyErr_CheckSignals` is called, :exc:`KeyboardInterrupt` will
be raised. It may be called without holding the interpreter lock.
.. % XXX This was described as obsolete, but is used in
.. % thread.interrupt_main() (used from IDLE), so it's still needed.
.. cfunction:: PyObject* PyErr_NewException(char *name, PyObject *base, PyObject *dict)
This utility function creates and returns a new exception object. The *name*
argument must be the name of the new exception, a C string of the form
``module.class``. The *base* and *dict* arguments are normally *NULL*. This
creates a class object derived from :exc:`Exception` (accessible in C as
:cdata:`PyExc_Exception`).
The :attr:`__module__` attribute of the new class is set to the first part (up
to the last dot) of the *name* argument, and the class name is set to the last
part (after the last dot). The *base* argument can be used to specify alternate
base classes; it can either be only one class or a tuple of classes. The *dict*
argument can be used to specify a dictionary of class variables and methods.
.. cfunction:: void PyErr_WriteUnraisable(PyObject *obj)
This utility function prints a warning message to ``sys.stderr`` when an
exception has been set but it is impossible for the interpreter to actually
raise the exception. It is used, for example, when an exception occurs in an
:meth:`__del__` method.
The function is called with a single argument *obj* that identifies the context
in which the unraisable exception occurred. The repr of *obj* will be printed in
the warning message.
.. _standardexceptions:
Standard Exceptions
===================
All standard Python exceptions are available as global variables whose names are
``PyExc_`` followed by the Python exception name. These have the type
:ctype:`PyObject\*`; they are all class objects. For completeness, here are all
the variables:
+------------------------------------+----------------------------+----------+
| C Name | Python Name | Notes |
+====================================+============================+==========+
| :cdata:`PyExc_BaseException` | :exc:`BaseException` | (1), (4) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_Exception` | :exc:`Exception` | \(1) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_StandardError` | :exc:`StandardError` | \(1) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_ArithmeticError` | :exc:`ArithmeticError` | \(1) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_LookupError` | :exc:`LookupError` | \(1) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_AssertionError` | :exc:`AssertionError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_AttributeError` | :exc:`AttributeError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_EOFError` | :exc:`EOFError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_EnvironmentError` | :exc:`EnvironmentError` | \(1) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_FloatingPointError` | :exc:`FloatingPointError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_IOError` | :exc:`IOError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_ImportError` | :exc:`ImportError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_IndexError` | :exc:`IndexError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_KeyError` | :exc:`KeyError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_KeyboardInterrupt` | :exc:`KeyboardInterrupt` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_MemoryError` | :exc:`MemoryError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_NameError` | :exc:`NameError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_NotImplementedError` | :exc:`NotImplementedError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_OSError` | :exc:`OSError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_OverflowError` | :exc:`OverflowError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_ReferenceError` | :exc:`ReferenceError` | \(2) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_RuntimeError` | :exc:`RuntimeError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_SyntaxError` | :exc:`SyntaxError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_SystemError` | :exc:`SystemError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_SystemExit` | :exc:`SystemExit` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_TypeError` | :exc:`TypeError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_ValueError` | :exc:`ValueError` | |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_WindowsError` | :exc:`WindowsError` | \(3) |
+------------------------------------+----------------------------+----------+
| :cdata:`PyExc_ZeroDivisionError` | :exc:`ZeroDivisionError` | |
+------------------------------------+----------------------------+----------+
.. index::
single: PyExc_BaseException
single: PyExc_Exception
single: PyExc_StandardError
single: PyExc_ArithmeticError
single: PyExc_LookupError
single: PyExc_AssertionError
single: PyExc_AttributeError
single: PyExc_EOFError
single: PyExc_EnvironmentError
single: PyExc_FloatingPointError
single: PyExc_IOError
single: PyExc_ImportError
single: PyExc_IndexError
single: PyExc_KeyError
single: PyExc_KeyboardInterrupt
single: PyExc_MemoryError
single: PyExc_NameError
single: PyExc_NotImplementedError
single: PyExc_OSError
single: PyExc_OverflowError
single: PyExc_ReferenceError
single: PyExc_RuntimeError
single: PyExc_SyntaxError
single: PyExc_SystemError
single: PyExc_SystemExit
single: PyExc_TypeError
single: PyExc_ValueError
single: PyExc_WindowsError
single: PyExc_ZeroDivisionError
Notes:
(1)
This is a base class for other standard exceptions.
(2)
This is the same as :exc:`weakref.ReferenceError`.
(3)
Only defined on Windows; protect code that uses this by testing that the
preprocessor macro ``MS_WINDOWS`` is defined.
(4)
.. versionadded:: 2.5
Deprecation of String Exceptions
================================
.. index:: single: BaseException (built-in exception)
All exceptions built into Python or provided in the standard library are derived
from :exc:`BaseException`.
String exceptions are still supported in the interpreter to allow existing code
to run unmodified, but this will also change in a future release.

View File

@ -1,33 +0,0 @@
.. _c-api-index:
##################################
Python/C API Reference Manual
##################################
:Release: |version|
:Date: |today|
This manual documents the API used by C and C++ programmers who want to write
extension modules or embed Python. It is a companion to :ref:`extending-index`,
which describes the general principles of extension writing but does not
document the API functions in detail.
.. warning::
The current version of this document is somewhat incomplete. However, most of
the important functions, types and structures are described.
.. toctree::
:maxdepth: 2
intro.rst
veryhigh.rst
refcounting.rst
exceptions.rst
utilities.rst
abstract.rst
concrete.rst
init.rst
memory.rst
newtypes.rst

View File

@ -1,936 +0,0 @@
.. highlightlang:: c
.. _initialization:
*****************************************
Initialization, Finalization, and Threads
*****************************************
.. cfunction:: void Py_Initialize()
.. index::
single: Py_SetProgramName()
single: PyEval_InitThreads()
single: PyEval_ReleaseLock()
single: PyEval_AcquireLock()
single: modules (in module sys)
single: path (in module sys)
module: __builtin__
module: __main__
module: sys
triple: module; search; path
single: PySys_SetArgv()
single: Py_Finalize()
Initialize the Python interpreter. In an application embedding Python, this
should be called before using any other Python/C API functions; with the
exception of :cfunc:`Py_SetProgramName`, :cfunc:`PyEval_InitThreads`,
:cfunc:`PyEval_ReleaseLock`, and :cfunc:`PyEval_AcquireLock`. This initializes
the table of loaded modules (``sys.modules``), and creates the fundamental
modules :mod:`__builtin__`, :mod:`__main__` and :mod:`sys`. It also initializes
the module search path (``sys.path``). It does not set ``sys.argv``; use
:cfunc:`PySys_SetArgv` for that. This is a no-op when called for a second time
(without calling :cfunc:`Py_Finalize` first). There is no return value; it is a
fatal error if the initialization fails.
.. cfunction:: void Py_InitializeEx(int initsigs)
This function works like :cfunc:`Py_Initialize` if *initsigs* is 1. If
*initsigs* is 0, it skips initialization registration of signal handlers, which
might be useful when Python is embedded.
.. versionadded:: 2.4
.. cfunction:: int Py_IsInitialized()
Return true (nonzero) when the Python interpreter has been initialized, false
(zero) if not. After :cfunc:`Py_Finalize` is called, this returns false until
:cfunc:`Py_Initialize` is called again.
.. cfunction:: void Py_Finalize()
Undo all initializations made by :cfunc:`Py_Initialize` and subsequent use of
Python/C API functions, and destroy all sub-interpreters (see
:cfunc:`Py_NewInterpreter` below) that were created and not yet destroyed since
the last call to :cfunc:`Py_Initialize`. Ideally, this frees all memory
allocated by the Python interpreter. This is a no-op when called for a second
time (without calling :cfunc:`Py_Initialize` again first). There is no return
value; errors during finalization are ignored.
This function is provided for a number of reasons. An embedding application
might want to restart Python without having to restart the application itself.
An application that has loaded the Python interpreter from a dynamically
loadable library (or DLL) might want to free all memory allocated by Python
before unloading the DLL. During a hunt for memory leaks in an application a
developer might want to free all memory allocated by Python before exiting from
the application.
**Bugs and caveats:** The destruction of modules and objects in modules is done
in random order; this may cause destructors (:meth:`__del__` methods) to fail
when they depend on other objects (even functions) or modules. Dynamically
loaded extension modules loaded by Python are not unloaded. Small amounts of
memory allocated by the Python interpreter may not be freed (if you find a leak,
please report it). Memory tied up in circular references between objects is not
freed. Some memory allocated by extension modules may not be freed. Some
extensions may not work properly if their initialization routine is called more
than once; this can happen if an application calls :cfunc:`Py_Initialize` and
:cfunc:`Py_Finalize` more than once.
.. cfunction:: PyThreadState* Py_NewInterpreter()
.. index::
module: __builtin__
module: __main__
module: sys
single: stdout (in module sys)
single: stderr (in module sys)
single: stdin (in module sys)
Create a new sub-interpreter. This is an (almost) totally separate environment
for the execution of Python code. In particular, the new interpreter has
separate, independent versions of all imported modules, including the
fundamental modules :mod:`__builtin__`, :mod:`__main__` and :mod:`sys`. The
table of loaded modules (``sys.modules``) and the module search path
(``sys.path``) are also separate. The new environment has no ``sys.argv``
variable. It has new standard I/O stream file objects ``sys.stdin``,
``sys.stdout`` and ``sys.stderr`` (however these refer to the same underlying
:ctype:`FILE` structures in the C library).
The return value points to the first thread state created in the new
sub-interpreter. This thread state is made in the current thread state.
Note that no actual thread is created; see the discussion of thread states
below. If creation of the new interpreter is unsuccessful, *NULL* is
returned; no exception is set since the exception state is stored in the
current thread state and there may not be a current thread state. (Like all
other Python/C API functions, the global interpreter lock must be held before
calling this function and is still held when it returns; however, unlike most
other Python/C API functions, there needn't be a current thread state on
entry.)
.. index::
single: Py_Finalize()
single: Py_Initialize()
Extension modules are shared between (sub-)interpreters as follows: the first
time a particular extension is imported, it is initialized normally, and a
(shallow) copy of its module's dictionary is squirreled away. When the same
extension is imported by another (sub-)interpreter, a new module is initialized
and filled with the contents of this copy; the extension's ``init`` function is
not called. Note that this is different from what happens when an extension is
imported after the interpreter has been completely re-initialized by calling
:cfunc:`Py_Finalize` and :cfunc:`Py_Initialize`; in that case, the extension's
``initmodule`` function *is* called again.
.. index:: single: close() (in module os)
**Bugs and caveats:** Because sub-interpreters (and the main interpreter) are
part of the same process, the insulation between them isn't perfect --- for
example, using low-level file operations like :func:`os.close` they can
(accidentally or maliciously) affect each other's open files. Because of the
way extensions are shared between (sub-)interpreters, some extensions may not
work properly; this is especially likely when the extension makes use of
(static) global variables, or when the extension manipulates its module's
dictionary after its initialization. It is possible to insert objects created
in one sub-interpreter into a namespace of another sub-interpreter; this should
be done with great care to avoid sharing user-defined functions, methods,
instances or classes between sub-interpreters, since import operations executed
by such objects may affect the wrong (sub-)interpreter's dictionary of loaded
modules. (XXX This is a hard-to-fix bug that will be addressed in a future
release.)
Also note that the use of this functionality is incompatible with extension
modules such as PyObjC and ctypes that use the :cfunc:`PyGILState_\*` APIs (and
this is inherent in the way the :cfunc:`PyGILState_\*` functions work). Simple
things may work, but confusing behavior will always be near.
.. cfunction:: void Py_EndInterpreter(PyThreadState *tstate)
.. index:: single: Py_Finalize()
Destroy the (sub-)interpreter represented by the given thread state. The given
thread state must be the current thread state. See the discussion of thread
states below. When the call returns, the current thread state is *NULL*. All
thread states associated with this interpreter are destroyed. (The global
interpreter lock must be held before calling this function and is still held
when it returns.) :cfunc:`Py_Finalize` will destroy all sub-interpreters that
haven't been explicitly destroyed at that point.
.. cfunction:: void Py_SetProgramName(char *name)
.. index::
single: Py_Initialize()
single: main()
single: Py_GetPath()
This function should be called before :cfunc:`Py_Initialize` is called for
the first time, if it is called at all. It tells the interpreter the value
of the ``argv[0]`` argument to the :cfunc:`main` function of the program.
This is used by :cfunc:`Py_GetPath` and some other functions below to find
the Python run-time libraries relative to the interpreter executable. The
default value is ``'python'``. The argument should point to a
zero-terminated character string in static storage whose contents will not
change for the duration of the program's execution. No code in the Python
interpreter will change the contents of this storage.
.. cfunction:: char* Py_GetProgramName()
.. index:: single: Py_SetProgramName()
Return the program name set with :cfunc:`Py_SetProgramName`, or the default.
The returned string points into static storage; the caller should not modify its
value.
.. cfunction:: char* Py_GetPrefix()
Return the *prefix* for installed platform-independent files. This is derived
through a number of complicated rules from the program name set with
:cfunc:`Py_SetProgramName` and some environment variables; for example, if the
program name is ``'/usr/local/bin/python'``, the prefix is ``'/usr/local'``. The
returned string points into static storage; the caller should not modify its
value. This corresponds to the :makevar:`prefix` variable in the top-level
:file:`Makefile` and the :option:`--prefix` argument to the :program:`configure`
script at build time. The value is available to Python code as ``sys.prefix``.
It is only useful on Unix. See also the next function.
.. cfunction:: char* Py_GetExecPrefix()
Return the *exec-prefix* for installed platform-*dependent* files. This is
derived through a number of complicated rules from the program name set with
:cfunc:`Py_SetProgramName` and some environment variables; for example, if the
program name is ``'/usr/local/bin/python'``, the exec-prefix is
``'/usr/local'``. The returned string points into static storage; the caller
should not modify its value. This corresponds to the :makevar:`exec_prefix`
variable in the top-level :file:`Makefile` and the :option:`--exec-prefix`
argument to the :program:`configure` script at build time. The value is
available to Python code as ``sys.exec_prefix``. It is only useful on Unix.
Background: The exec-prefix differs from the prefix when platform dependent
files (such as executables and shared libraries) are installed in a different
directory tree. In a typical installation, platform dependent files may be
installed in the :file:`/usr/local/plat` subtree while platform independent may
be installed in :file:`/usr/local`.
Generally speaking, a platform is a combination of hardware and software
families, e.g. Sparc machines running the Solaris 2.x operating system are
considered the same platform, but Intel machines running Solaris 2.x are another
platform, and Intel machines running Linux are yet another platform. Different
major revisions of the same operating system generally also form different
platforms. Non-Unix operating systems are a different story; the installation
strategies on those systems are so different that the prefix and exec-prefix are
meaningless, and set to the empty string. Note that compiled Python bytecode
files are platform independent (but not independent from the Python version by
which they were compiled!).
System administrators will know how to configure the :program:`mount` or
:program:`automount` programs to share :file:`/usr/local` between platforms
while having :file:`/usr/local/plat` be a different filesystem for each
platform.
.. cfunction:: char* Py_GetProgramFullPath()
.. index::
single: Py_SetProgramName()
single: executable (in module sys)
Return the full program name of the Python executable; this is computed as a
side-effect of deriving the default module search path from the program name
(set by :cfunc:`Py_SetProgramName` above). The returned string points into
static storage; the caller should not modify its value. The value is available
to Python code as ``sys.executable``.
.. cfunction:: char* Py_GetPath()
.. index::
triple: module; search; path
single: path (in module sys)
Return the default module search path; this is computed from the program name
(set by :cfunc:`Py_SetProgramName` above) and some environment variables. The
returned string consists of a series of directory names separated by a platform
dependent delimiter character. The delimiter character is ``':'`` on Unix and
Mac OS X, ``';'`` on Windows. The returned string points into static storage;
the caller should not modify its value. The value is available to Python code
as the list ``sys.path``, which may be modified to change the future search path
for loaded modules.
.. % XXX should give the exact rules
.. cfunction:: const char* Py_GetVersion()
Return the version of this Python interpreter. This is a string that looks
something like ::
"1.5 (#67, Dec 31 1997, 22:34:28) [GCC 2.7.2.2]"
.. index:: single: version (in module sys)
The first word (up to the first space character) is the current Python version;
the first three characters are the major and minor version separated by a
period. The returned string points into static storage; the caller should not
modify its value. The value is available to Python code as ``sys.version``.
.. cfunction:: const char* Py_GetBuildNumber()
Return a string representing the Subversion revision that this Python executable
was built from. This number is a string because it may contain a trailing 'M'
if Python was built from a mixed revision source tree.
.. versionadded:: 2.5
.. cfunction:: const char* Py_GetPlatform()
.. index:: single: platform (in module sys)
Return the platform identifier for the current platform. On Unix, this is
formed from the "official" name of the operating system, converted to lower
case, followed by the major revision number; e.g., for Solaris 2.x, which is
also known as SunOS 5.x, the value is ``'sunos5'``. On Mac OS X, it is
``'darwin'``. On Windows, it is ``'win'``. The returned string points into
static storage; the caller should not modify its value. The value is available
to Python code as ``sys.platform``.
.. cfunction:: const char* Py_GetCopyright()
Return the official copyright string for the current Python version, for example
``'Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam'``
.. index:: single: copyright (in module sys)
The returned string points into static storage; the caller should not modify its
value. The value is available to Python code as ``sys.copyright``.
.. cfunction:: const char* Py_GetCompiler()
Return an indication of the compiler used to build the current Python version,
in square brackets, for example::
"[GCC 2.7.2.2]"
.. index:: single: version (in module sys)
The returned string points into static storage; the caller should not modify its
value. The value is available to Python code as part of the variable
``sys.version``.
.. cfunction:: const char* Py_GetBuildInfo()
Return information about the sequence number and build date and time of the
current Python interpreter instance, for example ::
"#67, Aug 1 1997, 22:34:28"
.. index:: single: version (in module sys)
The returned string points into static storage; the caller should not modify its
value. The value is available to Python code as part of the variable
``sys.version``.
.. cfunction:: void PySys_SetArgv(int argc, char **argv)
.. index::
single: main()
single: Py_FatalError()
single: argv (in module sys)
Set ``sys.argv`` based on *argc* and *argv*. These parameters are similar to
those passed to the program's :cfunc:`main` function with the difference that
the first entry should refer to the script file to be executed rather than the
executable hosting the Python interpreter. If there isn't a script that will be
run, the first entry in *argv* can be an empty string. If this function fails
to initialize ``sys.argv``, a fatal condition is signalled using
:cfunc:`Py_FatalError`.
.. % XXX impl. doesn't seem consistent in allowing 0/NULL for the params;
.. % check w/ Guido.
.. % XXX Other PySys thingies (doesn't really belong in this chapter)
.. _threads:
Thread State and the Global Interpreter Lock
============================================
.. index::
single: global interpreter lock
single: interpreter lock
single: lock, interpreter
The Python interpreter is not fully thread safe. In order to support
multi-threaded Python programs, there's a global lock that must be held by the
current thread before it can safely access Python objects. Without the lock,
even the simplest operations could cause problems in a multi-threaded program:
for example, when two threads simultaneously increment the reference count of
the same object, the reference count could end up being incremented only once
instead of twice.
.. index:: single: setcheckinterval() (in module sys)
Therefore, the rule exists that only the thread that has acquired the global
interpreter lock may operate on Python objects or call Python/C API functions.
In order to support multi-threaded Python programs, the interpreter regularly
releases and reacquires the lock --- by default, every 100 bytecode instructions
(this can be changed with :func:`sys.setcheckinterval`). The lock is also
released and reacquired around potentially blocking I/O operations like reading
or writing a file, so that other threads can run while the thread that requests
the I/O is waiting for the I/O operation to complete.
.. index::
single: PyThreadState
single: PyThreadState
The Python interpreter needs to keep some bookkeeping information separate per
thread --- for this it uses a data structure called :ctype:`PyThreadState`.
There's one global variable, however: the pointer to the current
:ctype:`PyThreadState` structure. While most thread packages have a way to
store "per-thread global data," Python's internal platform independent thread
abstraction doesn't support this yet. Therefore, the current thread state must
be manipulated explicitly.
This is easy enough in most cases. Most code manipulating the global
interpreter lock has the following simple structure::
Save the thread state in a local variable.
Release the interpreter lock.
...Do some blocking I/O operation...
Reacquire the interpreter lock.
Restore the thread state from the local variable.
This is so common that a pair of macros exists to simplify it::
Py_BEGIN_ALLOW_THREADS
...Do some blocking I/O operation...
Py_END_ALLOW_THREADS
.. index::
single: Py_BEGIN_ALLOW_THREADS
single: Py_END_ALLOW_THREADS
The :cmacro:`Py_BEGIN_ALLOW_THREADS` macro opens a new block and declares a
hidden local variable; the :cmacro:`Py_END_ALLOW_THREADS` macro closes the
block. Another advantage of using these two macros is that when Python is
compiled without thread support, they are defined empty, thus saving the thread
state and lock manipulations.
When thread support is enabled, the block above expands to the following code::
PyThreadState *_save;
_save = PyEval_SaveThread();
...Do some blocking I/O operation...
PyEval_RestoreThread(_save);
Using even lower level primitives, we can get roughly the same effect as
follows::
PyThreadState *_save;
_save = PyThreadState_Swap(NULL);
PyEval_ReleaseLock();
...Do some blocking I/O operation...
PyEval_AcquireLock();
PyThreadState_Swap(_save);
.. index::
single: PyEval_RestoreThread()
single: errno
single: PyEval_SaveThread()
single: PyEval_ReleaseLock()
single: PyEval_AcquireLock()
There are some subtle differences; in particular, :cfunc:`PyEval_RestoreThread`
saves and restores the value of the global variable :cdata:`errno`, since the
lock manipulation does not guarantee that :cdata:`errno` is left alone. Also,
when thread support is disabled, :cfunc:`PyEval_SaveThread` and
:cfunc:`PyEval_RestoreThread` don't manipulate the lock; in this case,
:cfunc:`PyEval_ReleaseLock` and :cfunc:`PyEval_AcquireLock` are not available.
This is done so that dynamically loaded extensions compiled with thread support
enabled can be loaded by an interpreter that was compiled with disabled thread
support.
The global interpreter lock is used to protect the pointer to the current thread
state. When releasing the lock and saving the thread state, the current thread
state pointer must be retrieved before the lock is released (since another
thread could immediately acquire the lock and store its own thread state in the
global variable). Conversely, when acquiring the lock and restoring the thread
state, the lock must be acquired before storing the thread state pointer.
Why am I going on with so much detail about this? Because when threads are
created from C, they don't have the global interpreter lock, nor is there a
thread state data structure for them. Such threads must bootstrap themselves
into existence, by first creating a thread state data structure, then acquiring
the lock, and finally storing their thread state pointer, before they can start
using the Python/C API. When they are done, they should reset the thread state
pointer, release the lock, and finally free their thread state data structure.
Beginning with version 2.3, threads can now take advantage of the
:cfunc:`PyGILState_\*` functions to do all of the above automatically. The
typical idiom for calling into Python from a C thread is now::
PyGILState_STATE gstate;
gstate = PyGILState_Ensure();
/* Perform Python actions here. */
result = CallSomeFunction();
/* evaluate result */
/* Release the thread. No Python API allowed beyond this point. */
PyGILState_Release(gstate);
Note that the :cfunc:`PyGILState_\*` functions assume there is only one global
interpreter (created automatically by :cfunc:`Py_Initialize`). Python still
supports the creation of additional interpreters (using
:cfunc:`Py_NewInterpreter`), but mixing multiple interpreters and the
:cfunc:`PyGILState_\*` API is unsupported.
.. ctype:: PyInterpreterState
This data structure represents the state shared by a number of cooperating
threads. Threads belonging to the same interpreter share their module
administration and a few other internal items. There are no public members in
this structure.
Threads belonging to different interpreters initially share nothing, except
process state like available memory, open file descriptors and such. The global
interpreter lock is also shared by all threads, regardless of to which
interpreter they belong.
.. ctype:: PyThreadState
This data structure represents the state of a single thread. The only public
data member is :ctype:`PyInterpreterState \*`:attr:`interp`, which points to
this thread's interpreter state.
.. cfunction:: void PyEval_InitThreads()
.. index::
single: PyEval_ReleaseLock()
single: PyEval_ReleaseThread()
single: PyEval_SaveThread()
single: PyEval_RestoreThread()
Initialize and acquire the global interpreter lock. It should be called in the
main thread before creating a second thread or engaging in any other thread
operations such as :cfunc:`PyEval_ReleaseLock` or
``PyEval_ReleaseThread(tstate)``. It is not needed before calling
:cfunc:`PyEval_SaveThread` or :cfunc:`PyEval_RestoreThread`.
.. index:: single: Py_Initialize()
This is a no-op when called for a second time. It is safe to call this function
before calling :cfunc:`Py_Initialize`.
.. index:: module: thread
When only the main thread exists, no lock operations are needed. This is a
common situation (most Python programs do not use threads), and the lock
operations slow the interpreter down a bit. Therefore, the lock is not created
initially. This situation is equivalent to having acquired the lock: when
there is only a single thread, all object accesses are safe. Therefore, when
this function initializes the lock, it also acquires it. Before the Python
:mod:`thread` module creates a new thread, knowing that either it has the lock
or the lock hasn't been created yet, it calls :cfunc:`PyEval_InitThreads`. When
this call returns, it is guaranteed that the lock has been created and that the
calling thread has acquired it.
It is **not** safe to call this function when it is unknown which thread (if
any) currently has the global interpreter lock.
This function is not available when thread support is disabled at compile time.
.. cfunction:: int PyEval_ThreadsInitialized()
Returns a non-zero value if :cfunc:`PyEval_InitThreads` has been called. This
function can be called without holding the lock, and therefore can be used to
avoid calls to the locking API when running single-threaded. This function is
not available when thread support is disabled at compile time.
.. versionadded:: 2.4
.. cfunction:: void PyEval_AcquireLock()
Acquire the global interpreter lock. The lock must have been created earlier.
If this thread already has the lock, a deadlock ensues. This function is not
available when thread support is disabled at compile time.
.. cfunction:: void PyEval_ReleaseLock()
Release the global interpreter lock. The lock must have been created earlier.
This function is not available when thread support is disabled at compile time.
.. cfunction:: void PyEval_AcquireThread(PyThreadState *tstate)
Acquire the global interpreter lock and set the current thread state to
*tstate*, which should not be *NULL*. The lock must have been created earlier.
If this thread already has the lock, deadlock ensues. This function is not
available when thread support is disabled at compile time.
.. cfunction:: void PyEval_ReleaseThread(PyThreadState *tstate)
Reset the current thread state to *NULL* and release the global interpreter
lock. The lock must have been created earlier and must be held by the current
thread. The *tstate* argument, which must not be *NULL*, is only used to check
that it represents the current thread state --- if it isn't, a fatal error is
reported. This function is not available when thread support is disabled at
compile time.
.. cfunction:: PyThreadState* PyEval_SaveThread()
Release the interpreter lock (if it has been created and thread support is
enabled) and reset the thread state to *NULL*, returning the previous thread
state (which is not *NULL*). If the lock has been created, the current thread
must have acquired it. (This function is available even when thread support is
disabled at compile time.)
.. cfunction:: void PyEval_RestoreThread(PyThreadState *tstate)
Acquire the interpreter lock (if it has been created and thread support is
enabled) and set the thread state to *tstate*, which must not be *NULL*. If the
lock has been created, the current thread must not have acquired it, otherwise
deadlock ensues. (This function is available even when thread support is
disabled at compile time.)
The following macros are normally used without a trailing semicolon; look for
example usage in the Python source distribution.
.. cmacro:: Py_BEGIN_ALLOW_THREADS
This macro expands to ``{ PyThreadState *_save; _save = PyEval_SaveThread();``.
Note that it contains an opening brace; it must be matched with a following
:cmacro:`Py_END_ALLOW_THREADS` macro. See above for further discussion of this
macro. It is a no-op when thread support is disabled at compile time.
.. cmacro:: Py_END_ALLOW_THREADS
This macro expands to ``PyEval_RestoreThread(_save); }``. Note that it contains
a closing brace; it must be matched with an earlier
:cmacro:`Py_BEGIN_ALLOW_THREADS` macro. See above for further discussion of
this macro. It is a no-op when thread support is disabled at compile time.
.. cmacro:: Py_BLOCK_THREADS
This macro expands to ``PyEval_RestoreThread(_save);``: it is equivalent to
:cmacro:`Py_END_ALLOW_THREADS` without the closing brace. It is a no-op when
thread support is disabled at compile time.
.. cmacro:: Py_UNBLOCK_THREADS
This macro expands to ``_save = PyEval_SaveThread();``: it is equivalent to
:cmacro:`Py_BEGIN_ALLOW_THREADS` without the opening brace and variable
declaration. It is a no-op when thread support is disabled at compile time.
All of the following functions are only available when thread support is enabled
at compile time, and must be called only when the interpreter lock has been
created.
.. cfunction:: PyInterpreterState* PyInterpreterState_New()
Create a new interpreter state object. The interpreter lock need not be held,
but may be held if it is necessary to serialize calls to this function.
.. cfunction:: void PyInterpreterState_Clear(PyInterpreterState *interp)
Reset all information in an interpreter state object. The interpreter lock must
be held.
.. cfunction:: void PyInterpreterState_Delete(PyInterpreterState *interp)
Destroy an interpreter state object. The interpreter lock need not be held.
The interpreter state must have been reset with a previous call to
:cfunc:`PyInterpreterState_Clear`.
.. cfunction:: PyThreadState* PyThreadState_New(PyInterpreterState *interp)
Create a new thread state object belonging to the given interpreter object. The
interpreter lock need not be held, but may be held if it is necessary to
serialize calls to this function.
.. cfunction:: void PyThreadState_Clear(PyThreadState *tstate)
Reset all information in a thread state object. The interpreter lock must be
held.
.. cfunction:: void PyThreadState_Delete(PyThreadState *tstate)
Destroy a thread state object. The interpreter lock need not be held. The
thread state must have been reset with a previous call to
:cfunc:`PyThreadState_Clear`.
.. cfunction:: PyThreadState* PyThreadState_Get()
Return the current thread state. The interpreter lock must be held. When the
current thread state is *NULL*, this issues a fatal error (so that the caller
needn't check for *NULL*).
.. cfunction:: PyThreadState* PyThreadState_Swap(PyThreadState *tstate)
Swap the current thread state with the thread state given by the argument
*tstate*, which may be *NULL*. The interpreter lock must be held.
.. cfunction:: PyObject* PyThreadState_GetDict()
Return a dictionary in which extensions can store thread-specific state
information. Each extension should use a unique key to use to store state in
the dictionary. It is okay to call this function when no current thread state
is available. If this function returns *NULL*, no exception has been raised and
the caller should assume no current thread state is available.
.. versionchanged:: 2.3
Previously this could only be called when a current thread is active, and *NULL*
meant that an exception was raised.
.. cfunction:: int PyThreadState_SetAsyncExc(long id, PyObject *exc)
Asynchronously raise an exception in a thread. The *id* argument is the thread
id of the target thread; *exc* is the exception object to be raised. This
function does not steal any references to *exc*. To prevent naive misuse, you
must write your own C extension to call this. Must be called with the GIL held.
Returns the number of thread states modified; this is normally one, but will be
zero if the thread id isn't found. If *exc* is :const:`NULL`, the pending
exception (if any) for the thread is cleared. This raises no exceptions.
.. versionadded:: 2.3
.. cfunction:: PyGILState_STATE PyGILState_Ensure()
Ensure that the current thread is ready to call the Python C API regardless of
the current state of Python, or of its thread lock. This may be called as many
times as desired by a thread as long as each call is matched with a call to
:cfunc:`PyGILState_Release`. In general, other thread-related APIs may be used
between :cfunc:`PyGILState_Ensure` and :cfunc:`PyGILState_Release` calls as long
as the thread state is restored to its previous state before the Release(). For
example, normal usage of the :cmacro:`Py_BEGIN_ALLOW_THREADS` and
:cmacro:`Py_END_ALLOW_THREADS` macros is acceptable.
The return value is an opaque "handle" to the thread state when
:cfunc:`PyGILState_Acquire` was called, and must be passed to
:cfunc:`PyGILState_Release` to ensure Python is left in the same state. Even
though recursive calls are allowed, these handles *cannot* be shared - each
unique call to :cfunc:`PyGILState_Ensure` must save the handle for its call to
:cfunc:`PyGILState_Release`.
When the function returns, the current thread will hold the GIL. Failure is a
fatal error.
.. versionadded:: 2.3
.. cfunction:: void PyGILState_Release(PyGILState_STATE)
Release any resources previously acquired. After this call, Python's state will
be the same as it was prior to the corresponding :cfunc:`PyGILState_Ensure` call
(but generally this state will be unknown to the caller, hence the use of the
GILState API.)
Every call to :cfunc:`PyGILState_Ensure` must be matched by a call to
:cfunc:`PyGILState_Release` on the same thread.
.. versionadded:: 2.3
.. _profiling:
Profiling and Tracing
=====================
.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
The Python interpreter provides some low-level support for attaching profiling
and execution tracing facilities. These are used for profiling, debugging, and
coverage analysis tools.
Starting with Python 2.2, the implementation of this facility was substantially
revised, and an interface from C was added. This C interface allows the
profiling or tracing code to avoid the overhead of calling through Python-level
callable objects, making a direct C function call instead. The essential
attributes of the facility have not changed; the interface allows trace
functions to be installed per-thread, and the basic events reported to the trace
function are the same as had been reported to the Python-level trace functions
in previous versions.
.. ctype:: int (*Py_tracefunc)(PyObject *obj, PyFrameObject *frame, int what, PyObject *arg)
The type of the trace function registered using :cfunc:`PyEval_SetProfile` and
:cfunc:`PyEval_SetTrace`. The first parameter is the object passed to the
registration function as *obj*, *frame* is the frame object to which the event
pertains, *what* is one of the constants :const:`PyTrace_CALL`,
:const:`PyTrace_EXCEPTION`, :const:`PyTrace_LINE`, :const:`PyTrace_RETURN`,
:const:`PyTrace_C_CALL`, :const:`PyTrace_C_EXCEPTION`, or
:const:`PyTrace_C_RETURN`, and *arg* depends on the value of *what*:
+------------------------------+--------------------------------------+
| Value of *what* | Meaning of *arg* |
+==============================+======================================+
| :const:`PyTrace_CALL` | Always *NULL*. |
+------------------------------+--------------------------------------+
| :const:`PyTrace_EXCEPTION` | Exception information as returned by |
| | :func:`sys.exc_info`. |
+------------------------------+--------------------------------------+
| :const:`PyTrace_LINE` | Always *NULL*. |
+------------------------------+--------------------------------------+
| :const:`PyTrace_RETURN` | Value being returned to the caller. |
+------------------------------+--------------------------------------+
| :const:`PyTrace_C_CALL` | Name of function being called. |
+------------------------------+--------------------------------------+
| :const:`PyTrace_C_EXCEPTION` | Always *NULL*. |
+------------------------------+--------------------------------------+
| :const:`PyTrace_C_RETURN` | Always *NULL*. |
+------------------------------+--------------------------------------+
.. cvar:: int PyTrace_CALL
The value of the *what* parameter to a :ctype:`Py_tracefunc` function when a new
call to a function or method is being reported, or a new entry into a generator.
Note that the creation of the iterator for a generator function is not reported
as there is no control transfer to the Python bytecode in the corresponding
frame.
.. cvar:: int PyTrace_EXCEPTION
The value of the *what* parameter to a :ctype:`Py_tracefunc` function when an
exception has been raised. The callback function is called with this value for
*what* when after any bytecode is processed after which the exception becomes
set within the frame being executed. The effect of this is that as exception
propagation causes the Python stack to unwind, the callback is called upon
return to each frame as the exception propagates. Only trace functions receives
these events; they are not needed by the profiler.
.. cvar:: int PyTrace_LINE
The value passed as the *what* parameter to a trace function (but not a
profiling function) when a line-number event is being reported.
.. cvar:: int PyTrace_RETURN
The value for the *what* parameter to :ctype:`Py_tracefunc` functions when a
call is returning without propagating an exception.
.. cvar:: int PyTrace_C_CALL
The value for the *what* parameter to :ctype:`Py_tracefunc` functions when a C
function is about to be called.
.. cvar:: int PyTrace_C_EXCEPTION
The value for the *what* parameter to :ctype:`Py_tracefunc` functions when a C
function has thrown an exception.
.. cvar:: int PyTrace_C_RETURN
The value for the *what* parameter to :ctype:`Py_tracefunc` functions when a C
function has returned.
.. cfunction:: void PyEval_SetProfile(Py_tracefunc func, PyObject *obj)
Set the profiler function to *func*. The *obj* parameter is passed to the
function as its first parameter, and may be any Python object, or *NULL*. If
the profile function needs to maintain state, using a different value for *obj*
for each thread provides a convenient and thread-safe place to store it. The
profile function is called for all monitored events except the line-number
events.
.. cfunction:: void PyEval_SetTrace(Py_tracefunc func, PyObject *obj)
Set the tracing function to *func*. This is similar to
:cfunc:`PyEval_SetProfile`, except the tracing function does receive line-number
events.
.. _advanced-debugging:
Advanced Debugger Support
=========================
.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
These functions are only intended to be used by advanced debugging tools.
.. cfunction:: PyInterpreterState* PyInterpreterState_Head()
Return the interpreter state object at the head of the list of all such objects.
.. versionadded:: 2.2
.. cfunction:: PyInterpreterState* PyInterpreterState_Next(PyInterpreterState *interp)
Return the next interpreter state object after *interp* from the list of all
such objects.
.. versionadded:: 2.2
.. cfunction:: PyThreadState * PyInterpreterState_ThreadHead(PyInterpreterState *interp)
Return the a pointer to the first :ctype:`PyThreadState` object in the list of
threads associated with the interpreter *interp*.
.. versionadded:: 2.2
.. cfunction:: PyThreadState* PyThreadState_Next(PyThreadState *tstate)
Return the next thread state object after *tstate* from the list of all such
objects belonging to the same :ctype:`PyInterpreterState` object.
.. versionadded:: 2.2

View File

@ -1,635 +0,0 @@
.. highlightlang:: c
.. _api-intro:
************
Introduction
************
The Application Programmer's Interface to Python gives C and C++ programmers
access to the Python interpreter at a variety of levels. The API is equally
usable from C++, but for brevity it is generally referred to as the Python/C
API. There are two fundamentally different reasons for using the Python/C API.
The first reason is to write *extension modules* for specific purposes; these
are C modules that extend the Python interpreter. This is probably the most
common use. The second reason is to use Python as a component in a larger
application; this technique is generally referred to as :dfn:`embedding` Python
in an application.
Writing an extension module is a relatively well-understood process, where a
"cookbook" approach works well. There are several tools that automate the
process to some extent. While people have embedded Python in other
applications since its early existence, the process of embedding Python is less
straightforward than writing an extension.
Many API functions are useful independent of whether you're embedding or
extending Python; moreover, most applications that embed Python will need to
provide a custom extension as well, so it's probably a good idea to become
familiar with writing an extension before attempting to embed Python in a real
application.
.. _api-includes:
Include Files
=============
All function, type and macro definitions needed to use the Python/C API are
included in your code by the following line::
#include "Python.h"
This implies inclusion of the following standard headers: ``<stdio.h>``,
``<string.h>``, ``<errno.h>``, ``<limits.h>``, and ``<stdlib.h>`` (if
available).
.. warning::
Since Python may define some pre-processor definitions which affect the standard
headers on some systems, you *must* include :file:`Python.h` before any standard
headers are included.
All user visible names defined by Python.h (except those defined by the included
standard headers) have one of the prefixes ``Py`` or ``_Py``. Names beginning
with ``_Py`` are for internal use by the Python implementation and should not be
used by extension writers. Structure member names do not have a reserved prefix.
**Important:** user code should never define names that begin with ``Py`` or
``_Py``. This confuses the reader, and jeopardizes the portability of the user
code to future Python versions, which may define additional names beginning with
one of these prefixes.
The header files are typically installed with Python. On Unix, these are
located in the directories :file:`{prefix}/include/pythonversion/` and
:file:`{exec_prefix}/include/pythonversion/`, where :envvar:`prefix` and
:envvar:`exec_prefix` are defined by the corresponding parameters to Python's
:program:`configure` script and *version* is ``sys.version[:3]``. On Windows,
the headers are installed in :file:`{prefix}/include`, where :envvar:`prefix` is
the installation directory specified to the installer.
To include the headers, place both directories (if different) on your compiler's
search path for includes. Do *not* place the parent directories on the search
path and then use ``#include <pythonX.Y/Python.h>``; this will break on
multi-platform builds since the platform independent headers under
:envvar:`prefix` include the platform specific headers from
:envvar:`exec_prefix`.
C++ users should note that though the API is defined entirely using C, the
header files do properly declare the entry points to be ``extern "C"``, so there
is no need to do anything special to use the API from C++.
.. _api-objects:
Objects, Types and Reference Counts
===================================
.. index:: object: type
Most Python/C API functions have one or more arguments as well as a return value
of type :ctype:`PyObject\*`. This type is a pointer to an opaque data type
representing an arbitrary Python object. Since all Python object types are
treated the same way by the Python language in most situations (e.g.,
assignments, scope rules, and argument passing), it is only fitting that they
should be represented by a single C type. Almost all Python objects live on the
heap: you never declare an automatic or static variable of type
:ctype:`PyObject`, only pointer variables of type :ctype:`PyObject\*` can be
declared. The sole exception are the type objects; since these must never be
deallocated, they are typically static :ctype:`PyTypeObject` objects.
All Python objects (even Python integers) have a :dfn:`type` and a
:dfn:`reference count`. An object's type determines what kind of object it is
(e.g., an integer, a list, or a user-defined function; there are many more as
explained in :ref:`types`). For each of the well-known types there is a macro
to check whether an object is of that type; for instance, ``PyList_Check(a)`` is
true if (and only if) the object pointed to by *a* is a Python list.
.. _api-refcounts:
Reference Counts
----------------
The reference count is important because today's computers have a finite (and
often severely limited) memory size; it counts how many different places there
are that have a reference to an object. Such a place could be another object,
or a global (or static) C variable, or a local variable in some C function.
When an object's reference count becomes zero, the object is deallocated. If
it contains references to other objects, their reference count is decremented.
Those other objects may be deallocated in turn, if this decrement makes their
reference count become zero, and so on. (There's an obvious problem with
objects that reference each other here; for now, the solution is "don't do
that.")
.. index::
single: Py_INCREF()
single: Py_DECREF()
Reference counts are always manipulated explicitly. The normal way is to use
the macro :cfunc:`Py_INCREF` to increment an object's reference count by one,
and :cfunc:`Py_DECREF` to decrement it by one. The :cfunc:`Py_DECREF` macro
is considerably more complex than the incref one, since it must check whether
the reference count becomes zero and then cause the object's deallocator to be
called. The deallocator is a function pointer contained in the object's type
structure. The type-specific deallocator takes care of decrementing the
reference counts for other objects contained in the object if this is a compound
object type, such as a list, as well as performing any additional finalization
that's needed. There's no chance that the reference count can overflow; at
least as many bits are used to hold the reference count as there are distinct
memory locations in virtual memory (assuming ``sizeof(long) >= sizeof(char*)``).
Thus, the reference count increment is a simple operation.
It is not necessary to increment an object's reference count for every local
variable that contains a pointer to an object. In theory, the object's
reference count goes up by one when the variable is made to point to it and it
goes down by one when the variable goes out of scope. However, these two
cancel each other out, so at the end the reference count hasn't changed. The
only real reason to use the reference count is to prevent the object from being
deallocated as long as our variable is pointing to it. If we know that there
is at least one other reference to the object that lives at least as long as
our variable, there is no need to increment the reference count temporarily.
An important situation where this arises is in objects that are passed as
arguments to C functions in an extension module that are called from Python;
the call mechanism guarantees to hold a reference to every argument for the
duration of the call.
However, a common pitfall is to extract an object from a list and hold on to it
for a while without incrementing its reference count. Some other operation might
conceivably remove the object from the list, decrementing its reference count
and possible deallocating it. The real danger is that innocent-looking
operations may invoke arbitrary Python code which could do this; there is a code
path which allows control to flow back to the user from a :cfunc:`Py_DECREF`, so
almost any operation is potentially dangerous.
A safe approach is to always use the generic operations (functions whose name
begins with ``PyObject_``, ``PyNumber_``, ``PySequence_`` or ``PyMapping_``).
These operations always increment the reference count of the object they return.
This leaves the caller with the responsibility to call :cfunc:`Py_DECREF` when
they are done with the result; this soon becomes second nature.
.. _api-refcountdetails:
Reference Count Details
^^^^^^^^^^^^^^^^^^^^^^^
The reference count behavior of functions in the Python/C API is best explained
in terms of *ownership of references*. Ownership pertains to references, never
to objects (objects are not owned: they are always shared). "Owning a
reference" means being responsible for calling Py_DECREF on it when the
reference is no longer needed. Ownership can also be transferred, meaning that
the code that receives ownership of the reference then becomes responsible for
eventually decref'ing it by calling :cfunc:`Py_DECREF` or :cfunc:`Py_XDECREF`
when it's no longer needed---or passing on this responsibility (usually to its
caller). When a function passes ownership of a reference on to its caller, the
caller is said to receive a *new* reference. When no ownership is transferred,
the caller is said to *borrow* the reference. Nothing needs to be done for a
borrowed reference.
Conversely, when a calling function passes it a reference to an object, there
are two possibilities: the function *steals* a reference to the object, or it
does not. *Stealing a reference* means that when you pass a reference to a
function, that function assumes that it now owns that reference, and you are not
responsible for it any longer.
.. index::
single: PyList_SetItem()
single: PyTuple_SetItem()
Few functions steal references; the two notable exceptions are
:cfunc:`PyList_SetItem` and :cfunc:`PyTuple_SetItem`, which steal a reference
to the item (but not to the tuple or list into which the item is put!). These
functions were designed to steal a reference because of a common idiom for
populating a tuple or list with newly created objects; for example, the code to
create the tuple ``(1, 2, "three")`` could look like this (forgetting about
error handling for the moment; a better way to code this is shown below)::
PyObject *t;
t = PyTuple_New(3);
PyTuple_SetItem(t, 0, PyInt_FromLong(1L));
PyTuple_SetItem(t, 1, PyInt_FromLong(2L));
PyTuple_SetItem(t, 2, PyString_FromString("three"));
Here, :cfunc:`PyInt_FromLong` returns a new reference which is immediately
stolen by :cfunc:`PyTuple_SetItem`. When you want to keep using an object
although the reference to it will be stolen, use :cfunc:`Py_INCREF` to grab
another reference before calling the reference-stealing function.
Incidentally, :cfunc:`PyTuple_SetItem` is the *only* way to set tuple items;
:cfunc:`PySequence_SetItem` and :cfunc:`PyObject_SetItem` refuse to do this
since tuples are an immutable data type. You should only use
:cfunc:`PyTuple_SetItem` for tuples that you are creating yourself.
Equivalent code for populating a list can be written using :cfunc:`PyList_New`
and :cfunc:`PyList_SetItem`.
However, in practice, you will rarely use these ways of creating and populating
a tuple or list. There's a generic function, :cfunc:`Py_BuildValue`, that can
create most common objects from C values, directed by a :dfn:`format string`.
For example, the above two blocks of code could be replaced by the following
(which also takes care of the error checking)::
PyObject *tuple, *list;
tuple = Py_BuildValue("(iis)", 1, 2, "three");
list = Py_BuildValue("[iis]", 1, 2, "three");
It is much more common to use :cfunc:`PyObject_SetItem` and friends with items
whose references you are only borrowing, like arguments that were passed in to
the function you are writing. In that case, their behaviour regarding reference
counts is much saner, since you don't have to increment a reference count so you
can give a reference away ("have it be stolen"). For example, this function
sets all items of a list (actually, any mutable sequence) to a given item::
int
set_all(PyObject *target, PyObject *item)
{
int i, n;
n = PyObject_Length(target);
if (n < 0)
return -1;
for (i = 0; i < n; i++) {
PyObject *index = PyInt_FromLong(i);
if (!index)
return -1;
if (PyObject_SetItem(target, index, item) < 0)
return -1;
Py_DECREF(index);
}
return 0;
}
.. index:: single: set_all()
The situation is slightly different for function return values. While passing
a reference to most functions does not change your ownership responsibilities
for that reference, many functions that return a reference to an object give
you ownership of the reference. The reason is simple: in many cases, the
returned object is created on the fly, and the reference you get is the only
reference to the object. Therefore, the generic functions that return object
references, like :cfunc:`PyObject_GetItem` and :cfunc:`PySequence_GetItem`,
always return a new reference (the caller becomes the owner of the reference).
It is important to realize that whether you own a reference returned by a
function depends on which function you call only --- *the plumage* (the type of
the object passed as an argument to the function) *doesn't enter into it!*
Thus, if you extract an item from a list using :cfunc:`PyList_GetItem`, you
don't own the reference --- but if you obtain the same item from the same list
using :cfunc:`PySequence_GetItem` (which happens to take exactly the same
arguments), you do own a reference to the returned object.
.. index::
single: PyList_GetItem()
single: PySequence_GetItem()
Here is an example of how you could write a function that computes the sum of
the items in a list of integers; once using :cfunc:`PyList_GetItem`, and once
using :cfunc:`PySequence_GetItem`. ::
long
sum_list(PyObject *list)
{
int i, n;
long total = 0;
PyObject *item;
n = PyList_Size(list);
if (n < 0)
return -1; /* Not a list */
for (i = 0; i < n; i++) {
item = PyList_GetItem(list, i); /* Can't fail */
if (!PyInt_Check(item)) continue; /* Skip non-integers */
total += PyInt_AsLong(item);
}
return total;
}
.. index:: single: sum_list()
::
long
sum_sequence(PyObject *sequence)
{
int i, n;
long total = 0;
PyObject *item;
n = PySequence_Length(sequence);
if (n < 0)
return -1; /* Has no length */
for (i = 0; i < n; i++) {
item = PySequence_GetItem(sequence, i);
if (item == NULL)
return -1; /* Not a sequence, or other failure */
if (PyInt_Check(item))
total += PyInt_AsLong(item);
Py_DECREF(item); /* Discard reference ownership */
}
return total;
}
.. index:: single: sum_sequence()
.. _api-types:
Types
-----
There are few other data types that play a significant role in the Python/C
API; most are simple C types such as :ctype:`int`, :ctype:`long`,
:ctype:`double` and :ctype:`char\*`. A few structure types are used to
describe static tables used to list the functions exported by a module or the
data attributes of a new object type, and another is used to describe the value
of a complex number. These will be discussed together with the functions that
use them.
.. _api-exceptions:
Exceptions
==========
The Python programmer only needs to deal with exceptions if specific error
handling is required; unhandled exceptions are automatically propagated to the
caller, then to the caller's caller, and so on, until they reach the top-level
interpreter, where they are reported to the user accompanied by a stack
traceback.
.. index:: single: PyErr_Occurred()
For C programmers, however, error checking always has to be explicit. All
functions in the Python/C API can raise exceptions, unless an explicit claim is
made otherwise in a function's documentation. In general, when a function
encounters an error, it sets an exception, discards any object references that
it owns, and returns an error indicator --- usually *NULL* or ``-1``. A few
functions return a Boolean true/false result, with false indicating an error.
Very few functions return no explicit error indicator or have an ambiguous
return value, and require explicit testing for errors with
:cfunc:`PyErr_Occurred`.
.. index::
single: PyErr_SetString()
single: PyErr_Clear()
Exception state is maintained in per-thread storage (this is equivalent to
using global storage in an unthreaded application). A thread can be in one of
two states: an exception has occurred, or not. The function
:cfunc:`PyErr_Occurred` can be used to check for this: it returns a borrowed
reference to the exception type object when an exception has occurred, and
*NULL* otherwise. There are a number of functions to set the exception state:
:cfunc:`PyErr_SetString` is the most common (though not the most general)
function to set the exception state, and :cfunc:`PyErr_Clear` clears the
exception state.
.. index::
single: exc_type (in module sys)
single: exc_value (in module sys)
single: exc_traceback (in module sys)
The full exception state consists of three objects (all of which can be
*NULL*): the exception type, the corresponding exception value, and the
traceback. These have the same meanings as the Python objects
``sys.exc_type``, ``sys.exc_value``, and ``sys.exc_traceback``; however, they
are not the same: the Python objects represent the last exception being handled
by a Python :keyword:`try` ... :keyword:`except` statement, while the C level
exception state only exists while an exception is being passed on between C
functions until it reaches the Python bytecode interpreter's main loop, which
takes care of transferring it to ``sys.exc_type`` and friends.
.. index:: single: exc_info() (in module sys)
Note that starting with Python 1.5, the preferred, thread-safe way to access the
exception state from Python code is to call the function :func:`sys.exc_info`,
which returns the per-thread exception state for Python code. Also, the
semantics of both ways to access the exception state have changed so that a
function which catches an exception will save and restore its thread's exception
state so as to preserve the exception state of its caller. This prevents common
bugs in exception handling code caused by an innocent-looking function
overwriting the exception being handled; it also reduces the often unwanted
lifetime extension for objects that are referenced by the stack frames in the
traceback.
As a general principle, a function that calls another function to perform some
task should check whether the called function raised an exception, and if so,
pass the exception state on to its caller. It should discard any object
references that it owns, and return an error indicator, but it should *not* set
another exception --- that would overwrite the exception that was just raised,
and lose important information about the exact cause of the error.
.. index:: single: sum_sequence()
A simple example of detecting exceptions and passing them on is shown in the
:cfunc:`sum_sequence` example above. It so happens that that example doesn't
need to clean up any owned references when it detects an error. The following
example function shows some error cleanup. First, to remind you why you like
Python, we show the equivalent Python code::
def incr_item(dict, key):
try:
item = dict[key]
except KeyError:
item = 0
dict[key] = item + 1
.. index:: single: incr_item()
Here is the corresponding C code, in all its glory::
int
incr_item(PyObject *dict, PyObject *key)
{
/* Objects all initialized to NULL for Py_XDECREF */
PyObject *item = NULL, *const_one = NULL, *incremented_item = NULL;
int rv = -1; /* Return value initialized to -1 (failure) */
item = PyObject_GetItem(dict, key);
if (item == NULL) {
/* Handle KeyError only: */
if (!PyErr_ExceptionMatches(PyExc_KeyError))
goto error;
/* Clear the error and use zero: */
PyErr_Clear();
item = PyInt_FromLong(0L);
if (item == NULL)
goto error;
}
const_one = PyInt_FromLong(1L);
if (const_one == NULL)
goto error;
incremented_item = PyNumber_Add(item, const_one);
if (incremented_item == NULL)
goto error;
if (PyObject_SetItem(dict, key, incremented_item) < 0)
goto error;
rv = 0; /* Success */
/* Continue with cleanup code */
error:
/* Cleanup code, shared by success and failure path */
/* Use Py_XDECREF() to ignore NULL references */
Py_XDECREF(item);
Py_XDECREF(const_one);
Py_XDECREF(incremented_item);
return rv; /* -1 for error, 0 for success */
}
.. index:: single: incr_item()
.. index::
single: PyErr_ExceptionMatches()
single: PyErr_Clear()
single: Py_XDECREF()
This example represents an endorsed use of the :keyword:`goto` statement in C!
It illustrates the use of :cfunc:`PyErr_ExceptionMatches` and
:cfunc:`PyErr_Clear` to handle specific exceptions, and the use of
:cfunc:`Py_XDECREF` to dispose of owned references that may be *NULL* (note the
``'X'`` in the name; :cfunc:`Py_DECREF` would crash when confronted with a
*NULL* reference). It is important that the variables used to hold owned
references are initialized to *NULL* for this to work; likewise, the proposed
return value is initialized to ``-1`` (failure) and only set to success after
the final call made is successful.
.. _api-embedding:
Embedding Python
================
The one important task that only embedders (as opposed to extension writers) of
the Python interpreter have to worry about is the initialization, and possibly
the finalization, of the Python interpreter. Most functionality of the
interpreter can only be used after the interpreter has been initialized.
.. index::
single: Py_Initialize()
module: __builtin__
module: __main__
module: sys
module: exceptions
triple: module; search; path
single: path (in module sys)
The basic initialization function is :cfunc:`Py_Initialize`. This initializes
the table of loaded modules, and creates the fundamental modules
:mod:`__builtin__`, :mod:`__main__`, :mod:`sys`, and :mod:`exceptions`. It also
initializes the module search path (``sys.path``).
.. index:: single: PySys_SetArgv()
:cfunc:`Py_Initialize` does not set the "script argument list" (``sys.argv``).
If this variable is needed by Python code that will be executed later, it must
be set explicitly with a call to ``PySys_SetArgv(argc, argv)`` subsequent to
the call to :cfunc:`Py_Initialize`.
On most systems (in particular, on Unix and Windows, although the details are
slightly different), :cfunc:`Py_Initialize` calculates the module search path
based upon its best guess for the location of the standard Python interpreter
executable, assuming that the Python library is found in a fixed location
relative to the Python interpreter executable. In particular, it looks for a
directory named :file:`lib/python{X.Y}` relative to the parent directory
where the executable named :file:`python` is found on the shell command search
path (the environment variable :envvar:`PATH`).
For instance, if the Python executable is found in
:file:`/usr/local/bin/python`, it will assume that the libraries are in
:file:`/usr/local/lib/python{X.Y}`. (In fact, this particular path is also
the "fallback" location, used when no executable file named :file:`python` is
found along :envvar:`PATH`.) The user can override this behavior by setting the
environment variable :envvar:`PYTHONHOME`, or insert additional directories in
front of the standard path by setting :envvar:`PYTHONPATH`.
.. index::
single: Py_SetProgramName()
single: Py_GetPath()
single: Py_GetPrefix()
single: Py_GetExecPrefix()
single: Py_GetProgramFullPath()
The embedding application can steer the search by calling
``Py_SetProgramName(file)`` *before* calling :cfunc:`Py_Initialize`. Note that
:envvar:`PYTHONHOME` still overrides this and :envvar:`PYTHONPATH` is still
inserted in front of the standard path. An application that requires total
control has to provide its own implementation of :cfunc:`Py_GetPath`,
:cfunc:`Py_GetPrefix`, :cfunc:`Py_GetExecPrefix`, and
:cfunc:`Py_GetProgramFullPath` (all defined in :file:`Modules/getpath.c`).
.. index:: single: Py_IsInitialized()
Sometimes, it is desirable to "uninitialize" Python. For instance, the
application may want to start over (make another call to
:cfunc:`Py_Initialize`) or the application is simply done with its use of
Python and wants to free memory allocated by Python. This can be accomplished
by calling :cfunc:`Py_Finalize`. The function :cfunc:`Py_IsInitialized` returns
true if Python is currently in the initialized state. More information about
these functions is given in a later chapter. Notice that :cfunc:`Py_Finalize`
does *not* free all memory allocated by the Python interpreter, e.g. memory
allocated by extension modules currently cannot be released.
.. _api-debugging:
Debugging Builds
================
Python can be built with several macros to enable extra checks of the
interpreter and extension modules. These checks tend to add a large amount of
overhead to the runtime so they are not enabled by default.
A full list of the various types of debugging builds is in the file
:file:`Misc/SpecialBuilds.txt` in the Python source distribution. Builds are
available that support tracing of reference counts, debugging the memory
allocator, or low-level profiling of the main interpreter loop. Only the most
frequently-used builds will be described in the remainder of this section.
Compiling the interpreter with the :cmacro:`Py_DEBUG` macro defined produces
what is generally meant by "a debug build" of Python. :cmacro:`Py_DEBUG` is
enabled in the Unix build by adding :option:`--with-pydebug` to the
:file:`configure` command. It is also implied by the presence of the
not-Python-specific :cmacro:`_DEBUG` macro. When :cmacro:`Py_DEBUG` is enabled
in the Unix build, compiler optimization is disabled.
In addition to the reference count debugging described below, the following
extra checks are performed:
* Extra checks are added to the object allocator.
* Extra checks are added to the parser and compiler.
* Downcasts from wide types to narrow types are checked for loss of information.
* A number of assertions are added to the dictionary and set implementations.
In addition, the set object acquires a :meth:`test_c_api` method.
* Sanity checks of the input arguments are added to frame creation.
* The storage for long ints is initialized with a known invalid pattern to catch
reference to uninitialized digits.
* Low-level tracing and extra exception checking are added to the runtime
virtual machine.
* Extra checks are added to the memory arena implementation.
* Extra debugging is added to the thread module.
There may be additional checks not mentioned here.
Defining :cmacro:`Py_TRACE_REFS` enables reference tracing. When defined, a
circular doubly linked list of active objects is maintained by adding two extra
fields to every :ctype:`PyObject`. Total allocations are tracked as well. Upon
exit, all existing references are printed. (In interactive mode this happens
after every statement run by the interpreter.) Implied by :cmacro:`Py_DEBUG`.
Please refer to :file:`Misc/SpecialBuilds.txt` in the Python source distribution
for more detailed information.

View File

@ -1,207 +0,0 @@
.. highlightlang:: c
.. _memory:
*****************
Memory Management
*****************
.. sectionauthor:: Vladimir Marangozov <Vladimir.Marangozov@inrialpes.fr>
.. _memoryoverview:
Overview
========
Memory management in Python involves a private heap containing all Python
objects and data structures. The management of this private heap is ensured
internally by the *Python memory manager*. The Python memory manager has
different components which deal with various dynamic storage management aspects,
like sharing, segmentation, preallocation or caching.
At the lowest level, a raw memory allocator ensures that there is enough room in
the private heap for storing all Python-related data by interacting with the
memory manager of the operating system. On top of the raw memory allocator,
several object-specific allocators operate on the same heap and implement
distinct memory management policies adapted to the peculiarities of every object
type. For example, integer objects are managed differently within the heap than
strings, tuples or dictionaries because integers imply different storage
requirements and speed/space tradeoffs. The Python memory manager thus delegates
some of the work to the object-specific allocators, but ensures that the latter
operate within the bounds of the private heap.
It is important to understand that the management of the Python heap is
performed by the interpreter itself and that the user has no control over it,
even if she regularly manipulates object pointers to memory blocks inside that
heap. The allocation of heap space for Python objects and other internal
buffers is performed on demand by the Python memory manager through the Python/C
API functions listed in this document.
.. index::
single: malloc()
single: calloc()
single: realloc()
single: free()
To avoid memory corruption, extension writers should never try to operate on
Python objects with the functions exported by the C library: :cfunc:`malloc`,
:cfunc:`calloc`, :cfunc:`realloc` and :cfunc:`free`. This will result in mixed
calls between the C allocator and the Python memory manager with fatal
consequences, because they implement different algorithms and operate on
different heaps. However, one may safely allocate and release memory blocks
with the C library allocator for individual purposes, as shown in the following
example::
PyObject *res;
char *buf = (char *) malloc(BUFSIZ); /* for I/O */
if (buf == NULL)
return PyErr_NoMemory();
...Do some I/O operation involving buf...
res = PyString_FromString(buf);
free(buf); /* malloc'ed */
return res;
In this example, the memory request for the I/O buffer is handled by the C
library allocator. The Python memory manager is involved only in the allocation
of the string object returned as a result.
In most situations, however, it is recommended to allocate memory from the
Python heap specifically because the latter is under control of the Python
memory manager. For example, this is required when the interpreter is extended
with new object types written in C. Another reason for using the Python heap is
the desire to *inform* the Python memory manager about the memory needs of the
extension module. Even when the requested memory is used exclusively for
internal, highly-specific purposes, delegating all memory requests to the Python
memory manager causes the interpreter to have a more accurate image of its
memory footprint as a whole. Consequently, under certain circumstances, the
Python memory manager may or may not trigger appropriate actions, like garbage
collection, memory compaction or other preventive procedures. Note that by using
the C library allocator as shown in the previous example, the allocated memory
for the I/O buffer escapes completely the Python memory manager.
.. _memoryinterface:
Memory Interface
================
The following function sets, modeled after the ANSI C standard, but specifying
behavior when requesting zero bytes, are available for allocating and releasing
memory from the Python heap:
.. cfunction:: void* PyMem_Malloc(size_t n)
Allocates *n* bytes and returns a pointer of type :ctype:`void\*` to the
allocated memory, or *NULL* if the request fails. Requesting zero bytes returns
a distinct non-*NULL* pointer if possible, as if :cfunc:`PyMem_Malloc(1)` had
been called instead. The memory will not have been initialized in any way.
.. cfunction:: void* PyMem_Realloc(void *p, size_t n)
Resizes the memory block pointed to by *p* to *n* bytes. The contents will be
unchanged to the minimum of the old and the new sizes. If *p* is *NULL*, the
call is equivalent to :cfunc:`PyMem_Malloc(n)`; else if *n* is equal to zero,
the memory block is resized but is not freed, and the returned pointer is
non-*NULL*. Unless *p* is *NULL*, it must have been returned by a previous call
to :cfunc:`PyMem_Malloc` or :cfunc:`PyMem_Realloc`. If the request fails,
:cfunc:`PyMem_Realloc` returns *NULL* and *p* remains a valid pointer to the
previous memory area.
.. cfunction:: void PyMem_Free(void *p)
Frees the memory block pointed to by *p*, which must have been returned by a
previous call to :cfunc:`PyMem_Malloc` or :cfunc:`PyMem_Realloc`. Otherwise, or
if :cfunc:`PyMem_Free(p)` has been called before, undefined behavior occurs. If
*p* is *NULL*, no operation is performed.
The following type-oriented macros are provided for convenience. Note that
*TYPE* refers to any C type.
.. cfunction:: TYPE* PyMem_New(TYPE, size_t n)
Same as :cfunc:`PyMem_Malloc`, but allocates ``(n * sizeof(TYPE))`` bytes of
memory. Returns a pointer cast to :ctype:`TYPE\*`. The memory will not have
been initialized in any way.
.. cfunction:: TYPE* PyMem_Resize(void *p, TYPE, size_t n)
Same as :cfunc:`PyMem_Realloc`, but the memory block is resized to ``(n *
sizeof(TYPE))`` bytes. Returns a pointer cast to :ctype:`TYPE\*`. On return,
*p* will be a pointer to the new memory area, or *NULL* in the event of failure.
.. cfunction:: void PyMem_Del(void *p)
Same as :cfunc:`PyMem_Free`.
In addition, the following macro sets are provided for calling the Python memory
allocator directly, without involving the C API functions listed above. However,
note that their use does not preserve binary compatibility across Python
versions and is therefore deprecated in extension modules.
:cfunc:`PyMem_MALLOC`, :cfunc:`PyMem_REALLOC`, :cfunc:`PyMem_FREE`.
:cfunc:`PyMem_NEW`, :cfunc:`PyMem_RESIZE`, :cfunc:`PyMem_DEL`.
.. _memoryexamples:
Examples
========
Here is the example from section :ref:`memoryoverview`, rewritten so that the
I/O buffer is allocated from the Python heap by using the first function set::
PyObject *res;
char *buf = (char *) PyMem_Malloc(BUFSIZ); /* for I/O */
if (buf == NULL)
return PyErr_NoMemory();
/* ...Do some I/O operation involving buf... */
res = PyString_FromString(buf);
PyMem_Free(buf); /* allocated with PyMem_Malloc */
return res;
The same code using the type-oriented function set::
PyObject *res;
char *buf = PyMem_New(char, BUFSIZ); /* for I/O */
if (buf == NULL)
return PyErr_NoMemory();
/* ...Do some I/O operation involving buf... */
res = PyString_FromString(buf);
PyMem_Del(buf); /* allocated with PyMem_New */
return res;
Note that in the two examples above, the buffer is always manipulated via
functions belonging to the same set. Indeed, it is required to use the same
memory API family for a given memory block, so that the risk of mixing different
allocators is reduced to a minimum. The following code sequence contains two
errors, one of which is labeled as *fatal* because it mixes two different
allocators operating on different heaps. ::
char *buf1 = PyMem_New(char, BUFSIZ);
char *buf2 = (char *) malloc(BUFSIZ);
char *buf3 = (char *) PyMem_Malloc(BUFSIZ);
...
PyMem_Del(buf3); /* Wrong -- should be PyMem_Free() */
free(buf2); /* Right -- allocated via malloc() */
free(buf1); /* Fatal -- should be PyMem_Del() */
In addition to the functions aimed at handling raw memory blocks from the Python
heap, objects in Python are allocated and released with :cfunc:`PyObject_New`,
:cfunc:`PyObject_NewVar` and :cfunc:`PyObject_Del`.
These will be explained in the next chapter on defining and implementing new
object types in C.

File diff suppressed because it is too large Load Diff

View File

@ -1,74 +0,0 @@
.. highlightlang:: c
.. _countingrefs:
******************
Reference Counting
******************
The macros in this section are used for managing reference counts of Python
objects.
.. cfunction:: void Py_INCREF(PyObject *o)
Increment the reference count for object *o*. The object must not be *NULL*; if
you aren't sure that it isn't *NULL*, use :cfunc:`Py_XINCREF`.
.. cfunction:: void Py_XINCREF(PyObject *o)
Increment the reference count for object *o*. The object may be *NULL*, in
which case the macro has no effect.
.. cfunction:: void Py_DECREF(PyObject *o)
Decrement the reference count for object *o*. The object must not be *NULL*; if
you aren't sure that it isn't *NULL*, use :cfunc:`Py_XDECREF`. If the reference
count reaches zero, the object's type's deallocation function (which must not be
*NULL*) is invoked.
.. warning::
The deallocation function can cause arbitrary Python code to be invoked (e.g.
when a class instance with a :meth:`__del__` method is deallocated). While
exceptions in such code are not propagated, the executed code has free access to
all Python global variables. This means that any object that is reachable from
a global variable should be in a consistent state before :cfunc:`Py_DECREF` is
invoked. For example, code to delete an object from a list should copy a
reference to the deleted object in a temporary variable, update the list data
structure, and then call :cfunc:`Py_DECREF` for the temporary variable.
.. cfunction:: void Py_XDECREF(PyObject *o)
Decrement the reference count for object *o*. The object may be *NULL*, in
which case the macro has no effect; otherwise the effect is the same as for
:cfunc:`Py_DECREF`, and the same warning applies.
.. cfunction:: void Py_CLEAR(PyObject *o)
Decrement the reference count for object *o*. The object may be *NULL*, in
which case the macro has no effect; otherwise the effect is the same as for
:cfunc:`Py_DECREF`, except that the argument is also set to *NULL*. The warning
for :cfunc:`Py_DECREF` does not apply with respect to the object passed because
the macro carefully uses a temporary variable and sets the argument to *NULL*
before decrementing its reference count.
It is a good idea to use this macro whenever decrementing the value of a
variable that might be traversed during garbage collection.
.. versionadded:: 2.4
The following functions are for runtime dynamic embedding of Python:
``Py_IncRef(PyObject \*o)``, `Py_DecRef(PyObject \*o)``. They are
simply exported function versions of :cfunc:`Py_XINCREF` and
:cfunc:`Py_XDECREF`, respectively.
The following functions or macros are only for use within the interpreter core:
:cfunc:`_Py_Dealloc`, :cfunc:`_Py_ForgetReference`, :cfunc:`_Py_NewReference`,
as well as the global variable :cdata:`_Py_RefTotal`.

File diff suppressed because it is too large Load Diff

View File

@ -1,278 +0,0 @@
.. highlightlang:: c
.. _veryhigh:
*************************
The Very High Level Layer
*************************
The functions in this chapter will let you execute Python source code given in a
file or a buffer, but they will not let you interact in a more detailed way with
the interpreter.
Several of these functions accept a start symbol from the grammar as a
parameter. The available start symbols are :const:`Py_eval_input`,
:const:`Py_file_input`, and :const:`Py_single_input`. These are described
following the functions which accept them as parameters.
Note also that several of these functions take :ctype:`FILE\*` parameters. On
particular issue which needs to be handled carefully is that the :ctype:`FILE`
structure for different C libraries can be different and incompatible. Under
Windows (at least), it is possible for dynamically linked extensions to actually
use different libraries, so care should be taken that :ctype:`FILE\*` parameters
are only passed to these functions if it is certain that they were created by
the same library that the Python runtime is using.
.. cfunction:: int Py_Main(int argc, char **argv)
The main program for the standard interpreter. This is made available for
programs which embed Python. The *argc* and *argv* parameters should be
prepared exactly as those which are passed to a C program's :cfunc:`main`
function. It is important to note that the argument list may be modified (but
the contents of the strings pointed to by the argument list are not). The return
value will be the integer passed to the :func:`sys.exit` function, ``1`` if the
interpreter exits due to an exception, or ``2`` if the parameter list does not
represent a valid Python command line.
.. cfunction:: int PyRun_AnyFile(FILE *fp, const char *filename)
This is a simplified interface to :cfunc:`PyRun_AnyFileExFlags` below, leaving
*closeit* set to ``0`` and *flags* set to *NULL*.
.. cfunction:: int PyRun_AnyFileFlags(FILE *fp, const char *filename, PyCompilerFlags *flags)
This is a simplified interface to :cfunc:`PyRun_AnyFileExFlags` below, leaving
the *closeit* argument set to ``0``.
.. cfunction:: int PyRun_AnyFileEx(FILE *fp, const char *filename, int closeit)
This is a simplified interface to :cfunc:`PyRun_AnyFileExFlags` below, leaving
the *flags* argument set to *NULL*.
.. cfunction:: int PyRun_AnyFileExFlags(FILE *fp, const char *filename, int closeit, PyCompilerFlags *flags)
If *fp* refers to a file associated with an interactive device (console or
terminal input or Unix pseudo-terminal), return the value of
:cfunc:`PyRun_InteractiveLoop`, otherwise return the result of
:cfunc:`PyRun_SimpleFile`. If *filename* is *NULL*, this function uses
``"???"`` as the filename.
.. cfunction:: int PyRun_SimpleString(const char *command)
This is a simplified interface to :cfunc:`PyRun_SimpleStringFlags` below,
leaving the *PyCompilerFlags\** argument set to NULL.
.. cfunction:: int PyRun_SimpleStringFlags(const char *command, PyCompilerFlags *flags)
Executes the Python source code from *command* in the :mod:`__main__` module
according to the *flags* argument. If :mod:`__main__` does not already exist, it
is created. Returns ``0`` on success or ``-1`` if an exception was raised. If
there was an error, there is no way to get the exception information. For the
meaning of *flags*, see below.
.. cfunction:: int PyRun_SimpleFile(FILE *fp, const char *filename)
This is a simplified interface to :cfunc:`PyRun_SimpleFileExFlags` below,
leaving *closeit* set to ``0`` and *flags* set to *NULL*.
.. cfunction:: int PyRun_SimpleFileFlags(FILE *fp, const char *filename, PyCompilerFlags *flags)
This is a simplified interface to :cfunc:`PyRun_SimpleFileExFlags` below,
leaving *closeit* set to ``0``.
.. cfunction:: int PyRun_SimpleFileEx(FILE *fp, const char *filename, int closeit)
This is a simplified interface to :cfunc:`PyRun_SimpleFileExFlags` below,
leaving *flags* set to *NULL*.
.. cfunction:: int PyRun_SimpleFileExFlags(FILE *fp, const char *filename, int closeit, PyCompilerFlags *flags)
Similar to :cfunc:`PyRun_SimpleStringFlags`, but the Python source code is read
from *fp* instead of an in-memory string. *filename* should be the name of the
file. If *closeit* is true, the file is closed before PyRun_SimpleFileExFlags
returns.
.. cfunction:: int PyRun_InteractiveOne(FILE *fp, const char *filename)
This is a simplified interface to :cfunc:`PyRun_InteractiveOneFlags` below,
leaving *flags* set to *NULL*.
.. cfunction:: int PyRun_InteractiveOneFlags(FILE *fp, const char *filename, PyCompilerFlags *flags)
Read and execute a single statement from a file associated with an interactive
device according to the *flags* argument. If *filename* is *NULL*, ``"???"`` is
used instead. The user will be prompted using ``sys.ps1`` and ``sys.ps2``.
Returns ``0`` when the input was executed successfully, ``-1`` if there was an
exception, or an error code from the :file:`errcode.h` include file distributed
as part of Python if there was a parse error. (Note that :file:`errcode.h` is
not included by :file:`Python.h`, so must be included specifically if needed.)
.. cfunction:: int PyRun_InteractiveLoop(FILE *fp, const char *filename)
This is a simplified interface to :cfunc:`PyRun_InteractiveLoopFlags` below,
leaving *flags* set to *NULL*.
.. cfunction:: int PyRun_InteractiveLoopFlags(FILE *fp, const char *filename, PyCompilerFlags *flags)
Read and execute statements from a file associated with an interactive device
until EOF is reached. If *filename* is *NULL*, ``"???"`` is used instead. The
user will be prompted using ``sys.ps1`` and ``sys.ps2``. Returns ``0`` at EOF.
.. cfunction:: struct _node* PyParser_SimpleParseString(const char *str, int start)
This is a simplified interface to
:cfunc:`PyParser_SimpleParseStringFlagsFilename` below, leaving *filename* set
to *NULL* and *flags* set to ``0``.
.. cfunction:: struct _node* PyParser_SimpleParseStringFlags( const char *str, int start, int flags)
This is a simplified interface to
:cfunc:`PyParser_SimpleParseStringFlagsFilename` below, leaving *filename* set
to *NULL*.
.. cfunction:: struct _node* PyParser_SimpleParseStringFlagsFilename( const char *str, const char *filename, int start, int flags)
Parse Python source code from *str* using the start token *start* according to
the *flags* argument. The result can be used to create a code object which can
be evaluated efficiently. This is useful if a code fragment must be evaluated
many times.
.. cfunction:: struct _node* PyParser_SimpleParseFile(FILE *fp, const char *filename, int start)
This is a simplified interface to :cfunc:`PyParser_SimpleParseFileFlags` below,
leaving *flags* set to ``0``
.. cfunction:: struct _node* PyParser_SimpleParseFileFlags(FILE *fp, const char *filename, int start, int flags)
Similar to :cfunc:`PyParser_SimpleParseStringFlagsFilename`, but the Python
source code is read from *fp* instead of an in-memory string.
.. cfunction:: PyObject* PyRun_String(const char *str, int start, PyObject *globals, PyObject *locals)
This is a simplified interface to :cfunc:`PyRun_StringFlags` below, leaving
*flags* set to *NULL*.
.. cfunction:: PyObject* PyRun_StringFlags(const char *str, int start, PyObject *globals, PyObject *locals, PyCompilerFlags *flags)
Execute Python source code from *str* in the context specified by the
dictionaries *globals* and *locals* with the compiler flags specified by
*flags*. The parameter *start* specifies the start token that should be used to
parse the source code.
Returns the result of executing the code as a Python object, or *NULL* if an
exception was raised.
.. cfunction:: PyObject* PyRun_File(FILE *fp, const char *filename, int start, PyObject *globals, PyObject *locals)
This is a simplified interface to :cfunc:`PyRun_FileExFlags` below, leaving
*closeit* set to ``0`` and *flags* set to *NULL*.
.. cfunction:: PyObject* PyRun_FileEx(FILE *fp, const char *filename, int start, PyObject *globals, PyObject *locals, int closeit)
This is a simplified interface to :cfunc:`PyRun_FileExFlags` below, leaving
*flags* set to *NULL*.
.. cfunction:: PyObject* PyRun_FileFlags(FILE *fp, const char *filename, int start, PyObject *globals, PyObject *locals, PyCompilerFlags *flags)
This is a simplified interface to :cfunc:`PyRun_FileExFlags` below, leaving
*closeit* set to ``0``.
.. cfunction:: PyObject* PyRun_FileExFlags(FILE *fp, const char *filename, int start, PyObject *globals, PyObject *locals, int closeit, PyCompilerFlags *flags)
Similar to :cfunc:`PyRun_StringFlags`, but the Python source code is read from
*fp* instead of an in-memory string. *filename* should be the name of the file.
If *closeit* is true, the file is closed before :cfunc:`PyRun_FileExFlags`
returns.
.. cfunction:: PyObject* Py_CompileString(const char *str, const char *filename, int start)
This is a simplified interface to :cfunc:`Py_CompileStringFlags` below, leaving
*flags* set to *NULL*.
.. cfunction:: PyObject* Py_CompileStringFlags(const char *str, const char *filename, int start, PyCompilerFlags *flags)
Parse and compile the Python source code in *str*, returning the resulting code
object. The start token is given by *start*; this can be used to constrain the
code which can be compiled and should be :const:`Py_eval_input`,
:const:`Py_file_input`, or :const:`Py_single_input`. The filename specified by
*filename* is used to construct the code object and may appear in tracebacks or
:exc:`SyntaxError` exception messages. This returns *NULL* if the code cannot
be parsed or compiled.
.. cvar:: int Py_eval_input
.. index:: single: Py_CompileString()
The start symbol from the Python grammar for isolated expressions; for use with
:cfunc:`Py_CompileString`.
.. cvar:: int Py_file_input
.. index:: single: Py_CompileString()
The start symbol from the Python grammar for sequences of statements as read
from a file or other source; for use with :cfunc:`Py_CompileString`. This is
the symbol to use when compiling arbitrarily long Python source code.
.. cvar:: int Py_single_input
.. index:: single: Py_CompileString()
The start symbol from the Python grammar for a single statement; for use with
:cfunc:`Py_CompileString`. This is the symbol used for the interactive
interpreter loop.
.. ctype:: struct PyCompilerFlags
This is the structure used to hold compiler flags. In cases where code is only
being compiled, it is passed as ``int flags``, and in cases where code is being
executed, it is passed as ``PyCompilerFlags *flags``. In this case, ``from
__future__ import`` can modify *flags*.
Whenever ``PyCompilerFlags *flags`` is *NULL*, :attr:`cf_flags` is treated as
equal to ``0``, and any modification due to ``from __future__ import`` is
discarded. ::
struct PyCompilerFlags {
int cf_flags;
}
.. cvar:: int CO_FUTURE_DIVISION
This bit can be set in *flags* to cause division operator ``/`` to be
interpreted as "true division" according to :pep:`238`.

View File

@ -1,55 +0,0 @@
# -*- coding: utf-8 -*-
#
# Python documentation build configuration file
#
# The contents of this file are pickled, so don't put values in the namespace
# that aren't pickleable (module imports are okay, they're removed automatically).
#
# The default replacements for |version| and |release|.
# If 'auto', Sphinx looks for the Include/patchlevel.h file in the current Python
# source tree and replaces the values accordingly.
#
# The short X.Y version.
# version = '2.6'
version = 'auto'
# The full version, including alpha/beta/rc tags.
# release = '2.6a0'
release = 'auto'
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
today = ''
# Else, today_fmt is used as the format for a strftime call.
today_fmt = '%B %d, %Y'
# The base URL for download links.
download_base_url = 'http://docs.python.org/ftp/python/doc/'
# List of files that shouldn't be included in the build.
unused_files = [
'whatsnew/2.0.rst',
'whatsnew/2.1.rst',
'whatsnew/2.2.rst',
'whatsnew/2.3.rst',
'whatsnew/2.4.rst',
'whatsnew/2.5.rst',
'maclib/scrap.rst',
'library/xmllib.rst',
'library/xml.etree.rst',
]
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
last_updated_format = '%b %d, %Y'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
use_smartypants = True
# If true, '()' will be appended to :func: etc. cross-reference text.
add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
add_module_names = True

View File

@ -1,21 +0,0 @@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Python Documentation contents
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
.. toctree::
whatsnew/2.6.rst
tutorial/index.rst
reference/index.rst
library/index.rst
extending/index.rst
c-api/index.rst
distutils/index.rst
install/index.rst
documenting/index.rst
howto/index.rst
about.rst
bugs.rst
copyright.rst
license.rst

View File

@ -1,19 +0,0 @@
*********
Copyright
*********
Python and this documentation is:
Copyright © 2001-2007 Python Software Foundation. All rights reserved.
Copyright © 2000 BeOpen.com. All rights reserved.
Copyright © 1995-2000 Corporation for National Research Initiatives. All rights
reserved.
Copyright © 1991-1995 Stichting Mathematisch Centrum. All rights reserved.
-------
See :ref:`history-and-license` for complete license and permissions information.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,405 +0,0 @@
.. _built-dist:
****************************
Creating Built Distributions
****************************
A "built distribution" is what you're probably used to thinking of either as a
"binary package" or an "installer" (depending on your background). It's not
necessarily binary, though, because it might contain only Python source code
and/or byte-code; and we don't call it a package, because that word is already
spoken for in Python. (And "installer" is a term specific to the world of
mainstream desktop systems.)
A built distribution is how you make life as easy as possible for installers of
your module distribution: for users of RPM-based Linux systems, it's a binary
RPM; for Windows users, it's an executable installer; for Debian-based Linux
users, it's a Debian package; and so forth. Obviously, no one person will be
able to create built distributions for every platform under the sun, so the
Distutils are designed to enable module developers to concentrate on their
specialty---writing code and creating source distributions---while an
intermediary species called *packagers* springs up to turn source distributions
into built distributions for as many platforms as there are packagers.
Of course, the module developer could be his own packager; or the packager could
be a volunteer "out there" somewhere who has access to a platform which the
original developer does not; or it could be software periodically grabbing new
source distributions and turning them into built distributions for as many
platforms as the software has access to. Regardless of who they are, a packager
uses the setup script and the :command:`bdist` command family to generate built
distributions.
As a simple example, if I run the following command in the Distutils source
tree::
python setup.py bdist
then the Distutils builds my module distribution (the Distutils itself in this
case), does a "fake" installation (also in the :file:`build` directory), and
creates the default type of built distribution for my platform. The default
format for built distributions is a "dumb" tar file on Unix, and a simple
executable installer on Windows. (That tar file is considered "dumb" because it
has to be unpacked in a specific location to work.)
Thus, the above command on a Unix system creates
:file:`Distutils-1.0.{plat}.tar.gz`; unpacking this tarball from the right place
installs the Distutils just as though you had downloaded the source distribution
and run ``python setup.py install``. (The "right place" is either the root of
the filesystem or Python's :file:`{prefix}` directory, depending on the options
given to the :command:`bdist_dumb` command; the default is to make dumb
distributions relative to :file:`{prefix}`.)
Obviously, for pure Python distributions, this isn't any simpler than just
running ``python setup.py install``\ ---but for non-pure distributions, which
include extensions that would need to be compiled, it can mean the difference
between someone being able to use your extensions or not. And creating "smart"
built distributions, such as an RPM package or an executable installer for
Windows, is far more convenient for users even if your distribution doesn't
include any extensions.
The :command:`bdist` command has a :option:`--formats` option, similar to the
:command:`sdist` command, which you can use to select the types of built
distribution to generate: for example, ::
python setup.py bdist --format=zip
would, when run on a Unix system, create :file:`Distutils-1.0.{plat}.zip`\
---again, this archive would be unpacked from the root directory to install the
Distutils.
The available formats for built distributions are:
+-------------+------------------------------+---------+
| Format | Description | Notes |
+=============+==============================+=========+
| ``gztar`` | gzipped tar file | (1),(3) |
| | (:file:`.tar.gz`) | |
+-------------+------------------------------+---------+
| ``ztar`` | compressed tar file | \(3) |
| | (:file:`.tar.Z`) | |
+-------------+------------------------------+---------+
| ``tar`` | tar file (:file:`.tar`) | \(3) |
+-------------+------------------------------+---------+
| ``zip`` | zip file (:file:`.zip`) | \(4) |
+-------------+------------------------------+---------+
| ``rpm`` | RPM | \(5) |
+-------------+------------------------------+---------+
| ``pkgtool`` | Solaris :program:`pkgtool` | |
+-------------+------------------------------+---------+
| ``sdux`` | HP-UX :program:`swinstall` | |
+-------------+------------------------------+---------+
| ``rpm`` | RPM | \(5) |
+-------------+------------------------------+---------+
| ``wininst`` | self-extracting ZIP file for | (2),(4) |
| | Windows | |
+-------------+------------------------------+---------+
Notes:
(1)
default on Unix
(2)
default on Windows
**\*\*** to-do! **\*\***
(3)
requires external utilities: :program:`tar` and possibly one of :program:`gzip`,
:program:`bzip2`, or :program:`compress`
(4)
requires either external :program:`zip` utility or :mod:`zipfile` module (part
of the standard Python library since Python 1.6)
(5)
requires external :program:`rpm` utility, version 3.0.4 or better (use ``rpm
--version`` to find out which version you have)
You don't have to use the :command:`bdist` command with the :option:`--formats`
option; you can also use the command that directly implements the format you're
interested in. Some of these :command:`bdist` "sub-commands" actually generate
several similar formats; for instance, the :command:`bdist_dumb` command
generates all the "dumb" archive formats (``tar``, ``ztar``, ``gztar``, and
``zip``), and :command:`bdist_rpm` generates both binary and source RPMs. The
:command:`bdist` sub-commands, and the formats generated by each, are:
+--------------------------+-----------------------+
| Command | Formats |
+==========================+=======================+
| :command:`bdist_dumb` | tar, ztar, gztar, zip |
+--------------------------+-----------------------+
| :command:`bdist_rpm` | rpm, srpm |
+--------------------------+-----------------------+
| :command:`bdist_wininst` | wininst |
+--------------------------+-----------------------+
The following sections give details on the individual :command:`bdist_\*`
commands.
.. _creating-dumb:
Creating dumb built distributions
=================================
**\*\*** Need to document absolute vs. prefix-relative packages here, but first
I have to implement it! **\*\***
.. _creating-rpms:
Creating RPM packages
=====================
The RPM format is used by many popular Linux distributions, including Red Hat,
SuSE, and Mandrake. If one of these (or any of the other RPM-based Linux
distributions) is your usual environment, creating RPM packages for other users
of that same distribution is trivial. Depending on the complexity of your module
distribution and differences between Linux distributions, you may also be able
to create RPMs that work on different RPM-based distributions.
The usual way to create an RPM of your module distribution is to run the
:command:`bdist_rpm` command::
python setup.py bdist_rpm
or the :command:`bdist` command with the :option:`--format` option::
python setup.py bdist --formats=rpm
The former allows you to specify RPM-specific options; the latter allows you to
easily specify multiple formats in one run. If you need to do both, you can
explicitly specify multiple :command:`bdist_\*` commands and their options::
python setup.py bdist_rpm --packager="John Doe <jdoe@example.org>" \
bdist_wininst --target_version="2.0"
Creating RPM packages is driven by a :file:`.spec` file, much as using the
Distutils is driven by the setup script. To make your life easier, the
:command:`bdist_rpm` command normally creates a :file:`.spec` file based on the
information you supply in the setup script, on the command line, and in any
Distutils configuration files. Various options and sections in the
:file:`.spec` file are derived from options in the setup script as follows:
+------------------------------------------+----------------------------------------------+
| RPM :file:`.spec` file option or section | Distutils setup script option |
+==========================================+==============================================+
| Name | :option:`name` |
+------------------------------------------+----------------------------------------------+
| Summary (in preamble) | :option:`description` |
+------------------------------------------+----------------------------------------------+
| Version | :option:`version` |
+------------------------------------------+----------------------------------------------+
| Vendor | :option:`author` and :option:`author_email`, |
| | or --- & :option:`maintainer` and |
| | :option:`maintainer_email` |
+------------------------------------------+----------------------------------------------+
| Copyright | :option:`licence` |
+------------------------------------------+----------------------------------------------+
| Url | :option:`url` |
+------------------------------------------+----------------------------------------------+
| %description (section) | :option:`long_description` |
+------------------------------------------+----------------------------------------------+
Additionally, there are many options in :file:`.spec` files that don't have
corresponding options in the setup script. Most of these are handled through
options to the :command:`bdist_rpm` command as follows:
+-------------------------------+-----------------------------+-------------------------+
| RPM :file:`.spec` file option | :command:`bdist_rpm` option | default value |
| or section | | |
+===============================+=============================+=========================+
| Release | :option:`release` | "1" |
+-------------------------------+-----------------------------+-------------------------+
| Group | :option:`group` | "Development/Libraries" |
+-------------------------------+-----------------------------+-------------------------+
| Vendor | :option:`vendor` | (see above) |
+-------------------------------+-----------------------------+-------------------------+
| Packager | :option:`packager` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| Provides | :option:`provides` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| Requires | :option:`requires` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| Conflicts | :option:`conflicts` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| Obsoletes | :option:`obsoletes` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| Distribution | :option:`distribution_name` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| BuildRequires | :option:`build_requires` | (none) |
+-------------------------------+-----------------------------+-------------------------+
| Icon | :option:`icon` | (none) |
+-------------------------------+-----------------------------+-------------------------+
Obviously, supplying even a few of these options on the command-line would be
tedious and error-prone, so it's usually best to put them in the setup
configuration file, :file:`setup.cfg`\ ---see section :ref:`setup-config`. If
you distribute or package many Python module distributions, you might want to
put options that apply to all of them in your personal Distutils configuration
file (:file:`~/.pydistutils.cfg`).
There are three steps to building a binary RPM package, all of which are
handled automatically by the Distutils:
#. create a :file:`.spec` file, which describes the package (analogous to the
Distutils setup script; in fact, much of the information in the setup script
winds up in the :file:`.spec` file)
#. create the source RPM
#. create the "binary" RPM (which may or may not contain binary code, depending
on whether your module distribution contains Python extensions)
Normally, RPM bundles the last two steps together; when you use the Distutils,
all three steps are typically bundled together.
If you wish, you can separate these three steps. You can use the
:option:`--spec-only` option to make :command:`bdist_rpm` just create the
:file:`.spec` file and exit; in this case, the :file:`.spec` file will be
written to the "distribution directory"---normally :file:`dist/`, but
customizable with the :option:`--dist-dir` option. (Normally, the :file:`.spec`
file winds up deep in the "build tree," in a temporary directory created by
:command:`bdist_rpm`.)
.. % \XXX{this isn't implemented yet---is it needed?!}
.. % You can also specify a custom \file{.spec} file with the
.. % \longprogramopt{spec-file} option; used in conjunction with
.. % \longprogramopt{spec-only}, this gives you an opportunity to customize
.. % the \file{.spec} file manually:
.. %
.. % \ begin{verbatim}
.. % > python setup.py bdist_rpm --spec-only
.. % # ...edit dist/FooBar-1.0.spec
.. % > python setup.py bdist_rpm --spec-file=dist/FooBar-1.0.spec
.. % \ end{verbatim}
.. %
.. % (Although a better way to do this is probably to override the standard
.. % \command{bdist\_rpm} command with one that writes whatever else you want
.. % to the \file{.spec} file.)
.. _creating-wininst:
Creating Windows Installers
===========================
Executable installers are the natural format for binary distributions on
Windows. They display a nice graphical user interface, display some information
about the module distribution to be installed taken from the metadata in the
setup script, let the user select a few options, and start or cancel the
installation.
Since the metadata is taken from the setup script, creating Windows installers
is usually as easy as running::
python setup.py bdist_wininst
or the :command:`bdist` command with the :option:`--formats` option::
python setup.py bdist --formats=wininst
If you have a pure module distribution (only containing pure Python modules and
packages), the resulting installer will be version independent and have a name
like :file:`foo-1.0.win32.exe`. These installers can even be created on Unix or
Mac OS platforms.
If you have a non-pure distribution, the extensions can only be created on a
Windows platform, and will be Python version dependent. The installer filename
will reflect this and now has the form :file:`foo-1.0.win32-py2.0.exe`. You
have to create a separate installer for every Python version you want to
support.
The installer will try to compile pure modules into bytecode after installation
on the target system in normal and optimizing mode. If you don't want this to
happen for some reason, you can run the :command:`bdist_wininst` command with
the :option:`--no-target-compile` and/or the :option:`--no-target-optimize`
option.
By default the installer will display the cool "Python Powered" logo when it is
run, but you can also supply your own bitmap which must be a Windows
:file:`.bmp` file with the :option:`--bitmap` option.
The installer will also display a large title on the desktop background window
when it is run, which is constructed from the name of your distribution and the
version number. This can be changed to another text by using the
:option:`--title` option.
The installer file will be written to the "distribution directory" --- normally
:file:`dist/`, but customizable with the :option:`--dist-dir` option.
.. _postinstallation-script:
The Postinstallation script
---------------------------
Starting with Python 2.3, a postinstallation script can be specified which the
:option:`--install-script` option. The basename of the script must be
specified, and the script filename must also be listed in the scripts argument
to the setup function.
This script will be run at installation time on the target system after all the
files have been copied, with ``argv[1]`` set to :option:`-install`, and again at
uninstallation time before the files are removed with ``argv[1]`` set to
:option:`-remove`.
The installation script runs embedded in the windows installer, every output
(``sys.stdout``, ``sys.stderr``) is redirected into a buffer and will be
displayed in the GUI after the script has finished.
Some functions especially useful in this context are available as additional
built-in functions in the installation script.
.. function:: directory_created(path)
file_created(path)
These functions should be called when a directory or file is created by the
postinstall script at installation time. It will register *path* with the
uninstaller, so that it will be removed when the distribution is uninstalled.
To be safe, directories are only removed if they are empty.
.. function:: get_special_folder_path(csidl_string)
This function can be used to retrieve special folder locations on Windows like
the Start Menu or the Desktop. It returns the full path to the folder.
*csidl_string* must be one of the following strings::
"CSIDL_APPDATA"
"CSIDL_COMMON_STARTMENU"
"CSIDL_STARTMENU"
"CSIDL_COMMON_DESKTOPDIRECTORY"
"CSIDL_DESKTOPDIRECTORY"
"CSIDL_COMMON_STARTUP"
"CSIDL_STARTUP"
"CSIDL_COMMON_PROGRAMS"
"CSIDL_PROGRAMS"
"CSIDL_FONTS"
If the folder cannot be retrieved, :exc:`OSError` is raised.
Which folders are available depends on the exact Windows version, and probably
also the configuration. For details refer to Microsoft's documentation of the
:cfunc:`SHGetSpecialFolderPath` function.
.. function:: create_shortcut(target, description, filename[, arguments[, workdir[, iconpath[, iconindex]]]])
This function creates a shortcut. *target* is the path to the program to be
started by the shortcut. *description* is the description of the shortcut.
*filename* is the title of the shortcut that the user will see. *arguments*
specifies the command line arguments, if any. *workdir* is the working directory
for the program. *iconpath* is the file containing the icon for the shortcut,
and *iconindex* is the index of the icon in the file *iconpath*. Again, for
details consult the Microsoft documentation for the :class:`IShellLink`
interface.

View File

@ -1,104 +0,0 @@
.. _reference:
*****************
Command Reference
*****************
.. % \section{Building modules: the \protect\command{build} command family}
.. % \label{build-cmds}
.. % \subsubsection{\protect\command{build}}
.. % \label{build-cmd}
.. % \subsubsection{\protect\command{build\_py}}
.. % \label{build-py-cmd}
.. % \subsubsection{\protect\command{build\_ext}}
.. % \label{build-ext-cmd}
.. % \subsubsection{\protect\command{build\_clib}}
.. % \label{build-clib-cmd}
.. _install-cmd:
Installing modules: the :command:`install` command family
=========================================================
The install command ensures that the build commands have been run and then runs
the subcommands :command:`install_lib`, :command:`install_data` and
:command:`install_scripts`.
.. % \subsubsection{\protect\command{install\_lib}}
.. % \label{install-lib-cmd}
.. _install-data-cmd:
:command:`install_data`
-----------------------
This command installs all data files provided with the distribution.
.. _install-scripts-cmd:
:command:`install_scripts`
--------------------------
This command installs all (Python) scripts in the distribution.
.. % \subsection{Cleaning up: the \protect\command{clean} command}
.. % \label{clean-cmd}
.. _sdist-cmd:
Creating a source distribution: the :command:`sdist` command
============================================================
**\*\*** fragment moved down from above: needs context! **\*\***
The manifest template commands are:
+-------------------------------------------+-----------------------------------------------+
| Command | Description |
+===========================================+===============================================+
| :command:`include pat1 pat2 ...` | include all files matching any of the listed |
| | patterns |
+-------------------------------------------+-----------------------------------------------+
| :command:`exclude pat1 pat2 ...` | exclude all files matching any of the listed |
| | patterns |
+-------------------------------------------+-----------------------------------------------+
| :command:`recursive-include dir pat1 pat2 | include all files under *dir* matching any of |
| ...` | the listed patterns |
+-------------------------------------------+-----------------------------------------------+
| :command:`recursive-exclude dir pat1 pat2 | exclude all files under *dir* matching any of |
| ...` | the listed patterns |
+-------------------------------------------+-----------------------------------------------+
| :command:`global-include pat1 pat2 ...` | include all files anywhere in the source tree |
| | matching --- & any of the listed patterns |
+-------------------------------------------+-----------------------------------------------+
| :command:`global-exclude pat1 pat2 ...` | exclude all files anywhere in the source tree |
| | matching --- & any of the listed patterns |
+-------------------------------------------+-----------------------------------------------+
| :command:`prune dir` | exclude all files under *dir* |
+-------------------------------------------+-----------------------------------------------+
| :command:`graft dir` | include all files under *dir* |
+-------------------------------------------+-----------------------------------------------+
The patterns here are Unix-style "glob" patterns: ``*`` matches any sequence of
regular filename characters, ``?`` matches any single regular filename
character, and ``[range]`` matches any of the characters in *range* (e.g.,
``a-z``, ``a-zA-Z``, ``a-f0-9_.``). The definition of "regular filename
character" is platform-specific: on Unix it is anything except slash; on Windows
anything except backslash or colon; on Mac OS 9 anything except colon.
**\*\*** Windows support not there yet **\*\***
.. % \section{Creating a built distribution: the
.. % \protect\command{bdist} command family}
.. % \label{bdist-cmds}
.. % \subsection{\protect\command{bdist}}
.. % \subsection{\protect\command{bdist\_dumb}}
.. % \subsection{\protect\command{bdist\_rpm}}
.. % \subsection{\protect\command{bdist\_wininst}}

View File

@ -1,130 +0,0 @@
.. _setup-config:
************************************
Writing the Setup Configuration File
************************************
Often, it's not possible to write down everything needed to build a distribution
*a priori*: you may need to get some information from the user, or from the
user's system, in order to proceed. As long as that information is fairly
simple---a list of directories to search for C header files or libraries, for
example---then providing a configuration file, :file:`setup.cfg`, for users to
edit is a cheap and easy way to solicit it. Configuration files also let you
provide default values for any command option, which the installer can then
override either on the command-line or by editing the config file.
The setup configuration file is a useful middle-ground between the setup script
---which, ideally, would be opaque to installers [#]_---and the command-line to
the setup script, which is outside of your control and entirely up to the
installer. In fact, :file:`setup.cfg` (and any other Distutils configuration
files present on the target system) are processed after the contents of the
setup script, but before the command-line. This has several useful
consequences:
.. % (If you have more advanced needs, such as determining which extensions
.. % to build based on what capabilities are present on the target system,
.. % then you need the Distutils ``auto-configuration'' facility. This
.. % started to appear in Distutils 0.9 but, as of this writing, isn't mature
.. % or stable enough yet for real-world use.)
* installers can override some of what you put in :file:`setup.py` by editing
:file:`setup.cfg`
* you can provide non-standard defaults for options that are not easily set in
:file:`setup.py`
* installers can override anything in :file:`setup.cfg` using the command-line
options to :file:`setup.py`
The basic syntax of the configuration file is simple::
[command]
option=value
...
where *command* is one of the Distutils commands (e.g. :command:`build_py`,
:command:`install`), and *option* is one of the options that command supports.
Any number of options can be supplied for each command, and any number of
command sections can be included in the file. Blank lines are ignored, as are
comments, which run from a ``'#'`` character until the end of the line. Long
option values can be split across multiple lines simply by indenting the
continuation lines.
You can find out the list of options supported by a particular command with the
universal :option:`--help` option, e.g. ::
> python setup.py --help build_ext
[...]
Options for 'build_ext' command:
--build-lib (-b) directory for compiled extension modules
--build-temp (-t) directory for temporary files (build by-products)
--inplace (-i) ignore build-lib and put compiled extensions into the
source directory alongside your pure Python modules
--include-dirs (-I) list of directories to search for header files
--define (-D) C preprocessor macros to define
--undef (-U) C preprocessor macros to undefine
--swig-opts list of SWIG command line options
[...]
Note that an option spelled :option:`--foo-bar` on the command-line is spelled
:option:`foo_bar` in configuration files.
For example, say you want your extensions to be built "in-place"---that is, you
have an extension :mod:`pkg.ext`, and you want the compiled extension file
(:file:`ext.so` on Unix, say) to be put in the same source directory as your
pure Python modules :mod:`pkg.mod1` and :mod:`pkg.mod2`. You can always use the
:option:`--inplace` option on the command-line to ensure this::
python setup.py build_ext --inplace
But this requires that you always specify the :command:`build_ext` command
explicitly, and remember to provide :option:`--inplace`. An easier way is to
"set and forget" this option, by encoding it in :file:`setup.cfg`, the
configuration file for this distribution::
[build_ext]
inplace=1
This will affect all builds of this module distribution, whether or not you
explicitly specify :command:`build_ext`. If you include :file:`setup.cfg` in
your source distribution, it will also affect end-user builds---which is
probably a bad idea for this option, since always building extensions in-place
would break installation of the module distribution. In certain peculiar cases,
though, modules are built right in their installation directory, so this is
conceivably a useful ability. (Distributing extensions that expect to be built
in their installation directory is almost always a bad idea, though.)
Another example: certain commands take a lot of options that don't change from
run to run; for example, :command:`bdist_rpm` needs to know everything required
to generate a "spec" file for creating an RPM distribution. Some of this
information comes from the setup script, and some is automatically generated by
the Distutils (such as the list of files installed). But some of it has to be
supplied as options to :command:`bdist_rpm`, which would be very tedious to do
on the command-line for every run. Hence, here is a snippet from the Distutils'
own :file:`setup.cfg`::
[bdist_rpm]
release = 1
packager = Greg Ward <gward@python.net>
doc_files = CHANGES.txt
README.txt
USAGE.txt
doc/
examples/
Note that the :option:`doc_files` option is simply a whitespace-separated string
split across multiple lines for readability.
.. seealso::
:ref:`inst-config-syntax` in "Installing Python Modules"
More information on the configuration files is available in the manual for
system administrators.
.. rubric:: Footnotes
.. [#] This ideal probably won't be achieved until auto-configuration is fully
supported by the Distutils.

View File

@ -1,241 +0,0 @@
.. _examples:
********
Examples
********
This chapter provides a number of basic examples to help get started with
distutils. Additional information about using distutils can be found in the
Distutils Cookbook.
.. seealso::
`Distutils Cookbook <http://www.python.org/cgi-bin/moinmoin/DistutilsCookbook>`_
Collection of recipes showing how to achieve more control over distutils.
.. _pure-mod:
Pure Python distribution (by module)
====================================
If you're just distributing a couple of modules, especially if they don't live
in a particular package, you can specify them individually using the
:option:`py_modules` option in the setup script.
In the simplest case, you'll have two files to worry about: a setup script and
the single module you're distributing, :file:`foo.py` in this example::
<root>/
setup.py
foo.py
(In all diagrams in this section, *<root>* will refer to the distribution root
directory.) A minimal setup script to describe this situation would be::
from distutils.core import setup
setup(name='foo',
version='1.0',
py_modules=['foo'],
)
Note that the name of the distribution is specified independently with the
:option:`name` option, and there's no rule that says it has to be the same as
the name of the sole module in the distribution (although that's probably a good
convention to follow). However, the distribution name is used to generate
filenames, so you should stick to letters, digits, underscores, and hyphens.
Since :option:`py_modules` is a list, you can of course specify multiple
modules, eg. if you're distributing modules :mod:`foo` and :mod:`bar`, your
setup might look like this::
<root>/
setup.py
foo.py
bar.py
and the setup script might be ::
from distutils.core import setup
setup(name='foobar',
version='1.0',
py_modules=['foo', 'bar'],
)
You can put module source files into another directory, but if you have enough
modules to do that, it's probably easier to specify modules by package rather
than listing them individually.
.. _pure-pkg:
Pure Python distribution (by package)
=====================================
If you have more than a couple of modules to distribute, especially if they are
in multiple packages, it's probably easier to specify whole packages rather than
individual modules. This works even if your modules are not in a package; you
can just tell the Distutils to process modules from the root package, and that
works the same as any other package (except that you don't have to have an
:file:`__init__.py` file).
The setup script from the last example could also be written as ::
from distutils.core import setup
setup(name='foobar',
version='1.0',
packages=[''],
)
(The empty string stands for the root package.)
If those two files are moved into a subdirectory, but remain in the root
package, e.g.::
<root>/
setup.py
src/ foo.py
bar.py
then you would still specify the root package, but you have to tell the
Distutils where source files in the root package live::
from distutils.core import setup
setup(name='foobar',
version='1.0',
package_dir={'': 'src'},
packages=[''],
)
More typically, though, you will want to distribute multiple modules in the same
package (or in sub-packages). For example, if the :mod:`foo` and :mod:`bar`
modules belong in package :mod:`foobar`, one way to layout your source tree is
::
<root>/
setup.py
foobar/
__init__.py
foo.py
bar.py
This is in fact the default layout expected by the Distutils, and the one that
requires the least work to describe in your setup script::
from distutils.core import setup
setup(name='foobar',
version='1.0',
packages=['foobar'],
)
If you want to put modules in directories not named for their package, then you
need to use the :option:`package_dir` option again. For example, if the
:file:`src` directory holds modules in the :mod:`foobar` package::
<root>/
setup.py
src/
__init__.py
foo.py
bar.py
an appropriate setup script would be ::
from distutils.core import setup
setup(name='foobar',
version='1.0',
package_dir={'foobar': 'src'},
packages=['foobar'],
)
Or, you might put modules from your main package right in the distribution
root::
<root>/
setup.py
__init__.py
foo.py
bar.py
in which case your setup script would be ::
from distutils.core import setup
setup(name='foobar',
version='1.0',
package_dir={'foobar': ''},
packages=['foobar'],
)
(The empty string also stands for the current directory.)
If you have sub-packages, they must be explicitly listed in :option:`packages`,
but any entries in :option:`package_dir` automatically extend to sub-packages.
(In other words, the Distutils does *not* scan your source tree, trying to
figure out which directories correspond to Python packages by looking for
:file:`__init__.py` files.) Thus, if the default layout grows a sub-package::
<root>/
setup.py
foobar/
__init__.py
foo.py
bar.py
subfoo/
__init__.py
blah.py
then the corresponding setup script would be ::
from distutils.core import setup
setup(name='foobar',
version='1.0',
packages=['foobar', 'foobar.subfoo'],
)
(Again, the empty string in :option:`package_dir` stands for the current
directory.)
.. _single-ext:
Single extension module
=======================
Extension modules are specified using the :option:`ext_modules` option.
:option:`package_dir` has no effect on where extension source files are found;
it only affects the source for pure Python modules. The simplest case, a
single extension module in a single C source file, is::
<root>/
setup.py
foo.c
If the :mod:`foo` extension belongs in the root package, the setup script for
this could be ::
from distutils.core import setup
from distutils.extension import Extension
setup(name='foobar',
version='1.0',
ext_modules=[Extension('foo', ['foo.c'])],
)
If the extension actually belongs in a package, say :mod:`foopkg`, then
With exactly the same source tree layout, this extension can be put in the
:mod:`foopkg` package simply by changing the name of the extension::
from distutils.core import setup
from distutils.extension import Extension
setup(name='foobar',
version='1.0',
ext_modules=[Extension('foopkg.foo', ['foo.c'])],
)
.. % \section{Multiple extension modules}
.. % \label{multiple-ext}
.. % \section{Putting it all together}

View File

@ -1,96 +0,0 @@
.. _extending:
*******************
Extending Distutils
*******************
Distutils can be extended in various ways. Most extensions take the form of new
commands or replacements for existing commands. New commands may be written to
support new types of platform-specific packaging, for example, while
replacements for existing commands may be made to modify details of how the
command operates on a package.
Most extensions of the distutils are made within :file:`setup.py` scripts that
want to modify existing commands; many simply add a few file extensions that
should be copied into packages in addition to :file:`.py` files as a
convenience.
Most distutils command implementations are subclasses of the :class:`Command`
class from :mod:`distutils.cmd`. New commands may directly inherit from
:class:`Command`, while replacements often derive from :class:`Command`
indirectly, directly subclassing the command they are replacing. Commands are
required to derive from :class:`Command`.
.. % \section{Extending existing commands}
.. % \label{extend-existing}
.. % \section{Writing new commands}
.. % \label{new-commands}
.. % \XXX{Would an uninstall command be a good example here?}
Integrating new commands
========================
There are different ways to integrate new command implementations into
distutils. The most difficult is to lobby for the inclusion of the new features
in distutils itself, and wait for (and require) a version of Python that
provides that support. This is really hard for many reasons.
The most common, and possibly the most reasonable for most needs, is to include
the new implementations with your :file:`setup.py` script, and cause the
:func:`distutils.core.setup` function use them::
from distutils.command.build_py import build_py as _build_py
from distutils.core import setup
class build_py(_build_py):
"""Specialized Python source builder."""
# implement whatever needs to be different...
setup(cmdclass={'build_py': build_py},
...)
This approach is most valuable if the new implementations must be used to use a
particular package, as everyone interested in the package will need to have the
new command implementation.
Beginning with Python 2.4, a third option is available, intended to allow new
commands to be added which can support existing :file:`setup.py` scripts without
requiring modifications to the Python installation. This is expected to allow
third-party extensions to provide support for additional packaging systems, but
the commands can be used for anything distutils commands can be used for. A new
configuration option, :option:`command_packages` (command-line option
:option:`--command-packages`), can be used to specify additional packages to be
searched for modules implementing commands. Like all distutils options, this
can be specified on the command line or in a configuration file. This option
can only be set in the ``[global]`` section of a configuration file, or before
any commands on the command line. If set in a configuration file, it can be
overridden from the command line; setting it to an empty string on the command
line causes the default to be used. This should never be set in a configuration
file provided with a package.
This new option can be used to add any number of packages to the list of
packages searched for command implementations; multiple package names should be
separated by commas. When not specified, the search is only performed in the
:mod:`distutils.command` package. When :file:`setup.py` is run with the option
:option:`--command-packages` :option:`distcmds,buildcmds`, however, the packages
:mod:`distutils.command`, :mod:`distcmds`, and :mod:`buildcmds` will be searched
in that order. New commands are expected to be implemented in modules of the
same name as the command by classes sharing the same name. Given the example
command line option above, the command :command:`bdist_openpkg` could be
implemented by the class :class:`distcmds.bdist_openpkg.bdist_openpkg` or
:class:`buildcmds.bdist_openpkg.bdist_openpkg`.
Adding new distribution types
=============================
Commands that create distributions (files in the :file:`dist/` directory) need
to add ``(command, filename)`` pairs to ``self.distribution.dist_files`` so that
:command:`upload` can upload it to PyPI. The *filename* in the pair contains no
path information, only the name of the file itself. In dry-run mode, pairs
should still be added to represent what would have been created.

View File

@ -1,30 +0,0 @@
.. _distutils-index:
###############################
Distributing Python Modules
###############################
:Authors: Greg Ward, Anthony Baxter
:Email: distutils-sig@python.org
:Release: |version|
:Date: |today|
This document describes the Python Distribution Utilities ("Distutils") from
the module developer's point of view, describing how to use the Distutils to
make Python modules and extensions easily available to a wider audience with
very little overhead for build/release/install mechanics.
.. toctree::
:maxdepth: 2
introduction.rst
setupscript.rst
configfile.rst
sourcedist.rst
builtdist.rst
packageindex.rst
uploading.rst
examples.rst
extending.rst
commandref.rst
apiref.rst

View File

@ -1,208 +0,0 @@
.. _distutils-intro:
****************************
An Introduction to Distutils
****************************
This document covers using the Distutils to distribute your Python modules,
concentrating on the role of developer/distributor: if you're looking for
information on installing Python modules, you should refer to the
:ref:`install-index` chapter.
.. _distutils-concepts:
Concepts & Terminology
======================
Using the Distutils is quite simple, both for module developers and for
users/administrators installing third-party modules. As a developer, your
responsibilities (apart from writing solid, well-documented and well-tested
code, of course!) are:
* write a setup script (:file:`setup.py` by convention)
* (optional) write a setup configuration file
* create a source distribution
* (optional) create one or more built (binary) distributions
Each of these tasks is covered in this document.
Not all module developers have access to a multitude of platforms, so it's not
always feasible to expect them to create a multitude of built distributions. It
is hoped that a class of intermediaries, called *packagers*, will arise to
address this need. Packagers will take source distributions released by module
developers, build them on one or more platforms, and release the resulting built
distributions. Thus, users on the most popular platforms will be able to
install most popular Python module distributions in the most natural way for
their platform, without having to run a single setup script or compile a line of
code.
.. _distutils-simple-example:
A Simple Example
================
The setup script is usually quite simple, although since it's written in Python,
there are no arbitrary limits to what you can do with it, though you should be
careful about putting arbitrarily expensive operations in your setup script.
Unlike, say, Autoconf-style configure scripts, the setup script may be run
multiple times in the course of building and installing your module
distribution.
If all you want to do is distribute a module called :mod:`foo`, contained in a
file :file:`foo.py`, then your setup script can be as simple as this::
from distutils.core import setup
setup(name='foo',
version='1.0',
py_modules=['foo'],
)
Some observations:
* most information that you supply to the Distutils is supplied as keyword
arguments to the :func:`setup` function
* those keyword arguments fall into two categories: package metadata (name,
version number) and information about what's in the package (a list of pure
Python modules, in this case)
* modules are specified by module name, not filename (the same will hold true
for packages and extensions)
* it's recommended that you supply a little more metadata, in particular your
name, email address and a URL for the project (see section :ref:`setup-script`
for an example)
To create a source distribution for this module, you would create a setup
script, :file:`setup.py`, containing the above code, and run::
python setup.py sdist
which will create an archive file (e.g., tarball on Unix, ZIP file on Windows)
containing your setup script :file:`setup.py`, and your module :file:`foo.py`.
The archive file will be named :file:`foo-1.0.tar.gz` (or :file:`.zip`), and
will unpack into a directory :file:`foo-1.0`.
If an end-user wishes to install your :mod:`foo` module, all she has to do is
download :file:`foo-1.0.tar.gz` (or :file:`.zip`), unpack it, and---from the
:file:`foo-1.0` directory---run ::
python setup.py install
which will ultimately copy :file:`foo.py` to the appropriate directory for
third-party modules in their Python installation.
This simple example demonstrates some fundamental concepts of the Distutils.
First, both developers and installers have the same basic user interface, i.e.
the setup script. The difference is which Distutils *commands* they use: the
:command:`sdist` command is almost exclusively for module developers, while
:command:`install` is more often for installers (although most developers will
want to install their own code occasionally).
If you want to make things really easy for your users, you can create one or
more built distributions for them. For instance, if you are running on a
Windows machine, and want to make things easy for other Windows users, you can
create an executable installer (the most appropriate type of built distribution
for this platform) with the :command:`bdist_wininst` command. For example::
python setup.py bdist_wininst
will create an executable installer, :file:`foo-1.0.win32.exe`, in the current
directory.
Other useful built distribution formats are RPM, implemented by the
:command:`bdist_rpm` command, Solaris :program:`pkgtool`
(:command:`bdist_pkgtool`), and HP-UX :program:`swinstall`
(:command:`bdist_sdux`). For example, the following command will create an RPM
file called :file:`foo-1.0.noarch.rpm`::
python setup.py bdist_rpm
(The :command:`bdist_rpm` command uses the :command:`rpm` executable, therefore
this has to be run on an RPM-based system such as Red Hat Linux, SuSE Linux, or
Mandrake Linux.)
You can find out what distribution formats are available at any time by running
::
python setup.py bdist --help-formats
.. _python-terms:
General Python terminology
==========================
If you're reading this document, you probably have a good idea of what modules,
extensions, and so forth are. Nevertheless, just to be sure that everyone is
operating from a common starting point, we offer the following glossary of
common Python terms:
module
the basic unit of code reusability in Python: a block of code imported by some
other code. Three types of modules concern us here: pure Python modules,
extension modules, and packages.
pure Python module
a module written in Python and contained in a single :file:`.py` file (and
possibly associated :file:`.pyc` and/or :file:`.pyo` files). Sometimes referred
to as a "pure module."
extension module
a module written in the low-level language of the Python implementation: C/C++
for Python, Java for Jython. Typically contained in a single dynamically
loadable pre-compiled file, e.g. a shared object (:file:`.so`) file for Python
extensions on Unix, a DLL (given the :file:`.pyd` extension) for Python
extensions on Windows, or a Java class file for Jython extensions. (Note that
currently, the Distutils only handles C/C++ extensions for Python.)
package
a module that contains other modules; typically contained in a directory in the
filesystem and distinguished from other directories by the presence of a file
:file:`__init__.py`.
root package
the root of the hierarchy of packages. (This isn't really a package, since it
doesn't have an :file:`__init__.py` file. But we have to call it something.)
The vast majority of the standard library is in the root package, as are many
small, standalone third-party modules that don't belong to a larger module
collection. Unlike regular packages, modules in the root package can be found in
many directories: in fact, every directory listed in ``sys.path`` contributes
modules to the root package.
.. _distutils-term:
Distutils-specific terminology
==============================
The following terms apply more specifically to the domain of distributing Python
modules using the Distutils:
module distribution
a collection of Python modules distributed together as a single downloadable
resource and meant to be installed *en masse*. Examples of some well-known
module distributions are Numeric Python, PyXML, PIL (the Python Imaging
Library), or mxBase. (This would be called a *package*, except that term is
already taken in the Python context: a single module distribution may contain
zero, one, or many Python packages.)
pure module distribution
a module distribution that contains only pure Python modules and packages.
Sometimes referred to as a "pure distribution."
non-pure module distribution
a module distribution that contains at least one extension module. Sometimes
referred to as a "non-pure distribution."
distribution root
the top-level directory of your source tree (or source distribution); the
directory where :file:`setup.py` exists. Generally :file:`setup.py` will be
run from this directory.

View File

@ -1,65 +0,0 @@
.. _package-index:
**********************************
Registering with the Package Index
**********************************
The Python Package Index (PyPI) holds meta-data describing distributions
packaged with distutils. The distutils command :command:`register` is used to
submit your distribution's meta-data to the index. It is invoked as follows::
python setup.py register
Distutils will respond with the following prompt::
running register
We need to know who you are, so please choose either:
1. use your existing login,
2. register as a new user,
3. have the server generate a new password for you (and email it to you), or
4. quit
Your selection [default 1]:
Note: if your username and password are saved locally, you will not see this
menu.
If you have not registered with PyPI, then you will need to do so now. You
should choose option 2, and enter your details as required. Soon after
submitting your details, you will receive an email which will be used to confirm
your registration.
Once you are registered, you may choose option 1 from the menu. You will be
prompted for your PyPI username and password, and :command:`register` will then
submit your meta-data to the index.
You may submit any number of versions of your distribution to the index. If you
alter the meta-data for a particular version, you may submit it again and the
index will be updated.
PyPI holds a record for each (name, version) combination submitted. The first
user to submit information for a given name is designated the Owner of that
name. They may submit changes through the :command:`register` command or through
the web interface. They may also designate other users as Owners or Maintainers.
Maintainers may edit the package information, but not designate other Owners or
Maintainers.
By default PyPI will list all versions of a given package. To hide certain
versions, the Hidden property should be set to yes. This must be edited through
the web interface.
.. _pypirc:
The .pypirc file
================
The format of the :file:`.pypirc` file is formated as follows::
[server-login]
repository: <repository-url>
username: <username>
password: <password>
*repository* can be ommitted and defaults to ``http://www.python.org/pypi``.

View File

@ -1,669 +0,0 @@
.. _setup-script:
************************
Writing the Setup Script
************************
The setup script is the centre of all activity in building, distributing, and
installing modules using the Distutils. The main purpose of the setup script is
to describe your module distribution to the Distutils, so that the various
commands that operate on your modules do the right thing. As we saw in section
:ref:`distutils-simple-example` above, the setup script consists mainly of a call to
:func:`setup`, and most information supplied to the Distutils by the module
developer is supplied as keyword arguments to :func:`setup`.
Here's a slightly more involved example, which we'll follow for the next couple
of sections: the Distutils' own setup script. (Keep in mind that although the
Distutils are included with Python 1.6 and later, they also have an independent
existence so that Python 1.5.2 users can use them to install other module
distributions. The Distutils' own setup script, shown here, is used to install
the package into Python 1.5.2.) ::
#!/usr/bin/env python
from distutils.core import setup
setup(name='Distutils',
version='1.0',
description='Python Distribution Utilities',
author='Greg Ward',
author_email='gward@python.net',
url='http://www.python.org/sigs/distutils-sig/',
packages=['distutils', 'distutils.command'],
)
There are only two differences between this and the trivial one-file
distribution presented in section :ref:`distutils-simple-example`: more metadata, and the
specification of pure Python modules by package, rather than by module. This is
important since the Distutils consist of a couple of dozen modules split into
(so far) two packages; an explicit list of every module would be tedious to
generate and difficult to maintain. For more information on the additional
meta-data, see section :ref:`meta-data`.
Note that any pathnames (files or directories) supplied in the setup script
should be written using the Unix convention, i.e. slash-separated. The
Distutils will take care of converting this platform-neutral representation into
whatever is appropriate on your current platform before actually using the
pathname. This makes your setup script portable across operating systems, which
of course is one of the major goals of the Distutils. In this spirit, all
pathnames in this document are slash-separated. (Mac OS 9 programmers should
keep in mind that the *absence* of a leading slash indicates a relative path,
the opposite of the Mac OS convention with colons.)
This, of course, only applies to pathnames given to Distutils functions. If
you, for example, use standard Python functions such as :func:`glob.glob` or
:func:`os.listdir` to specify files, you should be careful to write portable
code instead of hardcoding path separators::
glob.glob(os.path.join('mydir', 'subdir', '*.html'))
os.listdir(os.path.join('mydir', 'subdir'))
.. _listing-packages:
Listing whole packages
======================
The :option:`packages` option tells the Distutils to process (build, distribute,
install, etc.) all pure Python modules found in each package mentioned in the
:option:`packages` list. In order to do this, of course, there has to be a
correspondence between package names and directories in the filesystem. The
default correspondence is the most obvious one, i.e. package :mod:`distutils` is
found in the directory :file:`distutils` relative to the distribution root.
Thus, when you say ``packages = ['foo']`` in your setup script, you are
promising that the Distutils will find a file :file:`foo/__init__.py` (which
might be spelled differently on your system, but you get the idea) relative to
the directory where your setup script lives. If you break this promise, the
Distutils will issue a warning but still process the broken package anyways.
If you use a different convention to lay out your source directory, that's no
problem: you just have to supply the :option:`package_dir` option to tell the
Distutils about your convention. For example, say you keep all Python source
under :file:`lib`, so that modules in the "root package" (i.e., not in any
package at all) are in :file:`lib`, modules in the :mod:`foo` package are in
:file:`lib/foo`, and so forth. Then you would put ::
package_dir = {'': 'lib'}
in your setup script. The keys to this dictionary are package names, and an
empty package name stands for the root package. The values are directory names
relative to your distribution root. In this case, when you say ``packages =
['foo']``, you are promising that the file :file:`lib/foo/__init__.py` exists.
Another possible convention is to put the :mod:`foo` package right in
:file:`lib`, the :mod:`foo.bar` package in :file:`lib/bar`, etc. This would be
written in the setup script as ::
package_dir = {'foo': 'lib'}
A ``package: dir`` entry in the :option:`package_dir` dictionary implicitly
applies to all packages below *package*, so the :mod:`foo.bar` case is
automatically handled here. In this example, having ``packages = ['foo',
'foo.bar']`` tells the Distutils to look for :file:`lib/__init__.py` and
:file:`lib/bar/__init__.py`. (Keep in mind that although :option:`package_dir`
applies recursively, you must explicitly list all packages in
:option:`packages`: the Distutils will *not* recursively scan your source tree
looking for any directory with an :file:`__init__.py` file.)
.. _listing-modules:
Listing individual modules
==========================
For a small module distribution, you might prefer to list all modules rather
than listing packages---especially the case of a single module that goes in the
"root package" (i.e., no package at all). This simplest case was shown in
section :ref:`distutils-simple-example`; here is a slightly more involved example::
py_modules = ['mod1', 'pkg.mod2']
This describes two modules, one of them in the "root" package, the other in the
:mod:`pkg` package. Again, the default package/directory layout implies that
these two modules can be found in :file:`mod1.py` and :file:`pkg/mod2.py`, and
that :file:`pkg/__init__.py` exists as well. And again, you can override the
package/directory correspondence using the :option:`package_dir` option.
.. _describing-extensions:
Describing extension modules
============================
Just as writing Python extension modules is a bit more complicated than writing
pure Python modules, describing them to the Distutils is a bit more complicated.
Unlike pure modules, it's not enough just to list modules or packages and expect
the Distutils to go out and find the right files; you have to specify the
extension name, source file(s), and any compile/link requirements (include
directories, libraries to link with, etc.).
.. % XXX read over this section
All of this is done through another keyword argument to :func:`setup`, the
:option:`ext_modules` option. :option:`ext_modules` is just a list of
:class:`Extension` instances, each of which describes a single extension module.
Suppose your distribution includes a single extension, called :mod:`foo` and
implemented by :file:`foo.c`. If no additional instructions to the
compiler/linker are needed, describing this extension is quite simple::
Extension('foo', ['foo.c'])
The :class:`Extension` class can be imported from :mod:`distutils.core` along
with :func:`setup`. Thus, the setup script for a module distribution that
contains only this one extension and nothing else might be::
from distutils.core import setup, Extension
setup(name='foo',
version='1.0',
ext_modules=[Extension('foo', ['foo.c'])],
)
The :class:`Extension` class (actually, the underlying extension-building
machinery implemented by the :command:`build_ext` command) supports a great deal
of flexibility in describing Python extensions, which is explained in the
following sections.
Extension names and packages
----------------------------
The first argument to the :class:`Extension` constructor is always the name of
the extension, including any package names. For example, ::
Extension('foo', ['src/foo1.c', 'src/foo2.c'])
describes an extension that lives in the root package, while ::
Extension('pkg.foo', ['src/foo1.c', 'src/foo2.c'])
describes the same extension in the :mod:`pkg` package. The source files and
resulting object code are identical in both cases; the only difference is where
in the filesystem (and therefore where in Python's namespace hierarchy) the
resulting extension lives.
If you have a number of extensions all in the same package (or all under the
same base package), use the :option:`ext_package` keyword argument to
:func:`setup`. For example, ::
setup(...
ext_package='pkg',
ext_modules=[Extension('foo', ['foo.c']),
Extension('subpkg.bar', ['bar.c'])],
)
will compile :file:`foo.c` to the extension :mod:`pkg.foo`, and :file:`bar.c` to
:mod:`pkg.subpkg.bar`.
Extension source files
----------------------
The second argument to the :class:`Extension` constructor is a list of source
files. Since the Distutils currently only support C, C++, and Objective-C
extensions, these are normally C/C++/Objective-C source files. (Be sure to use
appropriate extensions to distinguish C++\ source files: :file:`.cc` and
:file:`.cpp` seem to be recognized by both Unix and Windows compilers.)
However, you can also include SWIG interface (:file:`.i`) files in the list; the
:command:`build_ext` command knows how to deal with SWIG extensions: it will run
SWIG on the interface file and compile the resulting C/C++ file into your
extension.
**\*\*** SWIG support is rough around the edges and largely untested! **\*\***
This warning notwithstanding, options to SWIG can be currently passed like
this::
setup(...
ext_modules=[Extension('_foo', ['foo.i'],
swig_opts=['-modern', '-I../include'])],
py_modules=['foo'],
)
Or on the commandline like this::
> python setup.py build_ext --swig-opts="-modern -I../include"
On some platforms, you can include non-source files that are processed by the
compiler and included in your extension. Currently, this just means Windows
message text (:file:`.mc`) files and resource definition (:file:`.rc`) files for
Visual C++. These will be compiled to binary resource (:file:`.res`) files and
linked into the executable.
Preprocessor options
--------------------
Three optional arguments to :class:`Extension` will help if you need to specify
include directories to search or preprocessor macros to define/undefine:
``include_dirs``, ``define_macros``, and ``undef_macros``.
For example, if your extension requires header files in the :file:`include`
directory under your distribution root, use the ``include_dirs`` option::
Extension('foo', ['foo.c'], include_dirs=['include'])
You can specify absolute directories there; if you know that your extension will
only be built on Unix systems with X11R6 installed to :file:`/usr`, you can get
away with ::
Extension('foo', ['foo.c'], include_dirs=['/usr/include/X11'])
You should avoid this sort of non-portable usage if you plan to distribute your
code: it's probably better to write C code like ::
#include <X11/Xlib.h>
If you need to include header files from some other Python extension, you can
take advantage of the fact that header files are installed in a consistent way
by the Distutils :command:`install_header` command. For example, the Numerical
Python header files are installed (on a standard Unix installation) to
:file:`/usr/local/include/python1.5/Numerical`. (The exact location will differ
according to your platform and Python installation.) Since the Python include
directory---\ :file:`/usr/local/include/python1.5` in this case---is always
included in the search path when building Python extensions, the best approach
is to write C code like ::
#include <Numerical/arrayobject.h>
If you must put the :file:`Numerical` include directory right into your header
search path, though, you can find that directory using the Distutils
:mod:`distutils.sysconfig` module::
from distutils.sysconfig import get_python_inc
incdir = os.path.join(get_python_inc(plat_specific=1), 'Numerical')
setup(...,
Extension(..., include_dirs=[incdir]),
)
Even though this is quite portable---it will work on any Python installation,
regardless of platform---it's probably easier to just write your C code in the
sensible way.
You can define and undefine pre-processor macros with the ``define_macros`` and
``undef_macros`` options. ``define_macros`` takes a list of ``(name, value)``
tuples, where ``name`` is the name of the macro to define (a string) and
``value`` is its value: either a string or ``None``. (Defining a macro ``FOO``
to ``None`` is the equivalent of a bare ``#define FOO`` in your C source: with
most compilers, this sets ``FOO`` to the string ``1``.) ``undef_macros`` is
just a list of macros to undefine.
For example::
Extension(...,
define_macros=[('NDEBUG', '1'),
('HAVE_STRFTIME', None)],
undef_macros=['HAVE_FOO', 'HAVE_BAR'])
is the equivalent of having this at the top of every C source file::
#define NDEBUG 1
#define HAVE_STRFTIME
#undef HAVE_FOO
#undef HAVE_BAR
Library options
---------------
You can also specify the libraries to link against when building your extension,
and the directories to search for those libraries. The ``libraries`` option is
a list of libraries to link against, ``library_dirs`` is a list of directories
to search for libraries at link-time, and ``runtime_library_dirs`` is a list of
directories to search for shared (dynamically loaded) libraries at run-time.
For example, if you need to link against libraries known to be in the standard
library search path on target systems ::
Extension(...,
libraries=['gdbm', 'readline'])
If you need to link with libraries in a non-standard location, you'll have to
include the location in ``library_dirs``::
Extension(...,
library_dirs=['/usr/X11R6/lib'],
libraries=['X11', 'Xt'])
(Again, this sort of non-portable construct should be avoided if you intend to
distribute your code.)
**\*\*** Should mention clib libraries here or somewhere else! **\*\***
Other options
-------------
There are still some other options which can be used to handle special cases.
The :option:`extra_objects` option is a list of object files to be passed to the
linker. These files must not have extensions, as the default extension for the
compiler is used.
:option:`extra_compile_args` and :option:`extra_link_args` can be used to
specify additional command line options for the respective compiler and linker
command lines.
:option:`export_symbols` is only useful on Windows. It can contain a list of
symbols (functions or variables) to be exported. This option is not needed when
building compiled extensions: Distutils will automatically add ``initmodule``
to the list of exported symbols.
Relationships between Distributions and Packages
================================================
A distribution may relate to packages in three specific ways:
#. It can require packages or modules.
#. It can provide packages or modules.
#. It can obsolete packages or modules.
These relationships can be specified using keyword arguments to the
:func:`distutils.core.setup` function.
Dependencies on other Python modules and packages can be specified by supplying
the *requires* keyword argument to :func:`setup`. The value must be a list of
strings. Each string specifies a package that is required, and optionally what
versions are sufficient.
To specify that any version of a module or package is required, the string
should consist entirely of the module or package name. Examples include
``'mymodule'`` and ``'xml.parsers.expat'``.
If specific versions are required, a sequence of qualifiers can be supplied in
parentheses. Each qualifier may consist of a comparison operator and a version
number. The accepted comparison operators are::
< > ==
<= >= !=
These can be combined by using multiple qualifiers separated by commas (and
optional whitespace). In this case, all of the qualifiers must be matched; a
logical AND is used to combine the evaluations.
Let's look at a bunch of examples:
+-------------------------+----------------------------------------------+
| Requires Expression | Explanation |
+=========================+==============================================+
| ``==1.0`` | Only version ``1.0`` is compatible |
+-------------------------+----------------------------------------------+
| ``>1.0, !=1.5.1, <2.0`` | Any version after ``1.0`` and before ``2.0`` |
| | is compatible, except ``1.5.1`` |
+-------------------------+----------------------------------------------+
Now that we can specify dependencies, we also need to be able to specify what we
provide that other distributions can require. This is done using the *provides*
keyword argument to :func:`setup`. The value for this keyword is a list of
strings, each of which names a Python module or package, and optionally
identifies the version. If the version is not specified, it is assumed to match
that of the distribution.
Some examples:
+---------------------+----------------------------------------------+
| Provides Expression | Explanation |
+=====================+==============================================+
| ``mypkg`` | Provide ``mypkg``, using the distribution |
| | version |
+---------------------+----------------------------------------------+
| ``mypkg (1.1)`` | Provide ``mypkg`` version 1.1, regardless of |
| | the distribution version |
+---------------------+----------------------------------------------+
A package can declare that it obsoletes other packages using the *obsoletes*
keyword argument. The value for this is similar to that of the *requires*
keyword: a list of strings giving module or package specifiers. Each specifier
consists of a module or package name optionally followed by one or more version
qualifiers. Version qualifiers are given in parentheses after the module or
package name.
The versions identified by the qualifiers are those that are obsoleted by the
distribution being described. If no qualifiers are given, all versions of the
named module or package are understood to be obsoleted.
Installing Scripts
==================
So far we have been dealing with pure and non-pure Python modules, which are
usually not run by themselves but imported by scripts.
Scripts are files containing Python source code, intended to be started from the
command line. Scripts don't require Distutils to do anything very complicated.
The only clever feature is that if the first line of the script starts with
``#!`` and contains the word "python", the Distutils will adjust the first line
to refer to the current interpreter location. By default, it is replaced with
the current interpreter location. The :option:`--executable` (or :option:`-e`)
option will allow the interpreter path to be explicitly overridden.
The :option:`scripts` option simply is a list of files to be handled in this
way. From the PyXML setup script::
setup(...
scripts=['scripts/xmlproc_parse', 'scripts/xmlproc_val']
)
Installing Package Data
=======================
Often, additional files need to be installed into a package. These files are
often data that's closely related to the package's implementation, or text files
containing documentation that might be of interest to programmers using the
package. These files are called :dfn:`package data`.
Package data can be added to packages using the ``package_data`` keyword
argument to the :func:`setup` function. The value must be a mapping from
package name to a list of relative path names that should be copied into the
package. The paths are interpreted as relative to the directory containing the
package (information from the ``package_dir`` mapping is used if appropriate);
that is, the files are expected to be part of the package in the source
directories. They may contain glob patterns as well.
The path names may contain directory portions; any necessary directories will be
created in the installation.
For example, if a package should contain a subdirectory with several data files,
the files can be arranged like this in the source tree::
setup.py
src/
mypkg/
__init__.py
module.py
data/
tables.dat
spoons.dat
forks.dat
The corresponding call to :func:`setup` might be::
setup(...,
packages=['mypkg'],
package_dir={'mypkg': 'src/mypkg'},
package_data={'mypkg': ['data/*.dat']},
)
.. versionadded:: 2.4
Installing Additional Files
===========================
The :option:`data_files` option can be used to specify additional files needed
by the module distribution: configuration files, message catalogs, data files,
anything which doesn't fit in the previous categories.
:option:`data_files` specifies a sequence of (*directory*, *files*) pairs in the
following way::
setup(...
data_files=[('bitmaps', ['bm/b1.gif', 'bm/b2.gif']),
('config', ['cfg/data.cfg']),
('/etc/init.d', ['init-script'])]
)
Note that you can specify the directory names where the data files will be
installed, but you cannot rename the data files themselves.
Each (*directory*, *files*) pair in the sequence specifies the installation
directory and the files to install there. If *directory* is a relative path, it
is interpreted relative to the installation prefix (Python's ``sys.prefix`` for
pure-Python packages, ``sys.exec_prefix`` for packages that contain extension
modules). Each file name in *files* is interpreted relative to the
:file:`setup.py` script at the top of the package source distribution. No
directory information from *files* is used to determine the final location of
the installed file; only the name of the file is used.
You can specify the :option:`data_files` options as a simple sequence of files
without specifying a target directory, but this is not recommended, and the
:command:`install` command will print a warning in this case. To install data
files directly in the target directory, an empty string should be given as the
directory.
.. _meta-data:
Additional meta-data
====================
The setup script may include additional meta-data beyond the name and version.
This information includes:
+----------------------+---------------------------+-----------------+--------+
| Meta-Data | Description | Value | Notes |
+======================+===========================+=================+========+
| ``name`` | name of the package | short string | \(1) |
+----------------------+---------------------------+-----------------+--------+
| ``version`` | version of this release | short string | (1)(2) |
+----------------------+---------------------------+-----------------+--------+
| ``author`` | package author's name | short string | \(3) |
+----------------------+---------------------------+-----------------+--------+
| ``author_email`` | email address of the | email address | \(3) |
| | package author | | |
+----------------------+---------------------------+-----------------+--------+
| ``maintainer`` | package maintainer's name | short string | \(3) |
+----------------------+---------------------------+-----------------+--------+
| ``maintainer_email`` | email address of the | email address | \(3) |
| | package maintainer | | |
+----------------------+---------------------------+-----------------+--------+
| ``url`` | home page for the package | URL | \(1) |
+----------------------+---------------------------+-----------------+--------+
| ``description`` | short, summary | short string | |
| | description of the | | |
| | package | | |
+----------------------+---------------------------+-----------------+--------+
| ``long_description`` | longer description of the | long string | |
| | package | | |
+----------------------+---------------------------+-----------------+--------+
| ``download_url`` | location where the | URL | \(4) |
| | package may be downloaded | | |
+----------------------+---------------------------+-----------------+--------+
| ``classifiers`` | a list of classifiers | list of strings | \(4) |
+----------------------+---------------------------+-----------------+--------+
Notes:
(1)
These fields are required.
(2)
It is recommended that versions take the form *major.minor[.patch[.sub]]*.
(3)
Either the author or the maintainer must be identified.
(4)
These fields should not be used if your package is to be compatible with Python
versions prior to 2.2.3 or 2.3. The list is available from the `PyPI website
<http://www.python.org/pypi>`_.
'short string'
A single line of text, not more than 200 characters.
'long string'
Multiple lines of plain text in reStructuredText format (see
http://docutils.sf.net/).
'list of strings'
See below.
None of the string values may be Unicode.
Encoding the version information is an art in itself. Python packages generally
adhere to the version format *major.minor[.patch][sub]*. The major number is 0
for initial, experimental releases of software. It is incremented for releases
that represent major milestones in a package. The minor number is incremented
when important new features are added to the package. The patch number
increments when bug-fix releases are made. Additional trailing version
information is sometimes used to indicate sub-releases. These are
"a1,a2,...,aN" (for alpha releases, where functionality and API may change),
"b1,b2,...,bN" (for beta releases, which only fix bugs) and "pr1,pr2,...,prN"
(for final pre-release release testing). Some examples:
0.1.0
the first, experimental release of a package
1.0.1a2
the second alpha release of the first patch version of 1.0
:option:`classifiers` are specified in a python list::
setup(...
classifiers=[
'Development Status :: 4 - Beta',
'Environment :: Console',
'Environment :: Web Environment',
'Intended Audience :: End Users/Desktop',
'Intended Audience :: Developers',
'Intended Audience :: System Administrators',
'License :: OSI Approved :: Python Software Foundation License',
'Operating System :: MacOS :: MacOS X',
'Operating System :: Microsoft :: Windows',
'Operating System :: POSIX',
'Programming Language :: Python',
'Topic :: Communications :: Email',
'Topic :: Office/Business',
'Topic :: Software Development :: Bug Tracking',
],
)
If you wish to include classifiers in your :file:`setup.py` file and also wish
to remain backwards-compatible with Python releases prior to 2.2.3, then you can
include the following code fragment in your :file:`setup.py` before the
:func:`setup` call. ::
# patch distutils if it can't cope with the "classifiers" or
# "download_url" keywords
from sys import version
if version < '2.2.3':
from distutils.dist import DistributionMetadata
DistributionMetadata.classifiers = None
DistributionMetadata.download_url = None
Debugging the setup script
==========================
Sometimes things go wrong, and the setup script doesn't do what the developer
wants.
Distutils catches any exceptions when running the setup script, and print a
simple error message before the script is terminated. The motivation for this
behaviour is to not confuse administrators who don't know much about Python and
are trying to install a package. If they get a big long traceback from deep
inside the guts of Distutils, they may think the package or the Python
installation is broken because they don't read all the way down to the bottom
and see that it's a permission problem.
On the other hand, this doesn't help the developer to find the cause of the
failure. For this purpose, the DISTUTILS_DEBUG environment variable can be set
to anything except an empty string, and distutils will now print detailed
information what it is doing, and prints the full traceback in case an exception
occurs.

View File

@ -1,207 +0,0 @@
.. _source-dist:
******************************
Creating a Source Distribution
******************************
As shown in section :ref:`distutils-simple-example`, you use the :command:`sdist` command
to create a source distribution. In the simplest case, ::
python setup.py sdist
(assuming you haven't specified any :command:`sdist` options in the setup script
or config file), :command:`sdist` creates the archive of the default format for
the current platform. The default format is a gzip'ed tar file
(:file:`.tar.gz`) on Unix, and ZIP file on Windows.
You can specify as many formats as you like using the :option:`--formats`
option, for example::
python setup.py sdist --formats=gztar,zip
to create a gzipped tarball and a zip file. The available formats are:
+-----------+-------------------------+---------+
| Format | Description | Notes |
+===========+=========================+=========+
| ``zip`` | zip file (:file:`.zip`) | (1),(3) |
+-----------+-------------------------+---------+
| ``gztar`` | gzip'ed tar file | (2),(4) |
| | (:file:`.tar.gz`) | |
+-----------+-------------------------+---------+
| ``bztar`` | bzip2'ed tar file | \(4) |
| | (:file:`.tar.bz2`) | |
+-----------+-------------------------+---------+
| ``ztar`` | compressed tar file | \(4) |
| | (:file:`.tar.Z`) | |
+-----------+-------------------------+---------+
| ``tar`` | tar file (:file:`.tar`) | \(4) |
+-----------+-------------------------+---------+
Notes:
(1)
default on Windows
(2)
default on Unix
(3)
requires either external :program:`zip` utility or :mod:`zipfile` module (part
of the standard Python library since Python 1.6)
(4)
requires external utilities: :program:`tar` and possibly one of :program:`gzip`,
:program:`bzip2`, or :program:`compress`
.. _manifest:
Specifying the files to distribute
==================================
If you don't supply an explicit list of files (or instructions on how to
generate one), the :command:`sdist` command puts a minimal default set into the
source distribution:
* all Python source files implied by the :option:`py_modules` and
:option:`packages` options
* all C source files mentioned in the :option:`ext_modules` or
:option:`libraries` options (
**\*\*** getting C library sources currently broken---no
:meth:`get_source_files` method in :file:`build_clib.py`! **\*\***)
* scripts identified by the :option:`scripts` option
* anything that looks like a test script: :file:`test/test\*.py` (currently, the
Distutils don't do anything with test scripts except include them in source
distributions, but in the future there will be a standard for testing Python
module distributions)
* :file:`README.txt` (or :file:`README`), :file:`setup.py` (or whatever you
called your setup script), and :file:`setup.cfg`
Sometimes this is enough, but usually you will want to specify additional files
to distribute. The typical way to do this is to write a *manifest template*,
called :file:`MANIFEST.in` by default. The manifest template is just a list of
instructions for how to generate your manifest file, :file:`MANIFEST`, which is
the exact list of files to include in your source distribution. The
:command:`sdist` command processes this template and generates a manifest based
on its instructions and what it finds in the filesystem.
If you prefer to roll your own manifest file, the format is simple: one filename
per line, regular files (or symlinks to them) only. If you do supply your own
:file:`MANIFEST`, you must specify everything: the default set of files
described above does not apply in this case.
The manifest template has one command per line, where each command specifies a
set of files to include or exclude from the source distribution. For an
example, again we turn to the Distutils' own manifest template::
include *.txt
recursive-include examples *.txt *.py
prune examples/sample?/build
The meanings should be fairly clear: include all files in the distribution root
matching :file:`\*.txt`, all files anywhere under the :file:`examples` directory
matching :file:`\*.txt` or :file:`\*.py`, and exclude all directories matching
:file:`examples/sample?/build`. All of this is done *after* the standard
include set, so you can exclude files from the standard set with explicit
instructions in the manifest template. (Or, you can use the
:option:`--no-defaults` option to disable the standard set entirely.) There are
several other commands available in the manifest template mini-language; see
section :ref:`sdist-cmd`.
The order of commands in the manifest template matters: initially, we have the
list of default files as described above, and each command in the template adds
to or removes from that list of files. Once we have fully processed the
manifest template, we remove files that should not be included in the source
distribution:
* all files in the Distutils "build" tree (default :file:`build/`)
* all files in directories named :file:`RCS`, :file:`CVS` or :file:`.svn`
Now we have our complete list of files, which is written to the manifest for
future reference, and then used to build the source distribution archive(s).
You can disable the default set of included files with the
:option:`--no-defaults` option, and you can disable the standard exclude set
with :option:`--no-prune`.
Following the Distutils' own manifest template, let's trace how the
:command:`sdist` command builds the list of files to include in the Distutils
source distribution:
#. include all Python source files in the :file:`distutils` and
:file:`distutils/command` subdirectories (because packages corresponding to
those two directories were mentioned in the :option:`packages` option in the
setup script---see section :ref:`setup-script`)
#. include :file:`README.txt`, :file:`setup.py`, and :file:`setup.cfg` (standard
files)
#. include :file:`test/test\*.py` (standard files)
#. include :file:`\*.txt` in the distribution root (this will find
:file:`README.txt` a second time, but such redundancies are weeded out later)
#. include anything matching :file:`\*.txt` or :file:`\*.py` in the sub-tree
under :file:`examples`,
#. exclude all files in the sub-trees starting at directories matching
:file:`examples/sample?/build`\ ---this may exclude files included by the
previous two steps, so it's important that the ``prune`` command in the manifest
template comes after the ``recursive-include`` command
#. exclude the entire :file:`build` tree, and any :file:`RCS`, :file:`CVS` and
:file:`.svn` directories
Just like in the setup script, file and directory names in the manifest template
should always be slash-separated; the Distutils will take care of converting
them to the standard representation on your platform. That way, the manifest
template is portable across operating systems.
.. _manifest-options:
Manifest-related options
========================
The normal course of operations for the :command:`sdist` command is as follows:
* if the manifest file, :file:`MANIFEST` doesn't exist, read :file:`MANIFEST.in`
and create the manifest
* if neither :file:`MANIFEST` nor :file:`MANIFEST.in` exist, create a manifest
with just the default file set
* if either :file:`MANIFEST.in` or the setup script (:file:`setup.py`) are more
recent than :file:`MANIFEST`, recreate :file:`MANIFEST` by reading
:file:`MANIFEST.in`
* use the list of files now in :file:`MANIFEST` (either just generated or read
in) to create the source distribution archive(s)
There are a couple of options that modify this behaviour. First, use the
:option:`--no-defaults` and :option:`--no-prune` to disable the standard
"include" and "exclude" sets.
Second, you might want to force the manifest to be regenerated---for example, if
you have added or removed files or directories that match an existing pattern in
the manifest template, you should regenerate the manifest::
python setup.py sdist --force-manifest
Or, you might just want to (re)generate the manifest, but not create a source
distribution::
python setup.py sdist --manifest-only
:option:`--manifest-only` implies :option:`--force-manifest`. :option:`-o` is a
shortcut for :option:`--manifest-only`, and :option:`-f` for
:option:`--force-manifest`.

View File

@ -1,37 +0,0 @@
.. _package-upload:
***************************************
Uploading Packages to the Package Index
***************************************
.. versionadded:: 2.5
The Python Package Index (PyPI) not only stores the package info, but also the
package data if the author of the package wishes to. The distutils command
:command:`upload` pushes the distribution files to PyPI.
The command is invoked immediately after building one or more distribution
files. For example, the command ::
python setup.py sdist bdist_wininst upload
will cause the source distribution and the Windows installer to be uploaded to
PyPI. Note that these will be uploaded even if they are built using an earlier
invocation of :file:`setup.py`, but that only distributions named on the command
line for the invocation including the :command:`upload` command are uploaded.
The :command:`upload` command uses the username, password, and repository URL
from the :file:`$HOME/.pypirc` file (see section :ref:`pypirc` for more on this
file).
You can use the :option:`--sign` option to tell :command:`upload` to sign each
uploaded file using GPG (GNU Privacy Guard). The :program:`gpg` program must
be available for execution on the system :envvar:`PATH`. You can also specify
which key to use for signing using the :option:`--identity=*name*` option.
Other :command:`upload` options include :option:`--repository=*url*` (which
lets you override the repository setting from :file:`$HOME/.pypirc`), and
:option:`--show-response` (which displays the full response text from the PyPI
server for help in debugging upload problems).

View File

@ -1,192 +0,0 @@
.. highlightlang:: rest
Differences to the LaTeX markup
===============================
Though the markup language is different, most of the concepts and markup types
of the old LaTeX docs have been kept -- environments as reST directives, inline
commands as reST roles and so forth.
However, there are some differences in the way these work, partly due to the
differences in the markup languages, partly due to improvements in Sphinx. This
section lists these differences, in order to give those familiar with the old
format a quick overview of what they might run into.
Inline markup
-------------
These changes have been made to inline markup:
* **Cross-reference roles**
Most of the following semantic roles existed previously as inline commands,
but didn't do anything except formatting the content as code. Now, they
cross-reference to known targets (some names have also been shortened):
| *mod* (previously *refmodule* or *module*)
| *func* (previously *function*)
| *data* (new)
| *const*
| *class*
| *meth* (previously *method*)
| *attr* (previously *member*)
| *exc* (previously *exception*)
| *cdata*
| *cfunc* (previously *cfunction*)
| *cmacro* (previously *csimplemacro*)
| *ctype*
Also different is the handling of *func* and *meth*: while previously
parentheses were added to the callable name (like ``\func{str()}``), they are
now appended by the build system -- appending them in the source will result
in double parentheses. This also means that ``:func:`str(object)``` will not
work as expected -- use ````str(object)```` instead!
* **Inline commands implemented as directives**
These were inline commands in LaTeX, but are now directives in reST:
| *deprecated*
| *versionadded*
| *versionchanged*
These are used like so::
.. deprecated:: 2.5
Reason of deprecation.
Also, no period is appended to the text for *versionadded* and
*versionchanged*.
| *note*
| *warning*
These are used like so::
.. note::
Content of note.
* **Otherwise changed commands**
The *samp* command previously formatted code and added quotation marks around
it. The *samp* role, however, features a new highlighting system just like
*file* does:
``:samp:`open({filename}, {mode})``` results in :samp:`open({filename}, {mode})`
* **Dropped commands**
These were commands in LaTeX, but are not available as roles:
| *bfcode*
| *character* (use :samp:`\`\`'c'\`\``)
| *citetitle* (use ```Title <URL>`_``)
| *code* (use ````code````)
| *email* (just write the address in body text)
| *filenq*
| *filevar* (use the ``{...}`` highlighting feature of *file*)
| *programopt*, *longprogramopt* (use *option*)
| *ulink* (use ```Title <URL>`_``)
| *url* (just write the URL in body text)
| *var* (use ``*var*``)
| *infinity*, *plusminus* (use the Unicode character)
| *shortversion*, *version* (use the ``|version|`` and ``|release|`` substitutions)
| *emph*, *strong* (use the reST markup)
* **Backslash escaping**
In reST, a backslash must be escaped in normal text, and in the content of
roles. However, in code literals and literal blocks, it must not be escaped.
Example: ``:file:`C:\\Temp\\my.tmp``` vs. ````open("C:\Temp\my.tmp")````.
Information units
-----------------
Information units (*...desc* environments) have been made reST directives.
These changes to information units should be noted:
* **New names**
"desc" has been removed from every name. Additionally, these directives have
new names:
| *cfunction* (previously *cfuncdesc*)
| *cmacro* (previously *csimplemacrodesc*)
| *exception* (previously *excdesc*)
| *function* (previously *funcdesc*)
| *attribute* (previously *memberdesc*)
The *classdesc\** and *excclassdesc* environments have been dropped, the
*class* and *exception* directives support classes documented with and without
constructor arguments.
* **Multiple objects**
The equivalent of the *...line* commands is::
.. function:: do_foo(bar)
do_bar(baz)
Description of the functions.
IOW, just give one signatures per line, at the same indentation level.
* **Arguments**
There is no *optional* command. Just give function signatures like they
should appear in the output::
.. function:: open(filename[, mode[, buffering]])
Description.
Note: markup in the signature is not supported.
* **Indexing**
The *...descni* environments have been dropped. To mark an information unit
as unsuitable for index entry generation, use the *noindex* option like so::
.. function:: foo_*
:noindex:
Description.
* **New information unit**
There is a new generic information unit called "describe" which can be used
to document things that are not covered by the other units::
.. describe:: a == b
The equals operator.
Structure
---------
The LaTeX docs were split in several toplevel manuals. Now, all files
are part of the same documentation tree, as indicated by the *toctree*
directives in the sources. Every *toctree* directive embeds other files
as subdocuments of the current file (this structure is not necessarily
mirrored in the filesystem layout). The toplevel file is
:file:`contents.rst`.
However, most of the old directory structure has been kept, with the
directories renamed as follows:
* :file:`api` -> :file:`c-api`
* :file:`dist` -> :file:`distutils`, with the single TeX file split up
* :file:`doc` -> :file:`documenting`
* :file:`ext` -> :file:`extending`
* :file:`inst` -> :file:`installing`
* :file:`lib` -> :file:`library`
* :file:`mac` -> merged into :file:`library`, with `mac/using.tex`
moved to `howto/pythonmac.rst`
* :file:`ref` -> :file:`reference`
* :file:`tut` -> :file:`tutorial`, with the single TeX file split up
.. XXX more (index-generating, production lists, ...)

View File

@ -1,34 +0,0 @@
.. _documenting-index:
######################
Documenting Python
######################
The Python language has a substantial body of documentation, much of it
contributed by various authors. The markup used for the Python documentation is
`reStructuredText`_, developed by the `docutils`_ project, amended by custom
directives and using a toolset named *Sphinx* to postprocess the HTML output.
This document describes the style guide for our documentation, the custom
reStructuredText markup introduced to support Python documentation and how it
should be used, as well as the Sphinx build system.
.. _reStructuredText: http://docutils.sf.net/rst.html
.. _docutils: http://docutils.sf.net/
If you're interested in contributing to Python's documentation, there's no need
to write reStructuredText if you're not so inclined; plain text contributions
are more than welcome as well.
.. toctree::
intro.rst
style.rst
rest.rst
markup.rst
fromlatex.rst
sphinx.rst
.. XXX add credits, thanks etc.

View File

@ -1,29 +0,0 @@
Introduction
============
Python's documentation has long been considered to be good for a free
programming language. There are a number of reasons for this, the most
important being the early commitment of Python's creator, Guido van Rossum, to
providing documentation on the language and its libraries, and the continuing
involvement of the user community in providing assistance for creating and
maintaining documentation.
The involvement of the community takes many forms, from authoring to bug reports
to just plain complaining when the documentation could be more complete or
easier to use.
This document is aimed at authors and potential authors of documentation for
Python. More specifically, it is for people contributing to the standard
documentation and developing additional documents using the same tools as the
standard documents. This guide will be less useful for authors using the Python
documentation tools for topics other than Python, and less useful still for
authors not using the tools at all.
If your interest is in contributing to the Python documentation, but you don't
have the time or inclination to learn reStructuredText and the markup structures
documented here, there's a welcoming place for you among the Python contributors
as well. Any time you feel that you can clarify existing documentation or
provide documentation that's missing, the existing documentation team will
gladly work with you to integrate your text, dealing with the markup for you.
Please don't let the material in this document stand between the documentation
and your desire to help out!

View File

@ -1,775 +0,0 @@
.. highlightlang:: rest
Additional Markup Constructs
============================
Sphinx adds a lot of new directives and interpreted text roles to standard reST
markup. This section contains the reference material for these facilities.
Documentation for "standard" reST constructs is not included here, though
they are used in the Python documentation.
File-wide metadata
------------------
reST has the concept of "field lists"; these are a sequence of fields marked up
like this::
:Field name: Field content
A field list at the very top of a file is parsed as the "docinfo", which in
normal documents can be used to record the author, date of publication and
other metadata. In Sphinx, the docinfo is used as metadata, too, but not
displayed in the output.
At the moment, only one metadata field is recognized:
``nocomments``
If set, the web application won't display a comment form for a page generated
from this source file.
Meta-information markup
-----------------------
.. describe:: sectionauthor
Identifies the author of the current section. The argument should include
the author's name such that it can be used for presentation (though it isn't)
and email address. The domain name portion of the address should be lower
case. Example::
.. sectionauthor:: Guido van Rossum <guido@python.org>
Currently, this markup isn't reflected in the output in any way, but it helps
keep track of contributions.
Module-specific markup
----------------------
The markup described in this section is used to provide information about a
module being documented. Each module should be documented in its own file.
Normally this markup appears after the title heading of that file; a typical
file might start like this::
:mod:`parrot` -- Dead parrot access
===================================
.. module:: parrot
:platform: Unix, Windows
:synopsis: Analyze and reanimate dead parrots.
.. moduleauthor:: Eric Cleese <eric@python.invalid>
.. moduleauthor:: John Idle <john@python.invalid>
As you can see, the module-specific markup consists of two directives, the
``module`` directive and the ``moduleauthor`` directive.
.. describe:: module
This directive marks the beginning of the description of a module (or package
submodule, in which case the name should be fully qualified, including the
package name).
The ``platform`` option, if present, is a comma-separated list of the
platforms on which the module is available (if it is available on all
platforms, the option should be omitted). The keys are short identifiers;
examples that are in use include "IRIX", "Mac", "Windows", and "Unix". It is
important to use a key which has already been used when applicable.
The ``synopsis`` option should consist of one sentence describing the
module's purpose -- it is currently only used in the Global Module Index.
.. describe:: moduleauthor
The ``moduleauthor`` directive, which can appear multiple times, names the
authors of the module code, just like ``sectionauthor`` names the author(s)
of a piece of documentation. It too does not result in any output currently.
.. note::
It is important to make the section title of a module-describing file
meaningful since that value will be inserted in the table-of-contents trees
in overview files.
Information units
-----------------
There are a number of directives used to describe specific features provided by
modules. Each directive requires one or more signatures to provide basic
information about what is being described, and the content should be the
description. The basic version makes entries in the general index; if no index
entry is desired, you can give the directive option flag ``:noindex:``. The
following example shows all of the features of this directive type::
.. function:: spam(eggs)
ham(eggs)
:noindex:
Spam or ham the foo.
The signatures of object methods or data attributes should always include the
type name (``.. method:: FileInput.input(...)``), even if it is obvious from the
context which type they belong to; this is to enable consistent
cross-references. If you describe methods belonging to an abstract protocol,
such as "context managers", include a (pseudo-)type name too to make the
index entries more informative.
The directives are:
.. describe:: cfunction
Describes a C function. The signature should be given as in C, e.g.::
.. cfunction:: PyObject* PyType_GenericAlloc(PyTypeObject *type, Py_ssize_t nitems)
This is also used to describe function-like preprocessor macros. The names
of the arguments should be given so they may be used in the description.
Note that you don't have to backslash-escape asterisks in the signature,
as it is not parsed by the reST inliner.
.. describe:: cmember
Describes a C struct member. Example signature::
.. cmember:: PyObject* PyTypeObject.tp_bases
The text of the description should include the range of values allowed, how
the value should be interpreted, and whether the value can be changed.
References to structure members in text should use the ``member`` role.
.. describe:: cmacro
Describes a "simple" C macro. Simple macros are macros which are used
for code expansion, but which do not take arguments so cannot be described as
functions. This is not to be used for simple constant definitions. Examples
of its use in the Python documentation include :cmacro:`PyObject_HEAD` and
:cmacro:`Py_BEGIN_ALLOW_THREADS`.
.. describe:: ctype
Describes a C type. The signature should just be the type name.
.. describe:: cvar
Describes a global C variable. The signature should include the type, such
as::
.. cvar:: PyObject* PyClass_Type
.. describe:: data
Describes global data in a module, including both variables and values used
as "defined constants." Class and object attributes are not documented
using this environment.
.. describe:: exception
Describes an exception class. The signature can, but need not include
parentheses with constructor arguments.
.. describe:: function
Describes a module-level function. The signature should include the
parameters, enclosing optional parameters in brackets. Default values can be
given if it enhances clarity. For example::
.. function:: Timer.repeat([repeat=3[, number=1000000]])
Object methods are not documented using this directive. Bound object methods
placed in the module namespace as part of the public interface of the module
are documented using this, as they are equivalent to normal functions for
most purposes.
The description should include information about the parameters required and
how they are used (especially whether mutable objects passed as parameters
are modified), side effects, and possible exceptions. A small example may be
provided.
.. describe:: class
Describes a class. The signature can include parentheses with parameters
which will be shown as the constructor arguments.
.. describe:: attribute
Describes an object data attribute. The description should include
information about the type of the data to be expected and whether it may be
changed directly.
.. describe:: method
Describes an object method. The parameters should not include the ``self``
parameter. The description should include similar information to that
described for ``function``.
.. describe:: opcode
Describes a Python bytecode instruction.
There is also a generic version of these directives:
.. describe:: describe
This directive produces the same formatting as the specific ones explained
above but does not create index entries or cross-referencing targets. It is
used, for example, to describe the directives in this document. Example::
.. describe:: opcode
Describes a Python bytecode instruction.
Showing code examples
---------------------
Examples of Python source code or interactive sessions are represented using
standard reST literal blocks. They are started by a ``::`` at the end of the
preceding paragraph and delimited by indentation.
Representing an interactive session requires including the prompts and output
along with the Python code. No special markup is required for interactive
sessions. After the last line of input or output presented, there should not be
an "unused" primary prompt; this is an example of what *not* to do::
>>> 1 + 1
2
>>>
Syntax highlighting is handled in a smart way:
* There is a "highlighting language" for each source file. Per default,
this is ``'python'`` as the majority of files will have to highlight Python
snippets.
* Within Python highlighting mode, interactive sessions are recognized
automatically and highlighted appropriately.
* The highlighting language can be changed using the ``highlightlang``
directive, used as follows::
.. highlightlang:: c
This language is used until the next ``highlightlang`` directive is
encountered.
* The valid values for the highlighting language are:
* ``python`` (the default)
* ``c``
* ``rest``
* ``none`` (no highlighting)
* If highlighting with the current language fails, the block is not highlighted
in any way.
Longer displays of verbatim text may be included by storing the example text in
an external file containing only plain text. The file may be included using the
``literalinclude`` directive. [1]_ For example, to include the Python source file
:file:`example.py`, use::
.. literalinclude:: example.py
The file name is relative to the current file's path. Documentation-specific
include files should be placed in the ``Doc/includes`` subdirectory.
Inline markup
-------------
As said before, Sphinx uses interpreted text roles to insert semantic markup in
documents.
Variable names are an exception, they should be marked simply with ``*var*``.
For all other roles, you have to write ``:rolename:`content```.
The following roles refer to objects in modules and are possibly hyperlinked if
a matching identifier is found:
.. describe:: mod
The name of a module; a dotted name may be used. This should also be used for
package names.
.. describe:: func
The name of a Python function; dotted names may be used. The role text
should include trailing parentheses to enhance readability. The parentheses
are stripped when searching for identifiers.
.. describe:: data
The name of a module-level variable.
.. describe:: const
The name of a "defined" constant. This may be a C-language ``#define``
or a Python variable that is not intended to be changed.
.. describe:: class
A class name; a dotted name may be used.
.. describe:: meth
The name of a method of an object. The role text should include the type
name, method name and the trailing parentheses. A dotted name may be used.
.. describe:: attr
The name of a data attribute of an object.
.. describe:: exc
The name of an exception. A dotted name may be used.
The name enclosed in this markup can include a module name and/or a class name.
For example, ``:func:`filter``` could refer to a function named ``filter`` in
the current module, or the built-in function of that name. In contrast,
``:func:`foo.filter``` clearly refers to the ``filter`` function in the ``foo``
module.
A similar heuristic is used to determine whether the name is an attribute of
the currently documented class.
The following roles create cross-references to C-language constructs if they
are defined in the API documentation:
.. describe:: cdata
The name of a C-language variable.
.. describe:: cfunc
The name of a C-language function. Should include trailing parentheses.
.. describe:: cmacro
The name of a "simple" C macro, as defined above.
.. describe:: ctype
The name of a C-language type.
The following role does possibly create a cross-reference, but does not refer
to objects:
.. describe:: token
The name of a grammar token (used in the reference manual to create links
between production displays).
---------
The following roles don't do anything special except formatting the text
in a different style:
.. describe:: command
The name of an OS-level command, such as ``rm``.
.. describe:: dfn
Mark the defining instance of a term in the text. (No index entries are
generated.)
.. describe:: envvar
An environment variable. Index entries are generated.
.. describe:: file
The name of a file or directory. Within the contents, you can use curly
braces to indicate a "variable" part, for example::
... is installed in :file:`/usr/lib/python2.{x}/site-packages` ...
In the built documentation, the ``x`` will be displayed differently to
indicate that it is to be replaced by the Python minor version.
.. describe:: guilabel
Labels presented as part of an interactive user interface should be marked
using ``guilabel``. This includes labels from text-based interfaces such as
those created using :mod:`curses` or other text-based libraries. Any label
used in the interface should be marked with this role, including button
labels, window titles, field names, menu and menu selection names, and even
values in selection lists.
.. describe:: kbd
Mark a sequence of keystrokes. What form the key sequence takes may depend
on platform- or application-specific conventions. When there are no relevant
conventions, the names of modifier keys should be spelled out, to improve
accessibility for new users and non-native speakers. For example, an
*xemacs* key sequence may be marked like ``:kbd:`C-x C-f```, but without
reference to a specific application or platform, the same sequence should be
marked as ``:kbd:`Control-x Control-f```.
.. describe:: keyword
The name of a keyword in a programming language.
.. describe:: mailheader
The name of an RFC 822-style mail header. This markup does not imply that
the header is being used in an email message, but can be used to refer to any
header of the same "style." This is also used for headers defined by the
various MIME specifications. The header name should be entered in the same
way it would normally be found in practice, with the camel-casing conventions
being preferred where there is more than one common usage. For example:
``:mailheader:`Content-Type```.
.. describe:: makevar
The name of a :command:`make` variable.
.. describe:: manpage
A reference to a Unix manual page including the section,
e.g. ``:manpage:`ls(1)```.
.. describe:: menuselection
Menu selections should be marked using the ``menuselection`` role. This is
used to mark a complete sequence of menu selections, including selecting
submenus and choosing a specific operation, or any subsequence of such a
sequence. The names of individual selections should be separated by
``-->``.
For example, to mark the selection "Start > Programs", use this markup::
:menuselection:`Start --> Programs`
When including a selection that includes some trailing indicator, such as the
ellipsis some operating systems use to indicate that the command opens a
dialog, the indicator should be omitted from the selection name.
.. describe:: mimetype
The name of a MIME type, or a component of a MIME type (the major or minor
portion, taken alone).
.. describe:: newsgroup
The name of a Usenet newsgroup.
.. describe:: option
A command-line option to an executable program. The leading hyphen(s) must
be included.
.. describe:: program
The name of an executable program. This may differ from the file name for
the executable for some platforms. In particular, the ``.exe`` (or other)
extension should be omitted for Windows programs.
.. describe:: regexp
A regular expression. Quotes should not be included.
.. describe:: samp
A piece of literal text, such as code. Within the contents, you can use
curly braces to indicate a "variable" part, as in ``:file:``.
If you don't need the "variable part" indication, use the standard
````code```` instead.
.. describe:: var
A Python or C variable or parameter name.
The following roles generate external links:
.. describe:: pep
A reference to a Python Enhancement Proposal. This generates appropriate
index entries. The text "PEP *number*\ " is generated; in the HTML output,
this text is a hyperlink to an online copy of the specified PEP.
.. describe:: rfc
A reference to an Internet Request for Comments. This generates appropriate
index entries. The text "RFC *number*\ " is generated; in the HTML output,
this text is a hyperlink to an online copy of the specified RFC.
Note that there are no special roles for including hyperlinks as you can use
the standard reST markup for that purpose.
.. _doc-ref-role:
Cross-linking markup
--------------------
To support cross-referencing to arbitrary sections in the documentation, the
standard reST labels are "abused" a bit: Every label must precede a section
title; and every label name must be unique throughout the entire documentation
source.
You can then reference to these sections using the ``:ref:`label-name``` role.
Example::
.. _my-reference-label:
Section to cross-reference
--------------------------
This is the text of the section.
It refers to the section itself, see :ref:`my-reference-label`.
The ``:ref:`` invocation is replaced with the section title.
Paragraph-level markup
----------------------
These directives create short paragraphs and can be used inside information
units as well as normal text:
.. describe:: note
An especially important bit of information about an API that a user should be
aware of when using whatever bit of API the note pertains to. The content of
the directive should be written in complete sentences and include all
appropriate punctuation.
Example::
.. note::
This function is not suitable for sending spam e-mails.
.. describe:: warning
An important bit of information about an API that a user should be very aware
of when using whatever bit of API the warning pertains to. The content of
the directive should be written in complete sentences and include all
appropriate punctuation. This differs from ``note`` in that it is recommended
over ``note`` for information regarding security.
.. describe:: versionadded
This directive documents the version of Python which added the described
feature to the library or C API. When this applies to an entire module, it
should be placed at the top of the module section before any prose.
The first argument must be given and is the version in question; you can add
a second argument consisting of a *brief* explanation of the change.
Example::
.. versionadded:: 2.5
The `spam` parameter.
Note that there must be no blank line between the directive head and the
explanation; this is to make these blocks visually continuous in the markup.
.. describe:: versionchanged
Similar to ``versionadded``, but describes when and what changed in the named
feature in some way (new parameters, changed side effects, etc.).
--------------
.. describe:: seealso
Many sections include a list of references to module documentation or
external documents. These lists are created using the ``seealso`` directive.
The ``seealso`` directive is typically placed in a section just before any
sub-sections. For the HTML output, it is shown boxed off from the main flow
of the text.
The content of the ``seealso`` directive should be a reST definition list.
Example::
.. seealso::
Module :mod:`zipfile`
Documentation of the :mod:`zipfile` standard module.
`GNU tar manual, Basic Tar Format <http://link>`_
Documentation for tar archive files, including GNU tar extensions.
.. describe:: rubric
This directive creates a paragraph heading that is not used to create a
table of contents node. It is currently used for the "Footnotes" caption.
.. describe:: centered
This directive creates a centered boldfaced paragraph. Use it as follows::
.. centered::
Paragraph contents.
Table-of-contents markup
------------------------
Since reST does not have facilities to interconnect several documents, or split
documents into multiple output files, Sphinx uses a custom directive to add
relations between the single files the documentation is made of, as well as
tables of contents. The ``toctree`` directive is the central element.
.. describe:: toctree
This directive inserts a "TOC tree" at the current location, using the
individual TOCs (including "sub-TOC trees") of the files given in the
directive body. A numeric ``maxdepth`` option may be given to indicate the
depth of the tree; by default, all levels are included.
Consider this example (taken from the library reference index)::
.. toctree::
:maxdepth: 2
intro.rst
strings.rst
datatypes.rst
numeric.rst
(many more files listed here)
This accomplishes two things:
* Tables of contents from all those files are inserted, with a maximum depth
of two, that means one nested heading. ``toctree`` directives in those
files are also taken into account.
* Sphinx knows that the relative order of the files ``intro.rst``,
``strings.rst`` and so forth, and it knows that they are children of the
shown file, the library index. From this information it generates "next
chapter", "previous chapter" and "parent chapter" links.
In the end, all files included in the build process must occur in one
``toctree`` directive; Sphinx will emit a warning if it finds a file that is
not included, because that means that this file will not be reachable through
standard navigation.
The special file ``contents.rst`` at the root of the source directory is the
"root" of the TOC tree hierarchy; from it the "Contents" page is generated.
Index-generating markup
-----------------------
Sphinx automatically creates index entries from all information units (like
functions, classes or attributes) like discussed before.
However, there is also an explicit directive available, to make the index more
comprehensive and enable index entries in documents where information is not
mainly contained in information units, such as the language reference.
The directive is ``index`` and contains one or more index entries. Each entry
consists of a type and a value, separated by a colon.
For example::
.. index::
single: execution!context
module: __main__
module: sys
triple: module; search; path
This directive contains five entries, which will be converted to entries in the
generated index which link to the exact location of the index statement (or, in
case of offline media, the corresponding page number).
The possible entry types are:
single
Creates a single index entry. Can be made a subentry by separating the
subentry text with a semicolon (this is also used below to describe what
entries are created).
pair
``pair: loop; statement`` is a shortcut that creates two index entries,
namely ``loop; statement`` and ``statement; loop``.
triple
Likewise, ``triple: module; search; path`` is a shortcut that creates three
index entries, which are ``module; search path``, ``search; path, module`` and
``path; module search``.
module, keyword, operator, object, exception, statement, builtin
These all create two index entries. For example, ``module: hashlib`` creates
the entries ``module; hashlib`` and ``hashlib; module``.
Grammar production displays
---------------------------
Special markup is available for displaying the productions of a formal grammar.
The markup is simple and does not attempt to model all aspects of BNF (or any
derived forms), but provides enough to allow context-free grammars to be
displayed in a way that causes uses of a symbol to be rendered as hyperlinks to
the definition of the symbol. There is this directive:
.. describe:: productionlist
This directive is used to enclose a group of productions. Each production is
given on a single line and consists of a name, separated by a colon from the
following definition. If the definition spans multiple lines, each
continuation line must begin with a colon placed at the same column as in the
first line.
Blank lines are not allowed within ``productionlist`` directive arguments.
The definition can contain token names which are marked as interpreted text
(e.g. ``sum ::= `integer` "+" `integer```) -- this generates cross-references
to the productions of these tokens.
Note that no further reST parsing is done in the production, so that you
don't have to escape ``*`` or ``|`` characters.
.. XXX describe optional first parameter
The following is an example taken from the Python Reference Manual::
.. productionlist::
try_stmt: try1_stmt | try2_stmt
try1_stmt: "try" ":" `suite`
: ("except" [`expression` ["," `target`]] ":" `suite`)+
: ["else" ":" `suite`]
: ["finally" ":" `suite`]
try2_stmt: "try" ":" `suite`
: "finally" ":" `suite`
Substitutions
-------------
The documentation system provides three substitutions that are defined by default.
They are set in the build configuration file, see :ref:`doc-build-config`.
.. describe:: |release|
Replaced by the Python release the documentation refers to. This is the full
version string including alpha/beta/release candidate tags, e.g. ``2.5.2b3``.
.. describe:: |version|
Replaced by the Python version the documentation refers to. This consists
only of the major and minor version parts, e.g. ``2.5``, even for version
2.5.1.
.. describe:: |today|
Replaced by either today's date, or the date set in the build configuration
file. Normally has the format ``April 14, 2007``.
.. rubric:: Footnotes
.. [1] There is a standard ``.. include`` directive, but it raises errors if the
file is not found. This one only emits a warning.

View File

@ -1,251 +0,0 @@
.. highlightlang:: rest
reStructuredText Primer
=======================
This section is a brief introduction to reStructuredText (reST) concepts and
syntax, intended to provide authors with enough information to author documents
productively. Since reST was designed to be a simple, unobtrusive markup
language, this will not take too long.
.. seealso::
The authoritative `reStructuredText User
Documentation <http://docutils.sourceforge.net/rst.html>`_.
Paragraphs
----------
The paragraph is the most basic block in a reST document. Paragraphs are simply
chunks of text separated by one or more blank lines. As in Python, indentation
is significant in reST, so all lines of the same paragraph must be left-aligned
to the same level of indentation.
Inline markup
-------------
The standard reST inline markup is quite simple: use
* one asterisk: ``*text*`` for emphasis (italics),
* two asterisks: ``**text**`` for strong emphasis (boldface), and
* backquotes: ````text```` for code samples.
If asterisks or backquotes appear in running text and could be confused with
inline markup delimiters, they have to be escaped with a backslash.
Be aware of some restrictions of this markup:
* it may not be nested,
* content may not start or end with whitespace: ``* text*`` is wrong,
* it must be separated from surrounding text by non-word characters. Use a
backslash escaped space to work around that: ``thisis\ *one*\ word``.
These restrictions may be lifted in future versions of the docutils.
reST also allows for custom "interpreted text roles"', which signify that the
enclosed text should be interpreted in a specific way. Sphinx uses this to
provide semantic markup and cross-referencing of identifiers, as described in
the appropriate section. The general syntax is ``:rolename:`content```.
Lists and Quotes
----------------
List markup is natural: just place an asterisk at the start of a paragraph and
indent properly. The same goes for numbered lists; they can also be
autonumbered using a ``#`` sign::
* This is a bulleted list.
* It has two items, the second
item uses two lines.
1. This is a numbered list.
2. It has two items too.
#. This is a numbered list.
#. It has two items too.
Note that Sphinx disables the use of enumerated lists introduced by alphabetic
or roman numerals, such as ::
A. First item
B. Second item
Nested lists are possible, but be aware that they must be separated from the
parent list items by blank lines::
* this is
* a list
* with a nested list
* and some subitems
* and here the parent list continues
Definition lists are created as follows::
term (up to a line of text)
Definition of the term, which must be indented
and can even consist of multiple paragraphs
next term
Description.
Paragraphs are quoted by just indenting them more than the surrounding
paragraphs.
Source Code
-----------
Literal code blocks are introduced by ending a paragraph with the special marker
``::``. The literal block must be indented, to be able to include blank lines::
This is a normal text paragraph. The next paragraph is a code sample::
It is not processed in any way, except
that the indentation is removed.
It can span multiple lines.
This is a normal text paragraph again.
The handling of the ``::`` marker is smart:
* If it occurs as a paragraph of its own, that paragraph is completely left
out of the document.
* If it is preceded by whitespace, the marker is removed.
* If it is preceded by non-whitespace, the marker is replaced by a single
colon.
That way, the second sentence in the above example's first paragraph would be
rendered as "The next paragraph is a code sample:".
Hyperlinks
----------
External links
^^^^^^^^^^^^^^
Use ```Link text <http://target>`_`` for inline web links. If the link text
should be the web address, you don't need special markup at all, the parser
finds links and mail addresses in ordinary text.
Internal links
^^^^^^^^^^^^^^
Internal linking is done via a special reST role, see the section on specific
markup, :ref:`doc-ref-role`.
Sections
--------
Section headers are created by underlining (and optionally overlining) the
section title with a punctuation character, at least as long as the text::
=================
This is a heading
=================
Normally, there are no heading levels assigned to certain characters as the
structure is determined from the succession of headings. However, for the
Python documentation, we use this convention:
* ``#`` with overline, for parts
* ``*`` with overline, for chapters
* ``=``, for sections
* ``-``, for subsections
* ``^``, for subsubsections
* ``"``, for paragraphs
Explicit Markup
---------------
"Explicit markup" is used in reST for most constructs that need special
handling, such as footnotes, specially-highlighted paragraphs, comments, and
generic directives.
An explicit markup block begins with a line starting with ``..`` followed by
whitespace and is terminated by the next paragraph at the same level of
indentation. (There needs to be a blank line between explicit markup and normal
paragraphs. This may all sound a bit complicated, but it is intuitive enough
when you write it.)
Directives
----------
A directive is a generic block of explicit markup. Besides roles, it is one of
the extension mechanisms of reST, and Sphinx makes heavy use of it.
Basically, a directive consists of a name, arguments, options and content. (Keep
this terminology in mind, it is used in the next chapter describing custom
directives.) Looking at this example, ::
.. function:: foo(x)
foo(y, z)
:bar: no
Return a line of text input from the user.
``function`` is the directive name. It is given two arguments here, the
remainder of the first line and the second line, as well as one option ``bar``
(as you can see, options are given in the lines immediately following the
arguments and indicated by the colons).
The directive content follows after a blank line and is indented relative to the
directive start.
Footnotes
---------
For footnotes, use ``[#]_`` to mark the footnote location, and add the footnote
body at the bottom of the document after a "Footnotes" rubric heading, like so::
Lorem ipsum [#]_ dolor sit amet ... [#]_
.. rubric:: Footnotes
.. [#] Text of the first footnote.
.. [#] Text of the second footnote.
You can also explicitly number the footnotes for better context.
Comments
--------
Every explicit markup block which isn't a valid markup construct (like the
footnotes above) is regarded as a comment.
Source encoding
---------------
Since the easiest way to include special characters like em dashes or copyright
signs in reST is to directly write them as Unicode characters, one has to
specify an encoding:
All Python documentation source files must be in UTF-8 encoding, and the HTML
documents written from them will be in that encoding as well.
Gotchas
-------
There are some problems one commonly runs into while authoring reST documents:
* **Separation of inline markup:** As said above, inline markup spans must be
separated from the surrounding text by non-word characters, you have to use
an escaped space to get around that.
.. XXX more?

View File

@ -1,60 +0,0 @@
.. highlightlang:: rest
The Sphinx build system
=======================
.. XXX: intro...
.. _doc-build-config:
The build configuration file
----------------------------
The documentation root, that is the ``Doc`` subdirectory of the source
distribution, contains a file named ``conf.py``. This file is called the "build
configuration file", and it contains several variables that are read and used
during a build run.
These variables are:
version : string
A string that is used as a replacement for the ``|version|`` reST
substitution. It should be the Python version the documentation refers to.
This consists only of the major and minor version parts, e.g. ``2.5``, even
for version 2.5.1.
release : string
A string that is used as a replacement for the ``|release|`` reST
substitution. It should be the full version string including
alpha/beta/release candidate tags, e.g. ``2.5.2b3``.
Both ``release`` and ``version`` can be ``'auto'``, which means that they are
determined at runtime from the ``Include/patchlevel.h`` file, if a complete
Python source distribution can be found, or else from the interpreter running
Sphinx.
today_fmt : string
A ``strftime`` format that is used to format a replacement for the
``|today|`` reST substitution.
today : string
A string that can contain a date that should be written to the documentation
output literally. If this is nonzero, it is used instead of
``strftime(today_fmt)``.
unused_files : list of strings
A list of reST filenames that are to be disregarded during building. This
could be docs for temporarily disabled modules or documentation that's not
yet ready for public consumption.
last_updated_format : string
If this is not an empty string, it will be given to ``time.strftime()`` and
written to each generated output file after "last updated on:".
use_smartypants : bool
If true, use SmartyPants to convert quotes and dashes to the typographically
correct entities.
add_function_parentheses : bool
If true, ``()`` will be appended to the content of ``:func:``, ``:meth:`` and
``:cfunc:`` cross-references.

View File

@ -1,70 +0,0 @@
.. highlightlang:: rest
Style Guide
===========
The Python documentation should follow the `Apple Publications Style Guide`_
wherever possible. This particular style guide was selected mostly because it
seems reasonable and is easy to get online.
Topics which are not covered in the Apple's style guide will be discussed in
this document.
All reST files use an indentation of 3 spaces. The maximum line length is 80
characters for normal text, but tables, deeply indented code samples and long
links may extend beyond that.
Make generous use of blank lines where applicable; they help grouping things
together.
A sentence-ending period may be followed by one or two spaces; while reST
ignores the second space, it is customarily put in by some users, for example
to aid Emacs' auto-fill mode.
Footnotes are generally discouraged, though they may be used when they are the
best way to present specific information. When a footnote reference is added at
the end of the sentence, it should follow the sentence-ending punctuation. The
reST markup should appear something like this::
This sentence has a footnote reference. [#]_ This is the next sentence.
Footnotes should be gathered at the end of a file, or if the file is very long,
at the end of a section. The docutils will automatically create backlinks to
the footnote reference.
Footnotes may appear in the middle of sentences where appropriate.
Many special names are used in the Python documentation, including the names of
operating systems, programming languages, standards bodies, and the like. Most
of these entities are not assigned any special markup, but the preferred
spellings are given here to aid authors in maintaining the consistency of
presentation in the Python documentation.
Other terms and words deserve special mention as well; these conventions should
be used to ensure consistency throughout the documentation:
CPU
For "central processing unit." Many style guides say this should be spelled
out on the first use (and if you must use it, do so!). For the Python
documentation, this abbreviation should be avoided since there's no
reasonable way to predict which occurrence will be the first seen by the
reader. It is better to use the word "processor" instead.
POSIX
The name assigned to a particular group of standards. This is always
uppercase.
Python
The name of our favorite programming language is always capitalized.
Unicode
The name of a character set and matching encoding. This is always written
capitalized.
Unix
The name of the operating system developed at AT&T Bell Labs in the early
1970s.
.. _Apple Publications Style Guide: http://developer.apple.com/documentation/UserExperience/Conceptual/APStyleGuide/AppleStyleGuide2003.pdf

View File

@ -1,131 +0,0 @@
.. highlightlang:: c
.. _building:
********************************************
Building C and C++ Extensions with distutils
********************************************
.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
Starting in Python 1.4, Python provides, on Unix, a special make file for
building make files for building dynamically-linked extensions and custom
interpreters. Starting with Python 2.0, this mechanism (known as related to
Makefile.pre.in, and Setup files) is no longer supported. Building custom
interpreters was rarely used, and extension modules can be built using
distutils.
Building an extension module using distutils requires that distutils is
installed on the build machine, which is included in Python 2.x and available
separately for Python 1.5. Since distutils also supports creation of binary
packages, users don't necessarily need a compiler and distutils to install the
extension.
A distutils package contains a driver script, :file:`setup.py`. This is a plain
Python file, which, in the most simple case, could look like this::
from distutils.core import setup, Extension
module1 = Extension('demo',
sources = ['demo.c'])
setup (name = 'PackageName',
version = '1.0',
description = 'This is a demo package',
ext_modules = [module1])
With this :file:`setup.py`, and a file :file:`demo.c`, running ::
python setup.py build
will compile :file:`demo.c`, and produce an extension module named ``demo`` in
the :file:`build` directory. Depending on the system, the module file will end
up in a subdirectory :file:`build/lib.system`, and may have a name like
:file:`demo.so` or :file:`demo.pyd`.
In the :file:`setup.py`, all execution is performed by calling the ``setup``
function. This takes a variable number of keyword arguments, of which the
example above uses only a subset. Specifically, the example specifies
meta-information to build packages, and it specifies the contents of the
package. Normally, a package will contain of addition modules, like Python
source modules, documentation, subpackages, etc. Please refer to the distutils
documentation in :ref:`distutils-index` to learn more about the features of
distutils; this section explains building extension modules only.
It is common to pre-compute arguments to :func:`setup`, to better structure the
driver script. In the example above, the\ ``ext_modules`` argument to
:func:`setup` is a list of extension modules, each of which is an instance of
the :class:`Extension`. In the example, the instance defines an extension named
``demo`` which is build by compiling a single source file, :file:`demo.c`.
In many cases, building an extension is more complex, since additional
preprocessor defines and libraries may be needed. This is demonstrated in the
example below. ::
from distutils.core import setup, Extension
module1 = Extension('demo',
define_macros = [('MAJOR_VERSION', '1'),
('MINOR_VERSION', '0')],
include_dirs = ['/usr/local/include'],
libraries = ['tcl83'],
library_dirs = ['/usr/local/lib'],
sources = ['demo.c'])
setup (name = 'PackageName',
version = '1.0',
description = 'This is a demo package',
author = 'Martin v. Loewis',
author_email = 'martin@v.loewis.de',
url = 'http://www.python.org/doc/current/ext/building.html',
long_description = '''
This is really just a demo package.
''',
ext_modules = [module1])
In this example, :func:`setup` is called with additional meta-information, which
is recommended when distribution packages have to be built. For the extension
itself, it specifies preprocessor defines, include directories, library
directories, and libraries. Depending on the compiler, distutils passes this
information in different ways to the compiler. For example, on Unix, this may
result in the compilation commands ::
gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -DMAJOR_VERSION=1 -DMINOR_VERSION=0 -I/usr/local/include -I/usr/local/include/python2.2 -c demo.c -o build/temp.linux-i686-2.2/demo.o
gcc -shared build/temp.linux-i686-2.2/demo.o -L/usr/local/lib -ltcl83 -o build/lib.linux-i686-2.2/demo.so
These lines are for demonstration purposes only; distutils users should trust
that distutils gets the invocations right.
.. _distributing:
Distributing your extension modules
===================================
When an extension has been successfully build, there are three ways to use it.
End-users will typically want to install the module, they do so by running ::
python setup.py install
Module maintainers should produce source packages; to do so, they run ::
python setup.py sdist
In some cases, additional files need to be included in a source distribution;
this is done through a :file:`MANIFEST.in` file; see the distutils documentation
for details.
If the source distribution has been build successfully, maintainers can also
create binary distributions. Depending on the platform, one of the following
commands can be used to do so. ::
python setup.py bdist_wininst
python setup.py bdist_rpm
python setup.py bdist_dumb

View File

@ -1,297 +0,0 @@
.. highlightlang:: c
.. _embedding:
***************************************
Embedding Python in Another Application
***************************************
The previous chapters discussed how to extend Python, that is, how to extend the
functionality of Python by attaching a library of C functions to it. It is also
possible to do it the other way around: enrich your C/C++ application by
embedding Python in it. Embedding provides your application with the ability to
implement some of the functionality of your application in Python rather than C
or C++. This can be used for many purposes; one example would be to allow users
to tailor the application to their needs by writing some scripts in Python. You
can also use it yourself if some of the functionality can be written in Python
more easily.
Embedding Python is similar to extending it, but not quite. The difference is
that when you extend Python, the main program of the application is still the
Python interpreter, while if you embed Python, the main program may have nothing
to do with Python --- instead, some parts of the application occasionally call
the Python interpreter to run some Python code.
So if you are embedding Python, you are providing your own main program. One of
the things this main program has to do is initialize the Python interpreter. At
the very least, you have to call the function :cfunc:`Py_Initialize` (on Mac OS,
call :cfunc:`PyMac_Initialize` instead). There are optional calls to pass
command line arguments to Python. Then later you can call the interpreter from
any part of the application.
There are several different ways to call the interpreter: you can pass a string
containing Python statements to :cfunc:`PyRun_SimpleString`, or you can pass a
stdio file pointer and a file name (for identification in error messages only)
to :cfunc:`PyRun_SimpleFile`. You can also call the lower-level operations
described in the previous chapters to construct and use Python objects.
A simple demo of embedding Python can be found in the directory
:file:`Demo/embed/` of the source distribution.
.. seealso::
:ref:`c-api-index`
The details of Python's C interface are given in this manual. A great deal of
necessary information can be found here.
.. _high-level-embedding:
Very High Level Embedding
=========================
The simplest form of embedding Python is the use of the very high level
interface. This interface is intended to execute a Python script without needing
to interact with the application directly. This can for example be used to
perform some operation on a file. ::
#include <Python.h>
int
main(int argc, char *argv[])
{
Py_Initialize();
PyRun_SimpleString("from time import time,ctime\n"
"print 'Today is',ctime(time())\n");
Py_Finalize();
return 0;
}
The above code first initializes the Python interpreter with
:cfunc:`Py_Initialize`, followed by the execution of a hard-coded Python script
that print the date and time. Afterwards, the :cfunc:`Py_Finalize` call shuts
the interpreter down, followed by the end of the program. In a real program,
you may want to get the Python script from another source, perhaps a text-editor
routine, a file, or a database. Getting the Python code from a file can better
be done by using the :cfunc:`PyRun_SimpleFile` function, which saves you the
trouble of allocating memory space and loading the file contents.
.. _lower-level-embedding:
Beyond Very High Level Embedding: An overview
=============================================
The high level interface gives you the ability to execute arbitrary pieces of
Python code from your application, but exchanging data values is quite
cumbersome to say the least. If you want that, you should use lower level calls.
At the cost of having to write more C code, you can achieve almost anything.
It should be noted that extending Python and embedding Python is quite the same
activity, despite the different intent. Most topics discussed in the previous
chapters are still valid. To show this, consider what the extension code from
Python to C really does:
#. Convert data values from Python to C,
#. Perform a function call to a C routine using the converted values, and
#. Convert the data values from the call from C to Python.
When embedding Python, the interface code does:
#. Convert data values from C to Python,
#. Perform a function call to a Python interface routine using the converted
values, and
#. Convert the data values from the call from Python to C.
As you can see, the data conversion steps are simply swapped to accommodate the
different direction of the cross-language transfer. The only difference is the
routine that you call between both data conversions. When extending, you call a
C routine, when embedding, you call a Python routine.
This chapter will not discuss how to convert data from Python to C and vice
versa. Also, proper use of references and dealing with errors is assumed to be
understood. Since these aspects do not differ from extending the interpreter,
you can refer to earlier chapters for the required information.
.. _pure-embedding:
Pure Embedding
==============
The first program aims to execute a function in a Python script. Like in the
section about the very high level interface, the Python interpreter does not
directly interact with the application (but that will change in the next
section).
The code to run a function defined in a Python script is:
.. literalinclude:: ../includes/run-func.c
This code loads a Python script using ``argv[1]``, and calls the function named
in ``argv[2]``. Its integer arguments are the other values of the ``argv``
array. If you compile and link this program (let's call the finished executable
:program:`call`), and use it to execute a Python script, such as::
def multiply(a,b):
print "Will compute", a, "times", b
c = 0
for i in range(0, a):
c = c + b
return c
then the result should be::
$ call multiply multiply 3 2
Will compute 3 times 2
Result of call: 6
Although the program is quite large for its functionality, most of the code is
for data conversion between Python and C, and for error reporting. The
interesting part with respect to embedding Python starts with
.. % $
::
Py_Initialize();
pName = PyString_FromString(argv[1]);
/* Error checking of pName left out */
pModule = PyImport_Import(pName);
After initializing the interpreter, the script is loaded using
:cfunc:`PyImport_Import`. This routine needs a Python string as its argument,
which is constructed using the :cfunc:`PyString_FromString` data conversion
routine. ::
pFunc = PyObject_GetAttrString(pModule, argv[2]);
/* pFunc is a new reference */
if (pFunc && PyCallable_Check(pFunc)) {
...
}
Py_XDECREF(pFunc);
Once the script is loaded, the name we're looking for is retrieved using
:cfunc:`PyObject_GetAttrString`. If the name exists, and the object returned is
callable, you can safely assume that it is a function. The program then
proceeds by constructing a tuple of arguments as normal. The call to the Python
function is then made with::
pValue = PyObject_CallObject(pFunc, pArgs);
Upon return of the function, ``pValue`` is either *NULL* or it contains a
reference to the return value of the function. Be sure to release the reference
after examining the value.
.. _extending-with-embedding:
Extending Embedded Python
=========================
Until now, the embedded Python interpreter had no access to functionality from
the application itself. The Python API allows this by extending the embedded
interpreter. That is, the embedded interpreter gets extended with routines
provided by the application. While it sounds complex, it is not so bad. Simply
forget for a while that the application starts the Python interpreter. Instead,
consider the application to be a set of subroutines, and write some glue code
that gives Python access to those routines, just like you would write a normal
Python extension. For example::
static int numargs=0;
/* Return the number of arguments of the application command line */
static PyObject*
emb_numargs(PyObject *self, PyObject *args)
{
if(!PyArg_ParseTuple(args, ":numargs"))
return NULL;
return Py_BuildValue("i", numargs);
}
static PyMethodDef EmbMethods[] = {
{"numargs", emb_numargs, METH_VARARGS,
"Return the number of arguments received by the process."},
{NULL, NULL, 0, NULL}
};
Insert the above code just above the :cfunc:`main` function. Also, insert the
following two statements directly after :cfunc:`Py_Initialize`::
numargs = argc;
Py_InitModule("emb", EmbMethods);
These two lines initialize the ``numargs`` variable, and make the
:func:`emb.numargs` function accessible to the embedded Python interpreter.
With these extensions, the Python script can do things like ::
import emb
print "Number of arguments", emb.numargs()
In a real application, the methods will expose an API of the application to
Python.
.. % \section{For the future}
.. %
.. % You don't happen to have a nice library to get textual
.. % equivalents of numeric values do you :-) ?
.. % Callbacks here ? (I may be using information from that section
.. % ?!)
.. % threads
.. % code examples do not really behave well if errors happen
.. % (what to watch out for)
.. _embeddingincplusplus:
Embedding Python in C++
=======================
It is also possible to embed Python in a C++ program; precisely how this is done
will depend on the details of the C++ system used; in general you will need to
write the main program in C++, and use the C++ compiler to compile and link your
program. There is no need to recompile Python itself using C++.
.. _link-reqs:
Linking Requirements
====================
While the :program:`configure` script shipped with the Python sources will
correctly build Python to export the symbols needed by dynamically linked
extensions, this is not automatically inherited by applications which embed the
Python library statically, at least on Unix. This is an issue when the
application is linked to the static runtime library (:file:`libpython.a`) and
needs to load dynamic extensions (implemented as :file:`.so` files).
The problem is that some entry points are defined by the Python runtime solely
for extension modules to use. If the embedding application does not use any of
these entry points, some linkers will not include those entries in the symbol
table of the finished executable. Some additional options are needed to inform
the linker not to remove these symbols.
Determining the right options to use for any given platform can be quite
difficult, but fortunately the Python configuration already has those values.
To retrieve them from an installed Python interpreter, start an interactive
interpreter and have a short session like this::
>>> import distutils.sysconfig
>>> distutils.sysconfig.get_config_var('LINKFORSHARED')
'-Xlinker -export-dynamic'
.. index:: module: distutils.sysconfig
The contents of the string presented will be the options that should be used.
If the string is empty, there's no need to add any additional options. The
:const:`LINKFORSHARED` definition corresponds to the variable of the same name
in Python's top-level :file:`Makefile`.

File diff suppressed because it is too large Load Diff

View File

@ -1,34 +0,0 @@
.. _extending-index:
##################################################
Extending and Embedding the Python Interpreter
##################################################
:Release: |version|
:Date: |today|
This document describes how to write modules in C or C++ to extend the Python
interpreter with new modules. Those modules can define new functions but also
new object types and their methods. The document also describes how to embed
the Python interpreter in another application, for use as an extension language.
Finally, it shows how to compile and link extension modules so that they can be
loaded dynamically (at run time) into the interpreter, if the underlying
operating system supports this feature.
This document assumes basic knowledge about Python. For an informal
introduction to the language, see :ref:`tutorial-index`. :ref:`reference-index`
gives a more formal definition of the language. :ref:`library-index` documents
the existing object types, functions and modules (both built-in and written in
Python) that give the language its wide application range.
For a detailed description of the whole Python/C API, see the separate
:ref:`c-api-index`.
.. toctree::
:maxdepth: 2
extending.rst
newtypes.rst
building.rst
windows.rst
embedding.rst

File diff suppressed because it is too large Load Diff

View File

@ -1,280 +0,0 @@
.. highlightlang:: c
.. _building-on-windows:
****************************************
Building C and C++ Extensions on Windows
****************************************
.. %
This chapter briefly explains how to create a Windows extension module for
Python using Microsoft Visual C++, and follows with more detailed background
information on how it works. The explanatory material is useful for both the
Windows programmer learning to build Python extensions and the Unix programmer
interested in producing software which can be successfully built on both Unix
and Windows.
Module authors are encouraged to use the distutils approach for building
extension modules, instead of the one described in this section. You will still
need the C compiler that was used to build Python; typically Microsoft Visual
C++.
.. note::
This chapter mentions a number of filenames that include an encoded Python
version number. These filenames are represented with the version number shown
as ``XY``; in practive, ``'X'`` will be the major version number and ``'Y'``
will be the minor version number of the Python release you're working with. For
example, if you are using Python 2.2.1, ``XY`` will actually be ``22``.
.. _win-cookbook:
A Cookbook Approach
===================
There are two approaches to building extension modules on Windows, just as there
are on Unix: use the :mod:`distutils` package to control the build process, or
do things manually. The distutils approach works well for most extensions;
documentation on using :mod:`distutils` to build and package extension modules
is available in :ref:`distutils-index`. This section describes the manual
approach to building Python extensions written in C or C++.
To build extensions using these instructions, you need to have a copy of the
Python sources of the same version as your installed Python. You will need
Microsoft Visual C++ "Developer Studio"; project files are supplied for VC++
version 7.1, but you can use older versions of VC++. Notice that you should use
the same version of VC++that was used to build Python itself. The example files
described here are distributed with the Python sources in the
:file:`PC\\example_nt\\` directory.
#. **Copy the example files** --- The :file:`example_nt` directory is a
subdirectory of the :file:`PC` directory, in order to keep all the PC-specific
files under the same directory in the source distribution. However, the
:file:`example_nt` directory can't actually be used from this location. You
first need to copy or move it up one level, so that :file:`example_nt` is a
sibling of the :file:`PC` and :file:`Include` directories. Do all your work
from within this new location.
#. **Open the project** --- From VC++, use the :menuselection:`File --> Open
Solution` dialog (not :menuselection:`File --> Open`!). Navigate to and select
the file :file:`example.sln`, in the *copy* of the :file:`example_nt` directory
you made above. Click Open.
#. **Build the example DLL** --- In order to check that everything is set up
right, try building:
#. Select a configuration. This step is optional. Choose
:menuselection:`Build --> Configuration Manager --> Active Solution Configuration`
and select either :guilabel:`Release` or :guilabel:`Debug`. If you skip this
step, VC++ will use the Debug configuration by default.
#. Build the DLL. Choose :menuselection:`Build --> Build Solution`. This
creates all intermediate and result files in a subdirectory called either
:file:`Debug` or :file:`Release`, depending on which configuration you selected
in the preceding step.
#. **Testing the debug-mode DLL** --- Once the Debug build has succeeded, bring
up a DOS box, and change to the :file:`example_nt\\Debug` directory. You should
now be able to repeat the following session (``C>`` is the DOS prompt, ``>>>``
is the Python prompt; note that build information and various debug output from
Python may not match this screen dump exactly)::
C>..\..\PCbuild\python_d
Adding parser accelerators ...
Done.
Python 2.2 (#28, Dec 19 2001, 23:26:37) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import example
[4897 refs]
>>> example.foo()
Hello, world
[4903 refs]
>>>
Congratulations! You've successfully built your first Python extension module.
#. **Creating your own project** --- Choose a name and create a directory for
it. Copy your C sources into it. Note that the module source file name does
not necessarily have to match the module name, but the name of the
initialization function should match the module name --- you can only import a
module :mod:`spam` if its initialization function is called :cfunc:`initspam`,
and it should call :cfunc:`Py_InitModule` with the string ``"spam"`` as its
first argument (use the minimal :file:`example.c` in this directory as a guide).
By convention, it lives in a file called :file:`spam.c` or :file:`spammodule.c`.
The output file should be called :file:`spam.dll` or :file:`spam.pyd` (the
latter is supported to avoid confusion with a system library :file:`spam.dll` to
which your module could be a Python interface) in Release mode, or
:file:`spam_d.dll` or :file:`spam_d.pyd` in Debug mode.
Now your options are:
#. Copy :file:`example.sln` and :file:`example.vcproj`, rename them to
:file:`spam.\*`, and edit them by hand, or
#. Create a brand new project; instructions are below.
In either case, copy :file:`example_nt\\example.def` to :file:`spam\\spam.def`,
and edit the new :file:`spam.def` so its second line contains the string
'``initspam``'. If you created a new project yourself, add the file
:file:`spam.def` to the project now. (This is an annoying little file with only
two lines. An alternative approach is to forget about the :file:`.def` file,
and add the option :option:`/export:initspam` somewhere to the Link settings, by
manually editing the setting in Project Properties dialog).
#. **Creating a brand new project** --- Use the :menuselection:`File --> New
--> Project` dialog to create a new Project Workspace. Select :guilabel:`Visual
C++ Projects/Win32/ Win32 Project`, enter the name (``spam``), and make sure the
Location is set to parent of the :file:`spam` directory you have created (which
should be a direct subdirectory of the Python build tree, a sibling of
:file:`Include` and :file:`PC`). Select Win32 as the platform (in my version,
this is the only choice). Make sure the Create new workspace radio button is
selected. Click OK.
You should now create the file :file:`spam.def` as instructed in the previous
section. Add the source files to the project, using :menuselection:`Project -->
Add Existing Item`. Set the pattern to ``*.*`` and select both :file:`spam.c`
and :file:`spam.def` and click OK. (Inserting them one by one is fine too.)
Now open the :menuselection:`Project --> spam properties` dialog. You only need
to change a few settings. Make sure :guilabel:`All Configurations` is selected
from the :guilabel:`Settings for:` dropdown list. Select the C/C++ tab. Choose
the General category in the popup menu at the top. Type the following text in
the entry box labeled :guilabel:`Additional Include Directories`::
..\Include,..\PC
Then, choose the General category in the Linker tab, and enter ::
..\PCbuild
in the text box labelled :guilabel:`Additional library Directories`.
Now you need to add some mode-specific settings:
Select :guilabel:`Release` in the :guilabel:`Configuration` dropdown list.
Choose the :guilabel:`Link` tab, choose the :guilabel:`Input` category, and
append ``pythonXY.lib`` to the list in the :guilabel:`Additional Dependencies`
box.
Select :guilabel:`Debug` in the :guilabel:`Configuration` dropdown list, and
append ``pythonXY_d.lib`` to the list in the :guilabel:`Additional Dependencies`
box. Then click the C/C++ tab, select :guilabel:`Code Generation`, and select
:guilabel:`Multi-threaded Debug DLL` from the :guilabel:`Runtime library`
dropdown list.
Select :guilabel:`Release` again from the :guilabel:`Configuration` dropdown
list. Select :guilabel:`Multi-threaded DLL` from the :guilabel:`Runtime
library` dropdown list.
If your module creates a new type, you may have trouble with this line::
PyObject_HEAD_INIT(&PyType_Type)
Change it to::
PyObject_HEAD_INIT(NULL)
and add the following to the module initialization function::
MyObject_Type.ob_type = &PyType_Type;
Refer to section 3 of the `Python FAQ <http://www.python.org/doc/FAQ.html>`_ for
details on why you must do this.
.. _dynamic-linking:
Differences Between Unix and Windows
====================================
.. sectionauthor:: Chris Phoenix <cphoenix@best.com>
Unix and Windows use completely different paradigms for run-time loading of
code. Before you try to build a module that can be dynamically loaded, be aware
of how your system works.
In Unix, a shared object (:file:`.so`) file contains code to be used by the
program, and also the names of functions and data that it expects to find in the
program. When the file is joined to the program, all references to those
functions and data in the file's code are changed to point to the actual
locations in the program where the functions and data are placed in memory.
This is basically a link operation.
In Windows, a dynamic-link library (:file:`.dll`) file has no dangling
references. Instead, an access to functions or data goes through a lookup
table. So the DLL code does not have to be fixed up at runtime to refer to the
program's memory; instead, the code already uses the DLL's lookup table, and the
lookup table is modified at runtime to point to the functions and data.
In Unix, there is only one type of library file (:file:`.a`) which contains code
from several object files (:file:`.o`). During the link step to create a shared
object file (:file:`.so`), the linker may find that it doesn't know where an
identifier is defined. The linker will look for it in the object files in the
libraries; if it finds it, it will include all the code from that object file.
In Windows, there are two types of library, a static library and an import
library (both called :file:`.lib`). A static library is like a Unix :file:`.a`
file; it contains code to be included as necessary. An import library is
basically used only to reassure the linker that a certain identifier is legal,
and will be present in the program when the DLL is loaded. So the linker uses
the information from the import library to build the lookup table for using
identifiers that are not included in the DLL. When an application or a DLL is
linked, an import library may be generated, which will need to be used for all
future DLLs that depend on the symbols in the application or DLL.
Suppose you are building two dynamic-load modules, B and C, which should share
another block of code A. On Unix, you would *not* pass :file:`A.a` to the
linker for :file:`B.so` and :file:`C.so`; that would cause it to be included
twice, so that B and C would each have their own copy. In Windows, building
:file:`A.dll` will also build :file:`A.lib`. You *do* pass :file:`A.lib` to the
linker for B and C. :file:`A.lib` does not contain code; it just contains
information which will be used at runtime to access A's code.
In Windows, using an import library is sort of like using ``import spam``; it
gives you access to spam's names, but does not create a separate copy. On Unix,
linking with a library is more like ``from spam import *``; it does create a
separate copy.
.. _win-dlls:
Using DLLs in Practice
======================
.. sectionauthor:: Chris Phoenix <cphoenix@best.com>
Windows Python is built in Microsoft Visual C++; using other compilers may or
may not work (though Borland seems to). The rest of this section is MSVC++
specific.
When creating DLLs in Windows, you must pass :file:`pythonXY.lib` to the linker.
To build two DLLs, spam and ni (which uses C functions found in spam), you could
use these commands::
cl /LD /I/python/include spam.c ../libs/pythonXY.lib
cl /LD /I/python/include ni.c spam.lib ../libs/pythonXY.lib
The first command created three files: :file:`spam.obj`, :file:`spam.dll` and
:file:`spam.lib`. :file:`Spam.dll` does not contain any Python functions (such
as :cfunc:`PyArg_ParseTuple`), but it does know how to find the Python code
thanks to :file:`pythonXY.lib`.
The second command created :file:`ni.dll` (and :file:`.obj` and :file:`.lib`),
which knows how to find the necessary functions from spam, and also from the
Python executable.
Not every identifier is exported to the lookup table. If you want any other
modules (including Python) to be able to see your identifiers, you have to say
``_declspec(dllexport)``, as in ``void _declspec(dllexport) initspam(void)`` or
``PyObject _declspec(dllexport) *NiGetSpamData(void)``.
Developer Studio will throw in a lot of import libraries that you do not really
need, adding about 100K to your executable. To get rid of them, use the Project
Settings dialog, Link tab, to specify *ignore default libraries*. Add the
correct :file:`msvcrtxx.lib` to the list of libraries.

View File

@ -1,356 +0,0 @@
*************************
Python Advocacy HOWTO
*************************
:Author: A.M. Kuchling
:Release: 0.03
.. topic:: Abstract
It's usually difficult to get your management to accept open source software,
and Python is no exception to this rule. This document discusses reasons to use
Python, strategies for winning acceptance, facts and arguments you can use, and
cases where you *shouldn't* try to use Python.
Reasons to Use Python
=====================
There are several reasons to incorporate a scripting language into your
development process, and this section will discuss them, and why Python has some
properties that make it a particularly good choice.
Programmability
---------------
Programs are often organized in a modular fashion. Lower-level operations are
grouped together, and called by higher-level functions, which may in turn be
used as basic operations by still further upper levels.
For example, the lowest level might define a very low-level set of functions for
accessing a hash table. The next level might use hash tables to store the
headers of a mail message, mapping a header name like ``Date`` to a value such
as ``Tue, 13 May 1997 20:00:54 -0400``. A yet higher level may operate on
message objects, without knowing or caring that message headers are stored in a
hash table, and so forth.
Often, the lowest levels do very simple things; they implement a data structure
such as a binary tree or hash table, or they perform some simple computation,
such as converting a date string to a number. The higher levels then contain
logic connecting these primitive operations. Using the approach, the primitives
can be seen as basic building blocks which are then glued together to produce
the complete product.
Why is this design approach relevant to Python? Because Python is well suited
to functioning as such a glue language. A common approach is to write a Python
module that implements the lower level operations; for the sake of speed, the
implementation might be in C, Java, or even Fortran. Once the primitives are
available to Python programs, the logic underlying higher level operations is
written in the form of Python code. The high-level logic is then more
understandable, and easier to modify.
John Ousterhout wrote a paper that explains this idea at greater length,
entitled "Scripting: Higher Level Programming for the 21st Century". I
recommend that you read this paper; see the references for the URL. Ousterhout
is the inventor of the Tcl language, and therefore argues that Tcl should be
used for this purpose; he only briefly refers to other languages such as Python,
Perl, and Lisp/Scheme, but in reality, Ousterhout's argument applies to
scripting languages in general, since you could equally write extensions for any
of the languages mentioned above.
Prototyping
-----------
In *The Mythical Man-Month*, Fredrick Brooks suggests the following rule when
planning software projects: "Plan to throw one away; you will anyway." Brooks
is saying that the first attempt at a software design often turns out to be
wrong; unless the problem is very simple or you're an extremely good designer,
you'll find that new requirements and features become apparent once development
has actually started. If these new requirements can't be cleanly incorporated
into the program's structure, you're presented with two unpleasant choices:
hammer the new features into the program somehow, or scrap everything and write
a new version of the program, taking the new features into account from the
beginning.
Python provides you with a good environment for quickly developing an initial
prototype. That lets you get the overall program structure and logic right, and
you can fine-tune small details in the fast development cycle that Python
provides. Once you're satisfied with the GUI interface or program output, you
can translate the Python code into C++, Fortran, Java, or some other compiled
language.
Prototyping means you have to be careful not to use too many Python features
that are hard to implement in your other language. Using ``eval()``, or regular
expressions, or the :mod:`pickle` module, means that you're going to need C or
Java libraries for formula evaluation, regular expressions, and serialization,
for example. But it's not hard to avoid such tricky code, and in the end the
translation usually isn't very difficult. The resulting code can be rapidly
debugged, because any serious logical errors will have been removed from the
prototype, leaving only more minor slip-ups in the translation to track down.
This strategy builds on the earlier discussion of programmability. Using Python
as glue to connect lower-level components has obvious relevance for constructing
prototype systems. In this way Python can help you with development, even if
end users never come in contact with Python code at all. If the performance of
the Python version is adequate and corporate politics allow it, you may not need
to do a translation into C or Java, but it can still be faster to develop a
prototype and then translate it, instead of attempting to produce the final
version immediately.
One example of this development strategy is Microsoft Merchant Server. Version
1.0 was written in pure Python, by a company that subsequently was purchased by
Microsoft. Version 2.0 began to translate the code into C++, shipping with some
C++code and some Python code. Version 3.0 didn't contain any Python at all; all
the code had been translated into C++. Even though the product doesn't contain
a Python interpreter, the Python language has still served a useful purpose by
speeding up development.
This is a very common use for Python. Past conference papers have also
described this approach for developing high-level numerical algorithms; see
David M. Beazley and Peter S. Lomdahl's paper "Feeding a Large-scale Physics
Application to Python" in the references for a good example. If an algorithm's
basic operations are things like "Take the inverse of this 4000x4000 matrix",
and are implemented in some lower-level language, then Python has almost no
additional performance cost; the extra time required for Python to evaluate an
expression like ``m.invert()`` is dwarfed by the cost of the actual computation.
It's particularly good for applications where seemingly endless tweaking is
required to get things right. GUI interfaces and Web sites are prime examples.
The Python code is also shorter and faster to write (once you're familiar with
Python), so it's easier to throw it away if you decide your approach was wrong;
if you'd spent two weeks working on it instead of just two hours, you might
waste time trying to patch up what you've got out of a natural reluctance to
admit that those two weeks were wasted. Truthfully, those two weeks haven't
been wasted, since you've learnt something about the problem and the technology
you're using to solve it, but it's human nature to view this as a failure of
some sort.
Simplicity and Ease of Understanding
------------------------------------
Python is definitely *not* a toy language that's only usable for small tasks.
The language features are general and powerful enough to enable it to be used
for many different purposes. It's useful at the small end, for 10- or 20-line
scripts, but it also scales up to larger systems that contain thousands of lines
of code.
However, this expressiveness doesn't come at the cost of an obscure or tricky
syntax. While Python has some dark corners that can lead to obscure code, there
are relatively few such corners, and proper design can isolate their use to only
a few classes or modules. It's certainly possible to write confusing code by
using too many features with too little concern for clarity, but most Python
code can look a lot like a slightly-formalized version of human-understandable
pseudocode.
In *The New Hacker's Dictionary*, Eric S. Raymond gives the following definition
for "compact":
.. epigraph::
Compact *adj.* Of a design, describes the valuable property that it can all be
apprehended at once in one's head. This generally means the thing created from
the design can be used with greater facility and fewer errors than an equivalent
tool that is not compact. Compactness does not imply triviality or lack of
power; for example, C is compact and FORTRAN is not, but C is more powerful than
FORTRAN. Designs become non-compact through accreting features and cruft that
don't merge cleanly into the overall design scheme (thus, some fans of Classic C
maintain that ANSI C is no longer compact).
(From http://www.catb.org/ esr/jargon/html/C/compact.html)
In this sense of the word, Python is quite compact, because the language has
just a few ideas, which are used in lots of places. Take namespaces, for
example. Import a module with ``import math``, and you create a new namespace
called ``math``. Classes are also namespaces that share many of the properties
of modules, and have a few of their own; for example, you can create instances
of a class. Instances? They're yet another namespace. Namespaces are currently
implemented as Python dictionaries, so they have the same methods as the
standard dictionary data type: .keys() returns all the keys, and so forth.
This simplicity arises from Python's development history. The language syntax
derives from different sources; ABC, a relatively obscure teaching language, is
one primary influence, and Modula-3 is another. (For more information about ABC
and Modula-3, consult their respective Web sites at http://www.cwi.nl/
steven/abc/ and http://www.m3.org.) Other features have come from C, Icon,
Algol-68, and even Perl. Python hasn't really innovated very much, but instead
has tried to keep the language small and easy to learn, building on ideas that
have been tried in other languages and found useful.
Simplicity is a virtue that should not be underestimated. It lets you learn the
language more quickly, and then rapidly write code, code that often works the
first time you run it.
Java Integration
----------------
If you're working with Java, Jython (http://www.jython.org/) is definitely worth
your attention. Jython is a re-implementation of Python in Java that compiles
Python code into Java bytecodes. The resulting environment has very tight,
almost seamless, integration with Java. It's trivial to access Java classes
from Python, and you can write Python classes that subclass Java classes.
Jython can be used for prototyping Java applications in much the same way
CPython is used, and it can also be used for test suites for Java code, or
embedded in a Java application to add scripting capabilities.
Arguments and Rebuttals
=======================
Let's say that you've decided upon Python as the best choice for your
application. How can you convince your management, or your fellow developers,
to use Python? This section lists some common arguments against using Python,
and provides some possible rebuttals.
**Python is freely available software that doesn't cost anything. How good can
it be?**
Very good, indeed. These days Linux and Apache, two other pieces of open source
software, are becoming more respected as alternatives to commercial software,
but Python hasn't had all the publicity.
Python has been around for several years, with many users and developers.
Accordingly, the interpreter has been used by many people, and has gotten most
of the bugs shaken out of it. While bugs are still discovered at intervals,
they're usually either quite obscure (they'd have to be, for no one to have run
into them before) or they involve interfaces to external libraries. The
internals of the language itself are quite stable.
Having the source code should be viewed as making the software available for
peer review; people can examine the code, suggest (and implement) improvements,
and track down bugs. To find out more about the idea of open source code, along
with arguments and case studies supporting it, go to http://www.opensource.org.
**Who's going to support it?**
Python has a sizable community of developers, and the number is still growing.
The Internet community surrounding the language is an active one, and is worth
being considered another one of Python's advantages. Most questions posted to
the comp.lang.python newsgroup are quickly answered by someone.
Should you need to dig into the source code, you'll find it's clear and
well-organized, so it's not very difficult to write extensions and track down
bugs yourself. If you'd prefer to pay for support, there are companies and
individuals who offer commercial support for Python.
**Who uses Python for serious work?**
Lots of people; one interesting thing about Python is the surprising diversity
of applications that it's been used for. People are using Python to:
* Run Web sites
* Write GUI interfaces
* Control number-crunching code on supercomputers
* Make a commercial application scriptable by embedding the Python interpreter
inside it
* Process large XML data sets
* Build test suites for C or Java code
Whatever your application domain is, there's probably someone who's used Python
for something similar. Yet, despite being useable for such high-end
applications, Python's still simple enough to use for little jobs.
See http://wiki.python.org/moin/OrganizationsUsingPython for a list of some of
the organizations that use Python.
**What are the restrictions on Python's use?**
They're practically nonexistent. Consult the :file:`Misc/COPYRIGHT` file in the
source distribution, or http://www.python.org/doc/Copyright.html for the full
language, but it boils down to three conditions.
* You have to leave the copyright notice on the software; if you don't include
the source code in a product, you have to put the copyright notice in the
supporting documentation.
* Don't claim that the institutions that have developed Python endorse your
product in any way.
* If something goes wrong, you can't sue for damages. Practically all software
licences contain this condition.
Notice that you don't have to provide source code for anything that contains
Python or is built with it. Also, the Python interpreter and accompanying
documentation can be modified and redistributed in any way you like, and you
don't have to pay anyone any licensing fees at all.
**Why should we use an obscure language like Python instead of well-known
language X?**
I hope this HOWTO, and the documents listed in the final section, will help
convince you that Python isn't obscure, and has a healthily growing user base.
One word of advice: always present Python's positive advantages, instead of
concentrating on language X's failings. People want to know why a solution is
good, rather than why all the other solutions are bad. So instead of attacking
a competing solution on various grounds, simply show how Python's virtues can
help.
Useful Resources
================
http://www.pythonology.com/success
The Python Success Stories are a collection of stories from successful users of
Python, with the emphasis on business and corporate users.
.. % \term{\url{http://www.fsbassociates.com/books/pythonchpt1.htm}}
.. % The first chapter of \emph{Internet Programming with Python} also
.. % examines some of the reasons for using Python. The book is well worth
.. % buying, but the publishers have made the first chapter available on
.. % the Web.
http://home.pacbell.net/ouster/scripting.html
John Ousterhout's white paper on scripting is a good argument for the utility of
scripting languages, though naturally enough, he emphasizes Tcl, the language he
developed. Most of the arguments would apply to any scripting language.
http://www.python.org/workshops/1997-10/proceedings/beazley.html
The authors, David M. Beazley and Peter S. Lomdahl, describe their use of
Python at Los Alamos National Laboratory. It's another good example of how
Python can help get real work done. This quotation from the paper has been
echoed by many people:
.. epigraph::
Originally developed as a large monolithic application for massively parallel
processing systems, we have used Python to transform our application into a
flexible, highly modular, and extremely powerful system for performing
simulation, data analysis, and visualization. In addition, we describe how
Python has solved a number of important problems related to the development,
debugging, deployment, and maintenance of scientific software.
http://pythonjournal.cognizor.com/pyj1/Everitt-Feit_interview98-V1.html
This interview with Andy Feit, discussing Infoseek's use of Python, can be used
to show that choosing Python didn't introduce any difficulties into a company's
development process, and provided some substantial benefits.
.. % \term{\url{http://www.python.org/psa/Commercial.html}}
.. % Robin Friedrich wrote this document on how to support Python's use in
.. % commercial projects.
http://www.python.org/workshops/1997-10/proceedings/stein.ps
For the 6th Python conference, Greg Stein presented a paper that traced Python's
adoption and usage at a startup called eShop, and later at Microsoft.
http://www.opensource.org
Management may be doubtful of the reliability and usefulness of software that
wasn't written commercially. This site presents arguments that show how open
source software can have considerable advantages over closed-source software.
http://sunsite.unc.edu/LDP/HOWTO/mini/Advocacy.html
The Linux Advocacy mini-HOWTO was the inspiration for this document, and is also
well worth reading for general suggestions on winning acceptance for a new
technology, such as Linux or Python. In general, you won't make much progress
by simply attacking existing systems and complaining about their inadequacies;
this often ends up looking like unfocused whining. It's much better to point
out some of the many areas where Python is an improvement over other systems.

View File

@ -1,434 +0,0 @@
**********************************
Curses Programming with Python
**********************************
:Author: A.M. Kuchling, Eric S. Raymond
:Release: 2.02
.. topic:: Abstract
This document describes how to write text-mode programs with Python 2.x, using
the :mod:`curses` extension module to control the display.
What is curses?
===============
The curses library supplies a terminal-independent screen-painting and
keyboard-handling facility for text-based terminals; such terminals include
VT100s, the Linux console, and the simulated terminal provided by X11 programs
such as xterm and rxvt. Display terminals support various control codes to
perform common operations such as moving the cursor, scrolling the screen, and
erasing areas. Different terminals use widely differing codes, and often have
their own minor quirks.
In a world of X displays, one might ask "why bother"? It's true that
character-cell display terminals are an obsolete technology, but there are
niches in which being able to do fancy things with them are still valuable. One
is on small-footprint or embedded Unixes that don't carry an X server. Another
is for tools like OS installers and kernel configurators that may have to run
before X is available.
The curses library hides all the details of different terminals, and provides
the programmer with an abstraction of a display, containing multiple
non-overlapping windows. The contents of a window can be changed in various
ways-- adding text, erasing it, changing its appearance--and the curses library
will automagically figure out what control codes need to be sent to the terminal
to produce the right output.
The curses library was originally written for BSD Unix; the later System V
versions of Unix from AT&T added many enhancements and new functions. BSD curses
is no longer maintained, having been replaced by ncurses, which is an
open-source implementation of the AT&T interface. If you're using an
open-source Unix such as Linux or FreeBSD, your system almost certainly uses
ncurses. Since most current commercial Unix versions are based on System V
code, all the functions described here will probably be available. The older
versions of curses carried by some proprietary Unixes may not support
everything, though.
No one has made a Windows port of the curses module. On a Windows platform, try
the Console module written by Fredrik Lundh. The Console module provides
cursor-addressable text output, plus full support for mouse and keyboard input,
and is available from http://effbot.org/efflib/console.
The Python curses module
------------------------
Thy Python module is a fairly simple wrapper over the C functions provided by
curses; if you're already familiar with curses programming in C, it's really
easy to transfer that knowledge to Python. The biggest difference is that the
Python interface makes things simpler, by merging different C functions such as
:func:`addstr`, :func:`mvaddstr`, :func:`mvwaddstr`, into a single
:meth:`addstr` method. You'll see this covered in more detail later.
This HOWTO is simply an introduction to writing text-mode programs with curses
and Python. It doesn't attempt to be a complete guide to the curses API; for
that, see the Python library guide's section on ncurses, and the C manual pages
for ncurses. It will, however, give you the basic ideas.
Starting and ending a curses application
========================================
Before doing anything, curses must be initialized. This is done by calling the
:func:`initscr` function, which will determine the terminal type, send any
required setup codes to the terminal, and create various internal data
structures. If successful, :func:`initscr` returns a window object representing
the entire screen; this is usually called ``stdscr``, after the name of the
corresponding C variable. ::
import curses
stdscr = curses.initscr()
Usually curses applications turn off automatic echoing of keys to the screen, in
order to be able to read keys and only display them under certain circumstances.
This requires calling the :func:`noecho` function. ::
curses.noecho()
Applications will also commonly need to react to keys instantly, without
requiring the Enter key to be pressed; this is called cbreak mode, as opposed to
the usual buffered input mode. ::
curses.cbreak()
Terminals usually return special keys, such as the cursor keys or navigation
keys such as Page Up and Home, as a multibyte escape sequence. While you could
write your application to expect such sequences and process them accordingly,
curses can do it for you, returning a special value such as
:const:`curses.KEY_LEFT`. To get curses to do the job, you'll have to enable
keypad mode. ::
stdscr.keypad(1)
Terminating a curses application is much easier than starting one. You'll need
to call ::
curses.nocbreak(); stdscr.keypad(0); curses.echo()
to reverse the curses-friendly terminal settings. Then call the :func:`endwin`
function to restore the terminal to its original operating mode. ::
curses.endwin()
A common problem when debugging a curses application is to get your terminal
messed up when the application dies without restoring the terminal to its
previous state. In Python this commonly happens when your code is buggy and
raises an uncaught exception. Keys are no longer be echoed to the screen when
you type them, for example, which makes using the shell difficult.
In Python you can avoid these complications and make debugging much easier by
importing the module :mod:`curses.wrapper`. It supplies a :func:`wrapper`
function that takes a callable. It does the initializations described above,
and also initializes colors if color support is present. It then runs your
provided callable and finally deinitializes appropriately. The callable is
called inside a try-catch clause which catches exceptions, performs curses
deinitialization, and then passes the exception upwards. Thus, your terminal
won't be left in a funny state on exception.
Windows and Pads
================
Windows are the basic abstraction in curses. A window object represents a
rectangular area of the screen, and supports various methods to display text,
erase it, allow the user to input strings, and so forth.
The ``stdscr`` object returned by the :func:`initscr` function is a window
object that covers the entire screen. Many programs may need only this single
window, but you might wish to divide the screen into smaller windows, in order
to redraw or clear them separately. The :func:`newwin` function creates a new
window of a given size, returning the new window object. ::
begin_x = 20 ; begin_y = 7
height = 5 ; width = 40
win = curses.newwin(height, width, begin_y, begin_x)
A word about the coordinate system used in curses: coordinates are always passed
in the order *y,x*, and the top-left corner of a window is coordinate (0,0).
This breaks a common convention for handling coordinates, where the *x*
coordinate usually comes first. This is an unfortunate difference from most
other computer applications, but it's been part of curses since it was first
written, and it's too late to change things now.
When you call a method to display or erase text, the effect doesn't immediately
show up on the display. This is because curses was originally written with slow
300-baud terminal connections in mind; with these terminals, minimizing the time
required to redraw the screen is very important. This lets curses accumulate
changes to the screen, and display them in the most efficient manner. For
example, if your program displays some characters in a window, and then clears
the window, there's no need to send the original characters because they'd never
be visible.
Accordingly, curses requires that you explicitly tell it to redraw windows,
using the :func:`refresh` method of window objects. In practice, this doesn't
really complicate programming with curses much. Most programs go into a flurry
of activity, and then pause waiting for a keypress or some other action on the
part of the user. All you have to do is to be sure that the screen has been
redrawn before pausing to wait for user input, by simply calling
``stdscr.refresh()`` or the :func:`refresh` method of some other relevant
window.
A pad is a special case of a window; it can be larger than the actual display
screen, and only a portion of it displayed at a time. Creating a pad simply
requires the pad's height and width, while refreshing a pad requires giving the
coordinates of the on-screen area where a subsection of the pad will be
displayed. ::
pad = curses.newpad(100, 100)
# These loops fill the pad with letters; this is
# explained in the next section
for y in range(0, 100):
for x in range(0, 100):
try: pad.addch(y,x, ord('a') + (x*x+y*y) % 26 )
except curses.error: pass
# Displays a section of the pad in the middle of the screen
pad.refresh( 0,0, 5,5, 20,75)
The :func:`refresh` call displays a section of the pad in the rectangle
extending from coordinate (5,5) to coordinate (20,75) on the screen; the upper
left corner of the displayed section is coordinate (0,0) on the pad. Beyond
that difference, pads are exactly like ordinary windows and support the same
methods.
If you have multiple windows and pads on screen there is a more efficient way to
go, which will prevent annoying screen flicker at refresh time. Use the
:meth:`noutrefresh` method of each window to update the data structure
representing the desired state of the screen; then change the physical screen to
match the desired state in one go with the function :func:`doupdate`. The
normal :meth:`refresh` method calls :func:`doupdate` as its last act.
Displaying Text
===============
From a C programmer's point of view, curses may sometimes look like a twisty
maze of functions, all subtly different. For example, :func:`addstr` displays a
string at the current cursor location in the ``stdscr`` window, while
:func:`mvaddstr` moves to a given y,x coordinate first before displaying the
string. :func:`waddstr` is just like :func:`addstr`, but allows specifying a
window to use, instead of using ``stdscr`` by default. :func:`mvwaddstr` follows
similarly.
Fortunately the Python interface hides all these details; ``stdscr`` is a window
object like any other, and methods like :func:`addstr` accept multiple argument
forms. Usually there are four different forms.
+---------------------------------+-----------------------------------------------+
| Form | Description |
+=================================+===============================================+
| *str* or *ch* | Display the string *str* or character *ch* at |
| | the current position |
+---------------------------------+-----------------------------------------------+
| *str* or *ch*, *attr* | Display the string *str* or character *ch*, |
| | using attribute *attr* at the current |
| | position |
+---------------------------------+-----------------------------------------------+
| *y*, *x*, *str* or *ch* | Move to position *y,x* within the window, and |
| | display *str* or *ch* |
+---------------------------------+-----------------------------------------------+
| *y*, *x*, *str* or *ch*, *attr* | Move to position *y,x* within the window, and |
| | display *str* or *ch*, using attribute *attr* |
+---------------------------------+-----------------------------------------------+
Attributes allow displaying text in highlighted forms, such as in boldface,
underline, reverse code, or in color. They'll be explained in more detail in
the next subsection.
The :func:`addstr` function takes a Python string as the value to be displayed,
while the :func:`addch` functions take a character, which can be either a Python
string of length 1 or an integer. If it's a string, you're limited to
displaying characters between 0 and 255. SVr4 curses provides constants for
extension characters; these constants are integers greater than 255. For
example, :const:`ACS_PLMINUS` is a +/- symbol, and :const:`ACS_ULCORNER` is the
upper left corner of a box (handy for drawing borders).
Windows remember where the cursor was left after the last operation, so if you
leave out the *y,x* coordinates, the string or character will be displayed
wherever the last operation left off. You can also move the cursor with the
``move(y,x)`` method. Because some terminals always display a flashing cursor,
you may want to ensure that the cursor is positioned in some location where it
won't be distracting; it can be confusing to have the cursor blinking at some
apparently random location.
If your application doesn't need a blinking cursor at all, you can call
``curs_set(0)`` to make it invisible. Equivalently, and for compatibility with
older curses versions, there's a ``leaveok(bool)`` function. When *bool* is
true, the curses library will attempt to suppress the flashing cursor, and you
won't need to worry about leaving it in odd locations.
Attributes and Color
--------------------
Characters can be displayed in different ways. Status lines in a text-based
application are commonly shown in reverse video; a text viewer may need to
highlight certain words. curses supports this by allowing you to specify an
attribute for each cell on the screen.
An attribute is a integer, each bit representing a different attribute. You can
try to display text with multiple attribute bits set, but curses doesn't
guarantee that all the possible combinations are available, or that they're all
visually distinct. That depends on the ability of the terminal being used, so
it's safest to stick to the most commonly available attributes, listed here.
+----------------------+--------------------------------------+
| Attribute | Description |
+======================+======================================+
| :const:`A_BLINK` | Blinking text |
+----------------------+--------------------------------------+
| :const:`A_BOLD` | Extra bright or bold text |
+----------------------+--------------------------------------+
| :const:`A_DIM` | Half bright text |
+----------------------+--------------------------------------+
| :const:`A_REVERSE` | Reverse-video text |
+----------------------+--------------------------------------+
| :const:`A_STANDOUT` | The best highlighting mode available |
+----------------------+--------------------------------------+
| :const:`A_UNDERLINE` | Underlined text |
+----------------------+--------------------------------------+
So, to display a reverse-video status line on the top line of the screen, you
could code::
stdscr.addstr(0, 0, "Current mode: Typing mode",
curses.A_REVERSE)
stdscr.refresh()
The curses library also supports color on those terminals that provide it, The
most common such terminal is probably the Linux console, followed by color
xterms.
To use color, you must call the :func:`start_color` function soon after calling
:func:`initscr`, to initialize the default color set (the
:func:`curses.wrapper.wrapper` function does this automatically). Once that's
done, the :func:`has_colors` function returns TRUE if the terminal in use can
actually display color. (Note: curses uses the American spelling 'color',
instead of the Canadian/British spelling 'colour'. If you're used to the
British spelling, you'll have to resign yourself to misspelling it for the sake
of these functions.)
The curses library maintains a finite number of color pairs, containing a
foreground (or text) color and a background color. You can get the attribute
value corresponding to a color pair with the :func:`color_pair` function; this
can be bitwise-OR'ed with other attributes such as :const:`A_REVERSE`, but
again, such combinations are not guaranteed to work on all terminals.
An example, which displays a line of text using color pair 1::
stdscr.addstr( "Pretty text", curses.color_pair(1) )
stdscr.refresh()
As I said before, a color pair consists of a foreground and background color.
:func:`start_color` initializes 8 basic colors when it activates color mode.
They are: 0:black, 1:red, 2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and
7:white. The curses module defines named constants for each of these colors:
:const:`curses.COLOR_BLACK`, :const:`curses.COLOR_RED`, and so forth.
The ``init_pair(n, f, b)`` function changes the definition of color pair *n*, to
foreground color f and background color b. Color pair 0 is hard-wired to white
on black, and cannot be changed.
Let's put all this together. To change color 1 to red text on a white
background, you would call::
curses.init_pair(1, curses.COLOR_RED, curses.COLOR_WHITE)
When you change a color pair, any text already displayed using that color pair
will change to the new colors. You can also display new text in this color
with::
stdscr.addstr(0,0, "RED ALERT!", curses.color_pair(1) )
Very fancy terminals can change the definitions of the actual colors to a given
RGB value. This lets you change color 1, which is usually red, to purple or
blue or any other color you like. Unfortunately, the Linux console doesn't
support this, so I'm unable to try it out, and can't provide any examples. You
can check if your terminal can do this by calling :func:`can_change_color`,
which returns TRUE if the capability is there. If you're lucky enough to have
such a talented terminal, consult your system's man pages for more information.
User Input
==========
The curses library itself offers only very simple input mechanisms. Python's
support adds a text-input widget that makes up some of the lack.
The most common way to get input to a window is to use its :meth:`getch` method.
:meth:`getch` pauses and waits for the user to hit a key, displaying it if
:func:`echo` has been called earlier. You can optionally specify a coordinate
to which the cursor should be moved before pausing.
It's possible to change this behavior with the method :meth:`nodelay`. After
``nodelay(1)``, :meth:`getch` for the window becomes non-blocking and returns
``curses.ERR`` (a value of -1) when no input is ready. There's also a
:func:`halfdelay` function, which can be used to (in effect) set a timer on each
:meth:`getch`; if no input becomes available within the number of milliseconds
specified as the argument to :func:`halfdelay`, curses raises an exception.
The :meth:`getch` method returns an integer; if it's between 0 and 255, it
represents the ASCII code of the key pressed. Values greater than 255 are
special keys such as Page Up, Home, or the cursor keys. You can compare the
value returned to constants such as :const:`curses.KEY_PPAGE`,
:const:`curses.KEY_HOME`, or :const:`curses.KEY_LEFT`. Usually the main loop of
your program will look something like this::
while 1:
c = stdscr.getch()
if c == ord('p'): PrintDocument()
elif c == ord('q'): break # Exit the while()
elif c == curses.KEY_HOME: x = y = 0
The :mod:`curses.ascii` module supplies ASCII class membership functions that
take either integer or 1-character-string arguments; these may be useful in
writing more readable tests for your command interpreters. It also supplies
conversion functions that take either integer or 1-character-string arguments
and return the same type. For example, :func:`curses.ascii.ctrl` returns the
control character corresponding to its argument.
There's also a method to retrieve an entire string, :const:`getstr()`. It isn't
used very often, because its functionality is quite limited; the only editing
keys available are the backspace key and the Enter key, which terminates the
string. It can optionally be limited to a fixed number of characters. ::
curses.echo() # Enable echoing of characters
# Get a 15-character string, with the cursor on the top line
s = stdscr.getstr(0,0, 15)
The Python :mod:`curses.textpad` module supplies something better. With it, you
can turn a window into a text box that supports an Emacs-like set of
keybindings. Various methods of :class:`Textbox` class support editing with
input validation and gathering the edit results either with or without trailing
spaces. See the library documentation on :mod:`curses.textpad` for the
details.
For More Information
====================
This HOWTO didn't cover some advanced topics, such as screen-scraping or
capturing mouse events from an xterm instance. But the Python library page for
the curses modules is now pretty complete. You should browse it next.
If you're in doubt about the detailed behavior of any of the ncurses entry
points, consult the manual pages for your curses implementation, whether it's
ncurses or a proprietary Unix vendor's. The manual pages will document any
quirks, and provide complete lists of all the functions, attributes, and
:const:`ACS_\*` characters available to you.
Because the curses API is so large, some functions aren't supported in the
Python interface, not because they're difficult to implement, but because no one
has needed them yet. Feel free to add them and then submit a patch. Also, we
don't yet have support for the menus or panels libraries associated with
ncurses; feel free to add that.
If you write an interesting little program, feel free to contribute it as
another demo. We can always use more of them!
The ncurses FAQ: http://dickey.his.com/ncurses/ncurses.faq.html

View File

@ -1,308 +0,0 @@
************************************
Idioms and Anti-Idioms in Python
************************************
:Author: Moshe Zadka
This document is placed in the public doman.
.. topic:: Abstract
This document can be considered a companion to the tutorial. It shows how to use
Python, and even more importantly, how *not* to use Python.
Language Constructs You Should Not Use
======================================
While Python has relatively few gotchas compared to other languages, it still
has some constructs which are only useful in corner cases, or are plain
dangerous.
from module import \*
---------------------
Inside Function Definitions
^^^^^^^^^^^^^^^^^^^^^^^^^^^
``from module import *`` is *invalid* inside function definitions. While many
versions of Python do not check for the invalidity, it does not make it more
valid, no more then having a smart lawyer makes a man innocent. Do not use it
like that ever. Even in versions where it was accepted, it made the function
execution slower, because the compiler could not be certain which names are
local and which are global. In Python 2.1 this construct causes warnings, and
sometimes even errors.
At Module Level
^^^^^^^^^^^^^^^
While it is valid to use ``from module import *`` at module level it is usually
a bad idea. For one, this loses an important property Python otherwise has ---
you can know where each toplevel name is defined by a simple "search" function
in your favourite editor. You also open yourself to trouble in the future, if
some module grows additional functions or classes.
One of the most awful question asked on the newsgroup is why this code::
f = open("www")
f.read()
does not work. Of course, it works just fine (assuming you have a file called
"www".) But it does not work if somewhere in the module, the statement ``from os
import *`` is present. The :mod:`os` module has a function called :func:`open`
which returns an integer. While it is very useful, shadowing builtins is one of
its least useful properties.
Remember, you can never know for sure what names a module exports, so either
take what you need --- ``from module import name1, name2``, or keep them in the
module and access on a per-need basis --- ``import module;print module.name``.
When It Is Just Fine
^^^^^^^^^^^^^^^^^^^^
There are situations in which ``from module import *`` is just fine:
* The interactive prompt. For example, ``from math import *`` makes Python an
amazing scientific calculator.
* When extending a module in C with a module in Python.
* When the module advertises itself as ``from import *`` safe.
Unadorned :keyword:`exec`, :func:`execfile` and friends
-------------------------------------------------------
The word "unadorned" refers to the use without an explicit dictionary, in which
case those constructs evaluate code in the *current* environment. This is
dangerous for the same reasons ``from import *`` is dangerous --- it might step
over variables you are counting on and mess up things for the rest of your code.
Simply do not do that.
Bad examples::
>>> for name in sys.argv[1:]:
>>> exec "%s=1" % name
>>> def func(s, **kw):
>>> for var, val in kw.items():
>>> exec "s.%s=val" % var # invalid!
>>> execfile("handler.py")
>>> handle()
Good examples::
>>> d = {}
>>> for name in sys.argv[1:]:
>>> d[name] = 1
>>> def func(s, **kw):
>>> for var, val in kw.items():
>>> setattr(s, var, val)
>>> d={}
>>> execfile("handle.py", d, d)
>>> handle = d['handle']
>>> handle()
from module import name1, name2
-------------------------------
This is a "don't" which is much weaker then the previous "don't"s but is still
something you should not do if you don't have good reasons to do that. The
reason it is usually bad idea is because you suddenly have an object which lives
in two seperate namespaces. When the binding in one namespace changes, the
binding in the other will not, so there will be a discrepancy between them. This
happens when, for example, one module is reloaded, or changes the definition of
a function at runtime.
Bad example::
# foo.py
a = 1
# bar.py
from foo import a
if something():
a = 2 # danger: foo.a != a
Good example::
# foo.py
a = 1
# bar.py
import foo
if something():
foo.a = 2
except:
-------
Python has the ``except:`` clause, which catches all exceptions. Since *every*
error in Python raises an exception, this makes many programming errors look
like runtime problems, and hinders the debugging process.
The following code shows a great example::
try:
foo = opne("file") # misspelled "open"
except:
sys.exit("could not open file!")
The second line triggers a :exc:`NameError` which is caught by the except
clause. The program will exit, and you will have no idea that this has nothing
to do with the readability of ``"file"``.
The example above is better written ::
try:
foo = opne("file") # will be changed to "open" as soon as we run it
except IOError:
sys.exit("could not open file")
There are some situations in which the ``except:`` clause is useful: for
example, in a framework when running callbacks, it is good not to let any
callback disturb the framework.
Exceptions
==========
Exceptions are a useful feature of Python. You should learn to raise them
whenever something unexpected occurs, and catch them only where you can do
something about them.
The following is a very popular anti-idiom ::
def get_status(file):
if not os.path.exists(file):
print "file not found"
sys.exit(1)
return open(file).readline()
Consider the case the file gets deleted between the time the call to
:func:`os.path.exists` is made and the time :func:`open` is called. That means
the last line will throw an :exc:`IOError`. The same would happen if *file*
exists but has no read permission. Since testing this on a normal machine on
existing and non-existing files make it seem bugless, that means in testing the
results will seem fine, and the code will get shipped. Then an unhandled
:exc:`IOError` escapes to the user, who has to watch the ugly traceback.
Here is a better way to do it. ::
def get_status(file):
try:
return open(file).readline()
except (IOError, OSError):
print "file not found"
sys.exit(1)
In this version, \*either\* the file gets opened and the line is read (so it
works even on flaky NFS or SMB connections), or the message is printed and the
application aborted.
Still, :func:`get_status` makes too many assumptions --- that it will only be
used in a short running script, and not, say, in a long running server. Sure,
the caller could do something like ::
try:
status = get_status(log)
except SystemExit:
status = None
So, try to make as few ``except`` clauses in your code --- those will usually be
a catch-all in the :func:`main`, or inside calls which should always succeed.
So, the best version is probably ::
def get_status(file):
return open(file).readline()
The caller can deal with the exception if it wants (for example, if it tries
several files in a loop), or just let the exception filter upwards to *its*
caller.
The last version is not very good either --- due to implementation details, the
file would not be closed when an exception is raised until the handler finishes,
and perhaps not at all in non-C implementations (e.g., Jython). ::
def get_status(file):
fp = open(file)
try:
return fp.readline()
finally:
fp.close()
Using the Batteries
===================
Every so often, people seem to be writing stuff in the Python library again,
usually poorly. While the occasional module has a poor interface, it is usually
much better to use the rich standard library and data types that come with
Python then inventing your own.
A useful module very few people know about is :mod:`os.path`. It always has the
correct path arithmetic for your operating system, and will usually be much
better then whatever you come up with yourself.
Compare::
# ugh!
return dir+"/"+file
# better
return os.path.join(dir, file)
More useful functions in :mod:`os.path`: :func:`basename`, :func:`dirname` and
:func:`splitext`.
There are also many useful builtin functions people seem not to be aware of for
some reason: :func:`min` and :func:`max` can find the minimum/maximum of any
sequence with comparable semantics, for example, yet many people write their own
:func:`max`/:func:`min`. Another highly useful function is :func:`reduce`. A
classical use of :func:`reduce` is something like ::
import sys, operator
nums = map(float, sys.argv[1:])
print reduce(operator.add, nums)/len(nums)
This cute little script prints the average of all numbers given on the command
line. The :func:`reduce` adds up all the numbers, and the rest is just some
pre- and postprocessing.
On the same note, note that :func:`float`, :func:`int` and :func:`long` all
accept arguments of type string, and so are suited to parsing --- assuming you
are ready to deal with the :exc:`ValueError` they raise.
Using Backslash to Continue Statements
======================================
Since Python treats a newline as a statement terminator, and since statements
are often more then is comfortable to put in one line, many people do::
if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \
calculate_number(10, 20) != forbulate(500, 360):
pass
You should realize that this is dangerous: a stray space after the ``XXX`` would
make this line wrong, and stray spaces are notoriously hard to see in editors.
In this case, at least it would be a syntax error, but if the code was::
value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \
+ calculate_number(10, 20)*forbulate(500, 360)
then it would just be subtly wrong.
It is usually much better to use the implicit continuation inside parenthesis:
This version is bulletproof::
value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9]
+ calculate_number(10, 20)*forbulate(500, 360))

File diff suppressed because it is too large Load Diff

View File

@ -1,25 +0,0 @@
***************
Python HOWTOs
***************
Python HOWTOs are documents that cover a single, specific topic,
and attempt to cover it fairly completely. Modelled on the Linux
Documentation Project's HOWTO collection, this collection is an
effort to foster documentation that's more detailed than the
Python Library Reference.
Currently, the HOWTOs are:
.. toctree::
:maxdepth: 1
advocacy.rst
pythonmac.rst
curses.rst
doanddont.rst
functional.rst
regex.rst
sockets.rst
unicode.rst
urllib2.rst

View File

@ -1,202 +0,0 @@
.. _using-on-mac:
***************************
Using Python on a Macintosh
***************************
:Author: Bob Savage <bobsavage@mac.com>
Python on a Macintosh running Mac OS X is in principle very similar to Python on
any other Unix platform, but there are a number of additional features such as
the IDE and the Package Manager that are worth pointing out.
The Mac-specific modules are documented in :ref:`mac-specific-services`.
Python on Mac OS 9 or earlier can be quite different from Python on Unix or
Windows, but is beyond the scope of this manual, as that platform is no longer
supported, starting with Python 2.4. See http://www.cwi.nl/~jack/macpython for
installers for the latest 2.3 release for Mac OS 9 and related documentation.
.. _getting-osx:
Getting and Installing MacPython
================================
Mac OS X 10.4 comes with Python 2.3 pre-installed by Apple. However, you are
encouraged to install the most recent version of Python from the Python website
(http://www.python.org). A "universal binary" build of Python 2.5, which runs
natively on the Mac's new Intel and legacy PPC CPU's, is available there.
What you get after installing is a number of things:
* A :file:`MacPython 2.5` folder in your :file:`Applications` folder. In here
you find IDLE, the development environment that is a standard part of official
Python distributions; PythonLauncher, which handles double-clicking Python
scripts from the Finder; and the "Build Applet" tool, which allows you to
package Python scripts as standalone applications on your system.
* A framework :file:`/Library/Frameworks/Python.framework`, which includes the
Python executable and libraries. The installer adds this location to your shell
path. To uninstall MacPython, you can simply remove these three things. A
symlink to the Python executable is placed in /usr/local/bin/.
The Apple-provided build of Python is installed in
:file:`/System/Library/Frameworks/Python.framework` and :file:`/usr/bin/python`,
respectively. You should never modify or delete these, as they are
Apple-controlled and are used by Apple- or third-party software.
IDLE includes a help menu that allows you to access Python documentation. If you
are completely new to Python you should start reading the tutorial introduction
in that document.
If you are familiar with Python on other Unix platforms you should read the
section on running Python scripts from the Unix shell.
How to run a Python script
--------------------------
Your best way to get started with Python on Mac OS X is through the IDLE
integrated development environment, see section :ref:`ide` and use the Help menu
when the IDE is running.
If you want to run Python scripts from the Terminal window command line or from
the Finder you first need an editor to create your script. Mac OS X comes with a
number of standard Unix command line editors, :program:`vim` and
:program:`emacs` among them. If you want a more Mac-like editor,
:program:`BBEdit` or :program:`TextWrangler` from Bare Bones Software (see
http://www.barebones.com/products/bbedit/index.shtml) are good choices, as is
:program:`TextMate` (see http://macromates.com/). Other editors include
:program:`Gvim` (http://macvim.org) and :program:`Aquamacs`
(http://aquamacs.org).
To run your script from the Terminal window you must make sure that
:file:`/usr/local/bin` is in your shell search path.
To run your script from the Finder you have two options:
* Drag it to :program:`PythonLauncher`
* Select :program:`PythonLauncher` as the default application to open your
script (or any .py script) through the finder Info window and double-click it.
:program:`PythonLauncher` has various preferences to control how your script is
launched. Option-dragging allows you to change these for one invocation, or use
its Preferences menu to change things globally.
.. _osx-gui-scripts:
Running scripts with a GUI
--------------------------
With older versions of Python, there is one Mac OS X quirk that you need to be
aware of: programs that talk to the Aqua window manager (in other words,
anything that has a GUI) need to be run in a special way. Use :program:`pythonw`
instead of :program:`python` to start such scripts.
With Python 2.5, you can use either :program:`python` or :program:`pythonw`.
Configuration
-------------
Python on OS X honors all standard Unix environment variables such as
:envvar:`PYTHONPATH`, but setting these variables for programs started from the
Finder is non-standard as the Finder does not read your :file:`.profile` or
:file:`.cshrc` at startup. You need to create a file :file:`~
/.MacOSX/environment.plist`. See Apple's Technical Document QA1067 for details.
For more information on installation Python packages in MacPython, see section
:ref:`mac-package-manager`.
.. _ide:
The IDE
=======
MacPython ships with the standard IDLE development environment. A good
introduction to using IDLE can be found at http://hkn.eecs.berkeley.edu/
dyoo/python/idle_intro/index.html.
.. _mac-package-manager:
Installing Additional Python Packages
=====================================
There are several methods to install additional Python packages:
* http://pythonmac.org/packages/ contains selected compiled packages for Python
2.5, 2.4, and 2.3.
* Packages can be installed via the standard Python distutils mode (``python
setup.py install``).
* Many packages can also be installed via the :program:`setuptools` extension.
GUI Programming on the Mac
==========================
There are several options for building GUI applications on the Mac with Python.
*PyObjC* is a Python binding to Apple's Objective-C/Cocoa framework, which is
the foundation of most modern Mac development. Information on PyObjC is
available from http://pyobjc.sourceforge.net.
The standard Python GUI toolkit is :mod:`Tkinter`, based on the cross-platform
Tk toolkit (http://www.tcl.tk). An Aqua-native version of Tk is bundled with OS
X by Apple, and the latest version can be downloaded and installed from
http://www.activestate.com; it can also be built from source.
*wxPython* is another popular cross-platform GUI toolkit that runs natively on
Mac OS X. Packages and documentation are available from http://www.wxpython.org.
*PyQt* is another popular cross-platform GUI toolkit that runs natively on Mac
OS X. More information can be found at
http://www.riverbankcomputing.co.uk/pyqt/.
Distributing Python Applications on the Mac
===========================================
The "Build Applet" tool that is placed in the MacPython 2.5 folder is fine for
packaging small Python scripts on your own machine to run as a standard Mac
application. This tool, however, is not robust enough to distribute Python
applications to other users.
The standard tool for deploying standalone Python applications on the Mac is
:program:`py2app`. More information on installing and using py2app can be found
at http://undefined.org/python/#py2app.
Application Scripting
=====================
Python can also be used to script other Mac applications via Apple's Open
Scripting Architecture (OSA); see http://appscript.sourceforge.net. Appscript is
a high-level, user-friendly Apple event bridge that allows you to control
scriptable Mac OS X applications using ordinary Python scripts. Appscript makes
Python a serious alternative to Apple's own *AppleScript* language for
automating your Mac. A related package, *PyOSA*, is an OSA language component
for the Python scripting language, allowing Python code to be executed by any
OSA-enabled application (Script Editor, Mail, iTunes, etc.). PyOSA makes Python
a full peer to AppleScript.
Other Resources
===============
The MacPython mailing list is an excellent support resource for Python users and
developers on the Mac:
http://www.python.org/community/sigs/current/pythonmac-sig/
Another useful resource is the MacPython wiki:
http://wiki.python.org/moin/MacPython

File diff suppressed because it is too large Load Diff

View File

@ -1,421 +0,0 @@
****************************
Socket Programming HOWTO
****************************
:Author: Gordon McMillan
.. topic:: Abstract
Sockets are used nearly everywhere, but are one of the most severely
misunderstood technologies around. This is a 10,000 foot overview of sockets.
It's not really a tutorial - you'll still have work to do in getting things
operational. It doesn't cover the fine points (and there are a lot of them), but
I hope it will give you enough background to begin using them decently.
Sockets
=======
Sockets are used nearly everywhere, but are one of the most severely
misunderstood technologies around. This is a 10,000 foot overview of sockets.
It's not really a tutorial - you'll still have work to do in getting things
working. It doesn't cover the fine points (and there are a lot of them), but I
hope it will give you enough background to begin using them decently.
I'm only going to talk about INET sockets, but they account for at least 99% of
the sockets in use. And I'll only talk about STREAM sockets - unless you really
know what you're doing (in which case this HOWTO isn't for you!), you'll get
better behavior and performance from a STREAM socket than anything else. I will
try to clear up the mystery of what a socket is, as well as some hints on how to
work with blocking and non-blocking sockets. But I'll start by talking about
blocking sockets. You'll need to know how they work before dealing with
non-blocking sockets.
Part of the trouble with understanding these things is that "socket" can mean a
number of subtly different things, depending on context. So first, let's make a
distinction between a "client" socket - an endpoint of a conversation, and a
"server" socket, which is more like a switchboard operator. The client
application (your browser, for example) uses "client" sockets exclusively; the
web server it's talking to uses both "server" sockets and "client" sockets.
History
-------
Of the various forms of IPC (*Inter Process Communication*), sockets are by far
the most popular. On any given platform, there are likely to be other forms of
IPC that are faster, but for cross-platform communication, sockets are about the
only game in town.
They were invented in Berkeley as part of the BSD flavor of Unix. They spread
like wildfire with the Internet. With good reason --- the combination of sockets
with INET makes talking to arbitrary machines around the world unbelievably easy
(at least compared to other schemes).
Creating a Socket
=================
Roughly speaking, when you clicked on the link that brought you to this page,
your browser did something like the following::
#create an INET, STREAMing socket
s = socket.socket(
socket.AF_INET, socket.SOCK_STREAM)
#now connect to the web server on port 80
# - the normal http port
s.connect(("www.mcmillan-inc.com", 80))
When the ``connect`` completes, the socket ``s`` can now be used to send in a
request for the text of this page. The same socket will read the reply, and then
be destroyed. That's right - destroyed. Client sockets are normally only used
for one exchange (or a small set of sequential exchanges).
What happens in the web server is a bit more complex. First, the web server
creates a "server socket". ::
#create an INET, STREAMing socket
serversocket = socket.socket(
socket.AF_INET, socket.SOCK_STREAM)
#bind the socket to a public host,
# and a well-known port
serversocket.bind((socket.gethostname(), 80))
#become a server socket
serversocket.listen(5)
A couple things to notice: we used ``socket.gethostname()`` so that the socket
would be visible to the outside world. If we had used ``s.bind(('', 80))`` or
``s.bind(('localhost', 80))`` or ``s.bind(('127.0.0.1', 80))`` we would still
have a "server" socket, but one that was only visible within the same machine.
A second thing to note: low number ports are usually reserved for "well known"
services (HTTP, SNMP etc). If you're playing around, use a nice high number (4
digits).
Finally, the argument to ``listen`` tells the socket library that we want it to
queue up as many as 5 connect requests (the normal max) before refusing outside
connections. If the rest of the code is written properly, that should be plenty.
OK, now we have a "server" socket, listening on port 80. Now we enter the
mainloop of the web server::
while 1:
#accept connections from outside
(clientsocket, address) = serversocket.accept()
#now do something with the clientsocket
#in this case, we'll pretend this is a threaded server
ct = client_thread(clientsocket)
ct.run()
There's actually 3 general ways in which this loop could work - dispatching a
thread to handle ``clientsocket``, create a new process to handle
``clientsocket``, or restructure this app to use non-blocking sockets, and
mulitplex between our "server" socket and any active ``clientsocket``\ s using
``select``. More about that later. The important thing to understand now is
this: this is *all* a "server" socket does. It doesn't send any data. It doesn't
receive any data. It just produces "client" sockets. Each ``clientsocket`` is
created in response to some *other* "client" socket doing a ``connect()`` to the
host and port we're bound to. As soon as we've created that ``clientsocket``, we
go back to listening for more connections. The two "clients" are free to chat it
up - they are using some dynamically allocated port which will be recycled when
the conversation ends.
IPC
---
If you need fast IPC between two processes on one machine, you should look into
whatever form of shared memory the platform offers. A simple protocol based
around shared memory and locks or semaphores is by far the fastest technique.
If you do decide to use sockets, bind the "server" socket to ``'localhost'``. On
most platforms, this will take a shortcut around a couple of layers of network
code and be quite a bit faster.
Using a Socket
==============
The first thing to note, is that the web browser's "client" socket and the web
server's "client" socket are identical beasts. That is, this is a "peer to peer"
conversation. Or to put it another way, *as the designer, you will have to
decide what the rules of etiquette are for a conversation*. Normally, the
``connect``\ ing socket starts the conversation, by sending in a request, or
perhaps a signon. But that's a design decision - it's not a rule of sockets.
Now there are two sets of verbs to use for communication. You can use ``send``
and ``recv``, or you can transform your client socket into a file-like beast and
use ``read`` and ``write``. The latter is the way Java presents their sockets.
I'm not going to talk about it here, except to warn you that you need to use
``flush`` on sockets. These are buffered "files", and a common mistake is to
``write`` something, and then ``read`` for a reply. Without a ``flush`` in
there, you may wait forever for the reply, because the request may still be in
your output buffer.
Now we come the major stumbling block of sockets - ``send`` and ``recv`` operate
on the network buffers. They do not necessarily handle all the bytes you hand
them (or expect from them), because their major focus is handling the network
buffers. In general, they return when the associated network buffers have been
filled (``send``) or emptied (``recv``). They then tell you how many bytes they
handled. It is *your* responsibility to call them again until your message has
been completely dealt with.
When a ``recv`` returns 0 bytes, it means the other side has closed (or is in
the process of closing) the connection. You will not receive any more data on
this connection. Ever. You may be able to send data successfully; I'll talk
about that some on the next page.
A protocol like HTTP uses a socket for only one transfer. The client sends a
request, the reads a reply. That's it. The socket is discarded. This means that
a client can detect the end of the reply by receiving 0 bytes.
But if you plan to reuse your socket for further transfers, you need to realize
that *there is no "EOT" (End of Transfer) on a socket.* I repeat: if a socket
``send`` or ``recv`` returns after handling 0 bytes, the connection has been
broken. If the connection has *not* been broken, you may wait on a ``recv``
forever, because the socket will *not* tell you that there's nothing more to
read (for now). Now if you think about that a bit, you'll come to realize a
fundamental truth of sockets: *messages must either be fixed length* (yuck), *or
be delimited* (shrug), *or indicate how long they are* (much better), *or end by
shutting down the connection*. The choice is entirely yours, (but some ways are
righter than others).
Assuming you don't want to end the connection, the simplest solution is a fixed
length message::
class mysocket:
'''demonstration class only
- coded for clarity, not efficiency
'''
def __init__(self, sock=None):
if sock is None:
self.sock = socket.socket(
socket.AF_INET, socket.SOCK_STREAM)
else:
self.sock = sock
def connect(self, host, port):
self.sock.connect((host, port))
def mysend(self, msg):
totalsent = 0
while totalsent < MSGLEN:
sent = self.sock.send(msg[totalsent:])
if sent == 0:
raise RuntimeError, \
"socket connection broken"
totalsent = totalsent + sent
def myreceive(self):
msg = ''
while len(msg) < MSGLEN:
chunk = self.sock.recv(MSGLEN-len(msg))
if chunk == '':
raise RuntimeError, \
"socket connection broken"
msg = msg + chunk
return msg
The sending code here is usable for almost any messaging scheme - in Python you
send strings, and you can use ``len()`` to determine its length (even if it has
embedded ``\0`` characters). It's mostly the receiving code that gets more
complex. (And in C, it's not much worse, except you can't use ``strlen`` if the
message has embedded ``\0``\ s.)
The easiest enhancement is to make the first character of the message an
indicator of message type, and have the type determine the length. Now you have
two ``recv``\ s - the first to get (at least) that first character so you can
look up the length, and the second in a loop to get the rest. If you decide to
go the delimited route, you'll be receiving in some arbitrary chunk size, (4096
or 8192 is frequently a good match for network buffer sizes), and scanning what
you've received for a delimiter.
One complication to be aware of: if your conversational protocol allows multiple
messages to be sent back to back (without some kind of reply), and you pass
``recv`` an arbitrary chunk size, you may end up reading the start of a
following message. You'll need to put that aside and hold onto it, until it's
needed.
Prefixing the message with it's length (say, as 5 numeric characters) gets more
complex, because (believe it or not), you may not get all 5 characters in one
``recv``. In playing around, you'll get away with it; but in high network loads,
your code will very quickly break unless you use two ``recv`` loops - the first
to determine the length, the second to get the data part of the message. Nasty.
This is also when you'll discover that ``send`` does not always manage to get
rid of everything in one pass. And despite having read this, you will eventually
get bit by it!
In the interests of space, building your character, (and preserving my
competitive position), these enhancements are left as an exercise for the
reader. Lets move on to cleaning up.
Binary Data
-----------
It is perfectly possible to send binary data over a socket. The major problem is
that not all machines use the same formats for binary data. For example, a
Motorola chip will represent a 16 bit integer with the value 1 as the two hex
bytes 00 01. Intel and DEC, however, are byte-reversed - that same 1 is 01 00.
Socket libraries have calls for converting 16 and 32 bit integers - ``ntohl,
htonl, ntohs, htons`` where "n" means *network* and "h" means *host*, "s" means
*short* and "l" means *long*. Where network order is host order, these do
nothing, but where the machine is byte-reversed, these swap the bytes around
appropriately.
In these days of 32 bit machines, the ascii representation of binary data is
frequently smaller than the binary representation. That's because a surprising
amount of the time, all those longs have the value 0, or maybe 1. The string "0"
would be two bytes, while binary is four. Of course, this doesn't fit well with
fixed-length messages. Decisions, decisions.
Disconnecting
=============
Strictly speaking, you're supposed to use ``shutdown`` on a socket before you
``close`` it. The ``shutdown`` is an advisory to the socket at the other end.
Depending on the argument you pass it, it can mean "I'm not going to send
anymore, but I'll still listen", or "I'm not listening, good riddance!". Most
socket libraries, however, are so used to programmers neglecting to use this
piece of etiquette that normally a ``close`` is the same as ``shutdown();
close()``. So in most situations, an explicit ``shutdown`` is not needed.
One way to use ``shutdown`` effectively is in an HTTP-like exchange. The client
sends a request and then does a ``shutdown(1)``. This tells the server "This
client is done sending, but can still receive." The server can detect "EOF" by
a receive of 0 bytes. It can assume it has the complete request. The server
sends a reply. If the ``send`` completes successfully then, indeed, the client
was still receiving.
Python takes the automatic shutdown a step further, and says that when a socket
is garbage collected, it will automatically do a ``close`` if it's needed. But
relying on this is a very bad habit. If your socket just disappears without
doing a ``close``, the socket at the other end may hang indefinitely, thinking
you're just being slow. *Please* ``close`` your sockets when you're done.
When Sockets Die
----------------
Probably the worst thing about using blocking sockets is what happens when the
other side comes down hard (without doing a ``close``). Your socket is likely to
hang. SOCKSTREAM is a reliable protocol, and it will wait a long, long time
before giving up on a connection. If you're using threads, the entire thread is
essentially dead. There's not much you can do about it. As long as you aren't
doing something dumb, like holding a lock while doing a blocking read, the
thread isn't really consuming much in the way of resources. Do *not* try to kill
the thread - part of the reason that threads are more efficient than processes
is that they avoid the overhead associated with the automatic recycling of
resources. In other words, if you do manage to kill the thread, your whole
process is likely to be screwed up.
Non-blocking Sockets
====================
If you've understood the preceeding, you already know most of what you need to
know about the mechanics of using sockets. You'll still use the same calls, in
much the same ways. It's just that, if you do it right, your app will be almost
inside-out.
In Python, you use ``socket.setblocking(0)`` to make it non-blocking. In C, it's
more complex, (for one thing, you'll need to choose between the BSD flavor
``O_NONBLOCK`` and the almost indistinguishable Posix flavor ``O_NDELAY``, which
is completely different from ``TCP_NODELAY``), but it's the exact same idea. You
do this after creating the socket, but before using it. (Actually, if you're
nuts, you can switch back and forth.)
The major mechanical difference is that ``send``, ``recv``, ``connect`` and
``accept`` can return without having done anything. You have (of course) a
number of choices. You can check return code and error codes and generally drive
yourself crazy. If you don't believe me, try it sometime. Your app will grow
large, buggy and suck CPU. So let's skip the brain-dead solutions and do it
right.
Use ``select``.
In C, coding ``select`` is fairly complex. In Python, it's a piece of cake, but
it's close enough to the C version that if you understand ``select`` in Python,
you'll have little trouble with it in C. ::
ready_to_read, ready_to_write, in_error = \
select.select(
potential_readers,
potential_writers,
potential_errs,
timeout)
You pass ``select`` three lists: the first contains all sockets that you might
want to try reading; the second all the sockets you might want to try writing
to, and the last (normally left empty) those that you want to check for errors.
You should note that a socket can go into more than one list. The ``select``
call is blocking, but you can give it a timeout. This is generally a sensible
thing to do - give it a nice long timeout (say a minute) unless you have good
reason to do otherwise.
In return, you will get three lists. They have the sockets that are actually
readable, writable and in error. Each of these lists is a subset (possbily
empty) of the corresponding list you passed in. And if you put a socket in more
than one input list, it will only be (at most) in one output list.
If a socket is in the output readable list, you can be
as-close-to-certain-as-we-ever-get-in-this-business that a ``recv`` on that
socket will return *something*. Same idea for the writable list. You'll be able
to send *something*. Maybe not all you want to, but *something* is better than
nothing. (Actually, any reasonably healthy socket will return as writable - it
just means outbound network buffer space is available.)
If you have a "server" socket, put it in the potential_readers list. If it comes
out in the readable list, your ``accept`` will (almost certainly) work. If you
have created a new socket to ``connect`` to someone else, put it in the
ptoential_writers list. If it shows up in the writable list, you have a decent
chance that it has connected.
One very nasty problem with ``select``: if somewhere in those input lists of
sockets is one which has died a nasty death, the ``select`` will fail. You then
need to loop through every single damn socket in all those lists and do a
``select([sock],[],[],0)`` until you find the bad one. That timeout of 0 means
it won't take long, but it's ugly.
Actually, ``select`` can be handy even with blocking sockets. It's one way of
determining whether you will block - the socket returns as readable when there's
something in the buffers. However, this still doesn't help with the problem of
determining whether the other end is done, or just busy with something else.
**Portability alert**: On Unix, ``select`` works both with the sockets and
files. Don't try this on Windows. On Windows, ``select`` works with sockets
only. Also note that in C, many of the more advanced socket options are done
differently on Windows. In fact, on Windows I usually use threads (which work
very, very well) with my sockets. Face it, if you want any kind of performance,
your code will look very different on Windows than on Unix. (I haven't the
foggiest how you do this stuff on a Mac.)
Performance
-----------
There's no question that the fastest sockets code uses non-blocking sockets and
select to multiplex them. You can put together something that will saturate a
LAN connection without putting any strain on the CPU. The trouble is that an app
written this way can't do much of anything else - it needs to be ready to
shuffle bytes around at all times.
Assuming that your app is actually supposed to do something more than that,
threading is the optimal solution, (and using non-blocking sockets will be
faster than using blocking sockets). Unfortunately, threading support in Unixes
varies both in API and quality. So the normal Unix solution is to fork a
subprocess to deal with each connection. The overhead for this is significant
(and don't do this on Windows - the overhead of process creation is enormous
there). It also means that unless each subprocess is completely independent,
you'll need to use another form of IPC, say a pipe, or shared memory and
semaphores, to communicate between the parent and child processes.
Finally, remember that even though blocking sockets are somewhat slower than
non-blocking, in many cases they are the "right" solution. After all, if your
app is driven by the data it receives over a socket, there's not much sense in
complicating the logic just so your app can wait on ``select`` instead of
``recv``.

View File

@ -1,732 +0,0 @@
*****************
Unicode HOWTO
*****************
:Release: 1.02
This HOWTO discusses Python's support for Unicode, and explains various problems
that people commonly encounter when trying to work with Unicode.
Introduction to Unicode
=======================
History of Character Codes
--------------------------
In 1968, the American Standard Code for Information Interchange, better known by
its acronym ASCII, was standardized. ASCII defined numeric codes for various
characters, with the numeric values running from 0 to
127. For example, the lowercase letter 'a' is assigned 97 as its code
value.
ASCII was an American-developed standard, so it only defined unaccented
characters. There was an 'e', but no 'é' or 'Í'. This meant that languages
which required accented characters couldn't be faithfully represented in ASCII.
(Actually the missing accents matter for English, too, which contains words such
as 'naïve' and 'café', and some publications have house styles which require
spellings such as 'coöperate'.)
For a while people just wrote programs that didn't display accents. I remember
looking at Apple ][ BASIC programs, published in French-language publications in
the mid-1980s, that had lines like these::
PRINT "FICHER EST COMPLETE."
PRINT "CARACTERE NON ACCEPTE."
Those messages should contain accents, and they just look wrong to someone who
can read French.
In the 1980s, almost all personal computers were 8-bit, meaning that bytes could
hold values ranging from 0 to 255. ASCII codes only went up to 127, so some
machines assigned values between 128 and 255 to accented characters. Different
machines had different codes, however, which led to problems exchanging files.
Eventually various commonly used sets of values for the 128-255 range emerged.
Some were true standards, defined by the International Standards Organization,
and some were **de facto** conventions that were invented by one company or
another and managed to catch on.
255 characters aren't very many. For example, you can't fit both the accented
characters used in Western Europe and the Cyrillic alphabet used for Russian
into the 128-255 range because there are more than 127 such characters.
You could write files using different codes (all your Russian files in a coding
system called KOI8, all your French files in a different coding system called
Latin1), but what if you wanted to write a French document that quotes some
Russian text? In the 1980s people began to want to solve this problem, and the
Unicode standardization effort began.
Unicode started out using 16-bit characters instead of 8-bit characters. 16
bits means you have 2^16 = 65,536 distinct values available, making it possible
to represent many different characters from many different alphabets; an initial
goal was to have Unicode contain the alphabets for every single human language.
It turns out that even 16 bits isn't enough to meet that goal, and the modern
Unicode specification uses a wider range of codes, 0-1,114,111 (0x10ffff in
base-16).
There's a related ISO standard, ISO 10646. Unicode and ISO 10646 were
originally separate efforts, but the specifications were merged with the 1.1
revision of Unicode.
(This discussion of Unicode's history is highly simplified. I don't think the
average Python programmer needs to worry about the historical details; consult
the Unicode consortium site listed in the References for more information.)
Definitions
-----------
A **character** is the smallest possible component of a text. 'A', 'B', 'C',
etc., are all different characters. So are 'È' and 'Í'. Characters are
abstractions, and vary depending on the language or context you're talking
about. For example, the symbol for ohms (Ω) is usually drawn much like the
capital letter omega (Ω) in the Greek alphabet (they may even be the same in
some fonts), but these are two different characters that have different
meanings.
The Unicode standard describes how characters are represented by **code
points**. A code point is an integer value, usually denoted in base 16. In the
standard, a code point is written using the notation U+12ca to mean the
character with value 0x12ca (4810 decimal). The Unicode standard contains a lot
of tables listing characters and their corresponding code points::
0061 'a'; LATIN SMALL LETTER A
0062 'b'; LATIN SMALL LETTER B
0063 'c'; LATIN SMALL LETTER C
...
007B '{'; LEFT CURLY BRACKET
Strictly, these definitions imply that it's meaningless to say 'this is
character U+12ca'. U+12ca is a code point, which represents some particular
character; in this case, it represents the character 'ETHIOPIC SYLLABLE WI'. In
informal contexts, this distinction between code points and characters will
sometimes be forgotten.
A character is represented on a screen or on paper by a set of graphical
elements that's called a **glyph**. The glyph for an uppercase A, for example,
is two diagonal strokes and a horizontal stroke, though the exact details will
depend on the font being used. Most Python code doesn't need to worry about
glyphs; figuring out the correct glyph to display is generally the job of a GUI
toolkit or a terminal's font renderer.
Encodings
---------
To summarize the previous section: a Unicode string is a sequence of code
points, which are numbers from 0 to 0x10ffff. This sequence needs to be
represented as a set of bytes (meaning, values from 0-255) in memory. The rules
for translating a Unicode string into a sequence of bytes are called an
**encoding**.
The first encoding you might think of is an array of 32-bit integers. In this
representation, the string "Python" would look like this::
P y t h o n
0x50 00 00 00 79 00 00 00 74 00 00 00 68 00 00 00 6f 00 00 00 6e 00 00 00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
This representation is straightforward but using it presents a number of
problems.
1. It's not portable; different processors order the bytes differently.
2. It's very wasteful of space. In most texts, the majority of the code points
are less than 127, or less than 255, so a lot of space is occupied by zero
bytes. The above string takes 24 bytes compared to the 6 bytes needed for an
ASCII representation. Increased RAM usage doesn't matter too much (desktop
computers have megabytes of RAM, and strings aren't usually that large), but
expanding our usage of disk and network bandwidth by a factor of 4 is
intolerable.
3. It's not compatible with existing C functions such as ``strlen()``, so a new
family of wide string functions would need to be used.
4. Many Internet standards are defined in terms of textual data, and can't
handle content with embedded zero bytes.
Generally people don't use this encoding, instead choosing other encodings that
are more efficient and convenient.
Encodings don't have to handle every possible Unicode character, and most
encodings don't. For example, Python's default encoding is the 'ascii'
encoding. The rules for converting a Unicode string into the ASCII encoding are
simple; for each code point:
1. If the code point is < 128, each byte is the same as the value of the code
point.
2. If the code point is 128 or greater, the Unicode string can't be represented
in this encoding. (Python raises a :exc:`UnicodeEncodeError` exception in this
case.)
Latin-1, also known as ISO-8859-1, is a similar encoding. Unicode code points
0-255 are identical to the Latin-1 values, so converting to this encoding simply
requires converting code points to byte values; if a code point larger than 255
is encountered, the string can't be encoded into Latin-1.
Encodings don't have to be simple one-to-one mappings like Latin-1. Consider
IBM's EBCDIC, which was used on IBM mainframes. Letter values weren't in one
block: 'a' through 'i' had values from 129 to 137, but 'j' through 'r' were 145
through 153. If you wanted to use EBCDIC as an encoding, you'd probably use
some sort of lookup table to perform the conversion, but this is largely an
internal detail.
UTF-8 is one of the most commonly used encodings. UTF stands for "Unicode
Transformation Format", and the '8' means that 8-bit numbers are used in the
encoding. (There's also a UTF-16 encoding, but it's less frequently used than
UTF-8.) UTF-8 uses the following rules:
1. If the code point is <128, it's represented by the corresponding byte value.
2. If the code point is between 128 and 0x7ff, it's turned into two byte values
between 128 and 255.
3. Code points >0x7ff are turned into three- or four-byte sequences, where each
byte of the sequence is between 128 and 255.
UTF-8 has several convenient properties:
1. It can handle any Unicode code point.
2. A Unicode string is turned into a string of bytes containing no embedded zero
bytes. This avoids byte-ordering issues, and means UTF-8 strings can be
processed by C functions such as ``strcpy()`` and sent through protocols that
can't handle zero bytes.
3. A string of ASCII text is also valid UTF-8 text.
4. UTF-8 is fairly compact; the majority of code points are turned into two
bytes, and values less than 128 occupy only a single byte.
5. If bytes are corrupted or lost, it's possible to determine the start of the
next UTF-8-encoded code point and resynchronize. It's also unlikely that
random 8-bit data will look like valid UTF-8.
References
----------
The Unicode Consortium site at <http://www.unicode.org> has character charts, a
glossary, and PDF versions of the Unicode specification. Be prepared for some
difficult reading. <http://www.unicode.org/history/> is a chronology of the
origin and development of Unicode.
To help understand the standard, Jukka Korpela has written an introductory guide
to reading the Unicode character tables, available at
<http://www.cs.tut.fi/~jkorpela/unicode/guide.html>.
Roman Czyborra wrote another explanation of Unicode's basic principles; it's at
<http://czyborra.com/unicode/characters.html>. Czyborra has written a number of
other Unicode-related documentation, available from <http://www.cyzborra.com>.
Two other good introductory articles were written by Joel Spolsky
<http://www.joelonsoftware.com/articles/Unicode.html> and Jason Orendorff
<http://www.jorendorff.com/articles/unicode/>. If this introduction didn't make
things clear to you, you should try reading one of these alternate articles
before continuing.
Wikipedia entries are often helpful; see the entries for "character encoding"
<http://en.wikipedia.org/wiki/Character_encoding> and UTF-8
<http://en.wikipedia.org/wiki/UTF-8>, for example.
Python's Unicode Support
========================
Now that you've learned the rudiments of Unicode, we can look at Python's
Unicode features.
The Unicode Type
----------------
Unicode strings are expressed as instances of the :class:`unicode` type, one of
Python's repertoire of built-in types. It derives from an abstract type called
:class:`basestring`, which is also an ancestor of the :class:`str` type; you can
therefore check if a value is a string type with ``isinstance(value,
basestring)``. Under the hood, Python represents Unicode strings as either 16-
or 32-bit integers, depending on how the Python interpreter was compiled.
The :func:`unicode` constructor has the signature ``unicode(string[, encoding,
errors])``. All of its arguments should be 8-bit strings. The first argument
is converted to Unicode using the specified encoding; if you leave off the
``encoding`` argument, the ASCII encoding is used for the conversion, so
characters greater than 127 will be treated as errors::
>>> unicode('abcdef')
u'abcdef'
>>> s = unicode('abcdef')
>>> type(s)
<type 'unicode'>
>>> unicode('abcdef' + chr(255))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 6:
ordinal not in range(128)
The ``errors`` argument specifies the response when the input string can't be
converted according to the encoding's rules. Legal values for this argument are
'strict' (raise a ``UnicodeDecodeError`` exception), 'replace' (add U+FFFD,
'REPLACEMENT CHARACTER'), or 'ignore' (just leave the character out of the
Unicode result). The following examples show the differences::
>>> unicode('\x80abc', errors='strict')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0:
ordinal not in range(128)
>>> unicode('\x80abc', errors='replace')
u'\ufffdabc'
>>> unicode('\x80abc', errors='ignore')
u'abc'
Encodings are specified as strings containing the encoding's name. Python 2.4
comes with roughly 100 different encodings; see the Python Library Reference at
<http://docs.python.org/lib/standard-encodings.html> for a list. Some encodings
have multiple names; for example, 'latin-1', 'iso_8859_1' and '8859' are all
synonyms for the same encoding.
One-character Unicode strings can also be created with the :func:`unichr`
built-in function, which takes integers and returns a Unicode string of length 1
that contains the corresponding code point. The reverse operation is the
built-in :func:`ord` function that takes a one-character Unicode string and
returns the code point value::
>>> unichr(40960)
u'\ua000'
>>> ord(u'\ua000')
40960
Instances of the :class:`unicode` type have many of the same methods as the
8-bit string type for operations such as searching and formatting::
>>> s = u'Was ever feather so lightly blown to and fro as this multitude?'
>>> s.count('e')
5
>>> s.find('feather')
9
>>> s.find('bird')
-1
>>> s.replace('feather', 'sand')
u'Was ever sand so lightly blown to and fro as this multitude?'
>>> s.upper()
u'WAS EVER FEATHER SO LIGHTLY BLOWN TO AND FRO AS THIS MULTITUDE?'
Note that the arguments to these methods can be Unicode strings or 8-bit
strings. 8-bit strings will be converted to Unicode before carrying out the
operation; Python's default ASCII encoding will be used, so characters greater
than 127 will cause an exception::
>>> s.find('Was\x9f')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0x9f in position 3: ordinal not in range(128)
>>> s.find(u'Was\x9f')
-1
Much Python code that operates on strings will therefore work with Unicode
strings without requiring any changes to the code. (Input and output code needs
more updating for Unicode; more on this later.)
Another important method is ``.encode([encoding], [errors='strict'])``, which
returns an 8-bit string version of the Unicode string, encoded in the requested
encoding. The ``errors`` parameter is the same as the parameter of the
``unicode()`` constructor, with one additional possibility; as well as 'strict',
'ignore', and 'replace', you can also pass 'xmlcharrefreplace' which uses XML's
character references. The following example shows the different results::
>>> u = unichr(40960) + u'abcd' + unichr(1972)
>>> u.encode('utf-8')
'\xea\x80\x80abcd\xde\xb4'
>>> u.encode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character '\ua000' in position 0: ordinal not in range(128)
>>> u.encode('ascii', 'ignore')
'abcd'
>>> u.encode('ascii', 'replace')
'?abcd?'
>>> u.encode('ascii', 'xmlcharrefreplace')
'&#40960;abcd&#1972;'
Python's 8-bit strings have a ``.decode([encoding], [errors])`` method that
interprets the string using the given encoding::
>>> u = unichr(40960) + u'abcd' + unichr(1972) # Assemble a string
>>> utf8_version = u.encode('utf-8') # Encode as UTF-8
>>> type(utf8_version), utf8_version
(<type 'str'>, '\xea\x80\x80abcd\xde\xb4')
>>> u2 = utf8_version.decode('utf-8') # Decode using UTF-8
>>> u == u2 # The two strings match
True
The low-level routines for registering and accessing the available encodings are
found in the :mod:`codecs` module. However, the encoding and decoding functions
returned by this module are usually more low-level than is comfortable, so I'm
not going to describe the :mod:`codecs` module here. If you need to implement a
completely new encoding, you'll need to learn about the :mod:`codecs` module
interfaces, but implementing encodings is a specialized task that also won't be
covered here. Consult the Python documentation to learn more about this module.
The most commonly used part of the :mod:`codecs` module is the
:func:`codecs.open` function which will be discussed in the section on input and
output.
Unicode Literals in Python Source Code
--------------------------------------
In Python source code, Unicode literals are written as strings prefixed with the
'u' or 'U' character: ``u'abcdefghijk'``. Specific code points can be written
using the ``\u`` escape sequence, which is followed by four hex digits giving
the code point. The ``\U`` escape sequence is similar, but expects 8 hex
digits, not 4.
Unicode literals can also use the same escape sequences as 8-bit strings,
including ``\x``, but ``\x`` only takes two hex digits so it can't express an
arbitrary code point. Octal escapes can go up to U+01ff, which is octal 777.
::
>>> s = u"a\xac\u1234\u20ac\U00008000"
^^^^ two-digit hex escape
^^^^^^ four-digit Unicode escape
^^^^^^^^^^ eight-digit Unicode escape
>>> for c in s: print ord(c),
...
97 172 4660 8364 32768
Using escape sequences for code points greater than 127 is fine in small doses,
but becomes an annoyance if you're using many accented characters, as you would
in a program with messages in French or some other accent-using language. You
can also assemble strings using the :func:`unichr` built-in function, but this is
even more tedious.
Ideally, you'd want to be able to write literals in your language's natural
encoding. You could then edit Python source code with your favorite editor
which would display the accented characters naturally, and have the right
characters used at runtime.
Python supports writing Unicode literals in any encoding, but you have to
declare the encoding being used. This is done by including a special comment as
either the first or second line of the source file::
#!/usr/bin/env python
# -*- coding: latin-1 -*-
u = u'abcdé'
print ord(u[-1])
The syntax is inspired by Emacs's notation for specifying variables local to a
file. Emacs supports many different variables, but Python only supports
'coding'. The ``-*-`` symbols indicate that the comment is special; within
them, you must supply the name ``coding`` and the name of your chosen encoding,
separated by ``':'``.
If you don't include such a comment, the default encoding used will be ASCII.
Versions of Python before 2.4 were Euro-centric and assumed Latin-1 as a default
encoding for string literals; in Python 2.4, characters greater than 127 still
work but result in a warning. For example, the following program has no
encoding declaration::
#!/usr/bin/env python
u = u'abcdé'
print ord(u[-1])
When you run it with Python 2.4, it will output the following warning::
amk:~$ python p263.py
sys:1: DeprecationWarning: Non-ASCII character '\xe9'
in file p263.py on line 2, but no encoding declared;
see http://www.python.org/peps/pep-0263.html for details
Unicode Properties
------------------
The Unicode specification includes a database of information about code points.
For each code point that's defined, the information includes the character's
name, its category, the numeric value if applicable (Unicode has characters
representing the Roman numerals and fractions such as one-third and
four-fifths). There are also properties related to the code point's use in
bidirectional text and other display-related properties.
The following program displays some information about several characters, and
prints the numeric value of one particular character::
import unicodedata
u = unichr(233) + unichr(0x0bf2) + unichr(3972) + unichr(6000) + unichr(13231)
for i, c in enumerate(u):
print i, '%04x' % ord(c), unicodedata.category(c),
print unicodedata.name(c)
# Get numeric value of second character
print unicodedata.numeric(u[1])
When run, this prints::
0 00e9 Ll LATIN SMALL LETTER E WITH ACUTE
1 0bf2 No TAMIL NUMBER ONE THOUSAND
2 0f84 Mn TIBETAN MARK HALANTA
3 1770 Lo TAGBANWA LETTER SA
4 33af So SQUARE RAD OVER S SQUARED
1000.0
The category codes are abbreviations describing the nature of the character.
These are grouped into categories such as "Letter", "Number", "Punctuation", or
"Symbol", which in turn are broken up into subcategories. To take the codes
from the above output, ``'Ll'`` means 'Letter, lowercase', ``'No'`` means
"Number, other", ``'Mn'`` is "Mark, nonspacing", and ``'So'`` is "Symbol,
other". See
<http://www.unicode.org/Public/UNIDATA/UCD.html#General_Category_Values> for a
list of category codes.
References
----------
The Unicode and 8-bit string types are described in the Python library reference
at :ref:`typesseq`.
The documentation for the :mod:`unicodedata` module.
The documentation for the :mod:`codecs` module.
Marc-André Lemburg gave a presentation at EuroPython 2002 titled "Python and
Unicode". A PDF version of his slides is available at
<http://www.egenix.com/files/python/Unicode-EPC2002-Talk.pdf>, and is an
excellent overview of the design of Python's Unicode features.
Reading and Writing Unicode Data
================================
Once you've written some code that works with Unicode data, the next problem is
input/output. How do you get Unicode strings into your program, and how do you
convert Unicode into a form suitable for storage or transmission?
It's possible that you may not need to do anything depending on your input
sources and output destinations; you should check whether the libraries used in
your application support Unicode natively. XML parsers often return Unicode
data, for example. Many relational databases also support Unicode-valued
columns and can return Unicode values from an SQL query.
Unicode data is usually converted to a particular encoding before it gets
written to disk or sent over a socket. It's possible to do all the work
yourself: open a file, read an 8-bit string from it, and convert the string with
``unicode(str, encoding)``. However, the manual approach is not recommended.
One problem is the multi-byte nature of encodings; one Unicode character can be
represented by several bytes. If you want to read the file in arbitrary-sized
chunks (say, 1K or 4K), you need to write error-handling code to catch the case
where only part of the bytes encoding a single Unicode character are read at the
end of a chunk. One solution would be to read the entire file into memory and
then perform the decoding, but that prevents you from working with files that
are extremely large; if you need to read a 2Gb file, you need 2Gb of RAM.
(More, really, since for at least a moment you'd need to have both the encoded
string and its Unicode version in memory.)
The solution would be to use the low-level decoding interface to catch the case
of partial coding sequences. The work of implementing this has already been
done for you: the :mod:`codecs` module includes a version of the :func:`open`
function that returns a file-like object that assumes the file's contents are in
a specified encoding and accepts Unicode parameters for methods such as
``.read()`` and ``.write()``.
The function's parameters are ``open(filename, mode='rb', encoding=None,
errors='strict', buffering=1)``. ``mode`` can be ``'r'``, ``'w'``, or ``'a'``,
just like the corresponding parameter to the regular built-in ``open()``
function; add a ``'+'`` to update the file. ``buffering`` is similarly parallel
to the standard function's parameter. ``encoding`` is a string giving the
encoding to use; if it's left as ``None``, a regular Python file object that
accepts 8-bit strings is returned. Otherwise, a wrapper object is returned, and
data written to or read from the wrapper object will be converted as needed.
``errors`` specifies the action for encoding errors and can be one of the usual
values of 'strict', 'ignore', and 'replace'.
Reading Unicode from a file is therefore simple::
import codecs
f = codecs.open('unicode.rst', encoding='utf-8')
for line in f:
print repr(line)
It's also possible to open files in update mode, allowing both reading and
writing::
f = codecs.open('test', encoding='utf-8', mode='w+')
f.write(u'\u4500 blah blah blah\n')
f.seek(0)
print repr(f.readline()[:1])
f.close()
Unicode character U+FEFF is used as a byte-order mark (BOM), and is often
written as the first character of a file in order to assist with autodetection
of the file's byte ordering. Some encodings, such as UTF-16, expect a BOM to be
present at the start of a file; when such an encoding is used, the BOM will be
automatically written as the first character and will be silently dropped when
the file is read. There are variants of these encodings, such as 'utf-16-le'
and 'utf-16-be' for little-endian and big-endian encodings, that specify one
particular byte ordering and don't skip the BOM.
Unicode filenames
-----------------
Most of the operating systems in common use today support filenames that contain
arbitrary Unicode characters. Usually this is implemented by converting the
Unicode string into some encoding that varies depending on the system. For
example, MacOS X uses UTF-8 while Windows uses a configurable encoding; on
Windows, Python uses the name "mbcs" to refer to whatever the currently
configured encoding is. On Unix systems, there will only be a filesystem
encoding if you've set the ``LANG`` or ``LC_CTYPE`` environment variables; if
you haven't, the default encoding is ASCII.
The :func:`sys.getfilesystemencoding` function returns the encoding to use on
your current system, in case you want to do the encoding manually, but there's
not much reason to bother. When opening a file for reading or writing, you can
usually just provide the Unicode string as the filename, and it will be
automatically converted to the right encoding for you::
filename = u'filename\u4500abc'
f = open(filename, 'w')
f.write('blah\n')
f.close()
Functions in the :mod:`os` module such as :func:`os.stat` will also accept Unicode
filenames.
:func:`os.listdir`, which returns filenames, raises an issue: should it return
the Unicode version of filenames, or should it return 8-bit strings containing
the encoded versions? :func:`os.listdir` will do both, depending on whether you
provided the directory path as an 8-bit string or a Unicode string. If you pass
a Unicode string as the path, filenames will be decoded using the filesystem's
encoding and a list of Unicode strings will be returned, while passing an 8-bit
path will return the 8-bit versions of the filenames. For example, assuming the
default filesystem encoding is UTF-8, running the following program::
fn = u'filename\u4500abc'
f = open(fn, 'w')
f.close()
import os
print os.listdir('.')
print os.listdir(u'.')
will produce the following output::
amk:~$ python t.py
['.svn', 'filename\xe4\x94\x80abc', ...]
[u'.svn', u'filename\u4500abc', ...]
The first list contains UTF-8-encoded filenames, and the second list contains
the Unicode versions.
Tips for Writing Unicode-aware Programs
---------------------------------------
This section provides some suggestions on writing software that deals with
Unicode.
The most important tip is:
Software should only work with Unicode strings internally, converting to a
particular encoding on output.
If you attempt to write processing functions that accept both Unicode and 8-bit
strings, you will find your program vulnerable to bugs wherever you combine the
two different kinds of strings. Python's default encoding is ASCII, so whenever
a character with an ASCII value > 127 is in the input data, you'll get a
:exc:`UnicodeDecodeError` because that character can't be handled by the ASCII
encoding.
It's easy to miss such problems if you only test your software with data that
doesn't contain any accents; everything will seem to work, but there's actually
a bug in your program waiting for the first user who attempts to use characters
> 127. A second tip, therefore, is:
Include characters > 127 and, even better, characters > 255 in your test
data.
When using data coming from a web browser or some other untrusted source, a
common technique is to check for illegal characters in a string before using the
string in a generated command line or storing it in a database. If you're doing
this, be careful to check the string once it's in the form that will be used or
stored; it's possible for encodings to be used to disguise characters. This is
especially true if the input data also specifies the encoding; many encodings
leave the commonly checked-for characters alone, but Python includes some
encodings such as ``'base64'`` that modify every single character.
For example, let's say you have a content management system that takes a Unicode
filename, and you want to disallow paths with a '/' character. You might write
this code::
def read_file (filename, encoding):
if '/' in filename:
raise ValueError("'/' not allowed in filenames")
unicode_name = filename.decode(encoding)
f = open(unicode_name, 'r')
# ... return contents of file ...
However, if an attacker could specify the ``'base64'`` encoding, they could pass
``'L2V0Yy9wYXNzd2Q='``, which is the base-64 encoded form of the string
``'/etc/passwd'``, to read a system file. The above code looks for ``'/'``
characters in the encoded form and misses the dangerous character in the
resulting decoded form.
References
----------
The PDF slides for Marc-André Lemburg's presentation "Writing Unicode-aware
Applications in Python" are available at
<http://www.egenix.com/files/python/LSM2005-Developing-Unicode-aware-applications-in-Python.pdf>
and discuss questions of character encodings as well as how to internationalize
and localize an application.
Revision History and Acknowledgements
=====================================
Thanks to the following people who have noted errors or offered suggestions on
this article: Nicholas Bastin, Marius Gedminas, Kent Johnson, Ken Krugler,
Marc-André Lemburg, Martin von Löwis, Chad Whitacre.
Version 1.0: posted August 5 2005.
Version 1.01: posted August 7 2005. Corrects factual and markup errors; adds
several links.
Version 1.02: posted August 16 2005. Corrects factual errors.
.. comment Additional topic: building Python w/ UCS2 or UCS4 support
.. comment Describe obscure -U switch somewhere?
.. comment Describe use of codecs.StreamRecoder and StreamReaderWriter
.. comment
Original outline:
- [ ] Unicode introduction
- [ ] ASCII
- [ ] Terms
- [ ] Character
- [ ] Code point
- [ ] Encodings
- [ ] Common encodings: ASCII, Latin-1, UTF-8
- [ ] Unicode Python type
- [ ] Writing unicode literals
- [ ] Obscurity: -U switch
- [ ] Built-ins
- [ ] unichr()
- [ ] ord()
- [ ] unicode() constructor
- [ ] Unicode type
- [ ] encode(), decode() methods
- [ ] Unicodedata module for character properties
- [ ] I/O
- [ ] Reading/writing Unicode data into files
- [ ] Byte-order marks
- [ ] Unicode filenames
- [ ] Writing Unicode programs
- [ ] Do everything in Unicode
- [ ] Declaring source code encodings (PEP 263)
- [ ] Other issues
- [ ] Building Python (UCS2, UCS4)

View File

@ -1,578 +0,0 @@
************************************************
HOWTO Fetch Internet Resources Using urllib2
************************************************
:Author: `Michael Foord <http://www.voidspace.org.uk/python/index.shtml>`_
.. note::
There is an French translation of an earlier revision of this
HOWTO, available at `urllib2 - Le Manuel manquant
<http://www.voidspace/python/articles/urllib2_francais.shtml>`_.
Introduction
============
.. sidebar:: Related Articles
You may also find useful the following article on fetching web resources
with Python :
* `Basic Authentication <http://www.voidspace.org.uk/python/articles/authentication.shtml>`_
A tutorial on *Basic Authentication*, with examples in Python.
**urllib2** is a `Python <http://www.python.org>`_ module for fetching URLs
(Uniform Resource Locators). It offers a very simple interface, in the form of
the *urlopen* function. This is capable of fetching URLs using a variety of
different protocols. It also offers a slightly more complex interface for
handling common situations - like basic authentication, cookies, proxies and so
on. These are provided by objects called handlers and openers.
urllib2 supports fetching URLs for many "URL schemes" (identified by the string
before the ":" in URL - for example "ftp" is the URL scheme of
"ftp://python.org/") using their associated network protocols (e.g. FTP, HTTP).
This tutorial focuses on the most common case, HTTP.
For straightforward situations *urlopen* is very easy to use. But as soon as you
encounter errors or non-trivial cases when opening HTTP URLs, you will need some
understanding of the HyperText Transfer Protocol. The most comprehensive and
authoritative reference to HTTP is :rfc:`2616`. This is a technical document and
not intended to be easy to read. This HOWTO aims to illustrate using *urllib2*,
with enough detail about HTTP to help you through. It is not intended to replace
the :mod:`urllib2` docs, but is supplementary to them.
Fetching URLs
=============
The simplest way to use urllib2 is as follows::
import urllib2
response = urllib2.urlopen('http://python.org/')
html = response.read()
Many uses of urllib2 will be that simple (note that instead of an 'http:' URL we
could have used an URL starting with 'ftp:', 'file:', etc.). However, it's the
purpose of this tutorial to explain the more complicated cases, concentrating on
HTTP.
HTTP is based on requests and responses - the client makes requests and servers
send responses. urllib2 mirrors this with a ``Request`` object which represents
the HTTP request you are making. In its simplest form you create a Request
object that specifies the URL you want to fetch. Calling ``urlopen`` with this
Request object returns a response object for the URL requested. This response is
a file-like object, which means you can for example call ``.read()`` on the
response::
import urllib2
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req)
the_page = response.read()
Note that urllib2 makes use of the same Request interface to handle all URL
schemes. For example, you can make an FTP request like so::
req = urllib2.Request('ftp://example.com/')
In the case of HTTP, there are two extra things that Request objects allow you
to do: First, you can pass data to be sent to the server. Second, you can pass
extra information ("metadata") *about* the data or the about request itself, to
the server - this information is sent as HTTP "headers". Let's look at each of
these in turn.
Data
----
Sometimes you want to send data to a URL (often the URL will refer to a CGI
(Common Gateway Interface) script [#]_ or other web application). With HTTP,
this is often done using what's known as a **POST** request. This is often what
your browser does when you submit a HTML form that you filled in on the web. Not
all POSTs have to come from forms: you can use a POST to transmit arbitrary data
to your own application. In the common case of HTML forms, the data needs to be
encoded in a standard way, and then passed to the Request object as the ``data``
argument. The encoding is done using a function from the ``urllib`` library
*not* from ``urllib2``. ::
import urllib
import urllib2
url = 'http://www.someserver.com/cgi-bin/register.cgi'
values = {'name' : 'Michael Foord',
'location' : 'Northampton',
'language' : 'Python' }
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()
Note that other encodings are sometimes required (e.g. for file upload from HTML
forms - see `HTML Specification, Form Submission
<http://www.w3.org/TR/REC-html40/interact/forms.html#h-17.13>`_ for more
details).
If you do not pass the ``data`` argument, urllib2 uses a **GET** request. One
way in which GET and POST requests differ is that POST requests often have
"side-effects": they change the state of the system in some way (for example by
placing an order with the website for a hundredweight of tinned spam to be
delivered to your door). Though the HTTP standard makes it clear that POSTs are
intended to *always* cause side-effects, and GET requests *never* to cause
side-effects, nothing prevents a GET request from having side-effects, nor a
POST requests from having no side-effects. Data can also be passed in an HTTP
GET request by encoding it in the URL itself.
This is done as follows::
>>> import urllib2
>>> import urllib
>>> data = {}
>>> data['name'] = 'Somebody Here'
>>> data['location'] = 'Northampton'
>>> data['language'] = 'Python'
>>> url_values = urllib.urlencode(data)
>>> print url_values
name=Somebody+Here&language=Python&location=Northampton
>>> url = 'http://www.example.com/example.cgi'
>>> full_url = url + '?' + url_values
>>> data = urllib2.open(full_url)
Notice that the full URL is created by adding a ``?`` to the URL, followed by
the encoded values.
Headers
-------
We'll discuss here one particular HTTP header, to illustrate how to add headers
to your HTTP request.
Some websites [#]_ dislike being browsed by programs, or send different versions
to different browsers [#]_ . By default urllib2 identifies itself as
``Python-urllib/x.y`` (where ``x`` and ``y`` are the major and minor version
numbers of the Python release,
e.g. ``Python-urllib/2.5``), which may confuse the site, or just plain
not work. The way a browser identifies itself is through the
``User-Agent`` header [#]_. When you create a Request object you can
pass a dictionary of headers in. The following example makes the same
request as above, but identifies itself as a version of Internet
Explorer [#]_. ::
import urllib
import urllib2
url = 'http://www.someserver.com/cgi-bin/register.cgi'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
values = {'name' : 'Michael Foord',
'location' : 'Northampton',
'language' : 'Python' }
headers = { 'User-Agent' : user_agent }
data = urllib.urlencode(values)
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
the_page = response.read()
The response also has two useful methods. See the section on `info and geturl`_
which comes after we have a look at what happens when things go wrong.
Handling Exceptions
===================
*urlopen* raises ``URLError`` when it cannot handle a response (though as usual
with Python APIs, builtin exceptions such as ValueError, TypeError etc. may also
be raised).
``HTTPError`` is the subclass of ``URLError`` raised in the specific case of
HTTP URLs.
URLError
--------
Often, URLError is raised because there is no network connection (no route to
the specified server), or the specified server doesn't exist. In this case, the
exception raised will have a 'reason' attribute, which is a tuple containing an
error code and a text error message.
e.g. ::
>>> req = urllib2.Request('http://www.pretend_server.org')
>>> try: urllib2.urlopen(req)
>>> except URLError, e:
>>> print e.reason
>>>
(4, 'getaddrinfo failed')
HTTPError
---------
Every HTTP response from the server contains a numeric "status code". Sometimes
the status code indicates that the server is unable to fulfil the request. The
default handlers will handle some of these responses for you (for example, if
the response is a "redirection" that requests the client fetch the document from
a different URL, urllib2 will handle that for you). For those it can't handle,
urlopen will raise an ``HTTPError``. Typical errors include '404' (page not
found), '403' (request forbidden), and '401' (authentication required).
See section 10 of RFC 2616 for a reference on all the HTTP error codes.
The ``HTTPError`` instance raised will have an integer 'code' attribute, which
corresponds to the error sent by the server.
Error Codes
~~~~~~~~~~~
Because the default handlers handle redirects (codes in the 300 range), and
codes in the 100-299 range indicate success, you will usually only see error
codes in the 400-599 range.
``BaseHTTPServer.BaseHTTPRequestHandler.responses`` is a useful dictionary of
response codes in that shows all the response codes used by RFC 2616. The
dictionary is reproduced here for convenience ::
# Table mapping response codes to messages; entries have the
# form {code: (shortmessage, longmessage)}.
responses = {
100: ('Continue', 'Request received, please continue'),
101: ('Switching Protocols',
'Switching to new protocol; obey Upgrade header'),
200: ('OK', 'Request fulfilled, document follows'),
201: ('Created', 'Document created, URL follows'),
202: ('Accepted',
'Request accepted, processing continues off-line'),
203: ('Non-Authoritative Information', 'Request fulfilled from cache'),
204: ('No Content', 'Request fulfilled, nothing follows'),
205: ('Reset Content', 'Clear input form for further input.'),
206: ('Partial Content', 'Partial content follows.'),
300: ('Multiple Choices',
'Object has several resources -- see URI list'),
301: ('Moved Permanently', 'Object moved permanently -- see URI list'),
302: ('Found', 'Object moved temporarily -- see URI list'),
303: ('See Other', 'Object moved -- see Method and URL list'),
304: ('Not Modified',
'Document has not changed since given time'),
305: ('Use Proxy',
'You must use proxy specified in Location to access this '
'resource.'),
307: ('Temporary Redirect',
'Object moved temporarily -- see URI list'),
400: ('Bad Request',
'Bad request syntax or unsupported method'),
401: ('Unauthorized',
'No permission -- see authorization schemes'),
402: ('Payment Required',
'No payment -- see charging schemes'),
403: ('Forbidden',
'Request forbidden -- authorization will not help'),
404: ('Not Found', 'Nothing matches the given URI'),
405: ('Method Not Allowed',
'Specified method is invalid for this server.'),
406: ('Not Acceptable', 'URI not available in preferred format.'),
407: ('Proxy Authentication Required', 'You must authenticate with '
'this proxy before proceeding.'),
408: ('Request Timeout', 'Request timed out; try again later.'),
409: ('Conflict', 'Request conflict.'),
410: ('Gone',
'URI no longer exists and has been permanently removed.'),
411: ('Length Required', 'Client must specify Content-Length.'),
412: ('Precondition Failed', 'Precondition in headers is false.'),
413: ('Request Entity Too Large', 'Entity is too large.'),
414: ('Request-URI Too Long', 'URI is too long.'),
415: ('Unsupported Media Type', 'Entity body in unsupported format.'),
416: ('Requested Range Not Satisfiable',
'Cannot satisfy request range.'),
417: ('Expectation Failed',
'Expect condition could not be satisfied.'),
500: ('Internal Server Error', 'Server got itself in trouble'),
501: ('Not Implemented',
'Server does not support this operation'),
502: ('Bad Gateway', 'Invalid responses from another server/proxy.'),
503: ('Service Unavailable',
'The server cannot process the request due to a high load'),
504: ('Gateway Timeout',
'The gateway server did not receive a timely response'),
505: ('HTTP Version Not Supported', 'Cannot fulfill request.'),
}
When an error is raised the server responds by returning an HTTP error code
*and* an error page. You can use the ``HTTPError`` instance as a response on the
page returned. This means that as well as the code attribute, it also has read,
geturl, and info, methods. ::
>>> req = urllib2.Request('http://www.python.org/fish.html')
>>> try:
>>> urllib2.urlopen(req)
>>> except URLError, e:
>>> print e.code
>>> print e.read()
>>>
404
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<?xml-stylesheet href="./css/ht2html.css"
type="text/css"?>
<html><head><title>Error 404: File Not Found</title>
...... etc...
Wrapping it Up
--------------
So if you want to be prepared for ``HTTPError`` *or* ``URLError`` there are two
basic approaches. I prefer the second approach.
Number 1
~~~~~~~~
::
from urllib2 import Request, urlopen, URLError, HTTPError
req = Request(someurl)
try:
response = urlopen(req)
except HTTPError, e:
print 'The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
except URLError, e:
print 'We failed to reach a server.'
print 'Reason: ', e.reason
else:
# everything is fine
.. note::
The ``except HTTPError`` *must* come first, otherwise ``except URLError``
will *also* catch an ``HTTPError``.
Number 2
~~~~~~~~
::
from urllib2 import Request, urlopen, URLError
req = Request(someurl)
try:
response = urlopen(req)
except URLError, e:
if hasattr(e, 'reason'):
print 'We failed to reach a server.'
print 'Reason: ', e.reason
elif hasattr(e, 'code'):
print 'The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
else:
# everything is fine
info and geturl
===============
The response returned by urlopen (or the ``HTTPError`` instance) has two useful
methods ``info`` and ``geturl``.
**geturl** - this returns the real URL of the page fetched. This is useful
because ``urlopen`` (or the opener object used) may have followed a
redirect. The URL of the page fetched may not be the same as the URL requested.
**info** - this returns a dictionary-like object that describes the page
fetched, particularly the headers sent by the server. It is currently an
``httplib.HTTPMessage`` instance.
Typical headers include 'Content-length', 'Content-type', and so on. See the
`Quick Reference to HTTP Headers <http://www.cs.tut.fi/~jkorpela/http.html>`_
for a useful listing of HTTP headers with brief explanations of their meaning
and use.
Openers and Handlers
====================
When you fetch a URL you use an opener (an instance of the perhaps
confusingly-named :class:`urllib2.OpenerDirector`). Normally we have been using
the default opener - via ``urlopen`` - but you can create custom
openers. Openers use handlers. All the "heavy lifting" is done by the
handlers. Each handler knows how to open URLs for a particular URL scheme (http,
ftp, etc.), or how to handle an aspect of URL opening, for example HTTP
redirections or HTTP cookies.
You will want to create openers if you want to fetch URLs with specific handlers
installed, for example to get an opener that handles cookies, or to get an
opener that does not handle redirections.
To create an opener, instantiate an ``OpenerDirector``, and then call
``.add_handler(some_handler_instance)`` repeatedly.
Alternatively, you can use ``build_opener``, which is a convenience function for
creating opener objects with a single function call. ``build_opener`` adds
several handlers by default, but provides a quick way to add more and/or
override the default handlers.
Other sorts of handlers you might want to can handle proxies, authentication,
and other common but slightly specialised situations.
``install_opener`` can be used to make an ``opener`` object the (global) default
opener. This means that calls to ``urlopen`` will use the opener you have
installed.
Opener objects have an ``open`` method, which can be called directly to fetch
urls in the same way as the ``urlopen`` function: there's no need to call
``install_opener``, except as a convenience.
Basic Authentication
====================
To illustrate creating and installing a handler we will use the
``HTTPBasicAuthHandler``. For a more detailed discussion of this subject --
including an explanation of how Basic Authentication works - see the `Basic
Authentication Tutorial
<http://www.voidspace.org.uk/python/articles/authentication.shtml>`_.
When authentication is required, the server sends a header (as well as the 401
error code) requesting authentication. This specifies the authentication scheme
and a 'realm'. The header looks like : ``Www-authenticate: SCHEME
realm="REALM"``.
e.g. ::
Www-authenticate: Basic realm="cPanel Users"
The client should then retry the request with the appropriate name and password
for the realm included as a header in the request. This is 'basic
authentication'. In order to simplify this process we can create an instance of
``HTTPBasicAuthHandler`` and an opener to use this handler.
The ``HTTPBasicAuthHandler`` uses an object called a password manager to handle
the mapping of URLs and realms to passwords and usernames. If you know what the
realm is (from the authentication header sent by the server), then you can use a
``HTTPPasswordMgr``. Frequently one doesn't care what the realm is. In that
case, it is convenient to use ``HTTPPasswordMgrWithDefaultRealm``. This allows
you to specify a default username and password for a URL. This will be supplied
in the absence of you providing an alternative combination for a specific
realm. We indicate this by providing ``None`` as the realm argument to the
``add_password`` method.
The top-level URL is the first URL that requires authentication. URLs "deeper"
than the URL you pass to .add_password() will also match. ::
# create a password manager
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
# Add the username and password.
# If we knew the realm, we could use it instead of ``None``.
top_level_url = "http://example.com/foo/"
password_mgr.add_password(None, top_level_url, username, password)
handler = urllib2.HTTPBasicAuthHandler(password_mgr)
# create "opener" (OpenerDirector instance)
opener = urllib2.build_opener(handler)
# use the opener to fetch a URL
opener.open(a_url)
# Install the opener.
# Now all calls to urllib2.urlopen use our opener.
urllib2.install_opener(opener)
.. note::
In the above example we only supplied our ``HHTPBasicAuthHandler`` to
``build_opener``. By default openers have the handlers for normal situations
-- ``ProxyHandler``, ``UnknownHandler``, ``HTTPHandler``,
``HTTPDefaultErrorHandler``, ``HTTPRedirectHandler``, ``FTPHandler``,
``FileHandler``, ``HTTPErrorProcessor``.
``top_level_url`` is in fact *either* a full URL (including the 'http:' scheme
component and the hostname and optionally the port number)
e.g. "http://example.com/" *or* an "authority" (i.e. the hostname,
optionally including the port number) e.g. "example.com" or "example.com:8080"
(the latter example includes a port number). The authority, if present, must
NOT contain the "userinfo" component - for example "joe@password:example.com" is
not correct.
Proxies
=======
**urllib2** will auto-detect your proxy settings and use those. This is through
the ``ProxyHandler`` which is part of the normal handler chain. Normally that's
a good thing, but there are occasions when it may not be helpful [#]_. One way
to do this is to setup our own ``ProxyHandler``, with no proxies defined. This
is done using similar steps to setting up a `Basic Authentication`_ handler : ::
>>> proxy_support = urllib2.ProxyHandler({})
>>> opener = urllib2.build_opener(proxy_support)
>>> urllib2.install_opener(opener)
.. note::
Currently ``urllib2`` *does not* support fetching of ``https`` locations
through a proxy. However, this can be enabled by extending urllib2 as
shown in the recipe [#]_.
Sockets and Layers
==================
The Python support for fetching resources from the web is layered. urllib2 uses
the httplib library, which in turn uses the socket library.
As of Python 2.3 you can specify how long a socket should wait for a response
before timing out. This can be useful in applications which have to fetch web
pages. By default the socket module has *no timeout* and can hang. Currently,
the socket timeout is not exposed at the httplib or urllib2 levels. However,
you can set the default timeout globally for all sockets using ::
import socket
import urllib2
# timeout in seconds
timeout = 10
socket.setdefaulttimeout(timeout)
# this call to urllib2.urlopen now uses the default timeout
# we have set in the socket module
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req)
-------
Footnotes
=========
This document was reviewed and revised by John Lee.
.. [#] For an introduction to the CGI protocol see
`Writing Web Applications in Python <http://www.pyzine.com/Issue008/Section_Articles/article_CGIOne.html>`_.
.. [#] Like Google for example. The *proper* way to use google from a program
is to use `PyGoogle <http://pygoogle.sourceforge.net>`_ of course. See
`Voidspace Google <http://www.voidspace.org.uk/python/recipebook.shtml#google>`_
for some examples of using the Google API.
.. [#] Browser sniffing is a very bad practise for website design - building
sites using web standards is much more sensible. Unfortunately a lot of
sites still send different versions to different browsers.
.. [#] The user agent for MSIE 6 is
*'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)'*
.. [#] For details of more HTTP request headers, see
`Quick Reference to HTTP Headers`_.
.. [#] In my case I have to use a proxy to access the internet at work. If you
attempt to fetch *localhost* URLs through this proxy it blocks them. IE
is set to use the proxy, which urllib2 picks up on. In order to test
scripts with a localhost server, I have to prevent urllib2 from using
the proxy.
.. [#] urllib2 opener for SSL proxy (CONNECT method): `ASPN Cookbook Recipe
<http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/456195>`_.

View File

@ -1,115 +0,0 @@
#!/usr/bin/env python
"""Send the contents of a directory as a MIME message."""
import os
import sys
import smtplib
# For guessing MIME type based on file name extension
import mimetypes
from optparse import OptionParser
from email import encoders
from email.message import Message
from email.mime.audio import MIMEAudio
from email.mime.base import MIMEBase
from email.mime.image import MIMEImage
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
COMMASPACE = ', '
def main():
parser = OptionParser(usage="""\
Send the contents of a directory as a MIME message.
Usage: %prog [options]
Unless the -o option is given, the email is sent by forwarding to your local
SMTP server, which then does the normal delivery process. Your local machine
must be running an SMTP server.
""")
parser.add_option('-d', '--directory',
type='string', action='store',
help="""Mail the contents of the specified directory,
otherwise use the current directory. Only the regular
files in the directory are sent, and we don't recurse to
subdirectories.""")
parser.add_option('-o', '--output',
type='string', action='store', metavar='FILE',
help="""Print the composed message to FILE instead of
sending the message to the SMTP server.""")
parser.add_option('-s', '--sender',
type='string', action='store', metavar='SENDER',
help='The value of the From: header (required)')
parser.add_option('-r', '--recipient',
type='string', action='append', metavar='RECIPIENT',
default=[], dest='recipients',
help='A To: header value (at least one required)')
opts, args = parser.parse_args()
if not opts.sender or not opts.recipients:
parser.print_help()
sys.exit(1)
directory = opts.directory
if not directory:
directory = '.'
# Create the enclosing (outer) message
outer = MIMEMultipart()
outer['Subject'] = 'Contents of directory %s' % os.path.abspath(directory)
outer['To'] = COMMASPACE.join(opts.recipients)
outer['From'] = opts.sender
outer.preamble = 'You will not see this in a MIME-aware mail reader.\n'
for filename in os.listdir(directory):
path = os.path.join(directory, filename)
if not os.path.isfile(path):
continue
# Guess the content type based on the file's extension. Encoding
# will be ignored, although we should check for simple things like
# gzip'd or compressed files.
ctype, encoding = mimetypes.guess_type(path)
if ctype is None or encoding is not None:
# No guess could be made, or the file is encoded (compressed), so
# use a generic bag-of-bits type.
ctype = 'application/octet-stream'
maintype, subtype = ctype.split('/', 1)
if maintype == 'text':
fp = open(path)
# Note: we should handle calculating the charset
msg = MIMEText(fp.read(), _subtype=subtype)
fp.close()
elif maintype == 'image':
fp = open(path, 'rb')
msg = MIMEImage(fp.read(), _subtype=subtype)
fp.close()
elif maintype == 'audio':
fp = open(path, 'rb')
msg = MIMEAudio(fp.read(), _subtype=subtype)
fp.close()
else:
fp = open(path, 'rb')
msg = MIMEBase(maintype, subtype)
msg.set_payload(fp.read())
fp.close()
# Encode the payload using Base64
encoders.encode_base64(msg)
# Set the filename parameter
msg.add_header('Content-Disposition', 'attachment', filename=filename)
outer.attach(msg)
# Now send or store the message
composed = outer.as_string()
if opts.output:
fp = open(opts.output, 'w')
fp.write(composed)
fp.close()
else:
s = smtplib.SMTP()
s.connect()
s.sendmail(opts.sender, opts.recipients, composed)
s.close()
if __name__ == '__main__':
main()

View File

@ -1,32 +0,0 @@
# Import smtplib for the actual sending function
import smtplib
# Here are the email package modules we'll need
from email.mime.image import MIMEImage
from email.mime.multipart import MIMEMultipart
COMMASPACE = ', '
# Create the container (outer) email message.
msg = MIMEMultipart()
msg['Subject'] = 'Our family reunion'
# me == the sender's email address
# family = the list of all recipients' email addresses
msg['From'] = me
msg['To'] = COMMASPACE.join(family)
msg.preamble = 'Our family reunion'
# Assume we know that the image files are all in PNG format
for file in pngfiles:
# Open the files in binary mode. Let the MIMEImage class automatically
# guess the specific image type.
fp = open(file, 'rb')
img = MIMEImage(fp.read())
fp.close()
msg.attach(img)
# Send the email via our own SMTP server.
s = smtplib.SMTP()
s.connect()
s.sendmail(me, family, msg.as_string())
s.close()

View File

@ -1,25 +0,0 @@
# Import smtplib for the actual sending function
import smtplib
# Import the email modules we'll need
from email.mime.text import MIMEText
# Open a plain text file for reading. For this example, assume that
# the text file contains only ASCII characters.
fp = open(textfile, 'rb')
# Create a text/plain message
msg = MIMEText(fp.read())
fp.close()
# me == the sender's email address
# you == the recipient's email address
msg['Subject'] = 'The contents of %s' % textfile
msg['From'] = me
msg['To'] = you
# Send the message via our own SMTP server, but don't include the
# envelope header.
s = smtplib.SMTP()
s.connect()
s.sendmail(me, [you], msg.as_string())
s.close()

View File

@ -1,68 +0,0 @@
#!/usr/bin/env python
"""Unpack a MIME message into a directory of files."""
import os
import sys
import email
import errno
import mimetypes
from optparse import OptionParser
def main():
parser = OptionParser(usage="""\
Unpack a MIME message into a directory of files.
Usage: %prog [options] msgfile
""")
parser.add_option('-d', '--directory',
type='string', action='store',
help="""Unpack the MIME message into the named
directory, which will be created if it doesn't already
exist.""")
opts, args = parser.parse_args()
if not opts.directory:
parser.print_help()
sys.exit(1)
try:
msgfile = args[0]
except IndexError:
parser.print_help()
sys.exit(1)
try:
os.mkdir(opts.directory)
except OSError, e:
# Ignore directory exists error
if e.errno <> errno.EEXIST:
raise
fp = open(msgfile)
msg = email.message_from_file(fp)
fp.close()
counter = 1
for part in msg.walk():
# multipart/* are just containers
if part.get_content_maintype() == 'multipart':
continue
# Applications should really sanitize the given filename so that an
# email message can't be used to overwrite important files
filename = part.get_filename()
if not filename:
ext = mimetypes.guess_extension(part.get_type())
if not ext:
# Use a generic bag-of-bits extension
ext = '.bin'
filename = 'part-%03d%s' % (counter, ext)
counter += 1
fp = open(os.path.join(opts.directory, filename), 'wb')
fp.write(part.get_payload(decode=True))
fp.close()
if __name__ == '__main__':
main()

View File

@ -1,64 +0,0 @@
import xml.dom.minidom
document = """\
<slideshow>
<title>Demo slideshow</title>
<slide><title>Slide title</title>
<point>This is a demo</point>
<point>Of a program for processing slides</point>
</slide>
<slide><title>Another demo slide</title>
<point>It is important</point>
<point>To have more than</point>
<point>one slide</point>
</slide>
</slideshow>
"""
dom = xml.dom.minidom.parseString(document)
def getText(nodelist):
rc = ""
for node in nodelist:
if node.nodeType == node.TEXT_NODE:
rc = rc + node.data
return rc
def handleSlideshow(slideshow):
print "<html>"
handleSlideshowTitle(slideshow.getElementsByTagName("title")[0])
slides = slideshow.getElementsByTagName("slide")
handleToc(slides)
handleSlides(slides)
print "</html>"
def handleSlides(slides):
for slide in slides:
handleSlide(slide)
def handleSlide(slide):
handleSlideTitle(slide.getElementsByTagName("title")[0])
handlePoints(slide.getElementsByTagName("point"))
def handleSlideshowTitle(title):
print "<title>%s</title>" % getText(title.childNodes)
def handleSlideTitle(title):
print "<h2>%s</h2>" % getText(title.childNodes)
def handlePoints(points):
print "<ul>"
for point in points:
handlePoint(point)
print "</ul>"
def handlePoint(point):
print "<li>%s</li>" % getText(point.childNodes)
def handleToc(slides):
for slide in slides:
title = slide.getElementsByTagName("title")[0]
print "<p>%s</p>" % getText(title.childNodes)
handleSlideshow(dom)

View File

@ -1,54 +0,0 @@
#include <Python.h>
typedef struct {
PyObject_HEAD
/* Type-specific fields go here. */
} noddy_NoddyObject;
static PyTypeObject noddy_NoddyType = {
PyObject_HEAD_INIT(NULL)
0, /*ob_size*/
"noddy.Noddy", /*tp_name*/
sizeof(noddy_NoddyObject), /*tp_basicsize*/
0, /*tp_itemsize*/
0, /*tp_dealloc*/
0, /*tp_print*/
0, /*tp_getattr*/
0, /*tp_setattr*/
0, /*tp_compare*/
0, /*tp_repr*/
0, /*tp_as_number*/
0, /*tp_as_sequence*/
0, /*tp_as_mapping*/
0, /*tp_hash */
0, /*tp_call*/
0, /*tp_str*/
0, /*tp_getattro*/
0, /*tp_setattro*/
0, /*tp_as_buffer*/
Py_TPFLAGS_DEFAULT, /*tp_flags*/
"Noddy objects", /* tp_doc */
};
static PyMethodDef noddy_methods[] = {
{NULL} /* Sentinel */
};
#ifndef PyMODINIT_FUNC /* declarations for DLL import/export */
#define PyMODINIT_FUNC void
#endif
PyMODINIT_FUNC
initnoddy(void)
{
PyObject* m;
noddy_NoddyType.tp_new = PyType_GenericNew;
if (PyType_Ready(&noddy_NoddyType) < 0)
return;
m = Py_InitModule3("noddy", noddy_methods,
"Example module that creates an extension type.");
Py_INCREF(&noddy_NoddyType);
PyModule_AddObject(m, "Noddy", (PyObject *)&noddy_NoddyType);
}

View File

@ -1,190 +0,0 @@
#include <Python.h>
#include "structmember.h"
typedef struct {
PyObject_HEAD
PyObject *first; /* first name */
PyObject *last; /* last name */
int number;
} Noddy;
static void
Noddy_dealloc(Noddy* self)
{
Py_XDECREF(self->first);
Py_XDECREF(self->last);
self->ob_type->tp_free((PyObject*)self);
}
static PyObject *
Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
Noddy *self;
self = (Noddy *)type->tp_alloc(type, 0);
if (self != NULL) {
self->first = PyString_FromString("");
if (self->first == NULL)
{
Py_DECREF(self);
return NULL;
}
self->last = PyString_FromString("");
if (self->last == NULL)
{
Py_DECREF(self);
return NULL;
}
self->number = 0;
}
return (PyObject *)self;
}
static int
Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
{
PyObject *first=NULL, *last=NULL, *tmp;
static char *kwlist[] = {"first", "last", "number", NULL};
if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
&first, &last,
&self->number))
return -1;
if (first) {
tmp = self->first;
Py_INCREF(first);
self->first = first;
Py_XDECREF(tmp);
}
if (last) {
tmp = self->last;
Py_INCREF(last);
self->last = last;
Py_XDECREF(tmp);
}
return 0;
}
static PyMemberDef Noddy_members[] = {
{"first", T_OBJECT_EX, offsetof(Noddy, first), 0,
"first name"},
{"last", T_OBJECT_EX, offsetof(Noddy, last), 0,
"last name"},
{"number", T_INT, offsetof(Noddy, number), 0,
"noddy number"},
{NULL} /* Sentinel */
};
static PyObject *
Noddy_name(Noddy* self)
{
static PyObject *format = NULL;
PyObject *args, *result;
if (format == NULL) {
format = PyString_FromString("%s %s");
if (format == NULL)
return NULL;
}
if (self->first == NULL) {
PyErr_SetString(PyExc_AttributeError, "first");
return NULL;
}
if (self->last == NULL) {
PyErr_SetString(PyExc_AttributeError, "last");
return NULL;
}
args = Py_BuildValue("OO", self->first, self->last);
if (args == NULL)
return NULL;
result = PyString_Format(format, args);
Py_DECREF(args);
return result;
}
static PyMethodDef Noddy_methods[] = {
{"name", (PyCFunction)Noddy_name, METH_NOARGS,
"Return the name, combining the first and last name"
},
{NULL} /* Sentinel */
};
static PyTypeObject NoddyType = {
PyObject_HEAD_INIT(NULL)
0, /*ob_size*/
"noddy.Noddy", /*tp_name*/
sizeof(Noddy), /*tp_basicsize*/
0, /*tp_itemsize*/
(destructor)Noddy_dealloc, /*tp_dealloc*/
0, /*tp_print*/
0, /*tp_getattr*/
0, /*tp_setattr*/
0, /*tp_compare*/
0, /*tp_repr*/
0, /*tp_as_number*/
0, /*tp_as_sequence*/
0, /*tp_as_mapping*/
0, /*tp_hash */
0, /*tp_call*/
0, /*tp_str*/
0, /*tp_getattro*/
0, /*tp_setattro*/
0, /*tp_as_buffer*/
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/
"Noddy objects", /* tp_doc */
0, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
0, /* tp_iter */
0, /* tp_iternext */
Noddy_methods, /* tp_methods */
Noddy_members, /* tp_members */
0, /* tp_getset */
0, /* tp_base */
0, /* tp_dict */
0, /* tp_descr_get */
0, /* tp_descr_set */
0, /* tp_dictoffset */
(initproc)Noddy_init, /* tp_init */
0, /* tp_alloc */
Noddy_new, /* tp_new */
};
static PyMethodDef module_methods[] = {
{NULL} /* Sentinel */
};
#ifndef PyMODINIT_FUNC /* declarations for DLL import/export */
#define PyMODINIT_FUNC void
#endif
PyMODINIT_FUNC
initnoddy2(void)
{
PyObject* m;
if (PyType_Ready(&NoddyType) < 0)
return;
m = Py_InitModule3("noddy2", module_methods,
"Example module that creates an extension type.");
if (m == NULL)
return;
Py_INCREF(&NoddyType);
PyModule_AddObject(m, "Noddy", (PyObject *)&NoddyType);
}

View File

@ -1,243 +0,0 @@
#include <Python.h>
#include "structmember.h"
typedef struct {
PyObject_HEAD
PyObject *first;
PyObject *last;
int number;
} Noddy;
static void
Noddy_dealloc(Noddy* self)
{
Py_XDECREF(self->first);
Py_XDECREF(self->last);
self->ob_type->tp_free((PyObject*)self);
}
static PyObject *
Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
Noddy *self;
self = (Noddy *)type->tp_alloc(type, 0);
if (self != NULL) {
self->first = PyString_FromString("");
if (self->first == NULL)
{
Py_DECREF(self);
return NULL;
}
self->last = PyString_FromString("");
if (self->last == NULL)
{
Py_DECREF(self);
return NULL;
}
self->number = 0;
}
return (PyObject *)self;
}
static int
Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
{
PyObject *first=NULL, *last=NULL, *tmp;
static char *kwlist[] = {"first", "last", "number", NULL};
if (! PyArg_ParseTupleAndKeywords(args, kwds, "|SSi", kwlist,
&first, &last,
&self->number))
return -1;
if (first) {
tmp = self->first;
Py_INCREF(first);
self->first = first;
Py_DECREF(tmp);
}
if (last) {
tmp = self->last;
Py_INCREF(last);
self->last = last;
Py_DECREF(tmp);
}
return 0;
}
static PyMemberDef Noddy_members[] = {
{"number", T_INT, offsetof(Noddy, number), 0,
"noddy number"},
{NULL} /* Sentinel */
};
static PyObject *
Noddy_getfirst(Noddy *self, void *closure)
{
Py_INCREF(self->first);
return self->first;
}
static int
Noddy_setfirst(Noddy *self, PyObject *value, void *closure)
{
if (value == NULL) {
PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
return -1;
}
if (! PyString_Check(value)) {
PyErr_SetString(PyExc_TypeError,
"The first attribute value must be a string");
return -1;
}
Py_DECREF(self->first);
Py_INCREF(value);
self->first = value;
return 0;
}
static PyObject *
Noddy_getlast(Noddy *self, void *closure)
{
Py_INCREF(self->last);
return self->last;
}
static int
Noddy_setlast(Noddy *self, PyObject *value, void *closure)
{
if (value == NULL) {
PyErr_SetString(PyExc_TypeError, "Cannot delete the last attribute");
return -1;
}
if (! PyString_Check(value)) {
PyErr_SetString(PyExc_TypeError,
"The last attribute value must be a string");
return -1;
}
Py_DECREF(self->last);
Py_INCREF(value);
self->last = value;
return 0;
}
static PyGetSetDef Noddy_getseters[] = {
{"first",
(getter)Noddy_getfirst, (setter)Noddy_setfirst,
"first name",
NULL},
{"last",
(getter)Noddy_getlast, (setter)Noddy_setlast,
"last name",
NULL},
{NULL} /* Sentinel */
};
static PyObject *
Noddy_name(Noddy* self)
{
static PyObject *format = NULL;
PyObject *args, *result;
if (format == NULL) {
format = PyString_FromString("%s %s");
if (format == NULL)
return NULL;
}
args = Py_BuildValue("OO", self->first, self->last);
if (args == NULL)
return NULL;
result = PyString_Format(format, args);
Py_DECREF(args);
return result;
}
static PyMethodDef Noddy_methods[] = {
{"name", (PyCFunction)Noddy_name, METH_NOARGS,
"Return the name, combining the first and last name"
},
{NULL} /* Sentinel */
};
static PyTypeObject NoddyType = {
PyObject_HEAD_INIT(NULL)
0, /*ob_size*/
"noddy.Noddy", /*tp_name*/
sizeof(Noddy), /*tp_basicsize*/
0, /*tp_itemsize*/
(destructor)Noddy_dealloc, /*tp_dealloc*/
0, /*tp_print*/
0, /*tp_getattr*/
0, /*tp_setattr*/
0, /*tp_compare*/
0, /*tp_repr*/
0, /*tp_as_number*/
0, /*tp_as_sequence*/
0, /*tp_as_mapping*/
0, /*tp_hash */
0, /*tp_call*/
0, /*tp_str*/
0, /*tp_getattro*/
0, /*tp_setattro*/
0, /*tp_as_buffer*/
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/
"Noddy objects", /* tp_doc */
0, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
0, /* tp_iter */
0, /* tp_iternext */
Noddy_methods, /* tp_methods */
Noddy_members, /* tp_members */
Noddy_getseters, /* tp_getset */
0, /* tp_base */
0, /* tp_dict */
0, /* tp_descr_get */
0, /* tp_descr_set */
0, /* tp_dictoffset */
(initproc)Noddy_init, /* tp_init */
0, /* tp_alloc */
Noddy_new, /* tp_new */
};
static PyMethodDef module_methods[] = {
{NULL} /* Sentinel */
};
#ifndef PyMODINIT_FUNC /* declarations for DLL import/export */
#define PyMODINIT_FUNC void
#endif
PyMODINIT_FUNC
initnoddy3(void)
{
PyObject* m;
if (PyType_Ready(&NoddyType) < 0)
return;
m = Py_InitModule3("noddy3", module_methods,
"Example module that creates an extension type.");
if (m == NULL)
return;
Py_INCREF(&NoddyType);
PyModule_AddObject(m, "Noddy", (PyObject *)&NoddyType);
}

View File

@ -1,224 +0,0 @@
#include <Python.h>
#include "structmember.h"
typedef struct {
PyObject_HEAD
PyObject *first;
PyObject *last;
int number;
} Noddy;
static int
Noddy_traverse(Noddy *self, visitproc visit, void *arg)
{
int vret;
if (self->first) {
vret = visit(self->first, arg);
if (vret != 0)
return vret;
}
if (self->last) {
vret = visit(self->last, arg);
if (vret != 0)
return vret;
}
return 0;
}
static int
Noddy_clear(Noddy *self)
{
PyObject *tmp;
tmp = self->first;
self->first = NULL;
Py_XDECREF(tmp);
tmp = self->last;
self->last = NULL;
Py_XDECREF(tmp);
return 0;
}
static void
Noddy_dealloc(Noddy* self)
{
Noddy_clear(self);
self->ob_type->tp_free((PyObject*)self);
}
static PyObject *
Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
Noddy *self;
self = (Noddy *)type->tp_alloc(type, 0);
if (self != NULL) {
self->first = PyString_FromString("");
if (self->first == NULL)
{
Py_DECREF(self);
return NULL;
}
self->last = PyString_FromString("");
if (self->last == NULL)
{
Py_DECREF(self);
return NULL;
}
self->number = 0;
}
return (PyObject *)self;
}
static int
Noddy_init(Noddy *self, PyObject *args, PyObject *kwds)
{
PyObject *first=NULL, *last=NULL, *tmp;
static char *kwlist[] = {"first", "last", "number", NULL};
if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
&first, &last,
&self->number))
return -1;
if (first) {
tmp = self->first;
Py_INCREF(first);
self->first = first;
Py_XDECREF(tmp);
}
if (last) {
tmp = self->last;
Py_INCREF(last);
self->last = last;
Py_XDECREF(tmp);
}
return 0;
}
static PyMemberDef Noddy_members[] = {
{"first", T_OBJECT_EX, offsetof(Noddy, first), 0,
"first name"},
{"last", T_OBJECT_EX, offsetof(Noddy, last), 0,
"last name"},
{"number", T_INT, offsetof(Noddy, number), 0,
"noddy number"},
{NULL} /* Sentinel */
};
static PyObject *
Noddy_name(Noddy* self)
{
static PyObject *format = NULL;
PyObject *args, *result;
if (format == NULL) {
format = PyString_FromString("%s %s");
if (format == NULL)
return NULL;
}
if (self->first == NULL) {
PyErr_SetString(PyExc_AttributeError, "first");
return NULL;
}
if (self->last == NULL) {
PyErr_SetString(PyExc_AttributeError, "last");
return NULL;
}
args = Py_BuildValue("OO", self->first, self->last);
if (args == NULL)
return NULL;
result = PyString_Format(format, args);
Py_DECREF(args);
return result;
}
static PyMethodDef Noddy_methods[] = {
{"name", (PyCFunction)Noddy_name, METH_NOARGS,
"Return the name, combining the first and last name"
},
{NULL} /* Sentinel */
};
static PyTypeObject NoddyType = {
PyObject_HEAD_INIT(NULL)
0, /*ob_size*/
"noddy.Noddy", /*tp_name*/
sizeof(Noddy), /*tp_basicsize*/
0, /*tp_itemsize*/
(destructor)Noddy_dealloc, /*tp_dealloc*/
0, /*tp_print*/
0, /*tp_getattr*/
0, /*tp_setattr*/
0, /*tp_compare*/
0, /*tp_repr*/
0, /*tp_as_number*/
0, /*tp_as_sequence*/
0, /*tp_as_mapping*/
0, /*tp_hash */
0, /*tp_call*/
0, /*tp_str*/
0, /*tp_getattro*/
0, /*tp_setattro*/
0, /*tp_as_buffer*/
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC, /*tp_flags*/
"Noddy objects", /* tp_doc */
(traverseproc)Noddy_traverse, /* tp_traverse */
(inquiry)Noddy_clear, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
0, /* tp_iter */
0, /* tp_iternext */
Noddy_methods, /* tp_methods */
Noddy_members, /* tp_members */
0, /* tp_getset */
0, /* tp_base */
0, /* tp_dict */
0, /* tp_descr_get */
0, /* tp_descr_set */
0, /* tp_dictoffset */
(initproc)Noddy_init, /* tp_init */
0, /* tp_alloc */
Noddy_new, /* tp_new */
};
static PyMethodDef module_methods[] = {
{NULL} /* Sentinel */
};
#ifndef PyMODINIT_FUNC /* declarations for DLL import/export */
#define PyMODINIT_FUNC void
#endif
PyMODINIT_FUNC
initnoddy4(void)
{
PyObject* m;
if (PyType_Ready(&NoddyType) < 0)
return;
m = Py_InitModule3("noddy4", module_methods,
"Example module that creates an extension type.");
if (m == NULL)
return;
Py_INCREF(&NoddyType);
PyModule_AddObject(m, "Noddy", (PyObject *)&NoddyType);
}

View File

@ -1,68 +0,0 @@
#include <Python.h>
int
main(int argc, char *argv[])
{
PyObject *pName, *pModule, *pDict, *pFunc;
PyObject *pArgs, *pValue;
int i;
if (argc < 3) {
fprintf(stderr,"Usage: call pythonfile funcname [args]\n");
return 1;
}
Py_Initialize();
pName = PyString_FromString(argv[1]);
/* Error checking of pName left out */
pModule = PyImport_Import(pName);
Py_DECREF(pName);
if (pModule != NULL) {
pFunc = PyObject_GetAttrString(pModule, argv[2]);
/* pFunc is a new reference */
if (pFunc && PyCallable_Check(pFunc)) {
pArgs = PyTuple_New(argc - 3);
for (i = 0; i < argc - 3; ++i) {
pValue = PyInt_FromLong(atoi(argv[i + 3]));
if (!pValue) {
Py_DECREF(pArgs);
Py_DECREF(pModule);
fprintf(stderr, "Cannot convert argument\n");
return 1;
}
/* pValue reference stolen here: */
PyTuple_SetItem(pArgs, i, pValue);
}
pValue = PyObject_CallObject(pFunc, pArgs);
Py_DECREF(pArgs);
if (pValue != NULL) {
printf("Result of call: %ld\n", PyInt_AsLong(pValue));
Py_DECREF(pValue);
}
else {
Py_DECREF(pFunc);
Py_DECREF(pModule);
PyErr_Print();
fprintf(stderr,"Call failed\n");
return 1;
}
}
else {
if (PyErr_Occurred())
PyErr_Print();
fprintf(stderr, "Cannot find function \"%s\"\n", argv[2]);
}
Py_XDECREF(pFunc);
Py_DECREF(pModule);
}
else {
PyErr_Print();
fprintf(stderr, "Failed to load \"%s\"\n", argv[1]);
return 1;
}
Py_Finalize();
return 0;
}

View File

@ -1,8 +0,0 @@
from distutils.core import setup, Extension
setup(name="noddy", version="1.0",
ext_modules=[
Extension("noddy", ["noddy.c"]),
Extension("noddy2", ["noddy2.c"]),
Extension("noddy3", ["noddy3.c"]),
Extension("noddy4", ["noddy4.c"]),
])

View File

@ -1,91 +0,0 @@
#include <Python.h>
typedef struct {
PyListObject list;
int state;
} Shoddy;
static PyObject *
Shoddy_increment(Shoddy *self, PyObject *unused)
{
self->state++;
return PyInt_FromLong(self->state);
}
static PyMethodDef Shoddy_methods[] = {
{"increment", (PyCFunction)Shoddy_increment, METH_NOARGS,
PyDoc_STR("increment state counter")},
{NULL, NULL},
};
static int
Shoddy_init(Shoddy *self, PyObject *args, PyObject *kwds)
{
if (PyList_Type.tp_init((PyObject *)self, args, kwds) < 0)
return -1;
self->state = 0;
return 0;
}
static PyTypeObject ShoddyType = {
PyObject_HEAD_INIT(NULL)
0, /* ob_size */
"shoddy.Shoddy", /* tp_name */
sizeof(Shoddy), /* tp_basicsize */
0, /* tp_itemsize */
0, /* tp_dealloc */
0, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
0, /* tp_compare */
0, /* tp_repr */
0, /* tp_as_number */
0, /* tp_as_sequence */
0, /* tp_as_mapping */
0, /* tp_hash */
0, /* tp_call */
0, /* tp_str */
0, /* tp_getattro */
0, /* tp_setattro */
0, /* tp_as_buffer */
Py_TPFLAGS_DEFAULT |
Py_TPFLAGS_BASETYPE, /* tp_flags */
0, /* tp_doc */
0, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
0, /* tp_iter */
0, /* tp_iternext */
Shoddy_methods, /* tp_methods */
0, /* tp_members */
0, /* tp_getset */
0, /* tp_base */
0, /* tp_dict */
0, /* tp_descr_get */
0, /* tp_descr_set */
0, /* tp_dictoffset */
(initproc)Shoddy_init, /* tp_init */
0, /* tp_alloc */
0, /* tp_new */
};
PyMODINIT_FUNC
initshoddy(void)
{
PyObject *m;
ShoddyType.tp_base = &PyList_Type;
if (PyType_Ready(&ShoddyType) < 0)
return;
m = Py_InitModule3("shoddy", NULL, "Shoddy module");
if (m == NULL)
return;
Py_INCREF(&ShoddyType);
PyModule_AddObject(m, "Shoddy", (PyObject *) &ShoddyType);
}

View File

@ -1,14 +0,0 @@
import sqlite3
import datetime, time
def adapt_datetime(ts):
return time.mktime(ts.timetuple())
sqlite3.register_adapter(datetime.datetime, adapt_datetime)
con = sqlite3.connect(":memory:")
cur = con.cursor()
now = datetime.datetime.now()
cur.execute("select ?", (now,))
print cur.fetchone()[0]

View File

@ -1,16 +0,0 @@
import sqlite3
class Point(object):
def __init__(self, x, y):
self.x, self.y = x, y
def __conform__(self, protocol):
if protocol is sqlite3.PrepareProtocol:
return "%f;%f" % (self.x, self.y)
con = sqlite3.connect(":memory:")
cur = con.cursor()
p = Point(4.0, -3.2)
cur.execute("select ?", (p,))
print cur.fetchone()[0]

View File

@ -1,17 +0,0 @@
import sqlite3
class Point(object):
def __init__(self, x, y):
self.x, self.y = x, y
def adapt_point(point):
return "%f;%f" % (point.x, point.y)
sqlite3.register_adapter(Point, adapt_point)
con = sqlite3.connect(":memory:")
cur = con.cursor()
p = Point(4.0, -3.2)
cur.execute("select ?", (p,))
print cur.fetchone()[0]

View File

@ -1,15 +0,0 @@
import sqlite3
def collate_reverse(string1, string2):
return -cmp(string1, string2)
con = sqlite3.connect(":memory:")
con.create_collation("reverse", collate_reverse)
cur = con.cursor()
cur.execute("create table test(x)")
cur.executemany("insert into test(x) values (?)", [("a",), ("b",)])
cur.execute("select x from test order by x collate reverse")
for row in cur:
print row
con.close()

View File

@ -1,30 +0,0 @@
# A minimal SQLite shell for experiments
import sqlite3
con = sqlite3.connect(":memory:")
con.isolation_level = None
cur = con.cursor()
buffer = ""
print "Enter your SQL commands to execute in sqlite3."
print "Enter a blank line to exit."
while True:
line = raw_input()
if line == "":
break
buffer += line
if sqlite3.complete_statement(buffer):
try:
buffer = buffer.strip()
cur.execute(buffer)
if buffer.lstrip().upper().startswith("SELECT"):
print cur.fetchall()
except sqlite3.Error, e:
print "An error occurred:", e.args[0]
buffer = ""
con.close()

View File

@ -1,3 +0,0 @@
import sqlite3
con = sqlite3.connect("mydb")

View File

@ -1,3 +0,0 @@
import sqlite3
con = sqlite3.connect(":memory:")

View File

@ -1,47 +0,0 @@
import sqlite3
class Point(object):
def __init__(self, x, y):
self.x, self.y = x, y
def __repr__(self):
return "(%f;%f)" % (self.x, self.y)
def adapt_point(point):
return "%f;%f" % (point.x, point.y)
def convert_point(s):
x, y = map(float, s.split(";"))
return Point(x, y)
# Register the adapter
sqlite3.register_adapter(Point, adapt_point)
# Register the converter
sqlite3.register_converter("point", convert_point)
p = Point(4.0, -3.2)
#########################
# 1) Using declared types
con = sqlite3.connect(":memory:", detect_types=sqlite3.PARSE_DECLTYPES)
cur = con.cursor()
cur.execute("create table test(p point)")
cur.execute("insert into test(p) values (?)", (p,))
cur.execute("select p from test")
print "with declared types:", cur.fetchone()[0]
cur.close()
con.close()
#######################
# 1) Using column names
con = sqlite3.connect(":memory:", detect_types=sqlite3.PARSE_COLNAMES)
cur = con.cursor()
cur.execute("create table test(p)")
cur.execute("insert into test(p) values (?)", (p,))
cur.execute('select p as "p [point]" from test')
print "with column names:", cur.fetchone()[0]
cur.close()
con.close()

View File

@ -1,15 +0,0 @@
import sqlite3
class CountCursorsConnection(sqlite3.Connection):
def __init__(self, *args, **kwargs):
sqlite3.Connection.__init__(self, *args, **kwargs)
self.numcursors = 0
def cursor(self, *args, **kwargs):
self.numcursors += 1
return sqlite3.Connection.cursor(self, *args, **kwargs)
con = sqlite3.connect(":memory:", factory=CountCursorsConnection)
cur1 = con.cursor()
cur2 = con.cursor()
print con.numcursors

View File

@ -1,28 +0,0 @@
# Not referenced from the documentation, but builds the database file the other
# code snippets expect.
import sqlite3
import os
DB_FILE = "mydb"
if os.path.exists(DB_FILE):
os.remove(DB_FILE)
con = sqlite3.connect(DB_FILE)
cur = con.cursor()
cur.execute("""
create table people
(
name_last varchar(20),
age integer
)
""")
cur.execute("insert into people (name_last, age) values ('Yeltsin', 72)")
cur.execute("insert into people (name_last, age) values ('Putin', 51)")
con.commit()
cur.close()
con.close()

View File

@ -1,17 +0,0 @@
import sqlite3
con = sqlite3.connect("mydb")
cur = con.cursor()
SELECT = "select name_last, age from people order by age, name_last"
# 1. Iterate over the rows available from the cursor, unpacking the
# resulting sequences to yield their elements (name_last, age):
cur.execute(SELECT)
for (name_last, age) in cur:
print '%s is %d years old.' % (name_last, age)
# 2. Equivalently:
cur.execute(SELECT)
for row in cur:
print '%s is %d years old.' % (row[0], row[1])

View File

@ -1,13 +0,0 @@
import sqlite3
# Create a connection to the database file "mydb":
con = sqlite3.connect("mydb")
# Get a Cursor object that operates in the context of Connection con:
cur = con.cursor()
# Execute the SELECT statement:
cur.execute("select * from people order by age")
# Retrieve all rows as a sequence and print that sequence:
print cur.fetchall()

View File

@ -1,11 +0,0 @@
import sqlite3
con = sqlite3.connect("mydb")
cur = con.cursor()
who = "Yeltsin"
age = 72
cur.execute("select name_last, age from people where name_last=? and age=?", (who, age))
print cur.fetchone()

View File

@ -1,12 +0,0 @@
import sqlite3
con = sqlite3.connect("mydb")
cur = con.cursor()
who = "Yeltsin"
age = 72
cur.execute("select name_last, age from people where name_last=:who and age=:age",
{"who": who, "age": age})
print cur.fetchone()

View File

@ -1,12 +0,0 @@
import sqlite3
con = sqlite3.connect("mydb")
cur = con.cursor()
who = "Yeltsin"
age = 72
cur.execute("select name_last, age from people where name_last=:who and age=:age",
locals())
print cur.fetchone()

View File

@ -1,24 +0,0 @@
import sqlite3
class IterChars:
def __init__(self):
self.count = ord('a')
def __iter__(self):
return self
def next(self):
if self.count > ord('z'):
raise StopIteration
self.count += 1
return (chr(self.count - 1),) # this is a 1-tuple
con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute("create table characters(c)")
theIter = IterChars()
cur.executemany("insert into characters(c) values (?)", theIter)
cur.execute("select c from characters")
print cur.fetchall()

View File

@ -1,15 +0,0 @@
import sqlite3
def char_generator():
import string
for c in string.letters[:26]:
yield (c,)
con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute("create table characters(c)")
cur.executemany("insert into characters(c) values (?)", char_generator())
cur.execute("select c from characters")
print cur.fetchall()

View File

@ -1,24 +0,0 @@
import sqlite3
con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.executescript("""
create table person(
firstname,
lastname,
age
);
create table book(
title,
author,
published
);
insert into book(title, author, published)
values (
'Dirk Gently''s Holistic Detective Agency',
'Douglas Adams',
1987
);
""")

View File

@ -1,16 +0,0 @@
import sqlite3
con = sqlite3.connect("mydb")
cur = con.cursor()
newPeople = (
('Lebed' , 53),
('Zhirinovsky' , 57),
)
for person in newPeople:
cur.execute("insert into people (name_last, age) values (?, ?)", person)
# The changes will not be saved unless the transaction is committed explicitly:
con.commit()

View File

@ -1,11 +0,0 @@
import sqlite3
import md5
def md5sum(t):
return md5.md5(t).hexdigest()
con = sqlite3.connect(":memory:")
con.create_function("md5", 1, md5sum)
cur = con.cursor()
cur.execute("select md5(?)", ("foo",))
print cur.fetchone()[0]

View File

@ -1,20 +0,0 @@
import sqlite3
class MySum:
def __init__(self):
self.count = 0
def step(self, value):
self.count += value
def finalize(self):
return self.count
con = sqlite3.connect(":memory:")
con.create_aggregate("mysum", 1, MySum)
cur = con.cursor()
cur.execute("create table test(i)")
cur.execute("insert into test(i) values (1)")
cur.execute("insert into test(i) values (2)")
cur.execute("select mysum(i) from test")
print cur.fetchone()[0]

View File

@ -1,8 +0,0 @@
import sqlite3
import datetime
con = sqlite3.connect(":memory:", detect_types=sqlite3.PARSE_COLNAMES)
cur = con.cursor()
cur.execute('select ? as "x [timestamp]"', (datetime.datetime.now(),))
dt = cur.fetchone()[0]
print dt, type(dt)

View File

@ -1,20 +0,0 @@
import sqlite3
import datetime
con = sqlite3.connect(":memory:", detect_types=sqlite3.PARSE_DECLTYPES|sqlite3.PARSE_COLNAMES)
cur = con.cursor()
cur.execute("create table test(d date, ts timestamp)")
today = datetime.date.today()
now = datetime.datetime.now()
cur.execute("insert into test(d, ts) values (?, ?)", (today, now))
cur.execute("select d, ts from test")
row = cur.fetchone()
print today, "=>", row[0], type(row[0])
print now, "=>", row[1], type(row[1])
cur.execute('select current_date as "d [date]", current_timestamp as "ts [timestamp]"')
row = cur.fetchone()
print "current_date", row[0], type(row[0])
print "current_timestamp", row[1], type(row[1])

View File

@ -1,13 +0,0 @@
import sqlite3
def dict_factory(cursor, row):
d = {}
for idx, col in enumerate(cursor.description):
d[col[0]] = row[idx]
return d
con = sqlite3.connect(":memory:")
con.row_factory = dict_factory
cur = con.cursor()
cur.execute("select 1 as a")
print cur.fetchone()["a"]

View File

@ -1,12 +0,0 @@
import sqlite3
con = sqlite3.connect("mydb")
con.row_factory = sqlite3.Row
cur = con.cursor()
cur.execute("select name_last, age from people")
for row in cur:
assert row[0] == row["name_last"]
assert row["name_last"] == row["nAmE_lAsT"]
assert row[1] == row["age"]
assert row[1] == row["AgE"]

View File

@ -1,6 +0,0 @@
import sqlite3
# The shared cache is only available in SQLite versions 3.3.3 or later
# See the SQLite documentaton for details.
sqlite3.enable_shared_cache(True)

View File

@ -1,21 +0,0 @@
import sqlite3
persons = [
("Hugo", "Boss"),
("Calvin", "Klein")
]
con = sqlite3.connect(":memory:")
# Create the table
con.execute("create table person(firstname, lastname)")
# Fill the table
con.executemany("insert into person(firstname, lastname) values (?, ?)", persons)
# Print the table contents
for row in con.execute("select firstname, lastname from person"):
print row
# Using a dummy WHERE clause to not let SQLite take the shortcut table deletes.
print "I just deleted", con.execute("delete from person where 1=1").rowcount, "rows"

View File

@ -1,26 +0,0 @@
import sqlite3
FIELD_MAX_WIDTH = 20
TABLE_NAME = 'people'
SELECT = 'select * from %s order by age, name_last' % TABLE_NAME
con = sqlite3.connect("mydb")
cur = con.cursor()
cur.execute(SELECT)
# Print a header.
for fieldDesc in cur.description:
print fieldDesc[0].ljust(FIELD_MAX_WIDTH) ,
print # Finish the header with a newline.
print '-' * 78
# For each row, print the value of each field left-justified within
# the maximum possible width of that field.
fieldIndices = range(len(cur.description))
for row in cur:
for fieldIndex in fieldIndices:
fieldValue = str(row[fieldIndex])
print fieldValue.ljust(FIELD_MAX_WIDTH) ,
print # Finish the row with a newline.

View File

@ -1,42 +0,0 @@
import sqlite3
con = sqlite3.connect(":memory:")
cur = con.cursor()
# Create the table
con.execute("create table person(lastname, firstname)")
AUSTRIA = u"\xd6sterreich"
# by default, rows are returned as Unicode
cur.execute("select ?", (AUSTRIA,))
row = cur.fetchone()
assert row[0] == AUSTRIA
# but we can make pysqlite always return bytestrings ...
con.text_factory = str
cur.execute("select ?", (AUSTRIA,))
row = cur.fetchone()
assert type(row[0]) == str
# the bytestrings will be encoded in UTF-8, unless you stored garbage in the
# database ...
assert row[0] == AUSTRIA.encode("utf-8")
# we can also implement a custom text_factory ...
# here we implement one that will ignore Unicode characters that cannot be
# decoded from UTF-8
con.text_factory = lambda x: unicode(x, "utf-8", "ignore")
cur.execute("select ?", ("this is latin1 and would normally create errors" + u"\xe4\xf6\xfc".encode("latin1"),))
row = cur.fetchone()
assert type(row[0]) == unicode
# pysqlite offers a builtin optimized text_factory that will return bytestring
# objects, if the data is in ASCII only, and otherwise return unicode objects
con.text_factory = sqlite3.OptimizedUnicode
cur.execute("select ?", (AUSTRIA,))
row = cur.fetchone()
assert type(row[0]) == unicode
cur.execute("select ?", ("Germany",))
row = cur.fetchone()
assert type(row[0]) == str

View File

@ -1,213 +0,0 @@
"""Test module for the noddy examples
Noddy 1:
>>> import noddy
>>> n1 = noddy.Noddy()
>>> n2 = noddy.Noddy()
>>> del n1
>>> del n2
Noddy 2
>>> import noddy2
>>> n1 = noddy2.Noddy('jim', 'fulton', 42)
>>> n1.first
'jim'
>>> n1.last
'fulton'
>>> n1.number
42
>>> n1.name()
'jim fulton'
>>> n1.first = 'will'
>>> n1.name()
'will fulton'
>>> n1.last = 'tell'
>>> n1.name()
'will tell'
>>> del n1.first
>>> n1.name()
Traceback (most recent call last):
...
AttributeError: first
>>> n1.first
Traceback (most recent call last):
...
AttributeError: first
>>> n1.first = 'drew'
>>> n1.first
'drew'
>>> del n1.number
Traceback (most recent call last):
...
TypeError: can't delete numeric/char attribute
>>> n1.number=2
>>> n1.number
2
>>> n1.first = 42
>>> n1.name()
'42 tell'
>>> n2 = noddy2.Noddy()
>>> n2.name()
' '
>>> n2.first
''
>>> n2.last
''
>>> del n2.first
>>> n2.first
Traceback (most recent call last):
...
AttributeError: first
>>> n2.first
Traceback (most recent call last):
...
AttributeError: first
>>> n2.name()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: first
>>> n2.number
0
>>> n3 = noddy2.Noddy('jim', 'fulton', 'waaa')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: an integer is required
>>> del n1
>>> del n2
Noddy 3
>>> import noddy3
>>> n1 = noddy3.Noddy('jim', 'fulton', 42)
>>> n1 = noddy3.Noddy('jim', 'fulton', 42)
>>> n1.name()
'jim fulton'
>>> del n1.first
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: Cannot delete the first attribute
>>> n1.first = 42
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: The first attribute value must be a string
>>> n1.first = 'will'
>>> n1.name()
'will fulton'
>>> n2 = noddy3.Noddy()
>>> n2 = noddy3.Noddy()
>>> n2 = noddy3.Noddy()
>>> n3 = noddy3.Noddy('jim', 'fulton', 'waaa')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: an integer is required
>>> del n1
>>> del n2
Noddy 4
>>> import noddy4
>>> n1 = noddy4.Noddy('jim', 'fulton', 42)
>>> n1.first
'jim'
>>> n1.last
'fulton'
>>> n1.number
42
>>> n1.name()
'jim fulton'
>>> n1.first = 'will'
>>> n1.name()
'will fulton'
>>> n1.last = 'tell'
>>> n1.name()
'will tell'
>>> del n1.first
>>> n1.name()
Traceback (most recent call last):
...
AttributeError: first
>>> n1.first
Traceback (most recent call last):
...
AttributeError: first
>>> n1.first = 'drew'
>>> n1.first
'drew'
>>> del n1.number
Traceback (most recent call last):
...
TypeError: can't delete numeric/char attribute
>>> n1.number=2
>>> n1.number
2
>>> n1.first = 42
>>> n1.name()
'42 tell'
>>> n2 = noddy4.Noddy()
>>> n2 = noddy4.Noddy()
>>> n2 = noddy4.Noddy()
>>> n2 = noddy4.Noddy()
>>> n2.name()
' '
>>> n2.first
''
>>> n2.last
''
>>> del n2.first
>>> n2.first
Traceback (most recent call last):
...
AttributeError: first
>>> n2.first
Traceback (most recent call last):
...
AttributeError: first
>>> n2.name()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: first
>>> n2.number
0
>>> n3 = noddy4.Noddy('jim', 'fulton', 'waaa')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: an integer is required
Test cyclic gc(?)
>>> import gc
>>> gc.disable()
>>> x = []
>>> l = [x]
>>> n2.first = l
>>> n2.first
[[]]
>>> l.append(n2)
>>> del l
>>> del n1
>>> del n2
>>> sys.getrefcount(x)
3
>>> ignore = gc.collect()
>>> sys.getrefcount(x)
2
>>> gc.enable()
"""
import os
import sys
from distutils.util import get_platform
PLAT_SPEC = "%s-%s" % (get_platform(), sys.version[0:3])
src = os.path.join("build", "lib.%s" % PLAT_SPEC)
sys.path.append(src)
if __name__ == "__main__":
import doctest, __main__
doctest.testmod(__main__)

View File

@ -1,76 +0,0 @@
typedef struct _typeobject {
PyObject_VAR_HEAD
char *tp_name; /* For printing, in format "<module>.<name>" */
int tp_basicsize, tp_itemsize; /* For allocation */
/* Methods to implement standard operations */
destructor tp_dealloc;
printfunc tp_print;
getattrfunc tp_getattr;
setattrfunc tp_setattr;
cmpfunc tp_compare;
reprfunc tp_repr;
/* Method suites for standard classes */
PyNumberMethods *tp_as_number;
PySequenceMethods *tp_as_sequence;
PyMappingMethods *tp_as_mapping;
/* More standard operations (here for binary compatibility) */
hashfunc tp_hash;
ternaryfunc tp_call;
reprfunc tp_str;
getattrofunc tp_getattro;
setattrofunc tp_setattro;
/* Functions to access object as input/output buffer */
PyBufferProcs *tp_as_buffer;
/* Flags to define presence of optional/expanded features */
long tp_flags;
char *tp_doc; /* Documentation string */
/* Assigned meaning in release 2.0 */
/* call function for all accessible objects */
traverseproc tp_traverse;
/* delete references to contained objects */
inquiry tp_clear;
/* Assigned meaning in release 2.1 */
/* rich comparisons */
richcmpfunc tp_richcompare;
/* weak reference enabler */
long tp_weaklistoffset;
/* Added in release 2.2 */
/* Iterators */
getiterfunc tp_iter;
iternextfunc tp_iternext;
/* Attribute descriptor and subclassing stuff */
struct PyMethodDef *tp_methods;
struct PyMemberDef *tp_members;
struct PyGetSetDef *tp_getset;
struct _typeobject *tp_base;
PyObject *tp_dict;
descrgetfunc tp_descr_get;
descrsetfunc tp_descr_set;
long tp_dictoffset;
initproc tp_init;
allocfunc tp_alloc;
newfunc tp_new;
freefunc tp_free; /* Low-level free-memory routine */
inquiry tp_is_gc; /* For PyObject_IS_GC */
PyObject *tp_bases;
PyObject *tp_mro; /* method resolution order */
PyObject *tp_cache;
PyObject *tp_subclasses;
PyObject *tp_weaklist;
} PyTypeObject;

View File

@ -1,139 +0,0 @@
from datetime import tzinfo, timedelta, datetime
ZERO = timedelta(0)
HOUR = timedelta(hours=1)
# A UTC class.
class UTC(tzinfo):
"""UTC"""
def utcoffset(self, dt):
return ZERO
def tzname(self, dt):
return "UTC"
def dst(self, dt):
return ZERO
utc = UTC()
# A class building tzinfo objects for fixed-offset time zones.
# Note that FixedOffset(0, "UTC") is a different way to build a
# UTC tzinfo object.
class FixedOffset(tzinfo):
"""Fixed offset in minutes east from UTC."""
def __init__(self, offset, name):
self.__offset = timedelta(minutes = offset)
self.__name = name
def utcoffset(self, dt):
return self.__offset
def tzname(self, dt):
return self.__name
def dst(self, dt):
return ZERO
# A class capturing the platform's idea of local time.
import time as _time
STDOFFSET = timedelta(seconds = -_time.timezone)
if _time.daylight:
DSTOFFSET = timedelta(seconds = -_time.altzone)
else:
DSTOFFSET = STDOFFSET
DSTDIFF = DSTOFFSET - STDOFFSET
class LocalTimezone(tzinfo):
def utcoffset(self, dt):
if self._isdst(dt):
return DSTOFFSET
else:
return STDOFFSET
def dst(self, dt):
if self._isdst(dt):
return DSTDIFF
else:
return ZERO
def tzname(self, dt):
return _time.tzname[self._isdst(dt)]
def _isdst(self, dt):
tt = (dt.year, dt.month, dt.day,
dt.hour, dt.minute, dt.second,
dt.weekday(), 0, -1)
stamp = _time.mktime(tt)
tt = _time.localtime(stamp)
return tt.tm_isdst > 0
Local = LocalTimezone()
# A complete implementation of current DST rules for major US time zones.
def first_sunday_on_or_after(dt):
days_to_go = 6 - dt.weekday()
if days_to_go:
dt += timedelta(days_to_go)
return dt
# In the US, DST starts at 2am (standard time) on the first Sunday in April.
DSTSTART = datetime(1, 4, 1, 2)
# and ends at 2am (DST time; 1am standard time) on the last Sunday of Oct.
# which is the first Sunday on or after Oct 25.
DSTEND = datetime(1, 10, 25, 1)
class USTimeZone(tzinfo):
def __init__(self, hours, reprname, stdname, dstname):
self.stdoffset = timedelta(hours=hours)
self.reprname = reprname
self.stdname = stdname
self.dstname = dstname
def __repr__(self):
return self.reprname
def tzname(self, dt):
if self.dst(dt):
return self.dstname
else:
return self.stdname
def utcoffset(self, dt):
return self.stdoffset + self.dst(dt)
def dst(self, dt):
if dt is None or dt.tzinfo is None:
# An exception may be sensible here, in one or both cases.
# It depends on how you want to treat them. The default
# fromutc() implementation (called by the default astimezone()
# implementation) passes a datetime with dt.tzinfo is self.
return ZERO
assert dt.tzinfo is self
# Find first Sunday in April & the last in October.
start = first_sunday_on_or_after(DSTSTART.replace(year=dt.year))
end = first_sunday_on_or_after(DSTEND.replace(year=dt.year))
# Can't compare naive to aware objects, so strip the timezone from
# dt first.
if start <= dt.replace(tzinfo=None) < end:
return HOUR
else:
return ZERO
Eastern = USTimeZone(-5, "Eastern", "EST", "EDT")
Central = USTimeZone(-6, "Central", "CST", "CDT")
Mountain = USTimeZone(-7, "Mountain", "MST", "MDT")
Pacific = USTimeZone(-8, "Pacific", "PST", "PDT")

Some files were not shown because too many files have changed in this diff Show More