March 2007 archive

pg8000 v1.02

A new version of pg8000, a Pure-Python interface for the PostgreSQL database, has been released today. This version supports DB-API 2.0 as documented in PEP-249. The request to add DB-API support to pg8000 was the biggest thing I heard about over the last pg8000 release.

Also new in version 1.02 is SSL support, datetime parameter input, comprehensive unit tests, and bytea object support.

pg8000 v1.00 — a new PostgreSQL/Python interface

pg8000 is a Pure-Python interface to the PostgreSQL database engine. Yesterday, it was released to the public for the first time.

pg8000′s name comes from the belief that it is probably about the 8000th PostgreSQL interface for Python. However, pg8000 is somewhat distinctive in that it is written entirely in Python and does not rely on any external libraries (such as a compiled python module, or PostgreSQL’s libpq library). As such, it is quite small and easy to deploy. It is suitable for distribution where one might not have a compiled libpq available, and it is a great alternative to supplying one with your package.

Why use pg8000?

  • No external dependencies other than Python’s standard library.
  • Pretty cool to hack on, since it is 100% Python with no C involved.
  • Being entirely written in Python means it should work with Jython, PyPy, or IronPython without too much difficulty.
  • libpq reads the entire result set into memory immediately following a query. pg8000 uses cursors to read chunks of rows into memory, attempting to find a balance between speed and memory usage for large datasets. You could accomplish this yourself using libpq by declaring cursors and then executing them to read rows, but this has two disadvantages:
    • You have to do it yourself.
    • You have to know when your query returns rows, because you can’t DECLARE CURSOR on an INSERT, UPDATE, DELETE, CREATE, ALTER, ect.
  • pg8000 offers objects to represent prepared statements. This makes them easy to use, which should increase their usage and improve your application’s performance.
  • It has some pretty nice documentation, I think.

Now, that being said, reality kicks in. Here’s why not to use pg8000:

  • It’s pretty new. This means there are likely bugs that haven’t been found yet. It will mature over the next couple weeks with some community feedback and some internal testing.
  • It doesn’t support the DB-API interface. I didn’t want to limit myself to DB-API, so I created just a slightly different interface that made more sense to me. I intend to include a DB-API wrapper in the next release, v1.01.
  • It isn’t thread-safe. When a sequence of messages needs to be sent to the PG backend, it often needs to occur in a given order. The next release, v1.01, will address this by protecting critical areas of the code.
  • It doesn’t support every PostgreSQL type, or even the majority of them. Notably lacking are: parameter send for float, datetime, decimal, interval; data receive for interval. This will just be a matter of time as well, and hopefully some user patches to add more functions. For the case of interval, I expect to optionally link in mxDateTime, but have a reasonable fallback if it is not available.
  • It doesn’t support UNIX sockets for connection to the PostgreSQL backend. I just don’t quite know how to reliably find the socket location. It seems that information is compiled into libpq. Support could be added very easily if it was just assumed that the socket location was provided by the user.
  • It only supports authentication to the PG backend via trust, ident, or md5 hashed password.

pg8000′s website is http://pybrary.net/pg8000/. The source code is directly accessible through SVN at http://svn.pybrary.net/pg8000/.