Have you investigated how to make your API forward and backwards compatible, so that you can make changes to your API without affecting your current clients? Did you cry yourself to sleep shortly afterwards?
It’s really difficult to be confident about API compatibility because:
- You are planning for an unknown future.
- You are scared that you won’t be able to support unknown future changes.
- You are not omniscient.
- You are not a time-traveler.
- There are many convincing arguments about how to handle change in your API.
- All the convincing arguments conflict with each other.
My customers will be angry if I break my API.
When you release your Web API, it’s carved into stone. It’s a scary commitment to never make an incompatible change. If you fail, you’ll have irate customers yelling in your inbox, followed by your boss, and then your boss’s boss. You have to support this API. Forever. Unless you version it, right?
After publishing The Web API Checklist, I received comments (#1, #2) regarding API versioning. Before you struggle with how to version your API, I want you to know how to design your API to avoid future incompatibilities.
If my API is well designed, it won’t be fragile.
It is possible to design your API in a manner that reduces its fragility and increases its resilience to change. The key is to design your API around its intent. In the SOA world, this is also referred to as business-orientation. It’s a difficult design concept to understand, best explained with a fictitious example:
When you’re designing, testing, or releasing a new Web API, you’re building a new system on top of an existing complex and sophisticated system. At a minimum, you’re building upon HTTP, which is built upon TCP/IP, which is built upon a series of tubes. You’re also building upon a web server, an application framework, and maybe an API framework.
Most people, myself included, are not aware of all the intricacies and nuances of every component they’re building upon. Even if you deeply understand each component, it’s probably going to be too much information to hold in your head at one time.
“We know there are known knowns: there are things we know we know. We also know there are known unknowns: that is to say we know there are things we know we don’t know. But there are also unknown unknowns — the ones we don’t know we don’t know.” —Defense Secretary Donald Rumsfeld, Defense Department briefing, Feb 12, 2002
I don’t want you to build an API and only know about your unknown unknowns when they bite you in the ass. So, here’s a list of a bunch of things, both obvious and subtle, that can easily be missed when designing, testing, implementing, and releasing a Web API.
pyPdf and pg8000 have been ported to run under Python 3.0a1, in new Mercurial repository branches.
pg8000 is a Pure-Python database driver for PostgreSQL, compatible with the standard DB API (although under Python 3.0, the Binary object expects a bytes argument). pg8000 does not yet support every standard PostgreSQL data type, but it supports some of the most common data types.
pyPdf is a Pure-Python PDF toolkit. It is capable of reading and writing PDF files, and can be easily used for operations like splitting and merging PDF files.
I am pretty happy with the upgrade to Python 3.0a1. The 2to3 conversion utility provides a good start for some of the most mechanical of changes. pyPdf and pg8000 used strings as byte buffers pretty extensively, especially pyPdf, and so the changes were pretty extensive.
Having a good test suite is essential to the upgrade process. That was why I chose these two projects to start with, as I have a pretty good pg8000 test suite, and a very comprehensive pyPdf suite. After running 2to3 on the source code, it was just a matter of beating the code into order until all the tests run. It took about 4 hours per project, but many projects wouldn’t have as many changes as these projects have.
There are a couple of unexpected behaviours (in my opinion) regarding the new bytes type:
- b"xref" != b"x". Getting a single item out of a bytes type returns an integer, which fails to compare with a bytes instance of a length 1.
- b"x" == "x" throws an exception, rather than returning False. This exception is useful for finding places where byte/string comparisons are being done by mistake, but I ran into one instance where I wanted to compare these objects and have it be false. It was easy to code around.
- You can’t derive a class from bytes. I hope that this will be fixed in future releases, since pyPdf’s StringObject class derived from str previously. (It can’t derive from str now, since the PDF files have no encoding information for strings [that I know of…])
Good work on Python 3.0a1, developers! I love the separation of strings and byte arrays, even though it took me a lot of work to fix up these couple of projects. It’s the right way to do things.
A new version of pg8000, a Pure-Python interface for the PostgreSQL database, has been released today. This version supports DB-API 2.0 as documented in PEP-249. The request to add DB-API support to pg8000 was the biggest thing I heard about over the last pg8000 release.
Also new in version 1.02 is SSL support, datetime parameter input, comprehensive unit tests, and bytea object support.