Embedding Python Tips

Python is a beautiful programming language. One of it’s most wonderful features is a very clean and simple C API that allows Python to be extended with dynamically loadable C modules. That same C API also allows Python to be embedded in other pieces of software. This means that any program can allow the user to enter Python code interactively (or otherwise) to affect the program in whatever way they wish. This is a powerful capability, but using occasionally requires a few tricks to accomplish the embedder’s goals.

Today’s embedding exercise was allowing a MOO server to execute arbitrary Python code:

;py_runfunction({"import math", "return math.sqrt(2*2*2)"})
> 2.8284271247461903

Of course, a MOO server can already do square roots… that wasn’t the point. There was no point. Anyways, here are a few ideas that might help other people embed Python in a useful way.

Evaluating Statements

One of the first things most people try to do is evaluate an arbitrary statement and get its return value. This is not quite as easy as it sounds. Although Python’s eval builtin does this, it may be more limited than the embedding programmer desires. eval will only permit an expression to be evaluated, not a statement:

>>> eval("x = 2")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<string>", line 1
    x = 2
      ^
SyntaxError: invalid syntax

I suggest that if you want the user to be able to evaluate an arbitrary block of code, wrap an artificial function around it and call the function itself:

def f():
    import math
    class Cylinder:
        def _calcVolume(self):
            return math.pi * \
                self.radius**2 * \
                self.height
        volume = property(_calcVolume)
    c = Cylinder()
    c.radius = 12.2
    c.height = 16.12
    return c.volume

This allows the user to input much more complex functions, like the above example which uses a class and an import statement. All that needed to be artificially added was the ‘def f():’ and an arbitrary but constant amount of whitespace in front of each line of code.

Compiling Code without a Module

So you’ve gotten some code from a user, and you want to compile it. Maybe you’re creating a function to wrap around the user’s code. Where does that function belong? Where do you evaluate your code?

The first instinct I had was to use PyImport_AddModule to get the __main__ module and start importing functions into its module dictionary. I had a block of code similar to this (error checking omitted):

Py_Initialize();
PyObject* module = PyImport_AddModule("__main__");
PyObject* moduleDict = PyModule_GetDict(module);
PyObject* compileRetval = PyRun_String(code, Py_file_input,
    moduleDict, moduleDict);
...
Py_Finalize();

This then allowed me to call functions on the module object and get some code back. The only real downside was the initialize and finalize around my code. I didn’t want code from one compile to mess with another, and since I was using the __main__ module, this caused problems. Eventually I decided to use random strings as the names for my modules so that I could use them all independently, but that sure was ugly.

The solution I stumbled upon was caused by my accidently deleting some lines of code. I eventually realized that I didn’t need the module object at all. I could create a new, empty dictionary, and compile the code ‘into’ that:

PyObject* dict = PyDict_New();
PyObject* compileRetval = PyRun_String(code, Py_file_input, dict, dict);

Everything continued to work as before, except now I had to PyDict_GetItem out of dict and use PyObject_CallObject rather than the PyObject_CallMethod that I could have used before. But nothing crashed, the world continued to run, and I no longer needed to initialize and finalize around my evaluation. Yay!

Settings `builtins`

There was one minor problem. Some functionality like builtin functions and classes (like Exception) was missing. Oops:

// Check for __builtins__...
if (PyDict_GetItemString(dict, "__builtins__") == NULL)
{
    // Hm... no __builtins__ eh?
    PyObject* builtinMod = PyImport_ImportModule("__builtin__");
    if (builtinMod == NULL ||
        PyDict_SetItemString(dict, "__builtins__", builtinMod) != 0)
    {
        Py_DECREF(dict);
        Py_XDECREF(dict);
        // error handling
        return;
    }
    Py_DECREF(builtinMod);
}

Hey, that fixed that right up.

I had this problem when I was using random names for modules, as well. It seems PyImport_AddModule does not set __builtins__> on a new module — but it is set up on __main__ always.

Getting Tracebacks using the `traceback` Module

What happened when things went wrong? Well, at first, a lot of crashing. And things were going wrong a lot, especially when I was trying to use modules that didn’t exist in the system. Heh heh.

Thankfully, Python will setup tracebacks that are useful even when you’re using the C API and screwing things up from the inside. How on earth do you get at those tracebacks, though? You can get a lot of information from the PyErr_* class of functions, but not a properly formatted Python traceback to display to the user. Eventually, I ended up using the traceback module itself to display an error:

char* getPythonTraceback()
{
    // Python equivilant:
    // import traceback, sys
    // return "".join(traceback.format_exception(sys.exc_type,
    //    sys.exc_value, sys.exc_traceback))

    PyObject *type, *value, *traceback;
    PyObject *tracebackModule;
    char *chrRetval;

    PyErr_Fetch(&type, &value, &traceback);

    tracebackModule = PyImport_ImportModule("traceback");
    if (tracebackModule != NULL)
    {
        PyObject *tbList, *emptyString, *strRetval;

        tbList = PyObject_CallMethod(
            tracebackModule,
            "format_exception",
            "OOO",
            type,
            value == NULL ? Py_None : value,
            traceback == NULL ? Py_None : traceback);

        emptyString = PyString_FromString("");
        strRetval = PyObject_CallMethod(emptyString, "join",
            "O", tbList);

        chrRetval = strdup(PyString_AsString(strRetval));

        Py_DECREF(tbList);
        Py_DECREF(emptyString);
        Py_DECREF(strRetval);
        Py_DECREF(tracebackModule);
    }
    else
    {
        chrRetval = strdup("Unable to import traceback module.");
    }

    Py_DECREF(type);
    Py_XDECREF(value);
    Py_XDECREF(traceback);

    return chrRetval;
}

Of course, when one can’t import the traceback module, one can’t generate a traceback explaining why not. 🙂

Evaluating Statements

Compiling Code without a Module

Settings __builtins__

Getting Tracebacks using the traceback Module

Settings `builtins`

Getting Tracebacks using the `traceback` Module