Embedding Python Tips
Python is a beautiful programming language. One of it’s most wonderful features is a very clean and simple C API that allows Python to be extended with dynamically loadable C modules. That same C API also allows Python to be embedded in other pieces of software. This means that any program can allow the user to enter Python code interactively (or otherwise) to affect the program in whatever way they wish. This is a powerful capability, but using occasionally requires a few tricks to accomplish the embedder’s goals.
Today’s embedding exercise was allowing a MOO server to execute arbitrary Python code:
;py_runfunction({"import math", "return math.sqrt(2*2*2)"}) > 2.8284271247461903
Of course, a MOO server can already do square roots… that wasn’t the point. There was no point. Anyways, here are a few ideas that might help other people embed Python in a useful way.
Evaluating Statements
One of the first things most people try to do is evaluate an arbitrary statement and get its return value. This is not quite as easy as it sounds. Although Python’s eval
builtin does this, it may be more limited than the embedding programmer desires. eval
will only permit an expression to be evaluated, not a statement:
>>> eval("x = 2") Traceback (most recent call last): File "<stdin>", line 1, in ? File "<string>", line 1 x = 2 ^ SyntaxError: invalid syntax
I suggest that if you want the user to be able to evaluate an arbitrary block of code, wrap an artificial function around it and call the function itself:
This allows the user to input much more complex functions, like the above example which uses a class and an import statement. All that needed to be artificially added was the ‘def f():’ and an arbitrary but constant amount of whitespace in front of each line of code.
Compiling Code without a Module
So you’ve gotten some code from a user, and you want to compile it. Maybe you’re creating a function to wrap around the user’s code. Where does that function belong? Where do you evaluate your code?
The first instinct I had was to use PyImport_AddModule
to get the __main__
module and start importing functions into its module dictionary. I had a block of code similar to this (error checking omitted):
This then allowed me to call functions on the module
object and get some code back. The only real downside was the initialize and finalize around my code. I didn’t want code from one compile to mess with another, and since I was using the __main__
module, this caused problems. Eventually I decided to use random strings as the names for my modules so that I could use them all independently, but that sure was ugly.
The solution I stumbled upon was caused by my accidently deleting some lines of code. I eventually realized that I didn’t need the module
object at all. I could create a new, empty dictionary, and compile the code ‘into’ that:
Everything continued to work as before, except now I had to PyDict_GetItem
out of dict
and use PyObject_CallObject
rather than the PyObject_CallMethod
that I could have used before. But nothing crashed, the world continued to run, and I no longer needed to initialize and finalize around my evaluation. Yay!
Settings __builtins__
There was one minor problem. Some functionality like builtin functions and classes (like Exception
) was missing. Oops:
// Check for __builtins__...
if (PyDict_GetItemString(dict, "__builtins__") == NULL)
{
// Hm... no __builtins__ eh?
PyObject* builtinMod = PyImport_ImportModule("__builtin__");
if (builtinMod == NULL ||
PyDict_SetItemString(dict, "__builtins__", builtinMod) != 0)
{
Py_DECREF(dict);
Py_XDECREF(dict);
// error handling
return;
}
Py_DECREF(builtinMod);
}
Hey, that fixed that right up.
I had this problem when I was using random names for modules, as well. It seems PyImport_AddModule
does not set __builtins__
> on a new module — but it is set up on __main__
always.
Getting Tracebacks using the traceback
Module
What happened when things went wrong? Well, at first, a lot of crashing. And things were going wrong a lot, especially when I was trying to use modules that didn’t exist in the system. Heh heh.
Thankfully, Python will setup tracebacks that are useful even when you’re using the C API and screwing things up from the inside. How on earth do you get at those tracebacks, though? You can get a lot of information from the PyErr_*
class of functions, but not a properly formatted Python traceback to display to the user. Eventually, I ended up using the traceback
module itself to display an error:
Of course, when one can’t import the traceback
module, one can’t generate a traceback explaining why not. 🙂