May 2004 archive

Plotting in Excel through Python/COM

For the past couple weeks, I’ve been thinking about mathematical model development. There are lots of great tools out there to help with such tasks, like Mathcad and Mathematica. But if you’re doing software development, once you’ve built and tested a model, what you really want is code. Your Mathcad files are great for documentation, testing, and development of your model, but they can’t be embedded in your Java or C++ application.

Additionally, it’s very easy to use functions from those pieces of software which can’t be easily replicated in your software. They have very optimized methods for root solving, matrix math, symbolic derivative calculations, and other such tasks that you can’t reproduce without years worth of effort. So, when you’re developing a model that’s going to be used in software, what’s the easiest way to do it?

Python, of course. Python code tends to be very legible and terse, while still being well suited for mathematical programming. In fact, the syntax and control structures even look pretty similar between Mathcad and Python:

/wsimages/python-riddler.png

Part of a root solver implemented in Python.

/wsimages/mathcad-riddler.png

Part of a root solver implemented in Mathcad.

So, it seems natural to me to use Python to develop models. My code can easily be translated into other programming languages after the fact, and it tends to have a more proper programming structure if it’s not translated from Mathcad. Plus, translating a big chunk code from Mathcad can be a real pain in the ass to do manually – there’s just too much to miss, and it can be very time consuming to confirm it’s been done correctly.

Python is not quite a silver bullet in this case, though. Mathcad has a lot of tools for visualization which are very useful when developing a new model. Python’s wonderful abillity to interoperate makes it fairly easy to leverage plotting capabilities of an application like Microsoft Excel though. So, without any further leadup, here’s a quick process which will get you up and running with an Excel plotting Python program:

  1. Get and install Python, and the PythonWin extensions (or install ActivePython, which contains all the necessary tools and more).

  2. Run the PythonWin IDE program, and generate static COM wrappers for Microsoft Excel. This is as easy as selecting the COM Makepy Utility option from the Tools menu of PythonWin, then selecting the most recent version of the Microsoft Excel n.m Object Library available:

    /wsimages/pythonwin-comwrapper.png

    Selecting COM Makepy utility from PythonWin’s menu.

    The static COM wrappers must be used in order to access the Excel constants (of which there are hundreds).

  3. Import the COM Dispatch function and the constants namespace into your application:

    
    from win32com.client import Dispatch, constants
    
  4. At this point, all that’s left is to go to town on automating Excel. There is a lot of documentation that comes along with Excel, as well as a big chunk of MSDN content that will also help. Let’s dive in, and I’ll just throw an XY scatter plot at you:

    
    def plot(x, y, xAxisLog=False, yAxisLog=False):
        # acquire application object, which may start application
        application = Dispatch("Excel.Application")
    
        # create new file ('Workbook' in Excel-vocabulary)
        workbook = application.Workbooks.Add()
    
        # store default worksheet object so we can delete it later
        defaultWorksheet = workbook.Worksheets(1)
    
        # build new chart (on seperate page in workbook)
        chart = workbook.Charts.Add()
        chart.ChartType = constants.xlXYScatter
        chart.Name = "Plot"
    
        # create data worksheet
        worksheet = workbook.Worksheets.Add()
        worksheet.Name = "Plot data"
    
        # install data
        xColumn = addDataColumn(worksheet, 0, x)
        yColumn = addDataColumn(worksheet, 1, y)
    
        # create series for chart
        series = chart.SeriesCollection().NewSeries()
        series.XValues = xColumn
        series.Values = yColumn
        series.Name = "Data"
        series.MarkerSize = 3
    
        # setup axises
        xAxis = chart.Axes()[0]
        yAxis = chart.Axes()[1]
        xAxis.HasMajorGridlines = True
        yAxis.HasMajorGridlines = True
        if xAxisLog:
            xAxis.ScaleType = constants.xlLogarithmic
        if yAxisLog:
            yAxis.ScaleType = constants.xlLogarithmic
    
        # remove default worksheet
        defaultWorksheet.Delete()
    
        # make stuff visible now.
        chart.Activate()
        application.Visible = True
    
    def genExcelName(row, col):
        """Translate (0,0) into "A1"."""
        if col < 26:
            colName = chr(col + ord('A'))
        else:
            colName = chr((col / 26)-1 + ord('A')) +\
                chr((col % 26) + ord('A'))
        return "%s%s" % (colName, row + 1)
    
    def addDataColumn(worksheet, columnIdx, data):
        range = worksheet.Range("%s:%s" % (
            genExcelName(0, columnIdx),
            genExcelName(len(data) - 1, columnIdx),
            ))
        for idx, cell in enumerate(range):
            cell.Value = data[idx]
        return range
    

    I suppose that at least an explanation of what this is doing would be a good thing. This program will cause Excel to be run, and a new file to be created. The new file will contain one worksheet with the contents of the x/y arrays in columns A and B. It will also contain one sheet-location chart, an XY scatter, with the x/y points plotted. The x and y arguments to plot are sequences of numbers, and the two flags indicate whether the X and Y axis should be logarithmic or linear.

  5. Some examples of using plot:

    
    # A simple example:
    plot( (1,2,3,4,5), (6,7,8,9,10) )
    
    # Some more data:
    x, y = [], []
    for i in range(100):
        x.append(i)
        y.append(i ** 2)
    plot(x, y)
    
    # Using log axises:
    plot(x, y, True, True)
    
  6. Remember, there are good tools available if you're not in Windows too. gnuplot and gnuplot-py provide a pretty nice graphing environment which is probably quite a bit more capable than Excel.

KMDI's Open Source Conference

I’ve been attending an Open Source conference at the University of Toronto, organized by the Knowledge Media Design Institute. The conference has been extremely interesting for a number of reasons, which I will iterate here in chronological order.

I’m here mostly because of Catspaw‘s involvement in the conference. I’m also here staying with her while I attend the conference. This is a major important part of understanding why many of the odd events here have occured.

  • The day before the conference, I chatted with a volunteer by the name of Leigh. Although originally I was just listening to a conversation between her and Jason Nolan, eventually just the two of us were talking. At some point the topic of the Internet retaining information came up, and we used Catspaw’s laptop to lookup her name on google. One of the relevant google groups search result was from a newsgroup alt.dragon-net, which she claimed was a MUD she used to use.

    Well, the discussion continued on MUDs, when Leigh made the interesting claim that she had once dated two wizards from a single MOO (although she hadn’t known they were on the same MOO until much later). I dug into this and discovered she was referring to two people from MOO Canada, where she had previously had a character and used. She had even previously attended a MOO picnic. She was even a close friend of a good friend of mine, Joanna.

    This relatively random encounter was with a person who I had probably met before many years prior, and knew of myself and Catspaw (only by the name Catspaw), and was a very close friend-of-a-friend relation. Small world.

  • Sunday morning, we were treated to an amazing speech by Eben Moglen. He made a great, moving, active speach supporting Free software. It was one of the highlights of this conference. "We will win."

  • During the break between the Sunday sessions, I was standing next to the refreshment table when a voice from behind asked, "What’s with the Python shirt?" (referring to my wonderful Python t-shirt, of course). I turned around to be faced with David Ascher, one of the conference speakers and the current manager of product development at ActiveState. I had hoped to talk with him during the conference, and here was my chance. Surprised as I was though, I had nothing insightful or useful to say.

  • Sundary afternoon, it got even better. A panel on the law and politics of open source became a debate on legal issues surrounding free software and the GNU general public license. David McGowan spoke amazingly well to poke a hole in the wonderful balloon of emotion that Eben had earlier generated amoung the audience. Barry Sookman followed with a continuing and more detailed evaluation of the GPL, with specific references to Canadian copyright law.

    Eben was allowed to take the stage to comment. He replied with some very breakneck comments that blew away the earlier points of discussion.

    One of the conference attendees, while posing a question, described the situation very well. "You’re all excellent speakers, with an excellent discussion. I believed everything each of you said as you said it, which is amazing since it all conflicted with each other." (Ian D. Allen).

  • I helped Nancy Frishberg transfer her presentation slides off her laptop (which was running the Java Desktop System) and into the KMDI ePresence network server. I was happy to help, but I was very confused by a couple of points of this interaction:

    1. Why does a Sun employee need help doing such a simple seeming task using their own desktop environment? I mean, clearly she didn’t write the entire system, but I would have thought she might have the channel of input necessary to tell the software developers to make such a thing easy. It was … unintuitive, but understandable.
    2. Nancy kept a copy of the terminal session I used and put it into a StarOffice document after the process was done. I’m very curious what she planned to do with that… send it to the developers and suggest she doesn’t want to do such things ever again? Place it into documentation for the desktop system?
  • After David Ascher’s presentation during the Monday morning panel, a question was asked by a dude in a red plaid shirt. His question related to Komodo, in which he refered to it as a tool aiding the development of things like MUDzilla. This was an extremely odd comment. As one of the developers of MUDzilla, I was shocked to hear anyone else refer to our small, unmaintained, and very limited-audience piece of software.

  • After the morning panel on Monday, I got a chance to intercept David Ascher while he was leaving the conference room. I asked him whether ActiveState had any plans to develop an OSX version of their wonderful Komodo IDE. His response was, "if I had a nickel for every time I was asked that question…" to which I replied "You could have more than a nickel for every time." Anyways, his answer was that it was not as simple as it seemed to port, and market research had indicated it was not likely that many full commercial copies of the software (at $300 each) would be sold. It was not going to be profitable due to a large amount of development and a small potential return.

  • Nancy Frishberg’s presentation included a quotation from Catspaw’s recent paper on Five Fundamental Issues with Open Source Software. Cecil noticed this just shortly before her presentation began as she was zipping through her presentation on screen. We informed Catspaw and someone introduced the two of them. I find it pretty ironic that one of the presenters during this conference was quoting the volunteer organizer of the conference.