Mathieu Fenniak's Weblog

2003/04/16

Google Search Term Highlighting

Filed under: programming, python — admin @ 12:33 am

Earlier today I was searching around Google for some technical information, and was given Expert Exchange as a link. Experts Exchange has this nifty little feature, implemented as an Apache module, whereby the HTTP referer is examined for search engine footprints. When it is found that you were linked by a search engine, this Apache module named mod_suru will highlight those search terms inside the resultant web page.

“Cool,” I thought. So I went to look at mod_suru and discovered that it is only available at a relatively high cost. Certainly not a cost that I’m willing to pay for my own personal website.

So I took another path. I re-implemented the same idea as mod_suru through primarily JavaScript. My stomphighlight.js file defines a function, highlightWord, which does the magic of actually highlighting the text. The magic of determining which words to highlight is highly dependant upon the setup that the web site has. In the case of stompstompstomp.com, every footer on every page is served by a Python function. I added the following code to my footer:

import os, re, cgi

# Highlight google-ed for words!
if os.environ.has_key("HTTP_REFERER"):
    words = None

    m = re.search("google.[a-z.]+/search\\?(.*)", os.environ['HTTP_REFERER'])
    if m:
        googleQuery = cgi.parse_qs(m.group(1))
        words = ' '.join(googleQuery['q']).split(' ')
        words = filter(lambda x: x.find(":") == -1, words) # remove words like 'site:blahblah.com'

    if words != None:
        print "<!-- Begin magical search term highlighting. -->"
        print """<script language="JavaScript" type="text/javascript" src="/stomphighlight.js"></script>"""
        colors = ("#00eeee", "#eeee00", "#ee00ee", "#ee0000", "#00ee00", "#0000ee")
        print """<script language="JavaScript" type="text/javascript"><!--"""
        for i in range(len(words)):
            try:
                print "highlightWord(%r, %r, document.documentElement);" % (words[i], colors[i])
            except IndexError:
                pass
        print """//--></script>"""
        print "<!-- End magical search term highlighting. -->"

Hopefully, it should be pretty clear how you could take this and use the same idea on your own web site, or modify my code to support more than just Google. Yay!

5 Comments

  1. It`s nice. I`m reading this with joy.

    Comment by Angello — 2010/03/12 @ 5:42 am

  2. Ernie writes,

    > But it’s not like the complexity of the concept or

    > implementation is worth forking out a twenty spot.

    Trust me Ernie, it is extremely complex to implement this server-side. Lao-Tzu’s client-side JavaScript code weighs in at 57 lines (including comments). My server-side solution to the same problem weighs in at 370 lines (and I got to use Python; mod_suru had to use C).

    http://neil.fraser.name/software/highlighter/

    Comment by Neil Fraser — 2010/03/12 @ 5:42 am

  3. lines of code shlines of shlode.. the linux kernel is 5million+ and still free

    Comment by Ernie Hershey — 2010/03/12 @ 5:42 am

  4. I’ve posted a JavaScript highlighter that doesn’t require any server-side code. You can implement it in static HTML pages. Here’s the link:

    http://www.davelemen.com/archives/000002.html

    Let me know what you think!

    Comment by Dave Lemen — 2010/03/12 @ 5:42 am

  5. Good for you. I can’t believe they’re peddling mod_suru for money. I saw it in use on a website this morning and thought “What a cool module!” But it’s not like the complexity of the concept or implementation is worth forking out a twenty spot. (or the $65 they say it normally costs).

    Comment by Ernie Hershey — 2010/03/12 @ 5:42 am

RSS feed for comments on this post. TrackBack URL

Sorry, the comment form is closed at this time.

Powered by WordPress