Given a function ƒ, find x such that ƒ(x) = 0. Sounds simple enough…
Root finding is a hobby of mine. It’s kind of a lame hobby. It can be lots of fun though, and of course there are many real world usages for it. Once in a while, I’ll read through papers and methods on the general subject and implement new algorithms. I once even had an “algorithm war” program which would pit two against each other: given an arbitrary function with one root, which method would find it first? Most consistently first? Highest accuracy with fewest function calls?
I was faced with an interesting more specific problem yesterday. A simple secant solver had been used on a function with great results, except at a few extreme software inputs. In general, the solver only knew that 0 < x <= ?, and, in fact, ƒ could not be evaluated for values <= 0. “Hm…. the secant method worked so well for this nice, smooth function… if only there was a way to prevent it from going negative,” I thought.
What if the secant method were drawn on a semi-log plot, rather than a cartesian plot? That would cause the root finder to never go negative. It would have greater resolution at smaller values with fewer iterations, but could suffer from requiring more iterations at higher values (for the same tolerance).
Figure A: Function plotted on a cartesian scale.
The function being solved looked similar to figure A. In figure A, if the secant method were to hit points between 5 and 10 billion, it would quickly be thrown off by the straight slope into values less than zero, where the function cannot be evaluated.
Figure B: Function plotted on a semi-log scale.
If it is instead plotted on a semi-log scale, as shown in figure B, the flattened area would throw the values off to very small numbers (around 1E-10). This function would have a very high slope at those values, and the secant method would throw it back into a reasonable range very quickly.
Implementing this modified secant method idea was pretty simple. The standard secant root solver is used, but with a variation of how the next point is chosen:
- Rather than fitting y = m * x + b to two points on the curve, an exponential function y = m * exp(x) + b was fitted to the two points.
- The m value was calculated with m = (y1 - y2) / (log(x1) - log(x2)),
- the b value was calculated as b = y1 - log(x1) * m.
- A new x coordinate by solving the equation for y == 0, which simplifies down to x = exp((y1 * log(x2) - log(x1) * y2) / (y1 - y2)).
This method proved to be as quick as a normal secant solution, and very effective for functions which cannot be evaluated at values <= 0.