gee-simulated-annealing


Simulated annealing

The Guile binding to the GNU Scientific Library (distributed with GEE) exposes an interface to the simulated annealing algorithm.

The functions described in this page allow us to solve (approximately) the problem of finding the global minimum of a univariate function f(t) using the simulated annealing (siman) algorithm.

Siman starts from a location (configuration) in the space of the solutions and moves around selecting its step by jumping randomly to other locations inside a maximum range; this offers two advantages over the analytical minimisation algorithms:

Yet another advantage: it is possible to map the search of the minimum over the domain of other problems, so turning the problem into the search of the minimum of multivariate functions f(x, y, ...) or into the search of a vector of elements, chosen from sets, that satisfy a number of constraints.

Some of the examples in this page are taken from the GSL documentation or its test suite. The text of this page is a partial rewriting of the Guile-GSL documentation.

Univariate functions minimum

To find the minimum of a univariate function is "easy": some of the functions required to do it are reduced to very simple forms. Let's take the function:

         sin t                    (define (sinc t)
f(t) = - -----                      (- (/ (sin t)
           t                              t))

it is a well known function that has many local minima and a global minium at t = 0.

Solver solution

If we know where the global minimum is, using Guile-GSL we can solve this problem with:

 (define (sinc t) (- (/ (sin t) t))) 
 (minimum sinc #,(range -1 1) 
   (make <solver> #:solver 'brent #:guess  0.5)) 
       
 ;; -> t = -8.59614841320156e-12, f_min = -1.0 

but if we do not know we can get stuck in a local minimum:

 (define (sinc t) (- (/ (sin t) t))) 
 (minimum sinc #,(range 5 10) 
   (make <solver> #:solver 'brent #:guess 9)) 
       
 ;; t = 7.72525114216453, f_min = -0.128374553525868 

Siman solution

To solve the problem with siman we do (accepting default values):

 (define (sinc t) (- (/ (sin t) t))) 
 (simulated-annealing (make <siman>) 
   #:step-function gsl-rng-uniform-alea  #:max-step-size 5 
   #:energy-function sinc                #:start-configuration 15.5) 
       
 ;; -> t = -7.63963907957077e-6 
  

of course once we have found the minimising point we can refine it with a call to `MINIMUM':

 (let ((min -7.63963907957077e-6)) 
   (minimum sinc (make-range (- min 0.2) (+ min 0.2)) 
      (make <solver> #:solver 'brent #:guess  min 
                     #:epsabs 1.0e-9 #:epsrel 1.0e-6))) 
       
 ;; t = 6.2018305727762e-11, f = -1.0 

For now let's ignore the configuration of the <siman> object. To use siman we have to call the function `SIMULATED-ANNEALING'; the configuration keywords required here are:

#:start-configuration REAL

this is the starting position for the search; we see that we are far away from the global minimum position, t = 0, but siman is able to step over the local minium at t ~= 7.725;

#:energy-function APPLICABLE

this is the procedure that computes how "good" the current location in the search is; the less the value, the better; when looking for the minimum of a univariate function, the function itself is what we need;

#:step-function APPLICABLE

this is used to randomly make a step from a given location to another location inside a range; three arguments are given: the random numbers generator selected with #:rng in the configuration of `<siman>' (or automatically created as default value), the current location, the maximum step size selected with #:max-step-size;

     min bound                               max bound
         v                                       v
---------|-------------------+-------------------|-----------> t
                             ^
                          current
         |...................|...................|
             max-step-size       max-step-size

the function GSL-RNG-UNIFORM-ALEA is perfect to generate a step for a real number;

#:max-step-size VALUE

selects an upper bound for the distance between the current location and the next location built by the function given to #:step-function.

Multivariate functions minimum

Let's take this function:

           sin sqrt(x^2 + y^2)
f(x,y) = - -------------------     -30 < x < 30    -30 < y < 30
             sqrt(x^2 + y^2)

that is:

 (define (f x y) 
   (- (/ (sin (sqrt (+ (sqr x) (sqr y)))) 
         (sqrt (+ (sqr x) (sqr y)))))) 

if we plot it in the indicated range we see that it has a global minimum at (0, 0), and an infinity of local minima in circles around the origin.

Solver solution

If we know approximately where the minimum is, we can solve this problem with:

 (minimum f (make <solver> 
          #:solver 'nmsimplex           #:guess  #,(gvr 2 0.1 0.1) 
          #:step #,(gvr 2 0.001 0.001)  #:epsabs 1.0e-6)) 
       
 ;; -> t = [-1.01526896029594e-7 -4.13302749135758e-7] 
 ;; -> f = -0.99999999999997 

but if we do not know we may end up stuck in a local minima:

 (minimum f (make <solver> 
          #:solver 'nmsimplex           #:guess  #,(gvr 2 15 15) 
          #:step #,(gvr 2 0.001 0.001)  #:epsabs 1.0e-6)) 
       
 ;; -> t = [14.4037489395141 14.4056239395148] 
 ;; -> f = -0.0490296240140742 

Siman solution

With siman we do:

 (let* ((sinxy      (lambda (xp) 
                      (let-vectors (((x y) xp)) 
                        (- (/ (sin (sqrt (+ (sqr x) (sqr y)))) 
                              (sqrt (+ (sqr x) (sqr y)))))))) 
       
        (take-step  (lambda (rng xp max) 
                      (let-vectors (((x y) xp)) 
                        (read-gsl-vector-real 2 
                          (gsl-rng-uniform-alea rng x max) 
                          (gsl-rng-uniform-alea rng y max)))))) 
       
   (simulated-annealing (make <siman> #:restart 1) 
      #:max-step-size 10 
      #:step-function take-step 
      #:energy-function sinxy 
      #:start-configuration #,(gvr 2 15 15) 
      #:copy-function deep-clone)) 
       
      ;; -> [0.115101989358664 -0.0066785141825676] 

and then we can refine it with:

(minimum sinxy (make <solver>
                 #:solver 'nmsimplex    #:step #,(gvr 2 1.0e-5 1.0e-5)
                 #:guess  #,(gvr 2  0.115101989358664
                                   -0.0066785141825676)
                 #:epsabs 1.0e-6))
     
;; t = [-7.71650281511701e-7 -1.93361466521121e-8]
;; f = -0.999999999999901

There are a few differences with the univariate minimum case because the location is represented by a vector, not a real number; this means that:

Fitting parameters

Let's take a function f(t, p), where t is an independent variable and p is a vector of parameters. The least squares criterion finds the minimum of the integral:

  b
 /
 | [f(t, p)]^2 dt
 /
a

with respect to p; that is: the set of parameters p for which the area enclosed between the function f(t) and the function g(t) = 0 is minimum.

A computer cannot handle continuous functions (do not bother me), so it does this by minimising a vector of samples of the function:

      n
Phi = S  [f(t_k, p)]^2           a < t_k < b
     k=1

Once we have this algorithm, we can use it to fit a function y to a data set. Let's say that we can sample an unknown function x(t) over a vector of values [t_k] for the independent variable. We decide to model the unknown x(t) with a function y(t, p): we want to find a set of parameters p that minimises:

      n
Phi = S  [y(t_k, p) - x(t_k)]^2           a < t_k < b
     k=1

with such a set y(t, p) is the "closest" to x(t), in the least squares sense.

The vectors of points t_k and observations x(t_k) are known and fixed, so Phi is only a function of p: Phi = Phi(p). The problem of fitting is a special problem of minimisation.

Example

We want to model an unknown x(t) with an exponential function:

y(t, p) = A exp(-lam t) + b            p = [A, lam, b]

the function to minimise is:

                 n
Phi(A, lam, b) = S  [A exp(-lam t_k) + b - x(t_k)]^2
                k=1

and we may need the jacobian matrix:

 -                     -
| d Phi   d Phi   d Phi |
| -----   -----   ----- |
|  d A    d lam    d b  |
 -                     -

The GSL requests us to provide separately the functions Phi_k:

                 n
Phi(A, lam, b) = S  [Phi_k(A, lam, b)]^2
                k=1
     
Phi_k(A, lam, b) = A exp(-lam t_k) + b - x(t_k)

through a single function that returns a vector holding Phi_k(p) evaluated for a set of parameters p:

 -        -
| Phi_1(p) |
|          |
| Phi_2(p) |
|    .     |
|    .     |
|    .     |
| Phi_n(p) |
 -        -

it also requests a function that returns a special matrix in which the k-th row is the jacobian of Phi_k evaluated for a set of parameters p:

 -                          -
| d Phi_1  d Phi_1   d Phi_1 |
| -------  -------   ------- |
|   d A     d lam      d b   |
|                            |
| d Phi_2  d Phi_2   d Phi_2 |
| -------  -------   ------- |
|   d A     d lam      d b   |
|                            |
|   ...      ...       ...   |
|                            |
| d Phi_n  d Phi_n   d Phi_n |
| -------  -------   ------- |
|   d A     d lam      d b   |
 -                          -

whose elements we compute with:

d Phi_k
------- = exp(-lam t_k)
  d A
     
d Phi_k
------- = -A t_k exp(-lam t_k)
 d lam
     
d Phi_k
------- = 1
  d b

To code this we start by putting in variables: the vector of samples, the function to minimise and the special jacobian components:

 (define t        ...) 
 (define xsamples ...) 
       
 (define (phi_k p) 
   (let-vectors (((A lam b) p)) 
     (- (+ (* A (exp (- (* lam t)))) b) 
        xsamples))) 
       
  
 (define (dPhi/dA p) 
   (let-vectors (((A lam b) p)) 
     (exp (- (* lam t))))) 
       
 (define (dPhi/dlam p) 
   (let-vectors (((A lam b) p)) 
     (- (* A t (exp (- (* lam t))))))) 
       
 (define (dPhi/db p) 
   (make-gsl-vector-real (dim t) 1)) 

then we code a function to build the special jacobians matrix:

 (define (j p) 
   (let-vectors (((A lam b) p)) 
     (let ((m (make-gsl-matrix-real (cons (dim t) (dim p))))) 
       (set! (column m 0) (dPhi/dA   p)) 
       (set! (column m 1) (dPhi/dlam p)) 
       (set! (column m 2) (dPhi/db   p)) 
       m))) 

We can generate a simulated problem with:

 (define params #,(gvr 3  5 0.1 7)) 
       
 (define (y t p) 
   (let-vectors (((A lam b) p)) 
     (+ (* A (exp (- (* lam t)))) b))) 
       
 (define t (linspace 0 10 600)) 
 (define xsamples 
   (+ (y t params) 
      (- (* 2 (gsl-rng-make-uniform-object (gsl-make-rng 'mt19937 10) 
                                           600)) 
         1))) 

Solver solution

To do the fitting, starting from a vector of guessed parameters that is "close" to an acceptable solution:

 (receive (parameters covar) 
   (nonlinear-multifitting phi_k j 
                           #:number (dim t) 
                           #:solver 'lmsder 
                           #:test 'gradient 
                           #:guess #,(gvr 3  3 0.2 4)) 
       
      (let-vectors (((A lam b) parameters)) 
         ...)) 
  
 ;; -> [4.68148956351238 0.110714940177528 7.26966782380432] 

but if we do not know, we may get:

 (nonlinear-multifitting phi_k j 
    #:number (dim t)      #:solver 'lmsder 
    #:test 'gradient      #:guess #,(gvr 3  6 3 1)) 
       
 ;; -> [-394.04933916814 -7.75997159278373e-4 405.684482357042] 

(cough...)

Siman solution

To use siman we do:

 (let* ((F             (lambda (xp) 
                         (sqr (euclidean-norm (phi_k xp))))) 
  
        (take-step     (lambda (rng xp max) 
                         (let-vectors (((A lam b) xp)) 
                           (read-gsl-vector-real 3 
                             (gsl-rng-uniform-alea rng A max) 
                             (gsl-rng-uniform-alea rng lam (* 0.1 max)) 
                             (gsl-rng-uniform-alea rng b max)))))) 
       
   (simulated-annealing (make <siman> #:iterations 20) 
      #:start-configuration #,(gvr 3  6 3 1) 
      #:max-step-size 1 
      #:step-function take-step 
      #:energy-function F 
      #:copy-function deep-clone)) 
       
 ;; -> [4.45163530949503 0.120955551322551 7.51962838787585] 

and then we can refine the result with a call to NONLINEAR-MULTIFITTING.

Traveling salesman problem

This is a classic problem to be solved with heuristics. In the following example we take a set of 12 cities and search the shortest route to touch them all and go back to the first one.

 ;; Latitude and longitude are obtained from the US Census Bureau, 
 ;; at <http://www.census.gov/cgi-bin/gazetteer>. 
  
 (define city-names #("Santa Fe" "Phoenix" "Albuquerque" "Clovis" "Durango" "Dallas" 
                      "Tesuque" "Grants" "Los Alamos" "Las Cruces" "Cortez" "Gallup")) 
 (define latitudes #,(gvr 12  35.68 33.54 35.12  34.41 37.29 32.79 
                              35.77 35.15 35.89  32.34 37.35 35.52)) 
 (define longitudes #,(gvr 12  105.95 112.07 106.62  103.20 107.87 96.77 
                               105.92 107.84 106.28  106.76 108.58 108.74)) 
 (define (city-distance idx1 idx2) 
   (if (= idx1 idx2) 
       0.0 
       (let* ((idx1      (inexact->exact idx1)) 
              (idx2      (inexact->exact idx2)) 
              (lat1      (deg->rad (elm latitudes  idx1))) 
              (lon1      (deg->rad (elm longitudes idx1))) 
              (lat2      (deg->rad (elm latitudes  idx2))) 
              (lon2      (deg->rad (elm longitudes idx2))) 
              (sla1      (sin lat1)) 
              (cla1      (cos lat1)) 
              (slo1      (sin lon1)) 
              (clo1      (cos lon1)) 
              (sla2      (sin lat2)) 
              (cla2      (cos lat2)) 
              (slo2      (sin lon2)) 
              (clo2      (cos lon2)) 
              (earth-radius 6375.0)) 
         (* earth-radius (acos (+ (* cla1 clo1 cla2 clo2) 
                                  (* cla1 slo1 cla2 slo2) 
                                  (* sla1 sla2))))))) 
  
 (define distance-matrix (let ((t (linspace 0 11 12))) 
                           (sample-xy city-distance t t))) 
  
 (define (energy xp) 
   (let ((E 0.0)) 
     (do ((i 0 (1+ i))) 
         ((> i 11)) 
       (set! E (+ E (elm distance-matrix 
                         (cons (vector-ref xp i) 
                               (vector-ref xp (modulo (1+ i) 12))))))) 
     E)) 
  
 (define route (simulated-annealing (make <siman> 
                                      #:iterations 20 
                                      #:constant 1.0     #:initial 20 
                                      #:minimum 2.0e-6   #:damping 1.01) 
                 #:step-function gsl-vector-random-swap! 
                 #:energy-function energy 
                 #:max-step-size 5 
                 #:start-configuration #(0 1 2 3 4 5 6 7 8 9 10 11) 
                 #:copy-function vector-copy)) 
  
 ;;(with-number-format "~,1f" 
 ;;  (format #t "distance matrix~%~A~%" distance-matrix)) 
 (format #t "the route is: ~A~%~A, and back to ~A; distance ~,1f~%" 
         route 
         (string-join (map (lambda (idx) (vector-ref city-names idx)) 
                           (vector->list route)) ", ") 
         (vector-ref city-names (vector-ref route 0)) 
         (energy route)) 

the output is:

the route is: #(11 10 4 7 2 8 6 0 3 5 9 1)
Gallup, Cortez, Durango, Grants, Albuquerque, Los Alamos, Tesuque, Santa Fe, Clovis, Dallas, Las Cruces, Phoenix, and back to Gallup; distance 3490.6

category-gee