sicp-ex-1.45



<< Previous exercise (1.44) | Index | Next exercise (1.46) >>


Maggyero

QUESTION

We saw in section 1.3.3 that attempting to compute square roots by naively finding a fixed point of y ↦ x/y does not converge, and that this can be fixed by average damping. The same method works for finding cube roots as fixed points of the average-damped y ↦ x/y^2. Unfortunately, the process does not work for fourth roots—a single average damp is not enough to make a fixed-point search for y ↦ x/y^3 converge. On the other hand, if we average damp twice (i.e., use the average damp of the average damp of y ↦ x/y^3) the fixed-point search does converge. Do some experiments to determine how many average damps are required to compute nth roots as a fixed-point search based upon repeated average damping of y ↦ x/y^{n − 1}. Use this to implement a simple procedure for computing nth roots using fixed-point, average-damp, and the repeated procedure of exercise 1.43. Assume that any arithmetic operations you need are available as primitives.

ANSWER

 (import (scheme small)) 
  
  
 (define (nth-root x n) 
   (define (fixed-point f first-guess) 
     (define (try guess) 
       (let ((next (f guess))) 
         (if (close-enough? guess next) 
           guess 
           (try next)))) 
     (define (close-enough? x y) 
       (< (abs (- x y)) 0.00001)) 
     (try first-guess)) 
   (define (average-damp f) 
     (lambda (x) (/ (+ x (f x)) 2))) 
   (define (repeated f n) 
     (define (compose f g) 
       (lambda (x) (f (g x)))) 
     (if (= n 1) 
       f 
       (compose f (repeated f (- n 1))))) 
   (fixed-point 
     ((repeated average-damp (floor (log n 2))) 
       (lambda (y) (/ x (expt y (- n 1))))) 
     1.0)) 
  
  
 ; Average damping of y ↦ x/y^{n − 1} must be repeated ⌊log_2(n)⌋ times to compute nth roots of x as a fixed-point search. 
  
 (display (nth-root 2.0 2)) 
 (newline) 
 (display (nth-root 2.0 3)) 
 (newline) 
 (display (nth-root 2.0 4)) 
 (newline) 
 (display (nth-root 2.0 5)) 
 (newline) 
 (display (nth-root 2.0 6)) 
 (newline) 
 (display (nth-root 2.0 7)) 
 (newline) 
 (display (nth-root 2.0 8)) 
 (newline) 


  
 (define (average x y) 
   (/ (+ x y) 2.0)) 
  
 (define (average-damp f) 
   (lambda (x) (average x (f x)))) 
  
 (define tolerance 0.00001) 
  
 (define (fixed-point f first-guess) 
   (define (close-enough? v1 v2) 
     (< (abs (- v1 v2)) tolerance)) 
   (define (try guess) 
     (let ((next (f guess))) 
       (if (close-enough? guess next) 
           next 
           (try next)))) 
   (try first-guess)) 
  
 (define (repeated f n) 
   (if (= n 1) 
       f 
       (lambda (x) (f ((repeated f (- n 1)) x))))) 
  
 (define (get-max-pow n) 
   (define (iter p r) 
     (if (< (- n r) 0) 
         (- p 1) 
         (iter (+ p 1) (* r 2)))) 
    
   (iter 1 2)) 
  
 (define (pow b p) 
   (define (even? x) 
     (= (remainder x 2) 0)) 
    
   (define (sqr x) 
     (* x x)) 
    
   (define (iter res a n) 
     (if (= n 0) 
         res 
         (if (even? n) 
             (iter res (sqr a) (/ n 2)) 
             (iter (* res a) a (- n 1))))) 
    
   (iter 1 b p)) 
  
 (define (nth-root n x) 
   (fixed-point ((repeated average-damp (get-max-pow n)) 
                 (lambda (y) (/ x (pow y (- n 1))))) 
                1.0)) 

Example: (nth-root 5 32)

2.000001512995761

The number of times to repeat average-damp can also be calculated using floor and log to the base 2 as follows

  
 (define (log2 x) (/ (log x) (log 2))) 
 (define (nth-root n x) 
 (fixed-point ((repeated average-damp (floor (log2 n)))  
               (lambda (y) (/ x (pow y (- n 1))))) 
              1.0)) 

Kaihao

I think the above number of average damps required to compute nth root is wrong.

 (define tolerance 0.00001) 
  
 (define (fixed-point f first-guess) 
   (define (close-enough? v1 v2) 
     (< (abs (- v1 v2)) tolerance)) 
   (define (try guess) 
     (let ((next (f guess))) 
       (if (close-enough? guess next) 
         next 
         (try next)))) 
   (try first-guess)) 
  
 (define (average a b) 
   (/ (+ a b) 2)) 
  
 (define (average-damp f) 
   (lambda (x) (average x (f x)))) 
  
 (define (square x) 
   (* x x)) 
  
 (define (fast-expt b n) 
   (cond ((= n 0) 1) 
         ((even? n) (square (fast-expt b (/ n 2)))) 
         (else (* b (fast-expt b (- n 1)))))) 
  
 (define (compose f g) 
   (lambda (x) 
     (g (f x)))) 
  
 (define (repeated f n) 
   (if (> n 1) 
     (compose (repeated f (- n 1)) f) 
     f)) 
  
 ;; avarage-damp d times 
 (define (nth-root x n d) 
   (fixed-point ((repeated average-damp d)  
                 (lambda (y) (/ x (fast-expt y (- n 1))))) 
                1.0)) 
  
 ;; Experimental results from 2rd to 20th root: 
 ;; Root  Required Average Damps 
 ;; 2     1 
 ;; 3     1 
 ;; 4     2 
 ;; 5     2 
 ;; 6     1 
 ;; 7     1 
 ;; 8     1 
 ;; 9     1 
 ;; 10    1 
 ;; 11    1 
 ;; 12    1 
 ;; 13    3 
 ;; 14    1 
 ;; 15    1 
 ;; 16    1 
 ;; 17    1 
 ;; 18    1 
 ;; 19    1 
 ;; 20    1 


LisScheSic

a. The exercise will be better understood if we have some knowledge of numerical analysis since it is one approximation problem.

b. Here the answer $\lfloor\log n\rfloor$ is based on its similarity with Newton's method which is a bit like the discrete version of the gradient descent method. The related mapping process figure is shown in https://deltam.blogspot.com/2015/08/sicp145ex145.html referred to in https://github.com/xxyzz/SICP.

The related maths proof is shown in https://www.zhihu.com/question/28838814 where we do the mapping in the same way as Newton's method but with one different line. The link is the reference of deltam's blog.

Following the zhihu link formula

1. when `average_damp_level` is very large, the difference between mapping adjacent values is too small (< 0.00001 here) due to the high slope, e.g. 637621501.2140495 and 637621501.2140489.

2. when `average_damp_level` is very small, we have the following loop. Here I put the reasons behind the mapping inside the bracket pair:

2.2882799226996533 -(due to the low slope $2^m y_k^{n-1}$)> 869130.4304532777 -(/2: since following zhihu link formula when y=1, m=1, x>>y, y_{n+1}\approx y_n/2)> 434565.21522663883 -(/2)> 217282.60761331941 -> ... -> 1.668911759469753 -> 4526863251949.442

Code `minimal_test/1_45_minimal_test.scm`:

 ;;; from the above wiki 
 (define (average-damp f)  
     (lambda (x) (/ (+ x (f x)) 2))) 
 (define (repeated f n)  
     (define (compose f g)  
       (lambda (x) (f (g x))))  
     (if (= n 1)  
       f  
       (compose f (repeated f (- n 1))))) 
  
 ;;; from book 
 (define tolerance 0.00001) 
  
 ;;; small modified fixed_point 
 (define (displayln x) 
   (newline) 
   (display x)) 
 (define (fixed_point_bound f first-guess bound whether_output_cnt whether_output_guess_mapping) 
   (define (close-enough? v1 v2) 
     (< (abs (- v1 v2)) tolerance)) 
   (define (try guess count) 
     (let ((next (f guess))) 
       (if whether_output_guess_mapping 
         (displayln guess) 
         ) 
       (if (or (close-enough? guess next) (= count bound)) 
           (begin  
             (displayln "finished count:") 
             (displayln count) 
             (displayln "result:") 
             (displayln next) 
             (if whether_output_cnt 
               count 
               next)) 
           (try next (+ count 1))))) 
   (if whether_output_guess_mapping 
     (displayln "fixed_point_bound mapping process:") 
     ) 
   (try first-guess 1)) 
  
 (define small_bound 100) 
 (define (init_map x y n whether_use_internal_implementation)  
   (if whether_use_internal_implementation 
     (/ x (expt y (- n 1))) 
     (/ x ((repeated (lambda (m) (* y m)) (- n 1)) 1))))  
 (define (root n x average_damp_level bound init whether_output_cnt whether_output_guess_mapping whether_use_internal_implementation) 
   (define transformed_f ((repeated average-damp average_damp_level) (lambda (y) (init_map x y n whether_use_internal_implementation))))  
   (displayln "") 
   (displayln "average_damp_level:") 
   (displayln average_damp_level) 
   (fixed_point_bound transformed_f init bound whether_output_cnt whether_output_guess_mapping))  
 (define n2 50)  
 (- (root n2 (expt 3 n2) 1 (* small_bound 2) 1.0 #f #t #f) 3)  
 ;; It will have `y` sequence including the loop above. 
   
 (- (root n2 (expt 3 n2) n2 small_bound 1.0 #f #t #f) 3) 
 ;; It will have `y` sequence:  
 ; average_damp_level: 
 ; 50 
 ; fixed_point_bound mapping process: 
 ; 1. 
 ; 637621501.2140495 
 ; finished count: 
 ; 2 
 ; result: 
 ; 637621501.2140489 
 ;Value: 637621498.2140489 

3. floor will normally do better than ceil for $log_2 n$. The analysis is as the following reusing the above codes:

 ;;; compare floor and ceil for log n 
 (define (log_base base n) 
   (/ (log n) (log base))) 
 (define (find_floor_worse_than_ceil use_power_plus_one init_exp base count init ending_exp whether_output_guess_mapping) 
   (define n ((if use_power_plus_one 
                   + 
                   -) (expt 2 init_exp) 1)) ; this will make n-floor(n) greatest for one init_exp 
   (define x (expt base n)) 
   (define floor_average_damp_level (inexact->exact (floor (log_base 2 n)))) 
   (define ceiling_average_damp_level (inexact->exact (ceiling (log_base 2 n)))) 
   (displayln "") 
   (displayln "init_exp:") 
   (displayln init_exp) 
   (let ((diff (- (root n x floor_average_damp_level count init #t whether_output_guess_mapping #f) (root n x ceiling_average_damp_level count init #t whether_output_guess_mapping #f)))) 
     (if (> diff 0) 
       (displayln "ceiling better") 
       (displayln "floor better")) 
     (if (= init_exp ending_exp) 
       (displayln "finish") 
       (find_floor_worse_than_ceil use_power_plus_one (+ init_exp 1) base count init ending_exp whether_output_guess_mapping)))) 
  
 (define big_bound (* small_bound 10)) 
 (define base 1.1) 
 ;; Here init_exp=1 will floor to 0 which is not considered in repeated. 
 (find_floor_worse_than_ceil #f 2 base big_bound 1.0 10 #f) 
 ;; the above will change from "ceiling better" to "floor better" when init_exp change from 6 to 7 when base=1.1. 
 ;; result will become inf (in Racket)/nan (in MIT/GNU Scheme) when init_exp becomes 10 if choosing base=3 since I calculate $\sqrt[2^n-1]{base^{2^n-1}}=base$ to test the result correctness. 
 (find_floor_worse_than_ceil #t 1 base big_bound 1.0 10 #f)  
 ;; the above only have "ceiling better" for init_exp=1. 
  
 (find_floor_worse_than_ceil #f 3 3 big_bound 1.0 4 #t) 
 ;; Here when `init_exp=3`,  
 ;; floor will first map to one further point $y_1$ due to the low slope, then map to one point $y_2$ closer to the zero point $z<y_2$, then step by step converge to $z$. 
 ;; ceil will do the inverse, i.e. map to one relatively closer point $y_2$ compared with floor and $y_2‘$ further from $z$ than $y_2$ ... 
 ;; In summary, floor will do worse at the 1st step but better in the latter steps while ceil is the inverse situation. 
 ;; So normally, floor will do better. But we need larger latter step number to amortize, i.e. when `init_exp` is small ceil may do better. 
 ;; 
 ;; the above will output 
 ; init_exp: 
 ; 3 
 ;  
 ; average_damp_level: 
 ; 3 
 ; fixed_point_bound mapping process: 
 ; 1. 
 ; 274.25 
 ; 239.96875000000063 
 ; 209.972656250002 
 ; ... 
 ; 3.000002412349743 
 ; finished count: 
 ; 42 
 ; result: 
 ; 3.00000030154881 
 ; 
 ; average_damp_level: 
 ; 2 
 ; fixed_point_bound mapping process: 
 ; 1. 
 ; 547.5 
 ; 410.625 
 ; 307.9687500000001 
 ; ... 
 ; 3.0000074196296165 
 ; 2.999994435374126 
 ; finished count: 
 ; 55 
 ; result: 
 ; 3.0000041735235943 
 ; ceiling better 
 ; 
 ; init_exp: 
 ; 4 
 ; 
 ; average_damp_level: 
 ; 4 
 ; fixed_point_bound mapping process: 
 ; 1. 
 ; 896807.625 
 ; 840757.1484375 
 ; ... 
 ; 3.000037933721064 
 ; 3.0000023740050947 
 ; finished count: 
 ; 202 
 ; result: 
 ; 3.000000148387647 
 ; 
 ; average_damp_level: 
 ; 3 
 ; fixed_point_bound mapping process: 
 ; 1. 
 ; 1793614.25 
 ; 1569412.46875 
 ; 1373235.91015625 
 ; ... 
 ; 2.9999951975788592 
 ; finished count: 
 ; 166 
 ; result: 
 ; 3.000004202219401 
 ; floor better 
 ;; 

c. Notice: Same as Newton's method, the initial guess matters. The result is also implied by the recurrence relation. Use the above function implementation to show the differences:

 ;;; check negative init 
 (root 2 2.0 1 small_bound 1.0 #f #t #f) 
 ;Value: 1.4142135623746899 
 (root 2 2.0 1 small_bound -1.0 #f #t #f) 
 ;Value: -1.4142135623746899 
 (root 3 2.0 1 small_bound 1.0 #f #t #f) 
 ;Value: 1.259923236422975 
 (root 3 2.0 1 small_bound -1.0 #f #t #f) 
 ;Value: 1.2599233631602056 

d. Kaihao's comment doesn't give its statistics data. Actually although ";; 6 1" is right, the convergence iteration number is very large (319396 tested for $sqrt[6]{2}$). But for floor it only needs 15 (See the code comments and outputs). This is similar to https://web.archive.org/web/20161018022729/https://sicp.g.hatena.ne.jp/n-oohira/20090207/1233979052 (reference of deltam's blog) says "I felt like the calculation was a bit sluggish, so I guess it's just something like "it will converge if you try hard."" (translated by Google)

Here is the test codes with also `find_valid_average_damp_level` to find all needed average-damp nested level:

 (load "minimal_test/1_45_minimal_test.scm") 
 ;;; same as the wiki parameter choice. 
 (define (find_valid_average_damp_level max_root start_root bound init) 
   (define x 2.0) 
   (define (helper root_order average_damp_level) 
     ;; same as Kaihao's implementation, otherwise the above will output differently although it also converges 
     (if (< (root root_order x average_damp_level bound init #t #f #t) bound) 
       (begin 
         (displayln "") 
         (displayln "needed average_damp_level:") 
         average_damp_level) 
       (helper root_order (+ average_damp_level 1)))) 
   (if (> start_root max_root) 
     (displayln "finish") 
     (begin 
       (displayln "") 
       (displayln "start_root:") 
       (displayln start_root) 
       (displayln (helper start_root 1)) 
       (find_valid_average_damp_level max_root (+ start_root 1) bound init)))) 
 (find_valid_average_damp_level 20 2 small_bound 1.0) 
 ;; outputs 
 ; ... 
 ; start_root: 
 ; 6 
 ; 
 ; average_damp_level: 
 ; 1 
 ; finished count: 
 ; 100 
 ; result: 
 ; .9019353334996821 
 ; 
 ; average_damp_level: 
 ; 2 
 ; finished count: 
 ; 15 
 ; result: 
 ; 1.1224648393618204 
 ; 
 ; needed average_damp_level: 
 ; 2 
 ; ... 
  
 ;;; check how $sqrt[6]{2}$ converges when "Required Average Damps" is 1. 
 (root 6 2.0 1 1000000000 1.0 #t #f #t) 
 ;; outputs with the same result as the following `(nth-root 2.0 6 1)`. 
 ; average_damp_level: 
 ; 1 
 ; finished count: 
 ; 319396 
 ; result: 
 ; 1.1224584896267153 
 ;Value: 319396 
  
  
 ;;; from wiki Kaihao's comment 
 (define tolerance 0.00001)  
    
 (define (fixed-point f first-guess)  
   (define (close-enough? v1 v2)  
     (< (abs (- v1 v2)) tolerance))  
   (define (try guess)  
     (let ((next (f guess))) 
       ; (displayln next)  
       (if (close-enough? guess next)  
         next 
         (try next))))  
   (try first-guess))  
  
 (define (average a b)  
   (/ (+ a b) 2))  
  
 (define (average-damp f)  
   (lambda (x) (average x (f x))))  
  
 (define (square x)  
   (* x x))  
  
 (define (fast-expt b n)  
   (cond ((= n 0) 1)  
         ((even? n) (square (fast-expt b (/ n 2))))  
         (else (* b (fast-expt b (- n 1))))))  
  
 (define (compose f g)  
   (lambda (x)  
     (g (f x))))  
  
 (define (repeated f n)  
   (if (> n 1)  
     (compose (repeated f (- n 1)) f)  
     f))  
  
 ;; avarage-damp d times  
 (define (nth-root x n d)  
   (fixed-point ((repeated average-damp d)   
                 ; (lambda (y) (/ x (fast-expt y (- n 1)))))  
                 ;; `fast-expt` seems to be same as the internal `expt` since they have the same result. 
                 (lambda (y) (/ x (expt y (- n 1)))))  
               1.0)) 
 (nth-root 2.0 6 1)