# sicp-ex-1.45

Maggyero

QUESTION

We saw in section 1.3.3 that attempting to compute square roots by naively finding a fixed point of y ↦ x/y does not converge, and that this can be fixed by average damping. The same method works for finding cube roots as fixed points of the average-damped y ↦ x/y^2. Unfortunately, the process does not work for fourth roots—a single average damp is not enough to make a fixed-point search for y ↦ x/y^3 converge. On the other hand, if we average damp twice (i.e., use the average damp of the average damp of y ↦ x/y^3) the fixed-point search does converge. Do some experiments to determine how many average damps are required to compute nth roots as a fixed-point search based upon repeated average damping of y ↦ x/y^{n − 1}. Use this to implement a simple procedure for computing nth roots using fixed-point, average-damp, and the repeated procedure of exercise 1.43. Assume that any arithmetic operations you need are available as primitives.

 (import (scheme small))

(define (nth-root x n)
(define (fixed-point f first-guess)
(define (try guess)
(let ((next (f guess)))
(if (close-enough? guess next)
guess
(try next))))
(define (close-enough? x y)
(< (abs (- x y)) 0.00001))
(try first-guess))
(define (average-damp f)
(lambda (x) (/ (+ x (f x)) 2)))
(define (repeated f n)
(define (compose f g)
(lambda (x) (f (g x))))
(if (= n 1)
f
(compose f (repeated f (- n 1)))))
(fixed-point
((repeated average-damp (floor (log n 2)))
(lambda (y) (/ x (expt y (- n 1)))))
1.0))

; Average damping of y ↦ x/y^{n − 1} must be repeated ⌊log_2(n)⌋ times to compute nth roots of x as a fixed-point search.

(display (nth-root 2.0 2))
(newline)
(display (nth-root 2.0 3))
(newline)
(display (nth-root 2.0 4))
(newline)
(display (nth-root 2.0 5))
(newline)
(display (nth-root 2.0 6))
(newline)
(display (nth-root 2.0 7))
(newline)
(display (nth-root 2.0 8))
(newline)



(define (average x y)
(/ (+ x y) 2.0))

(define (average-damp f)
(lambda (x) (average x (f x))))

(define tolerance 0.00001)

(define (fixed-point f first-guess)
(define (close-enough? v1 v2)
(< (abs (- v1 v2)) tolerance))
(define (try guess)
(let ((next (f guess)))
(if (close-enough? guess next)
next
(try next))))
(try first-guess))

(define (repeated f n)
(if (= n 1)
f
(lambda (x) (f ((repeated f (- n 1)) x)))))

(define (get-max-pow n)
(define (iter p r)
(if (< (- n r) 0)
(- p 1)
(iter (+ p 1) (* r 2))))

(iter 1 2))

(define (pow b p)
(define (even? x)
(= (remainder x 2) 0))

(define (sqr x)
(* x x))

(define (iter res a n)
(if (= n 0)
res
(if (even? n)
(iter res (sqr a) (/ n 2))
(iter (* res a) a (- n 1)))))

(iter 1 b p))

(define (nth-root n x)
(fixed-point ((repeated average-damp (get-max-pow n))
(lambda (y) (/ x (pow y (- n 1)))))
1.0))


Example: (nth-root 5 32)

2.000001512995761

The number of times to repeat average-damp can also be calculated using floor and log to the base 2 as follows


(define (log2 x) (/ (log x) (log 2)))
(define (nth-root n x)
(fixed-point ((repeated average-damp (floor (log2 n)))
(lambda (y) (/ x (pow y (- n 1)))))
1.0))


Kaihao

I think the above number of average damps required to compute nth root is wrong.

 (define tolerance 0.00001)

(define (fixed-point f first-guess)
(define (close-enough? v1 v2)
(< (abs (- v1 v2)) tolerance))
(define (try guess)
(let ((next (f guess)))
(if (close-enough? guess next)
next
(try next))))
(try first-guess))

(define (average a b)
(/ (+ a b) 2))

(define (average-damp f)
(lambda (x) (average x (f x))))

(define (square x)
(* x x))

(define (fast-expt b n)
(cond ((= n 0) 1)
((even? n) (square (fast-expt b (/ n 2))))
(else (* b (fast-expt b (- n 1))))))

(define (compose f g)
(lambda (x)
(g (f x))))

(define (repeated f n)
(if (> n 1)
(compose (repeated f (- n 1)) f)
f))

;; avarage-damp d times
(define (nth-root x n d)
(fixed-point ((repeated average-damp d)
(lambda (y) (/ x (fast-expt y (- n 1)))))
1.0))

;; Experimental results from 2rd to 20th root:
;; Root  Required Average Damps
;; 2     1
;; 3     1
;; 4     2
;; 5     2
;; 6     1
;; 7     1
;; 8     1
;; 9     1
;; 10    1
;; 11    1
;; 12    1
;; 13    3
;; 14    1
;; 15    1
;; 16    1
;; 17    1
;; 18    1
;; 19    1
;; 20    1


LisScheSic

a. The exercise will be better understood if we have some knowledge of numerical analysis since it is one approximation problem.

b. Here the answer $\lfloor\log n\rfloor$ is based on its similarity with Newton's method which is a bit like the discrete version of the gradient descent method. The related mapping process figure is shown in https://deltam.blogspot.com/2015/08/sicp145ex145.html referred to in https://github.com/xxyzz/SICP.

The related maths proof is shown in https://www.zhihu.com/question/28838814 where we do the mapping in the same way as Newton's method but with one different line. The link is the reference of deltam's blog.

1. when average_damp_level is very large, the difference between mapping adjacent values is too small (< 0.00001 here) due to the high slope, e.g. 637621501.2140495 and 637621501.2140489.

2. when average_damp_level is very small, we have the following loop. Here I put the reasons behind the mapping inside the bracket pair:

2.2882799226996533 -(due to the low slope $2^m y_k^{n-1}$)> 869130.4304532777 -(/2: since following zhihu link formula when y=1, m=1, x>>y, y_{n+1}\approx y_n/2)> 434565.21522663883 -(/2)> 217282.60761331941 -> ... -> 1.668911759469753 -> 4526863251949.442

Code minimal_test/1_45_minimal_test.scm:

 ;;; from the above wiki
(define (average-damp f)
(lambda (x) (/ (+ x (f x)) 2)))
(define (repeated f n)
(define (compose f g)
(lambda (x) (f (g x))))
(if (= n 1)
f
(compose f (repeated f (- n 1)))))

;;; from book
(define tolerance 0.00001)

;;; small modified fixed_point
(define (displayln x)
(newline)
(display x))
(define (fixed_point_bound f first-guess bound whether_output_cnt whether_output_guess_mapping)
(define (close-enough? v1 v2)
(< (abs (- v1 v2)) tolerance))
(define (try guess count)
(let ((next (f guess)))
(if whether_output_guess_mapping
(displayln guess)
)
(if (or (close-enough? guess next) (= count bound))
(begin
(displayln "finished count:")
(displayln count)
(displayln "result:")
(displayln next)
(if whether_output_cnt
count
next))
(try next (+ count 1)))))
(if whether_output_guess_mapping
(displayln "fixed_point_bound mapping process:")
)
(try first-guess 1))

(define small_bound 100)
(define (init_map x y n whether_use_internal_implementation)
(if whether_use_internal_implementation
(/ x (expt y (- n 1)))
(/ x ((repeated (lambda (m) (* y m)) (- n 1)) 1))))
(define (root n x average_damp_level bound init whether_output_cnt whether_output_guess_mapping whether_use_internal_implementation)
(define transformed_f ((repeated average-damp average_damp_level) (lambda (y) (init_map x y n whether_use_internal_implementation))))
(displayln "")
(displayln "average_damp_level:")
(displayln average_damp_level)
(fixed_point_bound transformed_f init bound whether_output_cnt whether_output_guess_mapping))
(define n2 50)
(- (root n2 (expt 3 n2) 1 (* small_bound 2) 1.0 #f #t #f) 3)
;; It will have y sequence including the loop above.

(- (root n2 (expt 3 n2) n2 small_bound 1.0 #f #t #f) 3)
;; It will have y sequence:
; average_damp_level:
; 50
; fixed_point_bound mapping process:
; 1.
; 637621501.2140495
; finished count:
; 2
; result:
; 637621501.2140489
;Value: 637621498.2140489


3. floor will normally do better than ceil for $log_2 n$. The analysis is as the following reusing the above codes:

 ;;; compare floor and ceil for log n
(define (log_base base n)
(/ (log n) (log base)))
(define (find_floor_worse_than_ceil use_power_plus_one init_exp base count init ending_exp whether_output_guess_mapping)
(define n ((if use_power_plus_one
+
-) (expt 2 init_exp) 1)) ; this will make n-floor(n) greatest for one init_exp
(define x (expt base n))
(define floor_average_damp_level (inexact->exact (floor (log_base 2 n))))
(define ceiling_average_damp_level (inexact->exact (ceiling (log_base 2 n))))
(displayln "")
(displayln "init_exp:")
(displayln init_exp)
(let ((diff (- (root n x floor_average_damp_level count init #t whether_output_guess_mapping #f) (root n x ceiling_average_damp_level count init #t whether_output_guess_mapping #f))))
(if (> diff 0)
(displayln "ceiling better")
(displayln "floor better"))
(if (= init_exp ending_exp)
(displayln "finish")
(find_floor_worse_than_ceil use_power_plus_one (+ init_exp 1) base count init ending_exp whether_output_guess_mapping))))

(define big_bound (* small_bound 10))
(define base 1.1)
;; Here init_exp=1 will floor to 0 which is not considered in repeated.
(find_floor_worse_than_ceil #f 2 base big_bound 1.0 10 #f)
;; the above will change from "ceiling better" to "floor better" when init_exp change from 6 to 7 when base=1.1.
;; result will become inf (in Racket)/nan (in MIT/GNU Scheme) when init_exp becomes 10 if choosing base=3 since I calculate $\sqrt[2^n-1]{base^{2^n-1}}=base$ to test the result correctness.
(find_floor_worse_than_ceil #t 1 base big_bound 1.0 10 #f)
;; the above only have "ceiling better" for init_exp=1.

(find_floor_worse_than_ceil #f 3 3 big_bound 1.0 4 #t)
;; Here when init_exp=3,
;; floor will first map to one further point $y_1$ due to the low slope, then map to one point $y_2$ closer to the zero point $z<y_2$, then step by step converge to $z$.
;; ceil will do the inverse, i.e. map to one relatively closer point $y_2$ compared with floor and $y_2‘$ further from $z$ than $y_2$ ...
;; In summary, floor will do worse at the 1st step but better in the latter steps while ceil is the inverse situation.
;; So normally, floor will do better. But we need larger latter step number to amortize, i.e. when init_exp is small ceil may do better.
;;
;; the above will output
; init_exp:
; 3
;
; average_damp_level:
; 3
; fixed_point_bound mapping process:
; 1.
; 274.25
; 239.96875000000063
; 209.972656250002
; ...
; 3.000002412349743
; finished count:
; 42
; result:
; 3.00000030154881
;
; average_damp_level:
; 2
; fixed_point_bound mapping process:
; 1.
; 547.5
; 410.625
; 307.9687500000001
; ...
; 3.0000074196296165
; 2.999994435374126
; finished count:
; 55
; result:
; 3.0000041735235943
; ceiling better
;
; init_exp:
; 4
;
; average_damp_level:
; 4
; fixed_point_bound mapping process:
; 1.
; 896807.625
; 840757.1484375
; ...
; 3.000037933721064
; 3.0000023740050947
; finished count:
; 202
; result:
; 3.000000148387647
;
; average_damp_level:
; 3
; fixed_point_bound mapping process:
; 1.
; 1793614.25
; 1569412.46875
; 1373235.91015625
; ...
; 2.9999951975788592
; finished count:
; 166
; result:
; 3.000004202219401
; floor better
;;


c. Notice: Same as Newton's method, the initial guess matters. The result is also implied by the recurrence relation. Use the above function implementation to show the differences:

 ;;; check negative init
(root 2 2.0 1 small_bound 1.0 #f #t #f)
;Value: 1.4142135623746899
(root 2 2.0 1 small_bound -1.0 #f #t #f)
;Value: -1.4142135623746899
(root 3 2.0 1 small_bound 1.0 #f #t #f)
;Value: 1.259923236422975
(root 3 2.0 1 small_bound -1.0 #f #t #f)
;Value: 1.2599233631602056


d. Kaihao's comment doesn't give its statistics data. Actually although ";; 6 1" is right, the convergence iteration number is very large (319396 tested for $sqrt[6]{2}$). But for floor it only needs 15 (See the code comments and outputs). This is similar to https://web.archive.org/web/20161018022729/https://sicp.g.hatena.ne.jp/n-oohira/20090207/1233979052 (reference of deltam's blog) says "I felt like the calculation was a bit sluggish, so I guess it's just something like "it will converge if you try hard."" (translated by Google)

Here is the test codes with also find_valid_average_damp_level to find all needed average-damp nested level:

 (load "minimal_test/1_45_minimal_test.scm")
;;; same as the wiki parameter choice.
(define (find_valid_average_damp_level max_root start_root bound init)
(define x 2.0)
(define (helper root_order average_damp_level)
;; same as Kaihao's implementation, otherwise the above will output differently although it also converges
(if (< (root root_order x average_damp_level bound init #t #f #t) bound)
(begin
(displayln "")
(displayln "needed average_damp_level:")
average_damp_level)
(helper root_order (+ average_damp_level 1))))
(if (> start_root max_root)
(displayln "finish")
(begin
(displayln "")
(displayln "start_root:")
(displayln start_root)
(displayln (helper start_root 1))
(find_valid_average_damp_level max_root (+ start_root 1) bound init))))
(find_valid_average_damp_level 20 2 small_bound 1.0)
;; outputs
; ...
; start_root:
; 6
;
; average_damp_level:
; 1
; finished count:
; 100
; result:
; .9019353334996821
;
; average_damp_level:
; 2
; finished count:
; 15
; result:
; 1.1224648393618204
;
; needed average_damp_level:
; 2
; ...

;;; check how $sqrt[6]{2}$ converges when "Required Average Damps" is 1.
(root 6 2.0 1 1000000000 1.0 #t #f #t)
;; outputs with the same result as the following (nth-root 2.0 6 1).
; average_damp_level:
; 1
; finished count:
; 319396
; result:
; 1.1224584896267153
;Value: 319396

;;; from wiki Kaihao's comment
(define tolerance 0.00001)

(define (fixed-point f first-guess)
(define (close-enough? v1 v2)
(< (abs (- v1 v2)) tolerance))
(define (try guess)
(let ((next (f guess)))
; (displayln next)
(if (close-enough? guess next)
next
(try next))))
(try first-guess))

(define (average a b)
(/ (+ a b) 2))

(define (average-damp f)
(lambda (x) (average x (f x))))

(define (square x)
(* x x))

(define (fast-expt b n)
(cond ((= n 0) 1)
((even? n) (square (fast-expt b (/ n 2))))
(else (* b (fast-expt b (- n 1))))))

(define (compose f g)
(lambda (x)
(g (f x))))

(define (repeated f n)
(if (> n 1)
(compose (repeated f (- n 1)) f)
f))

;; avarage-damp d times
(define (nth-root x n d)
(fixed-point ((repeated average-damp d)
; (lambda (y) (/ x (fast-expt y (- n 1)))))
;; fast-expt seems to be same as the internal expt since they have the same result.
(lambda (y) (/ x (expt y (- n 1)))))
1.0))
(nth-root 2.0 6 1)