sicp-ex-2.70



<< Previous exercise (2.69) | Index | Next exercise (2.71) >>


frandibar

The huffmann tree:

 ((((leaf A 8) (leaf YIP 9) (A YIP) 17) ((((leaf JOB 2) (leaf GET 2) (JOB GET) 4) (((leaf WAH 1) (leaf BOOM 1) (WAH BOOM) 2) (leaf SHA 3) (WAH BOOM SHA) 5) (JOB GET WAH BOOM SHA) 9) (leaf NA 16) (JOB GET WAH BOOM SHA NA) 25) (A YIP JOB GET WAH BOOM SHA NA) 42))  

The encoded lyrics:

(0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1)

How many bits are required for the encoding? 128

What is the smallest number of bits that would be needed to encode this song if we used a fixed-length code for the eight-symbol alphabet? 108

2^3 = 8 so we need 3 bits to represent symbols

The lyrics has 36 symbols

=> 3 x 36 = 108


pluies

frandibar, I think there is a problem in your Huffman tree. It's illogical that the Huffman-encoded message it creates is longer than the fixed-length one, given that the whole point of Huffman's algorithm is to shorten messages containing repetitive parts.

My code gives:

 (define rocktree (generate-huffman-tree '((A 2) (NA 16) (BOOM  1) (SHA 3) (GET 2) (YIP 9) (JOB 2) (WAH 1)))) 
  
 rocktree 
  
 ((leaf na 16) ((leaf yip 9) (((leaf a 2) ((leaf wah 1) (leaf boom 1) (wah boom) 2) (a wah boom) 4) ((leaf sha 3) ((leaf job 2) (leaf get 2) (job get) 4) (sha job get) 7) (a wah boom sha job get) 11) (yip a wah boom sha job get) 20) (na yip a wah boom sha job get) 36) 

We can then encode the song:

 (define rock-song '(Get a job Sha na na na na na na na na Get a job Sha na na na na na na na na Wah yip yip yip yip yip yip yip yip yip Sha boom)) 
  
 (define encoded-rock-song (encode rock-song rocktree)) 
  
 encoded-rock-song 
  
 (1 1 1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 1 1 0 1 1) 

And compare the length of the encoded message vs. its fixed-length version:

 (length encoded-rock-song) 
 84 
  
 ; If we were to use a fixed-length encoding on that rock song, we would need 3 bits (8 = 2^3) per symbol, i.e.: 
 (* 3 (length rock-song)) 
 108 

A 22% gain in size seems to be coherent.