next up previous
Next: Problem 4 - Operating Up: Information Science I Previous: Problem 2 - Digital

Problem 3 - Coding Theory

Consider an information source consisting of 5 characters $ s_1$, $ s_2$, $ s_3$, $ s_4$, $ s_5$ with probability $ \frac{1}{16}$, $ \frac{2}{16}$, $ \frac{3}{16}$, $ \frac{4}{16}$, $ \frac{6}{16}$ respectively.

1.
Find a Huffman code for this information source, and compute the average code length, where $ \log_2(3) \simeq 1.585$.
2.
Compute the entropy of this information source
3.
Describe relation between the entropy and the average code length.


1.
The code words are:
\includegraphics{huffman1.ps}
The average code length is:

$\displaystyle \bar{L} = 4\frac{1}{16} + 4\frac{2}{16} + 3\frac{3}{16} + 3\frac{4}{16}
+1\frac{6}{16} = \frac{35}{16} \approx 2.187.$

2.
The entropy is:
$ H(A) = \sum_{a\in A} p_a \log_2(\frac{1}{p_a}),$
$ H(A) = - 4\frac{1}{16} - 3\frac{2}{16} - (4 - \log_2(3))\frac{3}{16} -
2\frac{4}{16} - (3 - \log_2(3))\frac{6}{16},$
$ H(A) \approx 2.109.$
3.
The efficiency of the code is:
$ e = \frac{H(A)}{\bar{L}} \approx 0.96433$
which means that only $ 3\%$ of the bits are redundant.
We can also point out that:
$ H(S) < \bar{L} < H(S) + 1$
Huffman code can deliver code word sequence that asymptotically approach the entropy. Which means that for large source alphabet, the amount of redundant bits is very small.



Reynald AFFELDT
2000-06-08