The log sum inequality is used for proving theorems in information theory.
Statement
Let \( a_1,\ldots,a_n \) and \( b_{1},\ldots ,b_{n} \) be nonnegative numbers. Denote the sum of all \( a_{i} \) s by a and the sum of all \( b_{i} \) s by b. The log sum inequality states that
\( \sum _{{i=1}}^{n}a_{i}\log {\frac {a_{i}}{b_{i}}}\geq a\log {\frac {a}{b}}, \)
with equality if and only if \( {\frac {a_{i}}{b_{i}}} \) are equal for all i, in other words \( {\displaystyle a_{i}=cb_{i}} \) for all i.[1]
(Take a i log \( {\displaystyle a_{i}\log {\frac {a_{i}}{b_{i}}}} \) to be \( {\displaystyle 0} \) if \( a_i=0 \) and \( \infty \) if \( {\displaystyle a_{i}>0,b_{i}=0} \). These are the limiting values obtained as the relevant number tends to \( {\displaystyle 0} \).)[1]
Proof
Notice that after setting \( f(x)=x\log x \) we have
\( {\begin{aligned}\sum _{{i=1}}^{n}a_{i}\log {\frac {a_{i}}{b_{i}}}&{}=\sum _{{i=1}}^{n}b_{i}f\left({\frac {a_{i}}{b_{i}}}\right)=b\sum _{{i=1}}^{n}{\frac {b_{i}}{b}}f\left({\frac {a_{i}}{b_{i}}}\right)\\&{}\geq bf\left(\sum _{{i=1}}^{n}{\frac {b_{i}}{b}}{\frac {a_{i}}{b_{i}}}\right)=bf\left({\frac {1}{b}}\sum _{{i=1}}^{n}a_{i}\right)=bf\left({\frac {a}{b}}\right)\\&{}=a\log {\frac {a}{b}},\end{aligned}} \)
where the inequality follows from Jensen's inequality since \( {\frac {b_{i}}{b}}\geq 0 \), \( {\displaystyle \sum _{i=1}^{n}{\frac {b_{i}}{b}}=1} \), and f f is convex.[1]
Generalizations
The inequality remains valid for \( n=\infty \) provided that \( a<\infty \) and \( b<\infty \) . The proof above holds for any function g such that f(x)=xg(x) is convex, such as all continuous non-decreasing functions. Generalizations to non-decreasing functions other than the logarithm is given in Csiszár, 2004.
Applications
The log sum inequality can be used to prove inequalities in information theory. Gibbs' inequality states that the Kullback-Leibler divergence is non-negative, and equal to zero precisely if its arguments are equal.[2] One proof uses the log sum inequality.
Proof[1]
The inequality can also prove convexity of Kullback-Leibler divergence.[3]
Notes
Cover & Thomas (1991), p. 29.
MacKay (2003), p. 34.
Cover & Thomas (1991), p. 30.
References
Thomas M. Cover; Joy A. Thomas (1991). Elements of Information Theory. Hoboken, New Jersey: Wiley. ISBN 978-0-471-24195-9.
Csiszár, I.; Shields, P. (2004). "Information Theory and Statistics: A Tutorial" (PDF). Foundations and Trends in Communications and Information Theory. 1 (4): 417–528. doi:10.1561/0100000004. Retrieved 2009-06-14.
T.S. Han, K. Kobayashi, Mathematics of information and coding. American Mathematical Society, 2001. ISBN 0-8218-0534-7.
Information Theory course materials, Utah State University [1]. Retrieved on 2009-06-14.
MacKay, David J.C. (2003). Information Theory, Inference, and Learning Algorithms. Cambridge University Press. ISBN 0-521-64298-1.
Undergraduate Texts in Mathematics
Graduate Studies in Mathematics
Hellenica World - Scientific Library
Retrieved from "http://en.wikipedia.org/"
All text is available under the terms of the GNU Free Documentation License