ART

In information theory, Fano's inequality (also known as the Fano converse and the Fano lemma) relates the average information lost in a noisy channel to the probability of the categorization error. It was derived by Robert Fano in the early 1950s while teaching a Ph.D. seminar in information theory at MIT, and later recorded in his 1961 textbook.

It is used to find a lower bound on the error probability of any decoder as well as the lower bounds for minimax risks in density estimation.

Let the random variables X and Y represent input and output messages with a joint probability P(x,y). Let e {\displaystyle e} e represent an occurrence of error; i.e., that \( X\neq {\tilde {X}}, \) with X \( {\tilde {X}}=f(Y) \) being an approximate version of X. Fano's inequality is

\( H(X|Y)\leq H(e)+P(e)\log(|{\mathcal {X}}|-1), \)

where \( {\mathcal {X}} \) denotes the support of X,

\( {\displaystyle H\left(X|Y\right)=-\sum _{i,j}P(x_{i},y_{j})\log P\left(x_{i}|y_{j}\right)} \)

is the conditional entropy,

\( P(e)=P(X\neq {\tilde {X}}) \)

is the probability of the communication error, and

\( H(e)=-P(e)\log P(e)-(1-P(e))\log(1-P(e)) \)

is the corresponding binary entropy.
Alternative formulation

Let X be a random variable with density equal to one of r+1 possible densities \( f_{1},\ldots ,f_{{r+1}} \) . Furthermore, the Kullback–Leibler divergence between any pair of densities cannot be too large,

\( D_{{KL}}(f_{i}\|f_{j})\leq \beta \) for all \( i\not =j \) .

Let \( \psi (X)\in \{1,\ldots ,r+1\} \) be an estimate of the index. Then

\( \sup _{i}P_{i}(\psi (X)\not =i)\geq 1-{\frac {\beta +\log 2}{\log r}} \)

where \( P_{i} \) is the probability induced by \( f_{i} \)

Generalization

The following generalization is due to Ibragimov and Khasminskii (1979), Assouad and Birge (1983).

Let F be a class of densities with a subclass of r + 1 densities ƒθ such that for any θ ≠ θ′

\( {\displaystyle \|f_{\theta }-f_{\theta '}\|_{L_{1}}\geq \alpha ,} \)
\( {\displaystyle D_{KL}(f_{\theta }\|f_{\theta '})\leq \beta .} \)

Then in the worst case the expected value of error of estimation is bound from below,

\( \sup _{{f\in {\mathbf {F}}}}E\|f_{n}-f\|_{{L_{1}}}\geq {\frac {\alpha }{2}}\left(1-{\frac {n\beta +\log 2}{\log r}}\right) \)

where ƒn is any density estimator based on a sample of size n.

References

P. Assouad, "Deux remarques sur l'estimation", Comptes Rendus de l'Académie des Sciences de Paris, Vol. 296, pp. 1021–1024, 1983.
L. Birge, "Estimating a density under order restrictions: nonasymptotic minimax risk", Technical report, UER de Sciences Économiques, Universite Paris X, Nanterre, France, 1983.
T. Cover, J. Thomas (1991). Elements of Information Theory. pp. 38–42. ISBN 978-0-471-06259-2.
L. Devroye, A Course in Density Estimation. Progress in probability and statistics, Vol 14. Boston, Birkhauser, 1987. ISBN 0-8176-3365-0, ISBN 3-7643-3365-0.
Fano, Robert (1968). Transmission of information: a statistical theory of communications. Cambridge, Mass: MIT Press. ISBN 978-0-262-56169-3. OCLC 804123877.
also: Cambridge, Massachusetts, M.I.T. Press, 1961. ISBN 0-262-06001-9
R. Fano, Fano inequality Scholarpedia, 2008.
I. A. Ibragimov, R. Z. Has′minskii, Statistical estimation, asymptotic theory. Applications of Mathematics, vol. 16, Springer-Verlag, New York, 1981. ISBN 0-387-90523-5

Undergraduate Texts in Mathematics

Graduate Texts in Mathematics

Graduate Studies in Mathematics

Mathematics Encyclopedia

World

Index

Hellenica World - Scientific Library

Retrieved from "http://en.wikipedia.org/"
All text is available under the terms of the GNU Free Documentation License