In the theory of probability and statistics, the Dvoretzky–Kiefer–Wolfowitz inequality bounds how close an empirically determined distribution function will be to the distribution function from which the empirical samples are drawn. It is named after Aryeh Dvoretzky, Jack Kiefer, and Jacob Wolfowitz, who in 1956 proved the inequality with an unspecified multiplicative constant C in front of the exponent on the right-hand side.[1] In 1990, Pascal Massart proved the inequality with the sharp constant C = 2,[2] confirming a conjecture due to Birnbaum and McCarty.[3]
The DKW inequality
Given a natural number n, let X1, X2, …, Xn be real-valued independent and identically distributed random variables with cumulative distribution function F(·). Let Fn denote the associated empirical distribution function defined by
\( F_{n}(x)={\frac 1n}\sum _{{i=1}}^{n}{\mathbf {1}}_{{\{X_{i}\leq x\}}},\qquad x\in {\mathbb {R}}. \)
So F(x) is the probability that a single random variable X is smaller than x, and \( F_n(x) \) is the fraction of random variables that are smaller than } x.
The Dvoretzky–Kiefer–Wolfowitz inequality bounds the probability that the random function Fn differs from F by more than a given constant ε > 0 anywhere on the real line. More precisely, there is the one-sided estimate
\( \Pr {\Bigl (}\sup _{{x\in {\mathbb R}}}{\bigl (}F_{n}(x)-F(x){\bigr )}>\varepsilon {\Bigr )}\leq e^{{-2n\varepsilon ^{2}}}\qquad {\text{for every }}\varepsilon \geq {\sqrt {{\tfrac {1}{2n}}\ln 2}}, \)
which also implies a two-sided estimate[4]
\( \Pr {\Bigl (}\sup _{{x\in {\mathbb R}}}|F_{n}(x)-F(x)|>\varepsilon {\Bigr )}\leq 2e^{{-2n\varepsilon ^{2}}}\qquad {\text{for every }}\varepsilon >0. \)
This strengthens the Glivenko–Cantelli theorem by quantifying the rate of convergence as n tends to infinity. It also estimates the tail probability of the Kolmogorov–Smirnov statistic. The inequalities above follow from the case where F corresponds to be the uniform distribution on [0,1] in view of the fact[5] that Fn has the same distributions as Gn(F) where Gn is the empirical distribution of U1, U2, …, Un where these are independent and Uniform(0,1), and noting that
\( {\displaystyle \sup _{x\in \mathbb {R} }|F_{n}(x)-F(x)|\;{\stackrel {d}{=}}\;\sup _{x\in \mathbb {R} }|G_{n}(F(x))-F(x)|\leq \sup _{0\leq t\leq 1}|G_{n}(t)-t|,} \)
with equality if and only if F is continuous.
B
uilding CDF bands
See also: CDF-based nonparametric confidence interval
The Dvoretzky–Kiefer–Wolfowitz inequality is one method for generating CDF-based confidence bounds and producing a confidence band. The purpose of this confidence interval is to contain the entire CDF at the specified confidence level, while alternative approaches attempt to only achieve the confidence level on each individual point which can allow for a tighter bound. The DKW bounds runs parallel to, and is equally above and below, the empirical CDF. The equally spaced confidence interval around the empirical CDF allows for different rates of violations across the support of the distribution. In particular, it is more common for a CDF to be outside of the CDF bound estimated using the DKW inequality near the median of the distribution than near the endpoints of the distribution.
The interval that contains the true CDF, F ( x ) {\displaystyle F(x)} F(x), with probability \( 1-\alpha \) is often specified as
\( {\displaystyle F_{n}(x)-\varepsilon \leq F(x)\leq F_{n}(x)+\varepsilon \;{\text{ where }}\varepsilon ={\sqrt {\frac {\ln {\frac {2}{\alpha }}}{2n}}}.} \)
See also
Concentration inequality – a summary of bounds on sets of random variables.
References
Dvoretzky, A.; Kiefer, J.; Wolfowitz, J. (1956), "Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator", The Annals of Mathematical Statistics, 27 (3): 642–669, doi:10.1214/aoms/1177728174, MR 0083864
Massart, P. (1990), "The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality", Annals of Probability, 18 (3): 1269–1283, doi:10.1214/aop/1176990746, MR 1062069
Birnbaum, Z. W.; McCarty, R. C. (1958). "A distribution-free upper confidence bound for Pr{Y<X}, based on independent samples of X and Y". The Annals of Mathematical Statistics. 29: 558–562. doi:10.1214/aoms/1177706631. MR 0093874. Zbl 0087.34002.
Kosorok, M.R. (2008), "Chapter 11: Additional Empirical Process Results", Introduction to Empirical Processes and Semiparametric Inference, Springer, p. 210, ISBN 9780387749778
Shorack, G.R.; Wellner, J.A. (1986), Empirical Processes with Applications to Statistics, Wiley, ISBN 0-471-86725-X
Undergraduate Texts in Mathematics
Graduate Studies in Mathematics
Hellenica World - Scientific Library
Retrieved from "http://en.wikipedia.org/"
All text is available under the terms of the GNU Free Documentation License