next up previous contents
Next: A Five Class Problem Up: Pattern Recognition Previous: k-Nearest Neighbor Classifiers

   
Confidence Intervals

When classifying images using test data, true classifier performance (Pc) is estimated as

\begin{displaymath}\widehat{P_c}=\frac{\textrm{Number of correct classifications}}
{\textrm{Number of classified samples}}
\end{displaymath} (1.4)

Each classified sample is therefore a Bernoulli trial and the estimate of classifier performance ( $\widehat{P_c}$) is distributed as a binomial random variable with a mean of Pc and a variance of $\frac{P_c(1-P_c)}{N}$. As the number of samples is increased, the binomial distribution can be approximated by a Gaussian distribution with the same mean and covariance. It is then possible to assign a confidence interval to the performance estimate, $\widehat{P_c}$[18, p. 250]

 \begin{displaymath}P_c = \widehat{P_c}\pm z_u\sqrt{\frac{\widehat{P_c}(1-\widehat{P_c})}{N}}
\end{displaymath} (1.5)

where zu satisfies

\begin{displaymath}u=\frac{1}{\sqrt{2\pi}}\int_{-z_u}^{z_u}e^{-\frac{z^2}{2}}dz
\end{displaymath} (1.6)

for a particular value of u (u=0.95 and zu=1.96 below, unless otherwise stated).



Copyright ©1999 Michael V. Boland
1999-09-18