Digital search trees again revisited: the internal path length perspective
Abstract.
This paper studies the
asymptotics of the variance for the internal
path length in a symmetric digital search tree. The problem was open up to
now. We prove that for the binary digital search tree the variance is
asymptotically equal to $0.26600\dots
\cdot N+N\delta(\log_2 N)$ where N is the
number of stored records and $\delta(x)$ is a periodic function of mean zero
and a very small amplitude. This result implies that the internal path
length becomes asymptotically $N\cdot\log_2 N$
{\it with probability one} (i.e.
almost surely). In our previous work we have argued that the variance of the
internal (external) path length is a good indicator how well the digital
trees are balanced. We shall show that the digital search tree is the best
balanced digital tree in the sense that a random shape of this tree strongly
resembles a shape of a complete tree. Therefore, we conclude that a
symmetric digital tree is a good candidate for a dictionary structure, and a
typical search time is asymptotically equal to the optimal one for these
type of structures. Finally, in order to prove our result we had to solve
a number of nontrivial problems concerning analytic continuations and some
others of numerical nature. In fact, our results and techniques are motivated
by the methodology introduced in an influential paper by Flajolet and
Sedgewick.