At the first level, we set the hash space size to be exactly the number of keys to be stored, which is n. We sample a hash function h(1):K→{0,1,…,n−1} from a universal hash family H(1) to distribute the keys into n buckets B0,…,Bn−1 according to the hash values. That is, for each bucket j,
Bj:={ki:h(1)(ki)=j}.
The total number of cells consumed by the construction is
M:=j=1∑nmj=j=1∑nnj2.
If M≥4n, we resample the hash functions and repeat the construction process until M<4n.
Finally, for each key ki in bucket Bj, we store the key-value pair (ki,vi) in cell h(2,j)(ki) of the second-level table for bucket j.
At the query time, given a query key k, we first compute j=h(1)(k) to locate the corresponding first-level bucket. Then we compute the second-level address h(2,j)(k) inside bucket j. Finally, we check if the cell at address h(2,j)(k) matches the query key k. If so, we return the corresponding value vi.
Theorem 2.2. Let X be a random variable and f:R→R be a function. If Pr[f(X)≥0]=1, then for any a>0,
Pr[f(X)≥a]≤aE[f(X)].
Theorem 2.3. If Pr[X≤u]=1 for some u∈R, then for any a<u,
Theorem 2.5 (Chebyshev's inequality). For any random variable X and any ϵ>0,
Pr[∣X−E[X]∣≥ϵ]≤ϵ2Var(X).
Definition (Standard deviation). The standard deviation of a random variable X is defined as
σ(X)=Var(X).
Theorem 2.6. For any random variable X and any α>0,
Pr[∣X−E[X]∣≥α⋅σ(X)]≤α21.
Theorem 2.7. For any random variable X and any constant c∈R,
Var(cX)=c2Var(X).
Theorem 2.8. If X and Y are independent random variables, then
Var(X+Y)=Var(X)+Var(Y).
Theorem 2.9 (Chebyshev for sample averages). Let X:=n1∑i=1nXi be the sample average of n i.i.d. random variables X1,…,Xn, each with mean μ and variance σ2. Then for every ϵ>0,
Theorem 2.11 (Generic Chernoff bound (upper tail)). Let X be a random variable with mean μ and MGF M(t). Then for any a≥μ,
Pr[X≥a]≤t∈Rinf{e−taM(t)}.
Theorem 2.12 (Generic Chernoff bound (lower tail)). Let X be a random variable with mean μ and MGF M(t). Then for any a≤μ,
Pr[X≤a]≤t∈Rinf{e−taM(t)}.
Definition (Rate function). The rate function of a random variable X is defined as
IX(a):=t∈Rsup{ta−logMX(t)},
with the convention that IX(a)=+∞ if the supremum is unbounded.
Theorem 2.13 (Generic Chernoff bound in terms of the rate function). Let X be a random variable with mean μ and rate function I(a). Then
∀a≥μ:Pr[X≥a]∀a≤μ:Pr[X≤a]≤e−I(a),≤e−I(a).
Theorem 2.14 (Properties of the rate function). Let X be a random variable with mean μ and rate function I(a). Let ℓ and u be the minimum and maximum values that X can take, and assume ℓ<u. Then
Lemma. Let X:=n1∑i=1nXi be the sample average of n i.i.d. random variables X1,…,Xn, each with rate function I(a). Then
IX(a)=nI(a).
Theorem 2.15 (Generic Chernoff bound for sample averages). Let X:=n1∑i=1nXi be the sample average of n i.i.d. random variables X1,…,Xn, each with mean μ and rate function I(a). Then
**Theorem 2.16 (Cramér's theorem). **Let D be a probability distribution with mean μ and rate function I(a). Let X(n):=n1∑i=1nXi be the sample average of n i.i.d. random variables X1,…,Xn, drawn from D. Then as n→∞,
for any fixed a≥μ:Pr[X(n)≥a]for any fixed a≤μ:Pr[X(n)≤a]=e−n(I(a)+o(1)),=e−n(I(a)+o(1)).
Theorem 2.20 (Hoeffding's Theorem). Let X be a random variable taking values in [ℓ,u] for some ℓ≤u. Then for all t∈R,
E[et(X−E[X])]≤et2(u−ℓ)2/8.
Theorem 2.21. Let X be a random variable taking values in [ℓ,u] for some ℓ≤u. Then for all ϵ∈R,
IX(E[X]+ϵ)≥2ϵ2/(u−ℓ)2.
Theorem 2.22 (Hoeffding's inequality, i.i.i.d. case). Let X:=n1∑i=1nXi be the sample average of n i.i.d. random variables X1,…,Xn taking values in [ℓ,u] for some ℓ≤u. Then for all ϵ≥0,
Theorem 2.23 (Hoeffding’s inequality). Let X:=n1∑i=1nXi be the sample average of n independent random variables X1,…,Xn taking values in intervals [ℓ1,u1],…,[ℓn,un], respectively. Let s2 be the mean squared interval length:
s2:=n1i=1∑n(ui−ℓi)2.
Then for all ϵ≥0,
Pr[X≥E[X]+ϵ]≤e−2nϵ2/s2,Pr[X≤E[X]−ϵ]≤e−2nϵ2/s2.
Theorem 2.24. Let X:=∑i=1nΔi be the sum of n independent random variables Δ1,…,Δn taking values in intervals [ℓ~1,u~1],…,[ℓ~n,u~n], respectively. Let L2 be the sum of the squared interval lengths:
Definition (Martingale). Let Z0,Z1,…,Zn be random variables. We say that (Zi)i=0n is a martingale if for every i≥1,
E[Zi∣Z0,…,Zi−1]=Zi−1.
Theorem 2.25 (Azuma’s lemma). Let (Zi)i=0n be a martingale. Assume that for each i∈[n], the increment Δi:=Zi−Zi−1 always lies in an interval [ℓ~i,u~i] for some ℓ~i,u~i∈R. Let L2 be the sum of the squared interval lengths:
L2:=i=1∑n(u~i−ℓ~i)2.
Then for every t∈R,
E[et(Zn−Z0)]≤et2L2/8.
**Theorem 2.26 (Azuma–Hoeffding inequality). ** Let (Zi)i=0n be a martingale. Assume that for each i∈[n], the increment Δi:=Zi−Zi−1 always lies in an interval [ℓ~i,u~i] for some ℓ~i,u~i∈R. Let L2 be the sum of the squared interval lengths: