Analysis of Lithium's algorithm

22 Jan 2013

I’ve recently been working on automated testcase reduction tools and thus I had the chance to study Jesse Ruderman’s Lithium tool, itself inspired from the ddmin algorithm. This paper contains good ideas, like for example the fact that the reduction could be improved if we rely on the testcase structure like XML nodes or grammar tokens instead of just characters/lines. However, the authors of the ddmin paper really don’t analyse precisely the complexity of the algorithm, except the best and worst case and there is a large gap between the two. Jesse’s analysis is much better and in particular introduces the concepts of monotonic testcase and clustered reduction where the algorithm performs the best and which intuitively seems the usual testcases that we meet in practice. However, the monotonic+clustered case complexity is only “guessed” and the bound $O(M\log_{2}(N))$ for a monotonic testcase (of size $N$ with final reduction of size $M$ ) is not optimal. For example if the final reduction is relatively small compared to $N$ , say $M=\frac{N}{\log_{2}(N)}=o(N)$ then $M\log_{2}(N)=N=\Omega(N)$ and we can’t say that the number of verifications is small compared to $N$ . In particular, Jesse can not deduce from his bound that Lithium’s algorithm is better than an approach based on $M$ binary search executions! In this blog post, I shall give the optimal bound for the monotonic case and formalize that in some sense the clustered reduction is near the best case. I’ll also compare Lithium’s algorithm with the binary search approach and with the ddmin algorithm. I shall explain that Lithium is the best in the monotonic case (or actually matches the ddmin in that case).

Thus suppose that we are given a large testcase exhibiting an unwanted behavior. We want to find a smaller test case exhibiting the same behavior and one way is to isolate subtestcases that can not be reduced any further. A testcase can be quite general so here are basic definitions to formalize a bit the problem:

•

A testcase $S$ is a nonempty finite sets of elements (lines, characters, tree nodes, user actions) exhibiting an “interesting” behavior (crash, hang and other bugs…)
•

A reduction $T$ of $S$ is a testcase $T\subseteq S$ with the same “interesting” behavior as $S$ .
•

A testcase $S$ is minimal if $\forall T\subsetneq S,T\text{ is not a reduction of }S$ .

Note that by definition, $S$ is a reduction of itself and $\emptyset$ is not a reduction of $S$ . Also the relation “is a reduction of” is transitive that is a reduction of a reduction of $S$ is a reduction of $S$ .

We assume that verifying one subset to check if it has the “interesting” behavior is what takes the most time (think e.g. testing a hang or user actions) so we want to optimize the number of testcases verified. Moreover, the original testcase $S$ is large and so a fast reduction algorithm would be to have a complexity in $o(|S|)$ . Of course, we also expect to find a small reduction $T\subseteq S$ that is $|T|=o(|S|)$ .

Without information on the structure on a given testcase $S$ or on the properties of the reduction $T$ , we must consider the $2^{|S|}-2$ subsets $\emptyset\neq T\neq S$ , to find a minimal reduction. And we only know how to do that in $O\left(2^{|S|}\right)$ operations (or $O\left(2^{|S|/2}\right)$ with Grover’s algorithm ;-)). Similarly, even to determine whether $T\subseteq S$ is minimal would require testing $2^{|T|}-2$ subsets which is not necessarily $o(|S|)$ (e.g. $|T|=\log_{2}{(|S|)}=o(|S|)$ ). Hence we consider the following definitions:

•

For any integer $n\geq 1$ , $S$ is $n$ -minimal if $\forall T\subsetneq S,|S-T|\leq n\implies T\text{ is not a reduction of }S$ .
•

In particular, $S$ is $1$ -minimal if $\forall x\in S,S\setminus\{x\}\text{ is not a reduction of }S$ .
•

$S$ is monotonic if $\forall T_{1}\subseteq T_{2}\subseteq S,\,T_{1}\text{ is a reduction of }S% \implies T_{2}\text{ is a reduction of }S$ .

Finding a $n$ -minimal reduction will give a minimal testcase that is no longer interesting if we remove any portion of size at most $n$ . Clearly, $S$ is minimal if it is $n$ -minimal for all $n$ . Moreover, $S$ is always $n$ -minimal for any $n\geq|S|$ . We still need to test exponentially many subsets to find a $n$ -minimal reduction. To decide whether $T\subseteq S$ is $n$ -minimal, we need to consider subsets obtained by removing portions of size $1,2,...,\min(n,|T|-1)$ that is $\sum_{k=1}^{\min(n,|T|-1)}\binom{|T|}{k}$ subsets. In particular whether $T$ is $1$ -minimal is $O(|T|)$ and so $o(|S|)$ if $T=o(|S|)$ . If $S$ is monotonic then so is any reduction $T$ of $S$ . Moreover, if $T\subsetneq S$ is a reduction of $S$ and $x\in S\setminus T$ , then $T\subseteq S\setminus\{x\}\subsetneq S$ and so $S\setminus\{x\}$ is a reduction of $S$ . Hence when $S$ is monotonic, $S$ is $1$ -minimal if and only if it is minimal. We will target $1$ -minimal reduction in what follows.

Let’s consider Lithium’s algorithm. We assume that $S$ is ordered and so can be identified with the interval $[1,|S|]$ (think for example line numbers). For simplicity, let’s first assume that the size of the original testcase is a power of two, that is $|S|=N=2^{n}$ . Lithium starts by $n-1$ steps $k=1,2,...,n-1$ . At step $k$ , we consider the chunks among the intervals $[1+j2^{n-k},(j+1)2^{n-k}]\ (0\leq j<2^{k})$ of size $2^{n-k}$ . Lithium verifies if removing each chunk provides a reduction. If so, it permanently removes that chunk and tries another chunk. Because $\emptyset$ is not a reduction of $S$ , we immediately increment $k$ if it remains only one chunk. The $n$ -th step is the same, with chunk of size 1 but we stop only when we are sure that the current testcase $T$ is $1$ -minimal that is when after $|T|$ attempts, we have not reduced $T$ any further. If $N$ is not a power of 2 then $2^{n-1}<N<2^{n}$ where $n=\lceil\log_{2}(N)\rceil$ . In that case, we apply the same algorithm as $2^{n}$ (i.e. as if there were $2^{n}-N$ dummy elements at the end) except that we don’t need to remove the chunks that are entirely in that additional part. This saves testing at most $n$ subtests (those that would be obtained by removing the dummy chunks at the end of sizes $2^{n-1},2^{n-2},...,1$ ). Hence in general if $C_{N}$ is the number of subsets of $S$ tried by Lithium, we have $C_{2^{n}}-n\leq C_{N}\leq C_{2^{n}}$ . Let $M$ be the size of the $1$ -minimal testcase found by Lithium and $m=\lceil\log_{2}(M)\rceil$ .

Lithium will always perform the $n-1$ initial steps above and check at least one subset at each step. At the end, it needs to do $M$ operations to be sure that the testcase is $1$ -minimal. So $C_{N}\geq\lceil\log_{2}(N)\rceil+M-1=\Omega(\log_{2}(N)+M)$ . Now, consider the case where $S$ monotonic and has one minimal reduction $T=[1,M]$ . Then $T$ is included in the chunk $[1,2^{m}]$ from step $k=n-m$ . Because $S$ is monotonic, this means that at step $k=1$ , we do two verifications and the second chunk is removed because it does not contain the $T$ (and the third one too if $N$ is not a power of two), at step $k=2$ it remains two chunks, we do two verifications and the second chunk is removed etc until $k=n-m$ . For $k>n-m$ , the number of chunk can grow again: 2, 4, 8… that is we handle at most $2^{1+k-(n-m)}$ chunks from step $n-m+1$ to $n-1$ . At step $k=n$ , a first round of at most $2^{m}$ verifications ensure that the testcase is of size $M$ and a second round of $M$ verifications ensure that it is $1$ -minimal. So $C_{N}\leq 1+\left(\sum_{k=1}^{n-m}2\right)+\left(\sum_{k=n-m+1}^{n}2^{1+k-(n-m% )}\right)+2^{m}+M=1+2(n-m)+2^{m}-1+2^{m}+M$ and after simplification $C_{N}=O(\log_{2}(N)+M)$ . Hence the lower bound $\Omega(\log_{2}(N)+M)$ is optimal. The previous example suggests the following generalization: a testcase $T$ is $C$ -clustered if it can be written as the union of $C$ nonempty closed intervals $T=I_{1}\cup I_{2}\cup...\cup I_{C}$ . If the minimal testcase found by Lithium is $C$ -clustered, each $I_{j}$ is of length at most $M\leq 2^{m}$ and so $I_{j}$ intersects at most 2 chunks of length $2^{m}$ from the step $k=n-m$ . So $T$ intersects at most $2C$ chunks from the step $k=n-m$ and a fortiori from all the steps $k\leq n-m$ . Suppose that $S$ is monotonic. Then if $c$ is a chunk that does not contain any element of $T$ then $T\setminus c$ is a reduction of $T$ and so Lithium will remove the chunk $c$ . Hence at each step $k\leq n-m$ , at most $2C$ chunks survive and so there are at most $4C$ chunks at the next step. A computation similar to what we have done for $T=[1,M]$ shows that $C_{N}=O(C(\log_{2}(N)+M))$ if the final testcase found by Lithium is $C$ -clustered. Note that we always have $M=o(N)$ and $\log_{2}(N)=o(N)$ . So if $C=O(1)$ then $C_{N}=O(\log_{2}(N)+M)=o(N)$ is small as wanted. Also, the final testcase is always $M$ -clustered (union of intervals that are singletons!) so we found that the monotonic case is $O(M(\log_{2}(N)+M))$ . We shall give a better bound below.

Now, for each step $k=1,2,...,n-1$ , Lithium splits the testcase in at most $2^{k}$ chunk and try to remove each chunk. Then it does at most $N$ steps before stopping or removing one chunk (so the testcase becomes of size at most $N-1$ ), then it does at most $N-1$ steps before stopping or removing one more chunk (so the testcase becomes of size at most $N-1$ ), …, then it does at most $M+1$ steps before stopping or removing one more chunk (so the testcase becomes of size at most $M$ ). Then the testcase is exactly of size $M$ and Lithium does at most $M$ additional verifications. This gives $C_{N}\leq\sum_{k=1}^{n-1}2^{k}+\sum_{k=M}^{N}k=2^{n}-2+\frac{N(N+1)-M(M-1)}{2}% =O(N^{2})$ verifications. This bound is optimal if $1\leq M\leq 2^{n-1}$ (this is asymptotically true since we assume $M=o(N)$ ): consider the cases where the proper reductions of $S$ are exactly the segments $[1,k]\ (2^{n-1}+1\leq k\leq 2^{n}-1)$ and $[k,2^{n-1}+1]\ (2\leq k\leq 2^{n-1}-M+2)$ . The testcase will be preserved during the first phase. Then we will keep browsing at least the first half to remove elements at position $2^{n-1}+2\leq k\leq 2^{n}-1$ . So $C_{N}\geq\sum_{k=2^{n-1}+2}^{2^{n}-1}2^{n-1}=2^{n-1}\left(2^{n-1}-2\right)=% \Omega(N^{2})$ .

We now come back to the case where $S$ is monotonic. We will prove that the worst case is $C_{N}=\Theta\left(M\log_{2}(\frac{N}{M})\right)$ and so our assumption $M=o(N)$ gives $M\log_{2}(\frac{N}{M})=\left(-\frac{M}{N}\log_{2}(\frac{M}{N})\right)N=o(N)$ as we expected. During the steps $1\leq k\leq m$ , we test at most $2^{k}$ chunks. When $k=m$ , $2^{m}\geq M$ chunks but at most $M$ distinct chunks contain an element from the final reduction. By monocity, at most $M$ chunks will survive and there are at most $2M$ chunks at step $m+1$ . Again, only $M$ chunks will survive at step $m+2$ and so on until $k=n-1$ . A the final step, it remains at most $2M$ elements. Again by monocity a first round of $2M$ tests will make $M$ elements survive and we finally need $M$ additional tests to ensure that the test case is minimal. Hence $C_{N}\leq{\sum_{k=1}^{m}2^{k}}+{\sum_{k=m+1}^{n}2M}+M=2^{m+1}-3+M(2(n-m)+1)=O(% M\log_{2}(\frac{N}{M}))$ . This bound is optimal: if $M=2^{m}$ , consider the case where $T=\{j2^{n-m}+1:0\leq j<2^{m}\}$ is the only minimal testcase (and $S$ monotonic) ; if $M$ is not a power of two, consider the same $T$ with $2^{m}-M$ points removed at odd positions. Then for each step $1\leq k\leq m-1$ , no chunks inside $[1,2^{n}]$ are removed. Then some chunks in $[1,2^{n}]$ are removed (none if $M$ is a power of two) at step $m$ and it remains $M$ chunks. Then for steps $m+1\leq k\leq n-1$ there are always exactly $2M$ chunks to handle. So $C_{N}\geq\sum_{m+1\leq k\leq n-1}2M=2M(n-m-2)=2M(\log_{2}(\frac{N}{M})-2)=% \Omega(M\log_{2}(\frac{N}{M}))$ .

We note that we have used two different methods to bound the number of verifications in the general monotonic case, or when the testcase is $C$ -clustered. One naturally wonders what happens when we combine the two techniques. So let $c=\lceil\log_{2}C\rceil\leq m$ . From step $1$ to $c$ , the best bound we found was $O(2^{k})$ ; from step $c$ to $m$ , it was $O(C)$ ; from step $m$ to $n-m$ it was $O(C)$ again ; from step $n-m$ to $n-c$ , it was $O(2^{1-k-(n-m)}C)$ and finally from step $n-c$ to $n$ , including final verifications, it was $O(M)$ . Taking the sum, we get $C_{N}=O(2^{c}+((n-m)-c)C+2^{(n-c-(n-m))}C+(n-(n-c))M)=O(C\left(1+\log_{2}{% \left(\frac{N}{MC}\right)}\right)+M(1+\log_{2}{C}))$ Because $C=O(M)$ , this becomes $C_{N}=O(C\log_{2}{\left(\frac{N}{MC}\right)}+M(1+\log_{2}{C}))$ . If $C=O(1)$ , then we get $C_{N}=O(\log_{2}(N)+M)$ . At the opposite, if $C=\Omega(M)$ , we get $C_{N}=\Omega(M\log_{2}{\left(\frac{N}{M}\right)})$ . If $C$ is not $O(1)$ but $C=o(M)$ then $1=o(\log_{2}{C})$ and $C\log_{2}(C)=o(M\log_{2}{C})$ and so the expression can be simplified to $C_{N}=O(C\log_{2}{\left(\frac{N}{M}\right)}+M\log_{2}{C})$ . Hence we have obtained an intermediate result between the worst and best monotonic cases and shown how the role played by the number of clusters: the less the final testcase is clustered, the faster Lithium finds it. The results are summarized in the following table:

	Number of tests
Best case	$\Theta(\log_{2}(N)+M)$
$S$ is monotonic ; $T$ is $O(1)$ -clustered	$O(\log_{2}(N)+M)$
$S$ is monotonic ; $T$ is $C$ -clustered ( $C=o(M)$ and unbounded)	$O(C\log_{2}{\left(\frac{N}{M}\right)}+M\log_{2}{C})$
$S$ is monotonic	$O\left(M\log_{2}\left(\frac{N}{M}\right)\right);o(N)$
Worst case	$\Theta(N^{2})$

Figure 0.1: Performance of Lithium’s algorithm for some initial testcase

S

of size

N

and final reduction

T

of size

M=o(N)

T

C

-clustered if it is the union of

C

intervals.

In the ddmin algorithm, at each step we add a preliminary round where we try to immediately reduce to a single chunk (or equivalently to remove the complement of a chunk). Actually, the ddmin algorithm only does this preliminary round at steps where there are more than 2 chunks for otherwise it would do twice the same work. For each step $k>1$ , if one chunk $c_{1}$ is a reduction of $S$ then $c_{1}\subseteq c_{2}$ for some chunk $c_{2}$ at the previous step $k-1$ . Now if $S$ is monotonic then, at level $k-1$ , removing all but the chunk $c_{2}$ gives a subset that contains $c_{1}$ and so a reduction of $S$ by monocity. Hence on chunk survive at level $k$ and there are exactly 2 chunks at level $k$ and so the ddmin algorithm is exactly Lithium’s algorithm when $S$ is monotonic. The ddmin algorithm keeps in memory the subsets that we didn’t find interesting in order to avoid repeating them. However, if we only reduce to the complement of a chunk, then we can never repeat the same subset and so this additional work is useless. That’s the case if $S$ is monotonic.

Finally, if $S$ is monotonic Jesse proposes a simpler approach based on a binary search. Suppose first that there is only one minimal testcase $T$ . If $k\geq\min T$ then $[1,k]$ intersects $T$ and so $U_{k}=S\setminus[1,k]\neq T$ . Then $U_{k}$ is not a reduction of $S$ for otherwise a minimal reduction of $U_{k}$ would be a minimal reduction of $S$ distinct from $T$ which we exclude by hypothesis. If instead $k<\min T$ then $[1,k]$ does not intersect $T$ and $U_{k}=S\setminus[1,k]\supseteq T\setminus[1,k]\supseteq T$ is a reduction of $S$ because $S$ is monotonic. So we can use a binary search to find $\min T$ by testing at most $\log_{2}(N)$ testcases (modulo some constant). Then we try with intervals $[1+\min T,k]$ to find the second least element of $T$ in at most $\log_{2}(N)$ . We continue until we find the $M$ -th element of $T$ . Clearly, this gives $M\log_{2}(N)$ verifications which sounds equivalent to Jesse’s bound with even a better constant factor. Note that the algorithm still works if we remove the assumption that there is only one minimal testcase $T$ . We start by $S=S_{1}$ and find $x_{1}=\max\{\min T:T\text{ is a minimal reduction of }S_{1}\}$ : if $k<x_{1}$ then $S\setminus[1;k]$ contains at least one mininal reduction with least element $x_{1}$ and so is a reduction because $S$ is monotonic. If $k\geq x_{1}$ then $S\setminus[1;k]$ is not a reduction of $S$ or a minimal reduction of $S\setminus[1;k]$ would be a minimal reduction of $S$ whose least element is greater than $x_{1}$ . So $S_{2}=S_{1}\setminus[1;x_{1}-1]$ is a reduction of $S_{1}$ . The algorithm continues to find $x_{2}=\max\{\min(T\setminus\{x_{1}\}):T\text{ is a minimal reduction of }S_{2}% ,\min(T)=x_{1}\}$ etc and finally returns a minimal reduction of $S$ . However, it is not clear that this approach can work if $S$ is not monotonic while we can hope that Lithium is still efficient if $S$ is “almost” monotonic. We remark that when there is only one minimal testcase $T=[1,M]$ , the binary search approach would require something like $\sum_{k=0}^{M-1}\log_{2}(N-k)=M\log_{2}(N)+\sum_{k=0}^{M-1}\log_{2}\left(1-% \frac{k}{N}\right)\geq M\log_{2}(N)+M\log_{2}\left(1-\frac{M}{N}\right)=M\log_% {2}(N)+o(M)$ . So that would be the worst case of the binary search approach whereas Lithium handles this case very nicely in $\Theta(\log_{2}(N)+M)$ ! In general, if there is only one minimal testcase $T$ of size $M\leq\frac{N}{2}$ then $\max(T)$ can be anywhere between $[M,N]$ and if $T$ is placed at random, $\max(T)\leq\frac{3}{4}N$ with probability at least $\frac{1}{2}$ . So the average complexity of the binary search approach in that case will be at least $\frac{1}{2}\sum_{k=0}^{M-1}\log_{2}(\frac{1}{4}N)=\frac{1}{2}M\log_{2}(N)+o(M)$ which is still not as good as Lithium’s optimal worst case of $O(M\log_{2}(\frac{N}{M}))$ …