Efficient compression of congruence class automations

Efficient compression of congruence class automations

As already discussed in the previous post, any radix-b number nn with nmodmEZn \bmod{m} \in E \subseteq Z can be accepted by finite automata in a digit-by-digit manner. However, the construction is not always optimal. The amount of states required is not always the amount of congruence classes. In this post we will examine when exactly the finite automata can be simplified by combining states. Reducing the amount of states will also help produce a much simpler regular expression, for example, with the Kleene construction.

General way to optimize an automation

Of course, we can optimize the DFA with the well-known algorithm based on the Myhill-Nerode theorem. Unfortunately, even an optimized implementation is not efficient enough to do the optimization for large mm and bb, because the complexity of the algorithm expressed in these terms is O(bm2)O(b \cdot m^2), as the algorithm needs to iterate over all m2m^2 pairs of states and consider all the outgoing transitions (bb in each state). With this tool we can run the algorithm and get the following results for b=10b = 10 and 1m201 \le m \le 20:

m= 1 |Z|= 1  Z = [[0]]
m= 2 |Z|= 2  Z = [[0], [1]]
m= 3 |Z|= 3  Z = [[0], [1], [2]]
m= 4 |Z|= 3  Z = [[0], [3, 1], [2]]
m= 5 |Z|= 2  Z = [[0], [2, 1, 3, 4]]
m= 6 |Z|= 4  Z = [[0], [4, 1], [5, 2], [3]]
m= 7 |Z|= 7  Z = [[0], [1], [2], [3], [4], [5], [6]]
m= 8 |Z|= 5  Z = [[0], [5, 1], [6, 2], [7, 3], [4]]
m= 9 |Z|= 9  Z = [[0], [1], [2], [3], [4], [5], [6], [7], [8]]
m=10 |Z|= 2  Z = [[0], [5, 4, 3, 2, 7, 6, 9, 8, 1]]
m=11 |Z|=11  Z = [[0], [1], [2], [3], [4], [5], [6], [7], [8], [9], [10]]
m=12 |Z|= 7  Z = [[0], [7, 1], [8, 2], [9, 3], [10, 4], [11, 5], [6]]
m=13 |Z|=13  Z = [[0], [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12]]
m=14 |Z|= 8  Z = [[0], [8, 1], [9, 2], [10, 3], [11, 4], [12, 5], [13, 6], [7]]
m=15 |Z|= 4  Z = [[0], [13, 10, 7, 4, 1], [5, 2, 8, 11, 14], [9, 6, 12, 3]]
m=16 |Z|= 9  Z = [[0], [9, 1], [10, 2], [11, 3], [12, 4], [13, 5], [14, 6], [15, 7], [8]]
m=17 |Z|=17  Z = [[0], [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16]]
m=18 |Z|=10  Z = [[0], [10, 1], [11, 2], [12, 3], [13, 4], [14, 5], [15, 6], [16, 7], [17, 8], [9]]
m=19 |Z|=19  Z = [[0], [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18]]
m=20 |Z|= 3  Z = [[0], [13, 11, 15, 17, 5, 3, 7, 9, 19, 1], [12, 10, 14, 16, 4, 2, 6, 8, 18]]

Here, ZZ is the set of states in the new, minimal DFA. ZZ is a set of equivalence classes that represent which states can be combined into one state. These equivalence classes have been computed for E:={0}E := \{0\}. That is why, for example, for m=8m = 8 states 0 and 4 aren’t in one equivalence class. They can’t be equivalent, because state 4 is intermediate while state 0 is final.

Computing equivalence classes ahead-of-time

As you can see, there is clearly a pattern in the equivalence states above. For example, we can see that no states are equivalent when mm is prime. By further examining the data above we can even conclude that the DFA probably can’t be simplified if gcd(b,m)=1\gcd(b, m) = 1 i.e. if bb and mm are coprime.

After formally examining when the states can be combined I came up with the following result:

Theorem

If mbm \le b and EE \neq \varnothing, then the minimum finite automation can be constructed as follows: Z=AZ{AE,A\E}\{}Z = \overset{\cdot}{\bigcup_{A\in Z'}} \{A \cap E, A \backslash E\}\backslash\{\varnothing\}

where ZZ' is the set of equivalence classes of the following equivalence relation: xyxymodmgcd(b,m)x \sim y \Leftrightarrow x \equiv y \mod{\frac{m}{\gcd(b, m)}}

Proof

By definition of δ\delta, we know that: δ(r,d)=k    δ(r,d+1modb)=k+1modm\delta(r, d) = k \implies \delta(r, d + 1 \bmod b) = k + 1 \bmod m

Because mbm \le b, it follows that the mapping δ(x,Σ)\delta(x, \Sigma) is surjective. In other words, mbm \le b implies that every state has at least one transition to any other state. Any change to this mapping will shift it. As EE \neq \varnothing, any shifted mapping δ\delta' will lead to at least 1 digit dd such that δ(x,d)E\delta(x, d) \in E and δ(x,d)E\delta'(x, d) \notin E.

Thus, 2 states x,yZx,y \in Z are equivalent if and only if they have equivalent mappings which is equivalent to: δ(x,Σ)=δ(y,Σ)δ(x,d)=δ(y,d)dΣxb+dmodm=yb+dmodmxb+dyb+dmodmxbybmodmxbyb0modm(xy)b0modmxy0modmgcd(b,m)xymodmgcd(b,m)\begin{aligned} \delta(x, \Sigma) = \delta(y, \Sigma) &\Leftrightarrow \delta(x, d) = \delta(y, d) \quad \forall d \in \Sigma \\ &\Leftrightarrow xb + d \bmod m = yb + d \bmod m \\ &\Leftrightarrow xb + d \equiv yb + d \mod{m} \\ &\Leftrightarrow xb \equiv yb \mod{m} \\ &\Leftrightarrow xb - yb \equiv 0 \mod{m} \\ &\Leftrightarrow (x-y) \cdot b \equiv 0 \mod{m} \\ &\Leftrightarrow x-y \equiv 0 \mod{\frac{m}{\gcd(b, m)}} \\ &\Leftrightarrow x \equiv y \mod{\frac{m}{\gcd(b, m)}} \end{aligned}

First corollary

It follows, that if mbm \le b, EE \neq \varnothing and gcd(b,m)=1\gcd(b, m) = 1, then no states can be combined and the minimal automation has mm states.

Second corollary

If mbm \le b and EE \neq \varnothing, then the amount of states ZZ' (before splitting each equivalence class in final states and non final ones as shown in the theorem) is equal to the following set of orbits: Z=(Z/m)/m/gcd(b,m)Z' = (\mathbb{Z}/m)/\langle \overline{m / \gcd(b, m)} \rangle

Example implementation

This Java method uses the results above to compute the equivalence classes efficiently:

public static List<List<Integer>> computeEquivalenceClasses(int b, int m, HashSet<Integer> finalStates) {
    assert m <= b;
    assert !finalStates.isEmpty();

    // Union-Find datastructure, preferrably with union-by-rank and path compression
    UnionFind equivalenceClasses = new UnionFind(m);

    int bmgcd = Euklidian.gcd(b, m);

    if (bmgcd == 1) {
        return equivalenceClasses.getDisjointSets();
    }

    int optimizedM = m / bmgcd;

    // loop through every set in the new set of equivalence classes
    for (int anchor = 0; anchor < optimizedM; anchor++) {

        int current = anchor; // [anchor] is the equivalence class
        // oppositeAnchor is the element that is final if anchor is intermediate
        // and intermediate if anchor is final
        // if oppositeAnchor == anchor, then it is considered as not yet found
        int oppositeAnchor = anchor;

        boolean anchorFinal = finalStates.contains(anchor);

        for (;;) {
            current = (current + optimizedM) % m;

            if (current == anchor) {
                break;
            }

            boolean currentFinal = finalStates.contains(current);

            if (currentFinal == anchorFinal) {
                equivalenceClasses.union(current, anchor);
            }
            else if (oppositeAnchor == anchor) {
                // "initialize" oppositeAnchor
                oppositeAnchor = current;
            }
            else {
                equivalenceClasses.union(current, oppositeAnchor);
            }
        }

    }

    return equivalenceClasses.getDisjointSets();
}

If we use a union-find datastructure with union in O(1)O(1) (e.g. union-by-rank with path compression), then the complexity of the algorithm is O(m)O(m) which is much better than O(bm2)O(b \cdot m^2).

Case when m > b

In case m>bm > b the equivalence relation depends very much on the set of final states EE. With the Myhill Nerode theorem we get: xy(xcLycLcΣ)(xbc+cmodmEybc+cmodmE)\begin{aligned} x \sim y &\Leftrightarrow (xc \in L \Leftrightarrow yc \in L \quad \forall c \in \Sigma^*) \\ &\Leftrightarrow (x \cdot b^{|c|} + c \bmod m \in E \Leftrightarrow y \cdot b^{|c|} + c \bmod m \in E) \end{aligned}

Conclusion

With the above theorem it is easy to implement a much faster algorithm that computes equivalent states. It also gives an intuition, why for example it is harder to test whether a decimal number is divisible by 9 than to test whether a decimal number is divisible by 5, if we measure “hardness” by the amount of states in the minimal DFA. It is the case because gcd(5,10)=5\gcd(5, 10) = 5 and gcd(9,10)=1\gcd(9, 10) = 1.


Copyright © 2019 — 2023 Alexander Mayorov. All rights reserved.