Learning notes on 2-SAT

Algorithm to solve 2-SAT

First decompose the implication graph into SCCs, and then if $x_i$ and $\neg x_i$ is in the same SCC, then the answer is impossible. Else it suffices to choose the variable that appears later in the topological order to be true.

Problems and Techniques

Enforcing one variable to be true: link
To do that, we can just add an edge from $\neg x_i$ to $x_i$

Enforcing at most one variable among several to be true
This can be done in $O(n)$. Order the nodes arbitrarily. Then we can build prefix nodes for each prefix part of the nodes and establish implication between those prefix nodes and the variables themselves.

Notes on Codeforces Problems: Rating 2400 (2)

Reduction to LIS: 76F
First, we need to understand the LIS problem with more depth. LIS problem, more generally, finds the longest sequence $X = [x_1, x_2,… , x_n]$ where $p(x_i) < p(x_{i+1})$ and $q(x_i) < q(x_{i+1})$ for $1 \leq i \leq n – 1$ for some function $p$ and $q$.
Then we return to the original problem to see how we can reduce it to LIS. After doing transformation to the formula, we see we can reach from $i$ to $j$ if and only if $|x_i – x_j| \leq (t_j – t_i) * V$. Then let $p_i = – x_i + t_i * V$ and $q_i = x_i + t_i * V$, then we can use LIS to solve this problem.

AC Automaton/ DP: 86C
AC Automaton is a good choice for maintaining “where we are when matching several strings”, but we also need to record how many characters are not matched yet when we are doing DP.
Another Note: To check whether a node is the end of a string, we need to check along the fail pointer until we reach the root.

Bottleneck Edge of Paths /Minimum Spanning Tree: 632F
The inequality in the problem looks like triangular inequality, which means we can somehow extend it into paths between two vertices, i.e. for any path between vertex $i$ and $j$, the maximum edge is large than the direct edge between them.
Then the way to solve this is to construct the MST and try to check each direct edge.

Euler’s Formula on Planar Graph: 933C
The solution of this problem utilizes Euler’s Formula, which relates the number of edges, vertices, and regions in a connected planar graph. $v-e+f = 2$, where $v$ is number of vertices, $e$ is number of edges, and $f$ is number of regions.

Reduction to MST / DP: 1120D
The first solution is to reduce this problem to MST. The idea of substituting between different tree nodes is similar to the way we make exchange argument in MST.
The second solution uses DP and the observation that we only need to care the cases where only one or zero vertice is left uncovered.

Almost Tree / Diameter: 835F
The graph is a almost tree, which contains a cycle and a bunch of trees attached to each node of the cycle. Ideally we want to enumerate the edge to delete. To do this, we can pick an arbitrary edge on the cycle, and do case analysis on whether it will be deleted. If yes, we can just run the diameter finding algorithm; If no, we can do some prefix/suffix sum around the cycle.

FFT / Segment Tree / Combinatorics: 1218E
1. While a good way to maintain single point/range update is to use a segment tree, it is still surprising to me that we can keep a frequency/sum list in each node and do convolution when merging nodes.
2. There is a step where we need to calculate array $s$ such that $$s_i = \sum_{j = 0}^{i} {{N – j}\choose{i – j}} \cdot t_j \cdot d^{i-j}$$. We can also use convolution to calculate this by spliting terms to those only related with $j$, $i – j$, $i$ respectively.
3. If we want to merge several intervals with complexity $O((n + m) \log (n + m))$. We can do this by pairing the smallest intervals together each time. This will yield a complexity of $L \log L$, where $L$ is the sum of all the intervals.

Segment Maintaining Max Prefix Sum: 1220F
We can keep the max prefix sums in a segment in two ways. One is using single point update and use a slightly more complicated merge function. Another is using lazy propagation directly on the prefix sum.

Divide-and-Conquer to Fit in Memory: 101E
If there is sufficient memory, we can do a straight-forward $O(n^2)$ DP. However, due to the tight memory constraint, in order to recover the answer, we will need to recursively find the middle point in the optimal answer using meet-in-the-middle tenique and divide the problems into the half, which yields a linear space complexity. In this way, the time complexity will be $O((n + m) ^ 2)$, which is still OK.

Reduction to Query in Segment Tree: 117D
One can see, when trying to caculate the sum, it is exactly like the process of querying sum in a segment tree.

HDU Multi-University Training 2019, Contest 4/5

Brute Force?/ Case Analysis: Contest 4, Problem M
First we can brute-forcely check factorize $n$ with the primes under $x = \sqrt[^5]{n}$. Then if $n$ is still larger than 1, other primes left to check is greater than $x$, which means there can only be at most 4 such primes. Analyzing the cases, we can see the answer will be 1 unless $n = p^2, p^3 \text{ or } p^4$ for some $p$.

XOR Operation/ Greedy?: Contest 5, Problem B
For each $a[i]$, define the favorite number as the best $b$ that $a[i]$ can choose (i.e. $a[i] \oplus b$ is smallest). For each $b[i]$ we define it similarly.
Without loss of generality, we assume there is no duplicate number in $A$ nor $B$. (The solution can be easily extended to the general case.)

Lemma 1: If for some number $a[i]$ and $b[j]$, they are each other’s favorite number, then $a[i] \oplus b[j]$ must be in the optimal solution. (If not, we can repair them together and the answer becomes better.)

Then we see this problem from a graph perspective: See each number as a vertex in a directed graph, and if $b$ is $a$’s favorite number, there is an directed edge from $a$ to $b$.

Lemma 2: There will be not cycle of length $>$ 2.
Proof (by contradiction): Suppose there is some cycle $x_1, x_2, x_3, …, x_n$. Then we have $x_1 \oplus x_2 < x_2 \oplus x_3 …. < x_n \oplus x_1 < x_1 \oplus x_2$. Contradiction.

Therefore, the idea to solve this problem is to effectively finding the cycle of length 2, putting the pair into the answer, and solve the problem again.

HDU Multi-University Training 2019, Contest 3

Segment tree for optimizing DP: Problem B
First we can binary search on the answer, and then we consider a $O(n^2)$ DP where $dp[i]$ means the maximum number of partitions we can have ending at $i$th position. Then transition is as follows $$dp[i] = max_{j} (dp[j]) \text{ s.t. } sum[i] – sum[j] \leq val$$. Since the constraints is in a continuous range of prefix sum, we can keep them in a segment tree to do fast maximum queries.

Wilson’s Theorem: Problem F
For any prime $p$, we have $$(p – 1)! = -1 \text{ (mod } p)$$

HDU Multi-University Training 2019, Contest 2

Longest Increasing Subsequence Revisited: Problem B
An alternative way to solve LIS: Using a segment or BIT to keep the maximum length ending with number $i$. To choose the lexicographically smallest/largest, we can greedy pick the first/last available number.

Contribution Technique?: Problem F
The key thing to notice is the product of any two unit cube. No matter how we split the cubes, they always stays the same. That means the answer is always the same, which equals to the sum of products of all pairs of unit cube. We can calculate that with the help of FWT.

Tricky Build-up of min-cut graph: Problem H
Consider having two items $A, B$ in a min-cut graph and the source $S$ and the sink $T$. Consider 5 edges between those vertices: $SA, SB, AB, AT, BT$. We can set up the cost of those edges properly so that cutting in some edges corresponds the profit that we are not able to get by seperating them. There are 5 variables and 4 equations, so we can have some solutions. In the setting of the problem, we can always get a non-negative solution. Then we get add the graphs of two vertices up and get the final graph.

Palindrome Tree: Problem I
Using DFS/Binary Lifting on the fail tree, we can know if there is a suffix palindrome of certain length. Then can add them to the answer.

Learning notes on Palindrome Tree

Basic things this structure can do
  1. the number of distinct palindrome substring in a string (which turns out to be the number of extra states, which is linear)
  2. the number of times each palindrome appears
  3. The number of palindromes (or sum of the corresponding properties) that ends in a certain location.
  1. The size of the tree is $O(n)$. This is a result of the fact that at most one new distinct palindrome can be added when we add a new character. (Otherwise, choose the longest of them, $s_{longest}$, and the shorter one will definitely appear if we flip it from the middle point of $s_{longest}$.)
  2. Each fail pointer is pointing to a smaller index. So when we want to do tree DP, we can just use the index as the topological order.
Other tricks:

Adding characters from both sides: HDU 5421
To do such thing, we can keep a extra pointer at the begining of the string. Both pointer will be the longest palindrome prefix/suffix. The operations will be symmetric, and, additionally, when the new distinct palindrome we get is the new whole string, we need to update both pointer to keep the property of the pointers.

Series Link

To be continued…

Learning notes on Fast Bitwise Transform

FWT (Fast Walsh-Hadamard Transform) is a good tool to do fast bitwise convolution. For example, in xor operation, given sequence $A$ and $B$, we can calculate $C$ in $O(n \log n)$ such that $C[i] = \sum_{j \oplus k = i} A[j] \cdot B[k]$.

There are two main properties for the $FWT$ functions for different bitwise operations, where are pretty similar to FFT. The first one is that $FWT(C) = FWT(A) * FWT(B)$, where “$*$” is element-wise multiplication. The second is that there is also $UFWT$ which let us convert the sequence back.

Especially, for “or” operation, $FWT(A)[i] = \sum_{i | j = i} A[j]$, which essentially means the sum over all the subsets for $i$; for “and” operation, $FWT(A)[i] = \sum_{i \land j = i} A[j]$, i.e., sum over supersets.

Related Problems

Frequency Convolution: SRM518 Hard, HDU 5909, HDU 6596
This is the basic application for this transform. Sometimes we can also use fast exponentiation to speed up the same convolution.

Accelerating Brute Force: CF662C
We can see for each way of brute force as a convolution between the frequency and the minimum one we can have for certain number. Thus we can do a convolution for once and get the answer for all.

Making the problem easier: CF800D
If we are asked to solve a problem for some values $x$ and solving for the supersets of $x$ is easier, then we can solve the superset version and transform it back to the exact value version.

HDU Multi-University Training 2019, Contest 1

Dynamic Programming: Problem A
Since there can only be four numbers, we can do a four-dimensional DP to record the last occurrences of the numbers. The time complexity is $O(n^4)$. One dimension can be used to record the current number, so we can reduce it to $O(n^3)$ space to fit into the space.

XOR Basis: Problem B
This can be solved by segment tree in $O(n \log n \log A)$, but this is too slow. Instead, we can keep a upper-triangular basis for each right bound. We should put the number with higher index as high as possible and put other numbers down. In this way, when we are querying the maximum for range $[l, r]$, we take the number in the “$r$ basis” when it is in the bound ($l_i \geq l$), and it can be guaranteed that every number we take is valid, i.e. can be expressed as a linear combination of the number in $[l_i, l]$.

Sliding Windows for String Matching on SAM: Problem F
There are several things to notice: The first is that the range for possible transition for copying text is increasing, which means we can use a deque to maintain the minimum. Furthermore, since we can prove the DP value is non-decreasing, we can just choose the smallest position to do the copy. To find that position $j$, we maintain a SAM for $s[1;j-1]$ and keeping adding $s[j]$ into the SAM until we can find $s[j;i]$ in $s[1;j-1]$. We use a pointer $pt$ to record the current matching. To delete a character, we just keep moving $pt$ until $curlen \leq st[pt].len$.

Generating Function/ Convolution: Problem L
Write the operation as generating function, we have $$\sum_i b_i x^i = (\sum_i a_i x^i) (\sum_i x^{ik})$$, which tells us that the order of operations does not matter. We also see that the operation of the same kind can be combined together by using the fact that the coefficient of $x^t$ in $(\sum_i x^{i}) ^ l$ is ${t + l – 1}\choose{l – 1}$. Then we can use three NTTs to get the answer.

Prüfer Code, Cayley’s Formula, etc

Prufer Code is a way to represent a labeled tree as an array, which is related some tree counting problems

The Prüfer code is a way of encoding a labeled tree with $n$ vertices using a sequence of $n-2$ integers in the interval $[0, n−1]$

Conversion and Bijection between Tree and Prüfer code

We can construct Prüfer Code from a given tree by removing the smallest leaf and add the number of the only vertex adjacent to it to the list until there are only two vertices.

We can also constructing a tree from a given Prüfer Code by the following procedure.
for i from 1 to n – 2:
    Let the smallest leaf not in the current sequence and not used be u
    Let the first element in the sequence be v;
    Mark v as used;
    Connect u and v;
    Delete the first element in the sequence;
Connect the last two unused vertices;

From the construction, we know each Prüfer Code has a tree, and each tree has a Prüfer Code. Let $f: \text{Tree} \rightarrow \text{Prüfer Code}$ be the function we use to convert a tree to a Prüfer Code. We can further prove $f^{-1}(f(t)) = t$ for any tree $t$ by induction, which implies that there is a bijection between labeled trees with $n$ vertices and Prüfer Code with length $n – 2$. The direct result of the bijection is Cayley’s Formula as shown below.

Cayley’s Formula and Generalized Version

Theorem 1 (Cayley’s Formula): For any integer $n > 0$, the number of trees of $n$ labeled vertices is $n^{n-2}$.

Theorem 2 (Generalized Cayley’s Formula): Let $T_{n, k}$ be the number of labelled forests on $n$ vertices with $k$ connected components, such that vertices $1, 2, …, k$ all belong to different connected components. Then $T_{n, k} = k n^{n-k-1}$.

Property of Prüfer Code

1. The vertex with the highest number will not be in the Code
2. Each vertex $i$ appears $d_i – 1$ times in the Code, where $d_i$ is the degree of vertex $i$.

Lemma 3: Given the degree of each vertex in a tree of size $n$, $d_1, d_2, …, d_n$, the number of possible trees is $$\frac{(n – 2)!}{(d_1 – 1)! (d_2 – 1)! … (d_n – 1)!}$$

Related Problems

Prüfer-Code-like Bijection: CF156D
Problem in short: Given a graph, find the number of ways to connect the graph with minimum edges.
Idea: We can construct sequence $P$ and $Q$ that represent the final graph similar to Prüfer Code. Let $k$ be the current number of connected component. $P$ denotes the vertex $v$ that we connect to when we remove the smallest labeled leaf component $C$ one by one. $Q$ denotes the vertex in $C$ that we use to connect $v$. The bijection between the sequences can be proved in a similar manner. Since for each sequence $P$, we will have $\prod_i sz_i$ different $Q$s, where $sz_i$ is the size of the $i$th component, the answer will be $$n^{k-2} \cdot \prod_i sz_i $$.

Generalized Cayley’s Formula: CF1109D
Trivial Application

About Suffix Automaton

Some Properties
  • Each state in the SAM corresponds an equivalent class of substrings of $s$ which appear in the same set of positions.
  • For some state $v$, $len(v) – minlen(v) + 1$ is the number of distinct substring in the corresponding class.
About String Matching

To check if $t$ occurs in $s$, we can keep matching $t$ on the SAM of $s$, go to the next state if the current character is matched, or get through suffix link otherwise.

If we want to know how much a string $t$ can be matched for each state in the SAM of $s$, we can first keep matching it and record whenever the character is matched, and then we can propagate the value through the links.

About Paths in the DAG

Fact: Every distinct path from the root is a distinct substring.

Following from this fact, we can use DP to calculate lots of things, such as number of distinct substrings, sum of distinct integers.
Example: sum of dictinct integers, Morse Code

About Suffix-Link Tree

The suffix links forms a tree in which you can propagate information upwards, like the first/last ocurring positions, number of characters matched in a suffix, etc.
Example: propagation of available indices

Related Problems

Sliding Windows for String Matching: HDU 6583
Analysis in the blog for the original contest.