Problems Notes on Codeforces String Problems

My goal is to solve all the problems related to strings/hashing/suffix structure starting from rating 2600 in Codeforces. I will keep summarizing problem ideas here.

If a problems falls into a specific large category, I will summarize it in those independent blogs:
Palindrome Tree
Suffix Automaton
Hashing

Shortest/Longest Path Cut Vertex Checking1214D650D, MCPC19B
Sometimes, if we want to check if a vertex is neccesary for a shortest path/longest path, one way to do is hashing as mentioned in the hashing blog. Another much easier way is to group the vertices by distance. The vertex is the cut vertex if and only if it is the only one in that “distance level”.

Non-Palindrome Cutting: Timus 2057
Consider that the only “impossible” cases are like $aaaaa$, $aaaabaaaa$, or $abababa$ (not really sure how to prove, though).
Otherwise the minimum answer is either $1$ or $2$. Check it brute-forcely.
For the maximum answer, we have $dp[i] = max\{dp[j] + 1\}$ for all $j$ s.t. $s[j + 1, i]$ is possible to split. Then we do the dp transition by excluding the three impossible cases.

Reference:
1. URAL Palindromic Contest (in Chinese)

Learning Notes on Generating Functions

Generating Function is a way to represent a sequence of number using coefficients of a polynomial.

The two most commonly used generating functions is:

1. Ordinary generating function (OGF)
$$G(a_n; x) = \sum_{n = 0}^{\infty}a_n x^n $$

2. Exponential generating function (EGF)
$$EG(a_n; x) = \sum_{n = 0}^{\infty}a_n \frac{ x^n }{n!} $$

The first one is used in combination, while the second is used in permutations.

For the second one, suppose we have two EGFs $f$ and $g$, then $$f(x) g(x) = \sum_{n=0}^{\infty} (\sum_{i=0}^{n} {{n}\choose{i}} f_i g_{n-i}) \frac{x^n}{n!}$$

Related Problems:

1. Representing $(x_1 + x_2 + .. + x_n)^k$: Codechef PSUM, HackerRank A Summation Problem
First of all, we can solve the problem in a knapsack fashion for the No. of the number item and the sum of items we are considering. Then we can keep a EGF $f$ s.t. the $i$-th term is the sum of all possible $(a_1 + a_2 + … )^i$. Then by multiplying $g(i) = \sum_{i = 0}^{k} \frac{{ v_i }^k}{n!}$, we put the $i$-th item in and permute with the terms in the sum.

2. Generating Functions and Knapsack Problem: Asia Shenyang 2018 M
In this problem we try to formulate knapsack problem to a generating function: $$\prod_{i = l}^{r} \frac{1 – x^{(a + 1) b}}{1 – x^b}$$. Also we have $$\frac{1}{1 – x^k} = 1 + x^k + x^{2k} + x^{3k} + …$$. Then we can precompute the prefix product and inverse product. To answer a query, we need to calculate the coefficients with the terms of powers $\leq c$ in the product.
Note:
1. From the generating function, we can also see undoing complete knapsack (where each item can be used many times) is the doing 0/1 knapsack (where each item can be used at most once) with a opposite coefficient. Just like the “add” and “del” function in this submission.
2. In each query, directly computing the product will be too costly. A neat trick is to make each coefficient the prefix sum of the correpsonding position for one of the polynomial beforehand, and then just take the coefficient of $x^c$ in the product. Then each query can be answer in $O(C)$.

3. Exponential Generating Functions: ZOJ4069
First key observation is that if $m < n$ (we also need to handle other cases very carefully), it must be composed by a $n – m$ chains. For one chain, the number ways it can be compsed by $n$ vertices is $\frac{n!}{2}$ if $n \geq 2$ and $1$ if $n = 1$. Then after dividing $n!$ for each term, the EGF is in a nice form $EG = \frac{1}{2}(2x+ x^2 + x^3 + x^4…) = x {(1 + \frac{1}{1 – x})}$. Suppose the coefficient of $x^n$ in $EG^m$ is $k$. The answer will be $\frac{k \cdot n!}{(n – m)!}$ because we didn’t specify the order of the $n – m$ chains. We can brute-forcedly calculate the answer since the form is not complicated.

Reference:
1. Generating Function – Wikipedia

Learning Notes on Multiplicative Functions, Dirichlet Convolution, Mobius Inversion, etc

Common Functions
  • Unit Function: $\epsilon(n)=[n=1]$
  • Constant Function: $1(n)=1$
  • Euler Function: $\varphi(n)=\sum_{i=1}^n [\gcd(i,n)=1]$
  • Mobius Function:$ \mu(n)=\left\{
    \begin{aligned}
    1 & \text{, if $n=1$} \\
    0 & \text{, if $n$ has some square}\\
    (-1)^k & \text{, ($k$ is the number of prime divisors of $n$) otherwise}
    \end{aligned}
    \right.
    $
Dirichlet Convolution

$$ (f*g)(n)=\sum_{d\mid n}f(d)g(\frac{n}{d}) $$

Common Convolutions:
$$ \varepsilon=\mu * 1 \iff \varepsilon(n)=\sum_{d\mid n}\mu(d)$$ $$ d=1 * 1 \iff d(n)=\sum_{d\mid n}1 $$ $$ \sigma=d * 1 \iff \sigma (n)=\sum_{d\mid n}d$$ $$\varphi=\mu * \text{ID} \iff \varphi(n)=\sum_{d\mid n}d\cdot\mu(\frac{n}{d})$$


Mobius Inversion

The first form of Mobius Inversion:
$\displaystyle F(n)=\sum_{d|n}f(d)\implies f(n)=\sum_{d|n}\bigg(\mu(d)\cdot F\Big(\frac{n}{d}\Big)\bigg)$

Proof of the first form:
$\displaystyle \sum_{d|n}\bigg(\mu(d)\cdot F\Big(\frac{n}{d}\Big)\bigg)$ $=\sum_{d|n}\bigg(\mu(d)\cdot \sum_{k|\frac{n}{d}}f(k)\bigg)=\sum_{k|n}\bigg(f(k)\cdot \sum_{d|\frac{n}{k}}\mu(d)\bigg)=f(n)$

Second form:
$\displaystyle F(n)=\sum_{n|d}f(d)\implies f(n)=\sum_{n|d}\bigg(\mu\Big(\frac{d}{n}\Big)\cdot F(d)\bigg)$

Related Problems

Euler’s Functions/ Mobius Inversion: CF645F
Let $F[n]$($f[n]$) be the number of $k$-tuples the g.c.d. of whose numbers is a multiple of k (exactly k ), respectively. First we have $F[n] = {{cnt[n]}\choose{k}}$, where $cnt[n]$ is the number of species that has number divisible by $n$.
By Mobius Inversion, we have $f(d) = \sum_{d|n} \mu(\frac{n}{d}) F(n)$, and the answer for the problem equals $\sum_{d = 1}^{10^6} d \cdot f(d)$
$$= \sum_{d = 1}^{10^6} d \cdot \sum_{d|n} \mu(\frac{n}{d}) F(n) $$
We can change the summation to sum over $n$ instead, and then we have $$= \sum_{n = 1}^{10^6} \sum_{d|n} d \cdot \mu(\frac{n}{d}) F(n)$$ $$= \sum_{n = 1}^{10^6} F(n) \sum_{d|n} d \cdot \mu(\frac{n}{d}) $$ $$= \sum_{n = 1}^{10^6} F(n) \varphi(n)$$.
Therefore, each update can only affect $O(maxdivs)$ values of $F$, which is good enough for this problem.

Changing representation of the sequence: 1097F
In this problem, we can store the frequency of the multiples instead of the number itself. This is essentially Mobius Inversion, which is also a linear transformation, we can use bitwise xor for addition, and bitwise and for Cartisian product.

LCM: SPOJ LCMSUM
Key takeaways:
1. The formula $lcm(a, b) = \frac{a \cdot b}{gcd(a, b)}$ can turn the problem to some gcd-related stuff where we can apply Mobius Inversion.
2. TODO

No. of Divisors: CF235E
Let $d(x)$ be the number of divisors of $x$.
Then $ d(mn)=\sum\limits_{i|m}\sum\limits_{j|n}[(i,j)=1] $
Also, $d(r \cdot s \cdot t)=\sum\limits_{i|r}\sum\limits_{j|st}[(i,j)=1]=\sum\limits_{i|r}\sum\limits_{j|s}\sum\limits_{k|t}[(i,jk)=1][(j,k)=1]$ $=\sum\limits_{i|r}\sum\limits_{j|s}\sum\limits_{k|t}[(i,j)=1][(i,k)=1][(j,k)=1]$
Refer to this blog for more details.

Centroid Decomposition: CF809E
By enumerating the GCD, we need to calculate $$\sum_{i = 1}^{n} \frac{f_i \cdot i}{\varphi(i)}$$ where $$f_i = \sum_i \sum_j [(a_i, b_j) = g] \varphi (a_i) \varphi (b_j) d(i)$$. Instead of calculating $f_i$, we can calculate $g_d = \sum_{d | a_i} \sum_{d | b_j} \varphi (a_i) \varphi (b_j) d(i) $ by using centroid decomposition.

Total number of divisors of all the numbers is $O(n \log n)$ and each of them is used $O(\log n)$ because of centroid decomposition. So the total complexity is $O(n \log^2 n)$

Reference

HDU Multi-University Training 2019, Contest 4/5

Brute Force?/ Case Analysis: Contest 4, Problem M
First we can brute-forcely check factorize $n$ with the primes under $x = \sqrt[^5]{n}$. Then if $n$ is still larger than 1, other primes left to check is greater than $x$, which means there can only be at most 4 such primes. Analyzing the cases, we can see the answer will be 1 unless $n = p^2, p^3 \text{ or } p^4$ for some $p$.

XOR Operation/ Greedy?: Contest 5, Problem B
For each $a[i]$, define the favorite number as the best $b$ that $a[i]$ can choose (i.e. $a[i] \oplus b$ is smallest). For each $b[i]$ we define it similarly.
Without loss of generality, we assume there is no duplicate number in $A$ nor $B$. (The solution can be easily extended to the general case.)

Lemma 1: If for some number $a[i]$ and $b[j]$, they are each other’s favorite number, then $a[i] \oplus b[j]$ must be in the optimal solution. (If not, we can repair them together and the answer becomes better.)

Then we see this problem from a graph perspective: See each number as a vertex in a directed graph, and if $b$ is $a$’s favorite number, there is an directed edge from $a$ to $b$.

Lemma 2: There will be not cycle of length $>$ 2.
Proof (by contradiction): Suppose there is some cycle $x_1, x_2, x_3, …, x_n$. Then we have $x_1 \oplus x_2 < x_2 \oplus x_3 …. < x_n \oplus x_1 < x_1 \oplus x_2$. Contradiction.

Therefore, the idea to solve this problem is to effectively finding the cycle of length 2, putting the pair into the answer, and solve the problem again.

HDU Multi-University Training 2019, Contest 3

Segment tree for optimizing DP: Problem B
First we can binary search on the answer, and then we consider a $O(n^2)$ DP where $dp[i]$ means the maximum number of partitions we can have ending at $i$th position. Then transition is as follows $$dp[i] = max_{j} (dp[j]) \text{ s.t. } sum[i] – sum[j] \leq val$$. Since the constraints is in a continuous range of prefix sum, we can keep them in a segment tree to do fast maximum queries.

Wilson’s Theorem: Problem F
For any prime $p$, we have $$(p – 1)! = -1 \text{ (mod } p)$$

Learning notes on Fast Bitwise Transform

FWT (Fast Walsh-Hadamard Transform) is a good tool to do fast bitwise convolution. For example, in xor operation, given sequence $A$ and $B$, we can calculate $C$ in $O(n \log n)$ such that $C[i] = \sum_{j \oplus k = i} A[j] \cdot B[k]$.

There are two main properties for the $FWT$ functions for different bitwise operations, where are pretty similar to FFT. The first one is that $FWT(C) = FWT(A) * FWT(B)$, where “$*$” is element-wise multiplication. The second is that there is also $UFWT$ which let us convert the sequence back.

Especially, for “or” operation, $FWT(A)[i] = \sum_{i | j = i} A[j]$, which essentially means the sum over all the subsets for $i$; for “and” operation, $FWT(A)[i] = \sum_{i \land j = i} A[j]$, i.e., sum over supersets.

Related Problems

Frequency Convolution: SRM518 Hard, HDU 5909, HDU 6596
This is the basic application for this transform. Sometimes we can also use fast exponentiation to speed up the same convolution.

Accelerating Brute Force: CF662C
We can see for each way of brute force as a convolution between the frequency and the minimum one we can have for certain number. Thus we can do a convolution for once and get the answer for all.

Making the problem easier: CF800D
If we are asked to solve a problem for some values $x$ and solving for the supersets of $x$ is easier, then we can solve the superset version and transform it back to the exact value version.