Upsolve Contest Tracker, Apr – Jun, 2020.

April 25, 2020 (1st Div1 Contest Done!)
Codeforces Round #635 (Div. 1): C, D, E, F (Done)
E: linear algebra, FWT
This problem is about relating bitwise-xor FWT operation $B[x]=\sum_y (-1)^{\langle x,y\rangle}A[y]$ with linear orthorgnal complement when $A$ is a linear subspace. It turns out that when the the rank of $A$ is large, the rank of the linear orthorgaal complement will be small, and the number of non-zero entries in $B = FWT(A)$ will be small.

April 29, 2020 (2nd Div1 Contest Done!)
Codeforces Round #637 (Div. 1) – Thanks, Ivan Belonogov!: D, F (Done)
F: bracket sequence, segment tree, hash
Consider a segment tree to merge bracket information. We can know that a bracket in a node should be of form “…}]){{({…”, we can merge two nodes if they can match the adjacent parts (we can check by hash).
The key property of this merging action is that the prefix of the left node (i.e. the “…}])” part) and the suffix of the right node (i.e. the ” {{({… ” part) will never get shorter when merged. Then we have storing in each node the longest valid prefix and suffix even if the sequence in it is not valid. By doing this, we can do the query in the same way as update since we can query the prefix and suffix of certain length as in update.
Code: 78524960

April 30, 2020
Educational Codeforces Round 86 (Rated for Div. 2): E, F (Done)
F: bitmask, brute force
DP usually had its value as another hidden dimension, so when our DP is too large, always think about putting one dimension to the value.

Jun 3, 2020
Codeforces Round #646 (Div. 2) : F (Done)

CF Problems: 2020 Jan. ~ Feb.

Matroid Intersection: 1284G
The problem asks us to find a spanning tree where certian nodes cannot be leaves. The key thing to think about is “What does it mean to be non-leaves?”, and the idea is that the total degree of a node is at least 2. Therefore, we can try to find an acyclic edge set where the degree of each special node is exactly 2. This can be done using intersection of a graphic matroid and a colorful matroid (We restrict each color to have at most two instances, and check whether it is exactly 2 in the final maximum independent set).

CF Problems: 2020 Mar. ~ Apr.

Problems are sorted from most recent to least recent

FFT, String Matching: 1334G
When we want to matching two string with more flexible criterion, we can try to come up with some formula $f$ for comparing two character where $f(u, v) = 0$ only when $u = v$. In this case, $(s – t)^2 (s’ – t)^2$ can be a good choice. After expanding the equation, we only need 3 FFTs to get the answer.

Bitmask, SOS DP, Brute Force: 1326F2

There are several techniques that we could use in this kind of problems: DP over subset/superset. Usually we can solve the “superset” version of a problem easier than itself. Like in this case, if we only care about the string location that must be 1, we only need to care about each component.

Then we can realize the split of the component (i.e. the partition of the sequence) can be at most as $P(18) = 385$ (recall NAC20J, which uses P(30) = 5040), and each split, no matter how ordered, has the same number ways to achieve. Then we can solve for each configuration and merge them together.

To merge them, we can use subset transformation (subset transform is for “OR”, superset transform is for “AND”) to make it easier to unite different component in a partition. When we get the answer, we can transform the result back from “superset” version.

One side note on the easier subtask is that it used the fact that $\sum_{k = 0}^{n} {{n}\choose{k}} 2^k = 3^k$ by binomial theorem. so we can do bitmask DP using two dimensional DP (one on the submask of permutation and another on the submask of string), which is $O(3^k k /2)$.

Problems Notes on Codeforces String Problems

My goal is to solve all the problems related to strings/hashing/suffix structure starting from rating 2600 in Codeforces. I will keep summarizing problem ideas here.

If a problems falls into a specific large category, I will summarize it in those independent blogs:
Palindrome Tree
Suffix Automaton
Hashing

Shortest/Longest Path Cut Vertex Checking1214D650D, MCPC19B
Sometimes, if we want to check if a vertex is neccesary for a shortest path/longest path, one way to do is hashing as mentioned in the hashing blog. Another much easier way is to group the vertices by distance. The vertex is the cut vertex if and only if it is the only one in that “distance level”.

Non-Palindrome Cutting: Timus 2057
Consider that the only “impossible” cases are like $aaaaa$, $aaaabaaaa$, or $abababa$ (not really sure how to prove, though).
Otherwise the minimum answer is either $1$ or $2$. Check it brute-forcely.
For the maximum answer, we have $dp[i] = max\{dp[j] + 1\}$ for all $j$ s.t. $s[j + 1, i]$ is possible to split. Then we do the dp transition by excluding the three impossible cases.

Reference:
1. URAL Palindromic Contest (in Chinese)

Learning Notes on Hashing

I wouldn’t think of summarizing problems on hashing if I didn’t include “hashing” in the tag for my training plan and see those nice problems recently… While hashing can be used in string checking (so-called “polynomial hashing”), it can be extended to do things like set identity checking.

Problems like this is one of the direct application: We can give each number a random number. Then if two set has the same xor sum, the content of the set will likely be the same.

Shortest/Longest Path Cut Vertex Checking: 1214D, 650D
Sometimes, if we want to check if a vertex is neccesary for a shortest path/longest path, we can use hashing to calculate the number of ways and see if $f(s, v) * f(v, t) = f(s, t)$, where $f(u, v)$ is number of ways to go from $u$ to $v$.
P.S. There are simpler ways as talked here.

Permutation Checking: 213E
We want to maintain a data structure which characterizes a permutation, and supports addition and deletion of an element. How to characterize a permutation? Well, we can put the question like this: ” What uniquely determines the permutation? ” One way that can be implemented relatively easily is inversion sequence, where is $i$th term is the number of numbers before the number $i$. We can keep this structure efficiently using a Fenwick Tree.

Symmetry Checking: 452F
In this problem, we want to check if a certain group of number is symmetric. To solve this, we can store two values for a group, giving powers to numbers in reverse order. Say the frequency list is $[1, 1, 1, 0, 0]$, we can give them two string $”11100″$ and $”00111″$. To check whether they are symmetric around, for example, the second element, we can shift them (in hash) and then check whether they are the same.

Learning Notes on Generating Functions

Generating Function is a way to represent a sequence of number using coefficients of a polynomial.

The two most commonly used generating functions is:

1. Ordinary generating function (OGF)
$$G(a_n; x) = \sum_{n = 0}^{\infty}a_n x^n $$

2. Exponential generating function (EGF)
$$EG(a_n; x) = \sum_{n = 0}^{\infty}a_n \frac{ x^n }{n!} $$

The first one is used in combination, while the second is used in permutations.

For the second one, suppose we have two EGFs $f$ and $g$, then $$f(x) g(x) = \sum_{n=0}^{\infty} (\sum_{i=0}^{n} {{n}\choose{i}} f_i g_{n-i}) \frac{x^n}{n!}$$

Related Problems:

1. Representing $(x_1 + x_2 + .. + x_n)^k$: Codechef PSUM, HackerRank A Summation Problem
First of all, we can solve the problem in a knapsack fashion for the No. of the number item and the sum of items we are considering. Then we can keep a EGF $f$ s.t. the $i$-th term is the sum of all possible $(a_1 + a_2 + … )^i$. Then by multiplying $g(i) = \sum_{i = 0}^{k} \frac{{ v_i }^k}{n!}$, we put the $i$-th item in and permute with the terms in the sum.

2. Generating Functions and Knapsack Problem: Asia Shenyang 2018 M
In this problem we try to formulate knapsack problem to a generating function: $$\prod_{i = l}^{r} \frac{1 – x^{(a + 1) b}}{1 – x^b}$$. Also we have $$\frac{1}{1 – x^k} = 1 + x^k + x^{2k} + x^{3k} + …$$. Then we can precompute the prefix product and inverse product. To answer a query, we need to calculate the coefficients with the terms of powers $\leq c$ in the product.
Note:
1. From the generating function, we can also see undoing complete knapsack (where each item can be used many times) is the doing 0/1 knapsack (where each item can be used at most once) with a opposite coefficient. Just like the “add” and “del” function in this submission.
2. In each query, directly computing the product will be too costly. A neat trick is to make each coefficient the prefix sum of the correpsonding position for one of the polynomial beforehand, and then just take the coefficient of $x^c$ in the product. Then each query can be answer in $O(C)$.

3. Exponential Generating Functions: ZOJ4069
First key observation is that if $m < n$ (we also need to handle other cases very carefully), it must be composed by a $n – m$ chains. For one chain, the number ways it can be compsed by $n$ vertices is $\frac{n!}{2}$ if $n \geq 2$ and $1$ if $n = 1$. Then after dividing $n!$ for each term, the EGF is in a nice form $EG = \frac{1}{2}(2x+ x^2 + x^3 + x^4…) = x {(1 + \frac{1}{1 – x})}$. Suppose the coefficient of $x^n$ in $EG^m$ is $k$. The answer will be $\frac{k \cdot n!}{(n – m)!}$ because we didn’t specify the order of the $n – m$ chains. We can brute-forcedly calculate the answer since the form is not complicated.

Reference:
1. Generating Function – Wikipedia

Learning Notes on Multiplicative Functions, Dirichlet Convolution, Mobius Inversion, etc

Common Functions
  • Unit Function: $\epsilon(n)=[n=1]$
  • Constant Function: $1(n)=1$
  • Euler Function: $\varphi(n)=\sum_{i=1}^n [\gcd(i,n)=1]$
  • Mobius Function:$ \mu(n)=\left\{
    \begin{aligned}
    1 & \text{, if $n=1$} \\
    0 & \text{, if $n$ has some square}\\
    (-1)^k & \text{, ($k$ is the number of prime divisors of $n$) otherwise}
    \end{aligned}
    \right.
    $
Dirichlet Convolution

$$ (f*g)(n)=\sum_{d\mid n}f(d)g(\frac{n}{d}) $$

Common Convolutions:
$$ \varepsilon=\mu * 1 \iff \varepsilon(n)=\sum_{d\mid n}\mu(d)$$ $$ d=1 * 1 \iff d(n)=\sum_{d\mid n}1 $$ $$ \sigma=d * 1 \iff \sigma (n)=\sum_{d\mid n}d$$ $$\varphi=\mu * \text{ID} \iff \varphi(n)=\sum_{d\mid n}d\cdot\mu(\frac{n}{d})$$


Mobius Inversion

The first form of Mobius Inversion:
$\displaystyle F(n)=\sum_{d|n}f(d)\implies f(n)=\sum_{d|n}\bigg(\mu(d)\cdot F\Big(\frac{n}{d}\Big)\bigg)$

Proof of the first form:
$\displaystyle \sum_{d|n}\bigg(\mu(d)\cdot F\Big(\frac{n}{d}\Big)\bigg)$ $=\sum_{d|n}\bigg(\mu(d)\cdot \sum_{k|\frac{n}{d}}f(k)\bigg)=\sum_{k|n}\bigg(f(k)\cdot \sum_{d|\frac{n}{k}}\mu(d)\bigg)=f(n)$

Second form:
$\displaystyle F(n)=\sum_{n|d}f(d)\implies f(n)=\sum_{n|d}\bigg(\mu\Big(\frac{d}{n}\Big)\cdot F(d)\bigg)$

Related Problems

Euler’s Functions/ Mobius Inversion: CF645F
Let $F[n]$($f[n]$) be the number of $k$-tuples the g.c.d. of whose numbers is a multiple of k (exactly k ), respectively. First we have $F[n] = {{cnt[n]}\choose{k}}$, where $cnt[n]$ is the number of species that has number divisible by $n$.
By Mobius Inversion, we have $f(d) = \sum_{d|n} \mu(\frac{n}{d}) F(n)$, and the answer for the problem equals $\sum_{d = 1}^{10^6} d \cdot f(d)$
$$= \sum_{d = 1}^{10^6} d \cdot \sum_{d|n} \mu(\frac{n}{d}) F(n) $$
We can change the summation to sum over $n$ instead, and then we have $$= \sum_{n = 1}^{10^6} \sum_{d|n} d \cdot \mu(\frac{n}{d}) F(n)$$ $$= \sum_{n = 1}^{10^6} F(n) \sum_{d|n} d \cdot \mu(\frac{n}{d}) $$ $$= \sum_{n = 1}^{10^6} F(n) \varphi(n)$$.
Therefore, each update can only affect $O(maxdivs)$ values of $F$, which is good enough for this problem.

Changing representation of the sequence: 1097F
In this problem, we can store the frequency of the multiples instead of the number itself. This is essentially Mobius Inversion, which is also a linear transformation, we can use bitwise xor for addition, and bitwise and for Cartisian product.

LCM: SPOJ LCMSUM
Key takeaways:
1. The formula $lcm(a, b) = \frac{a \cdot b}{gcd(a, b)}$ can turn the problem to some gcd-related stuff where we can apply Mobius Inversion.
2. TODO

No. of Divisors: CF235E
Let $d(x)$ be the number of divisors of $x$.
Then $ d(mn)=\sum\limits_{i|m}\sum\limits_{j|n}[(i,j)=1] $
Also, $d(r \cdot s \cdot t)=\sum\limits_{i|r}\sum\limits_{j|st}[(i,j)=1]=\sum\limits_{i|r}\sum\limits_{j|s}\sum\limits_{k|t}[(i,jk)=1][(j,k)=1]$ $=\sum\limits_{i|r}\sum\limits_{j|s}\sum\limits_{k|t}[(i,j)=1][(i,k)=1][(j,k)=1]$
Refer to this blog for more details.

Centroid Decomposition: CF809E
By enumerating the GCD, we need to calculate $$\sum_{i = 1}^{n} \frac{f_i \cdot i}{\varphi(i)}$$ where $$f_i = \sum_i \sum_j [(a_i, b_j) = g] \varphi (a_i) \varphi (b_j) d(i)$$. Instead of calculating $f_i$, we can calculate $g_d = \sum_{d | a_i} \sum_{d | b_j} \varphi (a_i) \varphi (b_j) d(i) $ by using centroid decomposition.

Total number of divisors of all the numbers is $O(n \log n)$ and each of them is used $O(\log n)$ because of centroid decomposition. So the total complexity is $O(n \log^2 n)$

Reference

Notes on Codeforces Problems: Rating 2500

Complexity Analysis for Tree DP: 815C
In this problem, we have a straight-forward DP which has $sz_u \cdot sz_v$ for each children merging. In this case the total running time will be $O(n^2)$ since every pair of vertices will only be merged once.

$O(\sqrt{n})$ possibilities of n / i: 830B
We can express the sum of cut-off as $$\sum_{i} ( d \lceil \frac{a_i}{d}\rceil ) – \text{sum of } a_i$$ $$= d \sum_{i} (\lceil \frac{a_i}{d}\rceil ) – \text{sum of } a_i$$.
The sigma term will only change only $O(n \cdot \sqrt{a_i})$ many times and for each of them, we try to maximize $d$ within the range.

Cycle Under Modulo/ Fermat’s Little Theorem: 311D
It turns out that the length of cycle for a number cubed is always a divisor of 48. This can be explained by the fact that $3^{48} = 1 \bmod ( 95542721 – 1 )$. Therefore, by Fermat’s Little Theorem, we have $a^{3^k} = a^{3^{(k \bmod 48)}} \bmod 95542721$.

Binomial Coefficient/ Pascal’s Triangle: 407C
Don’t forget we have binomial coefficients in Pascal’s triangle! This is the way that we can update those coefficients offline in an array efficiently.

Centroid: 708C
Observation: Find any centroid $v$. We will only cut the edges adjacent to $v$. Otherwise, it will either be not possible (too large) or not optimal (can do better).

Meet-in-the-Middle: 327E, 1257F
When the constraints are really small, always try bit-mask DP or meet-in-the-middle.

MST/ Edge Groups of Kruskal’s Algorithm: 160D, 891C
The process of Krusal’s algorithm can be interpreted as grouping the edges with the same weights and processing them in increasing order of weights. Then for each weight, we want to find a spanning forest. This idea can be used to find the edges that can be/cannot be/might be in the MST.

Dealing with Double Divisions: 138C
If we are frequently doing double multiplicaitons and divisions during sliding window, the error might explode. Instead, either we can take logarithm beforehand and do additions/subtractions instead, or we can use a segment tree to avoid divisions.

Bipartite Graphs over points on 2D plane: 870E, 547D
When we are dealing with configurations of points on 2D plane, we can consider building a bipartite graph where the left (right) part are all possible x (y) coordinates and the points are edges that connects their x and y values.
By doing this construction, 870E becomes directing edges so that $|deg_{in} – deg_{out}| \leq 1$ for each node, and 547D becomes directing edges so that $deg_{in} \geq 1$ for each node.

Doing DFS to enumerate paths in the tree: 490F
We can start DFS from each node and do exactly like the original LIS problem by keeping a DP array. The time for update in each node is $O(1)$ so we can easily roll back when we exit a node.

“Construct optimal answer from any”: 460D
The main difficulty of the problem comes from whether we can choose three integers than have xor sum 0. We can start by any solution and try to tweak it. Then we can see if there are such solution exists, there must be some solution of form “$11000000…, 10111111…, 1111111…$”.

Maximum Pseudoforest/ Greedy: 875F
One can show that this problem is equivelant to finding the maximum weight pseudoforest. Since it is a matroid, we can use greedy to solve it using a way similar to Kruskal’s Algorithm.

Combinatorics / DP: 886E
To solve this problem, we try to use DP and see how to reduce to a smaller problem. Consider $f(n)$ as the number of permutations of $1$ to $n$ that exit early. Then we can consider the position of the largest value: if the largest value is in a early position $i <= n – k – 1$, the algorithm will alway exit early; otherwise the problem will become a smaller one.

Combinatorics: 348D
Let $f(x, y)$ be the number of paths from $x$ to $y$. We can shown the answer is $f((1, 0), (n, m-1)) \cdot f((0, 1), (n – 1, m)) – f((0, 1), (n, m-1)) \cdot f((1, 0), (n – 1, m)) $. The subtracted term is equal to the number of intersecting pairs by consider exchanging the membership of the path at the first time they intersect. The more general form of this problem is discussed here: Lindström–Gessel–Viennot lemma.

Segment Tree: 687D
One can notice that for any certain range of the edges, there will be at most $O(n)$ edges that matters (edges for a spanning tree and an violating edge). Therefore, we can store those edges for each interval in the segment tree, and for each query, we only have $O(n \log n)$ edges to consider.

Maximum Matching: 1198E
First consider covering the grid with unit strip (i.e. $1 \times n$ or $n \times 1$). Then we can reduce the problem into minimum vertex cover, which can be solved by max flow. The number of vertices is large but we can compress the identical ones.

HLD? / Brute Force: 442D
First of all, the answer will be at most $O(\log n)$ because an upper bound is HLD. Then first each newly added vertex, we can just iterative update its parent, and stop when nothing is updated. The amortized complexity is $O(n \log n)$ because each node will be updated at most $O(\log n)$ times.

Subset Sum / Bitset: 356D
While the problem itself is NP-hard, we can use bitset to squeeze the DP solution. To recover the solution, we can record the first item that make sum $j$ available, which can also be done by using bitset.

Constructive Algorithm: 1225F
Sometimes, when dealing with such problems like “finding the maximum answer”, it will be helpful to try to find a upper bound for the answer and show it is always achievable.

Greedy/ We-dont-care-about-everything DP: 822E
While doing brute force transition is too slow, we can just match as long as possible (i.e. longest common prefix) if we do match in a transition. Then the solution will work in $O(n \cdot k)$ after preprocessing LCP and RMQ array.

Cycle under Modulo / Constructive Algorithm: 763C
Consider a (cyclic) sequence where the next number of each number is $(x + d) \text{ mod } m$, where $x$ is the current number and $d, m$ is fixed. Then the answer we want is exactly a segment on this cyclic sequence. Then if $n \leq \frac{m}{2}$, we can see the difference $d$ of a two arbitrary numbers will appear $n – cnt$ times, where $cnt$ is the number of numbers between those two. If $n > \frac{m}{2}$, we can the complement and do the same thing.

Dilworth’s Theorem: 1257G, SPOJ MDOLLS
Dilworth’s Theorem states that the longest anti-chain in a DAG (i.e. the DAG version of “max independent set”) is equal to the size of the smallest path cover. In 1257G, we can prove that the latter is the size of the largest group of divisors that have the same number of prime factors.

Data Structure Keeping the sum of DP Values: 1129D
We try to do fast queries (of sum of values at positions of values of at most $k$) and update (for adding $+1$ or $-1$ to a interval) in a data structure that stores the DP values.
To do this we can do square root decompostion, where in each block we store the count of a each number and a offset to support fast update. This will cost $O(n \sqrt{n})$ for space (which is a little tight) and time.
To save more space, we can observe that the absolute difference in each block can be at most $\sqrt{n}$. Therefore, we can just store a prefix sum of a certain range of values, which reduces the space complexity to $O(n)$.