## Notes on Codeforces Problems: Rating 2500

Complexity Analysis for Tree DP: 815C
In this problem, we have a straight-forward DP which has $sz_u \cdot sz_v$ for each children merging. In this case the total running time will be $O(n^2)$ since every pair of vertices will only be merged once.

$O(\sqrt{n})$ possibilities of n / i: 830B
We can express the sum of cut-off as $$\sum_{i} ( d \lceil \frac{a_i}{d}\rceil ) – \text{sum of } a_i$$ $$= d \sum_{i} (\lceil \frac{a_i}{d}\rceil ) – \text{sum of } a_i$$.
The sigma term will only change only $O(n \cdot \sqrt{a_i})$ many times and for each of them, we try to maximize $d$ within the range.

Cycle Under Modulo/ Fermat’s Little Theorem: 311D
It turns out that the length of cycle for a number cubed is always a divisor of 48. This can be explained by the fact that $3^{48} = 1 \bmod ( 95542721 – 1 )$. Therefore, by Fermat’s Little Theorem, we have $a^{3^k} = a^{3^{(k \bmod 48)}} \bmod 95542721$.

Binomial Coefficient/ Pascal’s Triangle: 407C
Don’t forget we have binomial coefficients in Pascal’s triangle! This is the way that we can update those coefficients offline in an array efficiently.

Centroid: 708C
Observation: Find any centroid $v$. We will only cut the edges adjacent to $v$. Otherwise, it will either be not possible (too large) or not optimal (can do better).

Meet-in-the-Middle: 327E, 1257F
When the constraints are really small, always try bit-mask DP or meet-in-the-middle.

MST/ Edge Groups of Kruskal’s Algorithm: 160D, 891C
The process of Krusal’s algorithm can be interpreted as grouping the edges with the same weights and processing them in increasing order of weights. Then for each weight, we want to find a spanning forest. This idea can be used to find the edges that can be/cannot be/might be in the MST.

Dealing with Double Divisions: 138C
If we are frequently doing double multiplicaitons and divisions during sliding window, the error might explode. Instead, either we can take logarithm beforehand and do additions/subtractions instead, or we can use a segment tree to avoid divisions.

Bipartite Graphs over points on 2D plane: 870E, 547D
When we are dealing with configurations of points on 2D plane, we can consider building a bipartite graph where the left (right) part are all possible x (y) coordinates and the points are edges that connects their x and y values.
By doing this construction, 870E becomes directing edges so that $|deg_{in} – deg_{out}| \leq 1$ for each node, and 547D becomes directing edges so that $deg_{in} \geq 1$ for each node.

Doing DFS to enumerate paths in the tree: 490F
We can start DFS from each node and do exactly like the original LIS problem by keeping a DP array. The time for update in each node is $O(1)$ so we can easily roll back when we exit a node.

“Construct optimal answer from any”: 460D
The main difficulty of the problem comes from whether we can choose three integers than have xor sum 0. We can start by any solution and try to tweak it. Then we can see if there are such solution exists, there must be some solution of form “$11000000…, 10111111…, 1111111…$”.

Maximum Pseudoforest/ Greedy: 875F
One can show that this problem is equivelant to finding the maximum weight pseudoforest. Since it is a matroid, we can use greedy to solve it using a way similar to Kruskal’s Algorithm.

Combinatorics / DP: 886E
To solve this problem, we try to use DP and see how to reduce to a smaller problem. Consider $f(n)$ as the number of permutations of $1$ to $n$ that exit early. Then we can consider the position of the largest value: if the largest value is in a early position $i <= n – k – 1$, the algorithm will alway exit early; otherwise the problem will become a smaller one.

Combinatorics: 348D
Let $f(x, y)$ be the number of paths from $x$ to $y$. We can shown the answer is $f((1, 0), (n, m-1)) \cdot f((0, 1), (n – 1, m)) – f((0, 1), (n, m-1)) \cdot f((1, 0), (n – 1, m))$. The subtracted term is equal to the number of intersecting pairs by consider exchanging the membership of the path at the first time they intersect. The more general form of this problem is discussed here: Lindström–Gessel–Viennot lemma.

Segment Tree: 687D
One can notice that for any certain range of the edges, there will be at most $O(n)$ edges that matters (edges for a spanning tree and an violating edge). Therefore, we can store those edges for each interval in the segment tree, and for each query, we only have $O(n \log n)$ edges to consider.

Maximum Matching: 1198E
First consider covering the grid with unit strip (i.e. $1 \times n$ or $n \times 1$). Then we can reduce the problem into minimum vertex cover, which can be solved by max flow. The number of vertices is large but we can compress the identical ones.

HLD? / Brute Force: 442D
First of all, the answer will be at most $O(\log n)$ because an upper bound is HLD. Then first each newly added vertex, we can just iterative update its parent, and stop when nothing is updated. The amortized complexity is $O(n \log n)$ because each node will be updated at most $O(\log n)$ times.

Subset Sum / Bitset: 356D
While the problem itself is NP-hard, we can use bitset to squeeze the DP solution. To recover the solution, we can record the first item that make sum $j$ available, which can also be done by using bitset.

Constructive Algorithm: 1225F
Sometimes, when dealing with such problems like “finding the maximum answer”, it will be helpful to try to find a upper bound for the answer and show it is always achievable.

While doing brute force transition is too slow, we can just match as long as possible (i.e. longest common prefix) if we do match in a transition. Then the solution will work in $O(n \cdot k)$ after preprocessing LCP and RMQ array.

Cycle under Modulo / Constructive Algorithm: 763C
Consider a (cyclic) sequence where the next number of each number is $(x + d) \text{ mod } m$, where $x$ is the current number and $d, m$ is fixed. Then the answer we want is exactly a segment on this cyclic sequence. Then if $n \leq \frac{m}{2}$, we can see the difference $d$ of a two arbitrary numbers will appear $n – cnt$ times, where $cnt$ is the number of numbers between those two. If $n > \frac{m}{2}$, we can the complement and do the same thing.

Dilworth’s Theorem: 1257G, SPOJ MDOLLS
Dilworth’s Theorem states that the longest anti-chain in a DAG (i.e. the DAG version of “max independent set”) is equal to the size of the smallest path cover. In 1257G, we can prove that the latter is the size of the largest group of divisors that have the same number of prime factors.

Data Structure Keeping the sum of DP Values: 1129D
We try to do fast queries (of sum of values at positions of values of at most $k$) and update (for adding $+1$ or $-1$ to a interval) in a data structure that stores the DP values.
To do this we can do square root decompostion, where in each block we store the count of a each number and a offset to support fast update. This will cost $O(n \sqrt{n})$ for space (which is a little tight) and time.
To save more space, we can observe that the absolute difference in each block can be at most $\sqrt{n}$. Therefore, we can just store a prefix sum of a certain range of values, which reduces the space complexity to $O(n)$.

## Notes on Codeforces Problems: Rating 2400 (2)

Reduction to LIS: 76F
First, we need to understand the LIS problem with more depth. LIS problem, more generally, finds the longest sequence $X = [x_1, x_2,… , x_n]$ where $p(x_i) < p(x_{i+1})$ and $q(x_i) < q(x_{i+1})$ for $1 \leq i \leq n – 1$ for some function $p$ and $q$.
Then we return to the original problem to see how we can reduce it to LIS. After doing transformation to the formula, we see we can reach from $i$ to $j$ if and only if $|x_i – x_j| \leq (t_j – t_i) * V$. Then let $p_i = – x_i + t_i * V$ and $q_i = x_i + t_i * V$, then we can use LIS to solve this problem.

AC Automaton/ DP: 86C
AC Automaton is a good choice for maintaining “where we are when matching several strings”, but we also need to record how many characters are not matched yet when we are doing DP.
Another Note: To check whether a node is the end of a string, we need to check along the fail pointer until we reach the root.

Bottleneck Edge of Paths /Minimum Spanning Tree: 632F
The inequality in the problem looks like triangular inequality, which means we can somehow extend it into paths between two vertices, i.e. for any path between vertex $i$ and $j$, the maximum edge is large than the direct edge between them.
Then the way to solve this is to construct the MST and try to check each direct edge.

Euler’s Formula on Planar Graph: 933C
The solution of this problem utilizes Euler’s Formula, which relates the number of edges, vertices, and regions in a connected planar graph. $v-e+f = 2$, where $v$ is number of vertices, $e$ is number of edges, and $f$ is number of regions.

Reduction to MST / DP: 1120D
The first solution is to reduce this problem to MST. The idea of substituting between different tree nodes is similar to the way we make exchange argument in MST.
The second solution uses DP and the observation that we only need to care the cases where only one or zero vertice is left uncovered.

Almost Tree / Diameter: 835F
The graph is a almost tree, which contains a cycle and a bunch of trees attached to each node of the cycle. Ideally we want to enumerate the edge to delete. To do this, we can pick an arbitrary edge on the cycle, and do case analysis on whether it will be deleted. If yes, we can just run the diameter finding algorithm; If no, we can do some prefix/suffix sum around the cycle.

FFT / Segment Tree / Combinatorics: 1218E
1. While a good way to maintain single point/range update is to use a segment tree, it is still surprising to me that we can keep a frequency/sum list in each node and do convolution when merging nodes.
2. There is a step where we need to calculate array $s$ such that $$s_i = \sum_{j = 0}^{i} {{N – j}\choose{i – j}} \cdot t_j \cdot d^{i-j}$$. We can also use convolution to calculate this by spliting terms to those only related with $j$, $i – j$, $i$ respectively.
3. If we want to merge several intervals with complexity $O((n + m) \log (n + m))$. We can do this by pairing the smallest intervals together each time. This will yield a complexity of $L \log L$, where $L$ is the sum of all the intervals.

Segment Maintaining Max Prefix Sum: 1220F
We can keep the max prefix sums in a segment in two ways. One is using single point update and use a slightly more complicated merge function. Another is using lazy propagation directly on the prefix sum.

Divide-and-Conquer to Fit in Memory: 101E
If there is sufficient memory, we can do a straight-forward $O(n^2)$ DP. However, due to the tight memory constraint, in order to recover the answer, we will need to recursively find the middle point in the optimal answer using meet-in-the-middle tenique and divide the problems into the half, which yields a linear space complexity. In this way, the time complexity will be $O((n + m) ^ 2)$, which is still OK.

Reduction to Query in Segment Tree: 117D
One can see, when trying to caculate the sum, it is exactly like the process of querying sum in a segment tree.

## HDU Multi-University Training 2019, Contest 2

Longest Increasing Subsequence Revisited: Problem B
An alternative way to solve LIS: Using a segment or BIT to keep the maximum length ending with number $i$. To choose the lexicographically smallest/largest, we can greedy pick the first/last available number.

Contribution Technique?: Problem F
The key thing to notice is the product of any two unit cube. No matter how we split the cubes, they always stays the same. That means the answer is always the same, which equals to the sum of products of all pairs of unit cube. We can calculate that with the help of FWT.

Tricky Build-up of min-cut graph: Problem H
Consider having two items $A, B$ in a min-cut graph and the source $S$ and the sink $T$. Consider 5 edges between those vertices: $SA, SB, AB, AT, BT$. We can set up the cost of those edges properly so that cutting in some edges corresponds the profit that we are not able to get by seperating them. There are 5 variables and 4 equations, so we can have some solutions. In the setting of the problem, we can always get a non-negative solution. Then we get add the graphs of two vertices up and get the final graph.

Palindrome Tree: Problem I
Using DFS/Binary Lifting on the fail tree, we can know if there is a suffix palindrome of certain length. Then can add them to the answer.

## HDU Multi-University Training 2019, Contest 1

Dynamic Programming: Problem A
Since there can only be four numbers, we can do a four-dimensional DP to record the last occurrences of the numbers. The time complexity is $O(n^4)$. One dimension can be used to record the current number, so we can reduce it to $O(n^3)$ space to fit into the space.

XOR Basis: Problem B
This can be solved by segment tree in $O(n \log n \log A)$, but this is too slow. Instead, we can keep a upper-triangular basis for each right bound. We should put the number with higher index as high as possible and put other numbers down. In this way, when we are querying the maximum for range $[l, r]$, we take the number in the “$r$ basis” when it is in the bound ($l_i \geq l$), and it can be guaranteed that every number we take is valid, i.e. can be expressed as a linear combination of the number in $[l_i, l]$.

Sliding Windows for String Matching on SAM: Problem F
There are several things to notice: The first is that the range for possible transition for copying text is increasing, which means we can use a deque to maintain the minimum. Furthermore, since we can prove the DP value is non-decreasing, we can just choose the smallest position to do the copy. To find that position $j$, we maintain a SAM for $s[1;j-1]$ and keeping adding $s[j]$ into the SAM until we can find $s[j;i]$ in $s[1;j-1]$. We use a pointer $pt$ to record the current matching. To delete a character, we just keep moving $pt$ until $curlen \leq st[pt].len$.

Generating Function/ Convolution: Problem L
Write the operation as generating function, we have $$\sum_i b_i x^i = (\sum_i a_i x^i) (\sum_i x^{ik})$$, which tells us that the order of operations does not matter. We also see that the operation of the same kind can be combined together by using the fact that the coefficient of $x^t$ in $(\sum_i x^{i}) ^ l$ is ${t + l – 1}\choose{l – 1}$. Then we can use three NTTs to get the answer.

## Notes on Codeforces Problems: Rating 2300

This post is used to record my notes on some of the problems (Mainly for difficulty 2300) in Codeforces

126D Canonical Decompositon into Fibonacci number: Link
We need to use a so-called “canonical” way to represent the decomposition of fibonacci number, and do DP on that representation. “You can imagine Fibonacci coding by following way: i-th bit of number corresponds to the i-th Fibonacci number. For example, 16=13+3 will be written as 100100. You can represent into this code any positive integer, for that no two neighbouring 1-bit will be present.”

690C3 Diameter of the tree: Link
Keeping the diameter of the tree online. “keep a pair of vertices (u, v)” and compare (u, w), (v, w), (u, v) when w is added.  Pretty nice property of diameter!

1149C Between tree diameter and bracket sequence: Link
This problem needs us to exploit the property of tree diameter/LCA in bracket sequences, which is pretty similar to a solution to solve the unweighted version of QTREE4 in SPOJ. The two solutions uses different interesting observations to convert it into something solvable by the segment.
1. The LCA of two nodes is essentially the shallowest point between those two nodes in the bracket sequences.
2. Or you can take the substring between the two nodes, for example, “] [ [ ] [ ] ] ] [ ] [ [” , and you take away all of the matched parenthesis between them. It becomes “] ] [ [“, which is essentially the same as the path from one node to the other “]” means going up, and “[” means going down.)
or either of those two ways, we can use a segment tree to maintain the information in the bracket sequence.

228C Fractal Structure and Sparse Table: Link
While rolling hash yields a valid solution for this problem, the one used by the tutorial is more elegant. Fractal is actually a good place to reduce the structure into sub-structure. We can use the doubling technique (like in suffix array construction/ sparse table construction) and use dynamic programming to solve it.

While the problem here is solvable with standard divide and conquer, we can just do a smart brute force to pass by enumerating every maximum number and choose the smaller side to iterate. The complexity analysis is pretty similar to that of divide and conquer (smaller to large): “Every element will be processed no more than logn times because, if we process it in a segment of size m, the smaller part of it contains no more than m/2 elements (which we will process later, and the smaller part of this segment contains no more than m/4 elements, and so on).”

Bitmask DP here. The DP dp[mask] can be define as “the prefix of elements we have made so far”, and we can get a $O(3^n + n \cdot 2^n)$. We can additionally record how much we have covered for the next element, and then we can have a $O(n \cdot 2^n)$ solution.

718C Query on Fibonacci Sum: Link
Classic problem. There are two ways of solving this problem (both using lazy propagation in the segment tree):
1. Using the explicit formula of n-th Fibonacci number, the updating is just range mulplication. In this way, we need to use fast exponentiation.
2. There is another way. Recall that we can get the $n$-th Fibonacci number by matrix multiplication and we can keep a 1 by 2 matrix in each node and use matrix exponentiation to maintain the matrix.
Both of the ways have a complexity of $O(n \log^2{n})$

The most important thing for this problem is realizing that at most one of the three inequality of a triangle can be violated at the same time. That means we can count all the possibilities and substract the violating ones from that.

The key part to figure out for this problem is to see how we can do efficient updates from one level to another. While the solution used some “square root” optimization method, we can also take advantage of the property of Manhattan Distance by (spliting into 4 case and) using some 2D segment tree or just collecting the maximum for each row.

246E DSU on tree / query on DFS order: Link
There are two different solutions: one with range queries in the dfs order, another with DSU on tree. The DSU solution is involved with keeping the (depth, set of integers) pairs in each vertex. That can be done by using a deque of sets since in this way we can easily append the new depth level in the front when we move into a new vertex.

For the same kind of souvenirs, we always take the most expensive ones of them. Then we can see the cost function with the number of pieces taken is convex. However, it is noteworthy that when we do ternary search, the function of weight-2 items are not convex if we are using the amount of space left as the input, (the amount of space has a step size of 1 and the weight-2 items has a step size of 2.)

756C Looking from backward? / Balance Value: Link
The key observation of this problem is that we should look at the operation from backwards and calculate the so-called “balanced value” of each operation. It turns out that the element on the top of the stack is the first element (from backwards) that is larger than zero.

Unusual solution, which is pretty similar to hill climbing: Start a arbitrary solution to turn things around. It turns out it will ultimately “converge” (i.e, become a valid answer).

593D Jumping using union find: Link
Basic observation: 1. the order of the division doesn’t matter. 2. every number larger than 2 divides the number into halves or less. Then we can skip those “one”s by using a technique like union find.

## Notes on Codeforces Problems: Rating 2400 (1)

Currently I am solving Codeforces problems with difficulty 2400. This post is to record some new/fancy ideas to solve the problems.

Linearity of Expectation: 280C
Calculate the contribution of each vertex being directly deleted, which is $\frac{1}{depth(v) + 1}$, and sum them up.

DFS Spanning Tree / Edge 2-connected Component: 732F
The key idea of this problem is directing edges in a 2-connected Component into a strongly connected component. The idea is pretty simple: Do a DFS and add the edge along the way. The correctness can be proved using the DFS spanning tree: We can always go back through the back-edges, otherwise there will be a bridge.

Change the problem! / DFS Spanning Tree: 405E
There is a better way to interpret this problem: directing edges so that the every vertex’s in-degree becomes even. Then the problem becomes more free and we can use arbitrarily assigning directions to edges not in the spanning tree and then assign others according to the parities.
Note: The DFS spanning tree has both “forward edges” and “backward edges” to the visited edges.

Complexity Analysis: 850C
While the solution for this problem is brute-forcely calculating the SG function, it is tricky to prove it will work in time. The analysis goes as follows: For a certain mask with length $k$, the number of masks with highest bit $m$ it goes through during DFS can be limited. If $1 \leq m \leq k / 2$, then there can be at most $2^m$ masks. If $k / 2 \leq m \leq k$, there can be at most $2^{k – m}$ masks since only the first $k – m$ bits can change.

Merging Tries: 601D
The amotized complexity for merging a trie (in this problem) is, surprisingly, linear, if we merge it smartly: Suppose we are merging trie $A$ to tree $B$. Keep descending along two tries and stop whenever one tree has not children. If it is $B$, then it is fine. If it is $A$, then we “clip” the subtree (instead of visiting all of them) and add it into $B$. In this way, every trie node will be visited exactly once when it is merged into other nodes.

Multiplicative Function: 757E, 1097D
The key part of both of the two problems are seeing that the function asked is a mulplicative function. Then, we can calculate the answer for each prime divisors and then multiply them to get the final answer.
A useful lemma: A function $$f(n) = \sum_{d | n} g(d)$$ is multiplicative if $g$ is multiplicative.

While the intuition is that we will most likely choose numbers as large as possible, we can prove that we will either choose the largest numbers one by one or choose $max / 2$, $max / 3$, $max / 5$.

Connected Component on 2D Plane/ Combinatorics: 870E
1. When we build the graph such that each point is connected to the neighboring points in all four directions, we can get a graph with a nice property: Every leaf has its unique x-value/y-value in the connected component.
2. Another Key point of this problem is to think about what kind of configuration cannot be in the result. (Or intuitively realize that almost all the configurations are possible.)

Knapsack with Huge Amount of Small Values: 1132E
We can see that if we choose lots of items with some weight, we can group them together to a block of size $S$, where $S$ is LCM of $1$ to $8$. Then we can do a DP with the remaining items.

Knapsack with Few Categories of Same Values: 95E
The first observation is that there can be at most $O(\sqrt n)$ different sizes of component. We should notice this especially when we can solve the problems of the same size efficienctly. (Another example: 666C)
Then we can use the following way to solve the knapsack problem of same values: represent the number of items $C_i$ into $2 ^ 0 + 2 ^ 1 + …$ that we can separate them into $O(\log C_i)$ items and do the classic knapsack problem one by one.

Centroid Decomposition: 914E
By centroid decomposition, we can take all the paths into consideration. Note that when counting for a certain node, we can propagate the value into its parent.

Mincost-Maxflow: 277E, 863F
When a problem includes different variables constraining each other and requires finding the minimum cost, we should think about mincost-maxflow.

Divide-and-Conquer/ Persistent Segment Tree/ Randomization: 840D
Three are three practical ways to solve this problem:
1. Divide-and-Conquer. If we apply D&C, we can see that each of the $k$ most frequent values in a query that goes across the mid point must be one of the $k$ most frequent in $[L, mid]$ or $[mid+1, R]$.
2. Persistent Segment Tree: Keep the a tree of frequency and the maximum. We can answer each query in $O(k \log n)$
3. Since we try to find some element that appears a lot of time in a range, we can have several random picks in the array. Then with high probability, the intended element will be found.

Reduction to finding a point covered by largest number of rectangles: 377D
The key approach to think about the solution is to think $(L, R)$ as a point in the 2D-plane. Then we can reduce this problem to a classic one.

Probability/ Two pointers: 1194F
We deduce the expectation by thinking about how each crossword can contribute to the answer: $P(i) = \frac{1}{2^i}\sum\limits_{k = 0}^{m_i}{\binom{i}{k}}$
Then the key observation is that we can transform from $(i, m_i)$ to the neighboring pairs efficiently by using: $2 \cdot \sum\limits_{k = 0}^{x}{\binom{n}{k}} + \binom{n}{x + 1} = \sum\limits_{k = 0}^{x + 1}{\binom{n + 1}{k}}$, which essentially somes from Pascal Triangle Formula $\binom{n}{k} + \binom{n}{k + 1} = \binom{n + 1}{k + 1}$

Sweeping Line: 1194E
Note: While the limit for $n$ is 5000, since the set of segments will be splited into vertical and horizontal ones, $O(n^2 \log n)$ solutions will work fine.
For sweeping line, putting things in a certain order can be very important.

First we can see that if the adjacency lists of two vertices are the same, then we can always assign the same value to them and merge them together. In the compressed graph we know that if a the degree of vertex is $\geq 3$, there will be no answer based on the fact that the vertices with different adjacency list must have different number.
Useful fact: Any path between two vertices can be expressed as: “xor-sum of edge on any path” $\oplus$ “an element in the cycle space of the graph”.
Both with $O(n)$ preprocessing, while Z-function can check the cycle in the string in $O(n)$, the prefix function can do that in $O(1)$, which is crucial for this problem.
Define $dp[i]$ as the maximum profit we can get ending at $i$ (only considering the segment ending at or before $i$). Then we have $dp[i] = max_{j} (dp[j] + p[j + 1, i] – cost[j + 1, i])$ since the solution will look at several consecutive segment. Then we can move from left and keep the values in a segment tree and add value into a range when we have a new segment.