Problem Solving Journal 2

Published Apr 21, 2025

Problems: Binary Subsequence Value Sum · Counting Squarefree GCD Pairs · Duplicates · Forming Groups · Interval Queries on the Greatest GCD Pair

During late February and early March I feel like I found a breakthrough in OI skill, from often struggling with Div 2. Ds in-contest to consistently solving Es. However, life and college has been quite overwhemling since then. Now that summer’s here with more free time hopefully I can perform in-contest to Master soon :). Although still far, qualifying for ICPC seems just a bit more possible now!

Binary Subsequence Value Sum

For a binary string $v$ , the score of $v$ is defined as the maximum value of
$F(v,1,i) \cdot F(v,i+1,|v|)$
over all $0 \leq i \leq \lvert v \rvert$ . Here, $F(v,l,r)=r - l + 1 - 2 \cdot \text{zero}(v,l,r)$ , where $\text{zero}(v,l,r)$ denotes the number of $0$ s in $v_l v_{l+1} \dots v_r$ . If $l > r$ , then $F(v, l, r) = 0$ . You are given a binary string $s$ of length $n$ and a positive integer $q$ . You will be asked $q$ queries. In each query, you are given an integer $1 \leq i \leq n$ . You must flip $s_i$ . Find the sum of the scores over all non-empty subsequences of s after each modification query, modulo some prime $p$ .

—Source: Codeforces 2077C

Solution

This can be done in $\mathcal{O}(n \log n + q)$ . First, let’s consider how to find the score of a binary string $v$ efficiently. We can expect to do this, considering all the other work we have to squeeze in time. The score is the maximum value of

\Bigl (i - 2 \cdot \text{zero}(v, 1, i) \Bigr ) \Bigl ( \lvert v \rvert - i - 2 \cdot \text{zero}(v, i + 1, \lvert v \rvert) \Bigr)

One thing we can note is that the sum of these two factors is constant up to the choice of $v$ , it always sums to $S = \lvert v \rvert - 2 \cdot \text{zero}(v, 1, \lvert v \rvert)$ . Since these factors are both integers, the best we can do is then

\text{score}(v) \leq \left \lfloor \frac{\lvert S \rvert}{2} \right \rfloor \left \lceil \frac{\lvert S \rvert}{2} \right \rceil

This is clear by “AM-GM”, proveable by a simple smoothing argument. Let’s check that equality is achieveable: the first term starts at $0$ , then either increases by $1$ ( $v_i = 1$ ), or decreases by $1$ ( $v_i = 0$ ) each time $i$ increases, and ends up at $S$ . Therefore, it attains all the integers from $0$ to $S$ and we establish

\text{score}(v) = \left \lfloor \frac{\lvert S \rvert}{2} \right \rfloor \left \lceil \frac{\lvert S \rvert}{2} \right \rceil

We are now tasked with finding the sum of the scores over all non-empty subsequences of a binary string $s$ . Notice that the score function depends only on the length $n$ of $v$ and the number of zeroes $k$ in $v$ . Moreover, it only depends on the quantity $n - 2k$ . Then we should evaluate

\texttt{ans} = \sum_n \sum_k f(n - 2k) \cdot (\text{\# of subsequences with length $n$ and $k$ zeroes})

Where $f(x) = \left \lfloor \frac{\lvert x \rvert}{2} \right \rfloor \left \lceil \frac{\lvert x \rvert}{2} \right \rceil$ .

It’s easy to count that the number of subsequences with length $n$ and $k$ zeroes is $\binom{A}{n - k} \binom{B}{k}$ , if $v$ has $A$ ones and $B$ zeroes. This looks like a potential application of Vandermonde’s, with a bit of manipulation first.

\begin{align*} \texttt{ans} &= \sum_n \sum_k f(n - 2k) \binom{A}{n - k} \binom{B}{k} \\ &= \sum_n \sum_k f(n - 2k) \binom{A}{n - k} \binom{B}{B - k} \\ &= \sum_t f(t - B) \sum_{a + b = t} \binom{A}{a} \binom{B}{b} \\ &= \sum_t f(t - B) \binom{\lvert v \rvert}{t} \end{align*}

So we have a solution in $\mathcal{O}(nq)$ . Note that, for some $v$ , only the value of $B$ matters. This sum looks tricky to dynamically recompute for each query, but we can notice it is a convolution. So this motivates an FFT-based approach:

\texttt{ans} = [x^B]\left(\sum f(t) x^{-t} \right)\left(\sum \binom{n}{t} x^t\right)

Where the sums are over the positive integers. However, from the coefficients of the second sum, it is clear we only care about $t \in [0, n]$ . Since $B \leq n$ , we can restrict the first sum to $t \in [-n, n]$ . FFT precomputes the answer for all $0 \leq B \leq n$ in $\mathcal{O}(n \log n)$ , and we’re done. One quick implementation note is that since the first sum can have negative powers, we can simply multiply a factor $x^n$ and query $x^{B + n}$ to make things easier. $\blacksquare$

I found this very tricky but satisfying to solve, with many possible ideas and approaches that never panned out.

Counting Squarefree GCD Pairs

An integer is square-biased if it is divisible by $x^2$ for some integer $x > 1$ . You are given an array of $1 \leq n \leq 10^5$ integers a_i ( $1 \leq a_i \leq 10^{12}$ ). Count the number pairs $i \leq j$ of entries such that $\gcd(a_i, a_j)$ is not square biased.

—Source: 2024 ICPC Pacific Northwest Regional Contest G

Solution

This was a problem which, if our team solved it last year, would’ve made us #4 in the entire NA West Division! Unfortunately, we weren’t that good at the time. I never read the solution or retried it post-contest, but recently remembered to. On this new attempt I was able to AC in 20 minutes! :)

The idea builds off from CSES Counting Coprime Pairs. First, we count the complement: the number of pairs which are square biased. We can then subtract this from ${n \choose 2}$ to obtain the answer.

Define $S_p = \{(i, j) : p^2 \mid \gcd(a_i, a_j) \}$ . Then we wish to find

\left \lvert \bigcup_{p} S_p \right \rvert = \sum_k (-1)^{k - 1} \sum_{p_1, \dots p_k} \left \lvert \bigcap S_{p_i} \right \rvert

By the Principle of Inclusion-Exclusion. Further note,

\bigcap S_{p_i} = S_{\prod p_i}

So that, in addition to the Fundamental Theorem of Arithmetic,

\left \lvert \bigcup_{p} S_p \right \rvert = \sum_{k > 1 \text{ squarefree}} (-1)^{\omega(k) - 1} \left \lvert S_k \right \rvert

Of course, since $a_i \leq 10^{12}$ , we only need to check $k \leq 10^6$ . Using a linear sieve we can easily compute $\omega(k)$ . It remains to be able to compute $\lvert S_k \rvert$ efficiently. Then let $s(k)$ be the number of $i$ such $k^2 \mid a_i$ . We know $\lvert S_k \rvert = {s(k) \choose 2}$ .

How do we find $s(k)$ ? We cannot sieve to $10^{12}$ to do so easily, but the key observation is that we don’t need to! Consider if $a \leq 10^{12}$ has no prime factors less or equal to $10^4$ . Then, $a$ has at most two prime factors, counting multiplicity. So $a$ is square biased if and only if $a = p^2$ for some prime $p$ . Therefore, we thoroughly divide $a$ by each $p \leq 10^4$ through brute force, and handle the remaining relevant square biased divisors through a $\mathcal{O}(1)$ check. This part is $\mathcal{O}(n \sqrt[3]{a_{\text{max}}}/ \log \sqrt[3]{a_{\text{max}}})$ from the Prime Number Theorem $\pi (\sqrt[3]{a_{\text{max}}})$ , which is fast enough. $\blacksquare$

Duplicates

A sequence contains duplicates if there is an element that appears more than once in the sequence. You are given an $n \times n$ matrix $X$ ( $3 \leq n \leq 100$ ), where each entry is between $1$ and $n$ . You can modify zero or more entries in $X$ to arbitrary integers between $1$ and $n$ . Your task is to make modifications to entries of $X$ such each row contains duplicates, and each column contains duplicates. Compute the minimum number of entries that need to be modified to achieve this.

—Source: 2024 ICPC Asia Pacific Championship E

Solution

Since each row and column have $n$ entries, it does not contain duplicates if and only if it’s a permutation of $(1, \dots, n)$ . Call this characterization unique. If a row/column is unique, we can quite easily change it to not be just by changing some one element while not making another row/column unique ( $n = 1, 2$ has weird cases otherwise, which justifies $n \geq 3$ ). The optimization part comes where changing one cell entry may simultaneously solve both a unique row and column. Then we have to strategically choose those, and it seems like a tricky process.

For these types of problems we can appeal to the classic row/column bipartition representation that has appeared in problems like 2021 Google Kickstart Round A - D. Checksum. We can have $n$ nodes for rows and $n$ for columns, with up to $n^2$ edges between these bipartitions corresponding to cells of their intersection. Here, the cells of interest motivate a graph where there is an edge between $r_i$ and $c_j$ if and only if both $r_i$ and $c_j$ is unique. We can also color unique row/column nodes red.

Under this construction, double-solving corresponds to “removing” both $r_i$ and $c_j$ . So having as much double-solves as possible is finding the maximal matching of this bipartite graph! For the remaining red nodes, we just need to single-solve. Hopcroft-Karp gives us a final $\mathcal{O}(n^2 \sqrt{n})$ , but Edmonds-Karp is fast enough too. $\blacksquare$

Forming Groups

There are $n$ students, numbered from $1$ to $n$ ( $2 \leq n \leq 10^6$ ), each student having skill $a_i$ . You are student $1$ , the captain of the students. Students $2$ to $n$ are standing in a line from left to right in order. You can choose to stand in between any two students, to the left of student $2$ , or to the right of student $n$ . Then you can choose a number $k$ ( $k > 1$ and $k$ must be a divisor of $n$ ), and the students will form $k$ groups based on their residue classes modulo $k$ . The skill of a group is the sum of skills of its members. What’s the minimum ratio $x_{\text{max}}/x_{\text{min}}$ you can achieve by choosing $k$ and your position, where $x_{\text{max}}, x_{\text{min}}$ is the maximum and minimum skills of the groups?

—Source: 2024 ICPC Asia Pacific Championship F

Solution

Of course, given some $k$ we can find this mininum ratio easily in $\mathcal{O}(n \log n)$ by trying each position you are in and maintaining the group sums in a multiset. However, a factor of $k$ is too much. But without choosing $k$ , it seems incredibly hard, which suggests that perhaps the set of viable $k$ is small.

We observe that the variance of group skills is larger the more groups there are, which makes the ratio large. This suggests that larger group sizes might be ideal. Let’s consider having group sizes of $nm$ , with $x_{\text{max}}, x_{\text{min}}$ . What if we instead of group sizes of $n$ ? Then consider the $n$ new groups that split from an original group. The skills satisfy

\sum_{n \text{ new groups}} x_i = x_{\text{original}}

Which implies

\min_{n \text{ new groups}} x_i \leq x_{\text{original}}/n \\ \max_{n \text{ new groups}} x_i \geq x_{\text{original}}/n

Therefore, over the new groups,

\frac{x_{\text{new max}}}{x_{\text{new min}}} \geq \frac{x_{\text{max}}/n}{x_{\text{min}}/n} = \frac{x_{\text{max}}}{x_{\text{min}}}

Which completes the proof. Thus, we only need to check $k = p$ for some prime, making the solution $\mathcal{O}(n \log^2 n)$ . $\blacksquare$

This is apparently a 2400-rated problem, which makes it my first red problem solve on CF! However, it took half an hour and I think 2100 better suits it…

Interval Queries on the Greatest GCD Pair

You are given a sequence of $2 \leq n \leq 10^5$ integers $a_i$ ( $1 \leq a_i \leq 10^5$ ) and $1 \leq q \leq 10^5$ intervals in the sequence. The intervals are specified by $[l_i, r_i]$ . An interval consisting of $k$ integers has $k(k−1)/2$ pairs of integers at different positions, which have their greatest common divisors. For each given interval, find the greatest one among such greatest common divisors.

—Source: 2024 ICPC Asia Yokohama Regional Contest I

Solution

Of course, $a_i \leq 10^5$ should motivate looking at the problem from the perspective of GCD values rather than pairs of $a_i$ . If $g$ is the GCD of two numbers in a range, then there must be two numbers in the range divisible by $g$ . So it is useful to compute the indices which are a multiple of each $1 \leq g \leq 10^5$ , which in total is bounded by $o(n^{4/3})$ from $d = o(n^{1/3})$ (the divisor count function).

Now we employ the classic way of handling offline queries by line sweeping. For each $g$ , we constantly keep track of the second last index $j$ which $g \mid a_j$ in an array $\texttt{b[g]} = j$ . Once we hit a $r$ in some query, then simply want to find the largest index $g$ of $b$ such $\texttt{b[g]} \geq l$ . This is a classic task which can be done by binary searching on a max segment tree in $\mathcal{O}(n \log^2 n)$ or by walking on the segment tree in $\mathcal{O}(n \log n)$ . Both pass. $\blacksquare$