Problem Solving Journal 2

Published Apr 21, 2025

#problem-journal #math #cs #algorithms

Problems: Binary Subsequence Value Sum · Counting Squarefree GCD Pairs · Duplicates · Forming Groups · Interval Queries on the Greatest GCD Pair

During late February and early March I feel like I found a breakthrough in OI skill, from often struggling with Div 2. Ds in-contest to consistently solving Es. However, life and college has been quite overwhemling since then. Now that summer’s here with more free time hopefully I can perform in-contest to Master soon :). Although still far, qualifying for ICPC seems just a bit more possible now!

Binary Subsequence Value Sum

For a binary string vv, the score of vv is defined as the maximum value of

F(v,1,i)F(v,i+1,v)F(v,1,i) \cdot F(v,i+1,|v|)

over all 0iv0 \leq i \leq \lvert v \rvert. Here, F(v,l,r)=rl+12zero(v,l,r)F(v,l,r)=r - l + 1 - 2 \cdot \text{zero}(v,l,r), where zero(v,l,r)\text{zero}(v,l,r) denotes the number of 00s in vlvl+1vrv_l v_{l+1} \dots v_r. If l>rl > r, then F(v,l,r)=0F(v, l, r) = 0. You are given a binary string ss of length nn and a positive integer qq. You will be asked qq queries. In each query, you are given an integer 1in1 \leq i \leq n. You must flip sis_i. Find the sum of the scores over all non-empty subsequences of s after each modification query, modulo some prime pp.

—Source: Codeforces 2077C

Solution

This can be done in O(nlogn+q)\mathcal{O}(n \log n + q). First, let’s consider how to find the score of a binary string vv efficiently. We can expect to do this, considering all the other work we have to squeeze in time. The score is the maximum value of

(i2zero(v,1,i))(vi2zero(v,i+1,v))\Bigl (i - 2 \cdot \text{zero}(v, 1, i) \Bigr ) \Bigl ( \lvert v \rvert - i - 2 \cdot \text{zero}(v, i + 1, \lvert v \rvert) \Bigr)

One thing we can note is that the sum of these two factors is constant up to the choice of vv, it always sums to S=v2zero(v,1,v)S = \lvert v \rvert - 2 \cdot \text{zero}(v, 1, \lvert v \rvert). Since these factors are both integers, the best we can do is then

score(v)S2S2\text{score}(v) \leq \left \lfloor \frac{\lvert S \rvert}{2} \right \rfloor \left \lceil \frac{\lvert S \rvert}{2} \right \rceil

This is clear by “AM-GM”, proveable by a simple smoothing argument. Let’s check that equality is achieveable: the first term starts at 00, then either increases by 11 (vi=1v_i = 1), or decreases by 11 (vi=0v_i = 0) each time ii increases, and ends up at SS. Therefore, it attains all the integers from 00 to SS and we establish

score(v)=S2S2\text{score}(v) = \left \lfloor \frac{\lvert S \rvert}{2} \right \rfloor \left \lceil \frac{\lvert S \rvert}{2} \right \rceil

We are now tasked with finding the sum of the scores over all non-empty subsequences of a binary string ss. Notice that the score function depends only on the length nn of vv and the number of zeroes kk in vv. Moreover, it only depends on the quantity n2kn - 2k. Then we should evaluate

ans=nkf(n2k)(# of subsequences with length n and k zeroes)\texttt{ans} = \sum_n \sum_k f(n - 2k) \cdot (\text{\# of subsequences with length $n$ and $k$ zeroes})

Where f(x)=x2x2f(x) = \left \lfloor \frac{\lvert x \rvert}{2} \right \rfloor \left \lceil \frac{\lvert x \rvert}{2} \right \rceil.

It’s easy to count that the number of subsequences with length nn and kk zeroes is (Ank)(Bk)\binom{A}{n - k} \binom{B}{k}, if vv has AA ones and BB zeroes. This looks like a potential application of Vandermonde’s, with a bit of manipulation first.

ans=nkf(n2k)(Ank)(Bk)=nkf(n2k)(Ank)(BBk)=tf(tB)a+b=t(Aa)(Bb)=tf(tB)(vt)\begin{align*} \texttt{ans} &= \sum_n \sum_k f(n - 2k) \binom{A}{n - k} \binom{B}{k} \\ &= \sum_n \sum_k f(n - 2k) \binom{A}{n - k} \binom{B}{B - k} \\ &= \sum_t f(t - B) \sum_{a + b = t} \binom{A}{a} \binom{B}{b} \\ &= \sum_t f(t - B) \binom{\lvert v \rvert}{t} \end{align*}

So we have a solution in O(nq)\mathcal{O}(nq). Note that, for some vv, only the value of BB matters. This sum looks tricky to dynamically recompute for each query, but we can notice it is a convolution. So this motivates an FFT-based approach:

ans=[xB](f(t)xt)((nt)xt)\texttt{ans} = [x^B]\left(\sum f(t) x^{-t} \right)\left(\sum \binom{n}{t} x^t\right)

Where the sums are over the positive integers. However, from the coefficients of the second sum, it is clear we only care about t[0,n]t \in [0, n]. Since BnB \leq n, we can restrict the first sum to t[n,n]t \in [-n, n]. FFT precomputes the answer for all 0Bn0 \leq B \leq n in O(nlogn)\mathcal{O}(n \log n), and we’re done. One quick implementation note is that since the first sum can have negative powers, we can simply multiply a factor xnx^n and query xB+nx^{B + n} to make things easier. \blacksquare

I found this very tricky but satisfying to solve, with many possible ideas and approaches that never panned out.

Counting Squarefree GCD Pairs

An integer is square-biased if it is divisible by x2x^2 for some integer x>1x > 1. You are given an array of 1n1051 \leq n \leq 10^5 integers a_i (1ai10121 \leq a_i \leq 10^{12}). Count the number pairs iji \leq j of entries such that gcd(ai,aj)\gcd(a_i, a_j) is not square biased.

—Source: 2024 ICPC Pacific Northwest Regional Contest G

Solution

This was a problem which, if our team solved it last year, would’ve made us #4 in the entire NA West Division! Unfortunately, we weren’t that good at the time. I never read the solution or retried it post-contest, but recently remembered to. On this new attempt I was able to AC in 20 minutes! :)

The idea builds off from CSES Counting Coprime Pairs. First, we count the complement: the number of pairs which are square biased. We can then subtract this from (n2){n \choose 2} to obtain the answer.

Define Sp={(i,j):p2gcd(ai,aj)}S_p = \{(i, j) : p^2 \mid \gcd(a_i, a_j) \}. Then we wish to find

pSp=k(1)k1p1,pkSpi\left \lvert \bigcup_{p} S_p \right \rvert = \sum_k (-1)^{k - 1} \sum_{p_1, \dots p_k} \left \lvert \bigcap S_{p_i} \right \rvert

By the Principle of Inclusion-Exclusion. Further note,

Spi=Spi\bigcap S_{p_i} = S_{\prod p_i}

So that, in addition to the Fundamental Theorem of Arithmetic,

pSp=k>1 squarefree(1)ω(k)1Sk\left \lvert \bigcup_{p} S_p \right \rvert = \sum_{k > 1 \text{ squarefree}} (-1)^{\omega(k) - 1} \left \lvert S_k \right \rvert

Of course, since ai1012a_i \leq 10^{12}, we only need to check k106k \leq 10^6. Using a linear sieve we can easily compute ω(k)\omega(k). It remains to be able to compute Sk\lvert S_k \rvert efficiently. Then let s(k)s(k) be the number of ii such k2aik^2 \mid a_i. We know Sk=(s(k)2)\lvert S_k \rvert = {s(k) \choose 2}.

How do we find s(k)s(k)? We cannot sieve to 101210^{12} to do so easily, but the key observation is that we don’t need to! Consider if a1012a \leq 10^{12} has no prime factors less or equal to 10410^4. Then, aa has at most two prime factors, counting multiplicity. So aa is square biased if and only if a=p2a = p^2 for some prime pp. Therefore, we thoroughly divide aa by each p104p \leq 10^4 through brute force, and handle the remaining relevant square biased divisors through a O(1)\mathcal{O}(1) check. This part is O(namax3/logamax3)\mathcal{O}(n \sqrt[3]{a_{\text{max}}}/ \log \sqrt[3]{a_{\text{max}}}) from the Prime Number Theorem π(amax3)\pi (\sqrt[3]{a_{\text{max}}}), which is fast enough. \blacksquare

Duplicates

A sequence contains duplicates if there is an element that appears more than once in the sequence. You are given an n×nn \times n matrix XX (3n1003 \leq n \leq 100), where each entry is between 11 and nn. You can modify zero or more entries in XX to arbitrary integers between 11 and nn. Your task is to make modifications to entries of XX such each row contains duplicates, and each column contains duplicates. Compute the minimum number of entries that need to be modified to achieve this.

—Source: 2024 ICPC Asia Pacific Championship E

Solution

Since each row and column have nn entries, it does not contain duplicates if and only if it’s a permutation of (1,,n)(1, \dots, n). Call this characterization unique. If a row/column is unique, we can quite easily change it to not be just by changing some one element while not making another row/column unique (n=1,2n = 1, 2 has weird cases otherwise, which justifies n3n \geq 3). The optimization part comes where changing one cell entry may simultaneously solve both a unique row and column. Then we have to strategically choose those, and it seems like a tricky process.

For these types of problems we can appeal to the classic row/column bipartition representation that has appeared in problems like 2021 Google Kickstart Round A - D. Checksum. We can have nn nodes for rows and nn for columns, with up to n2n^2 edges between these bipartitions corresponding to cells of their intersection. Here, the cells of interest motivate a graph where there is an edge between rir_i and cjc_j if and only if both rir_i and cjc_j is unique. We can also color unique row/column nodes red.

Under this construction, double-solving corresponds to “removing” both rir_i and cjc_j. So having as much double-solves as possible is finding the maximal matching of this bipartite graph! For the remaining red nodes, we just need to single-solve. Hopcroft-Karp gives us a final O(n2n)\mathcal{O}(n^2 \sqrt{n}), but Edmonds-Karp is fast enough too. \blacksquare

Forming Groups

There are nn students, numbered from 11 to nn (2n1062 \leq n \leq 10^6), each student having skill aia_i. You are student 11, the captain of the students. Students 22 to nn are standing in a line from left to right in order. You can choose to stand in between any two students, to the left of student 22, or to the right of student nn. Then you can choose a number kk (k>1k > 1 and kk must be a divisor of nn), and the students will form kk groups based on their residue classes modulo kk. The skill of a group is the sum of skills of its members. What’s the minimum ratio xmax/xminx_{\text{max}}/x_{\text{min}} you can achieve by choosing kk and your position, where xmax,xminx_{\text{max}}, x_{\text{min}} is the maximum and minimum skills of the groups?

—Source: 2024 ICPC Asia Pacific Championship F

Solution

Of course, given some kk we can find this mininum ratio easily in O(nlogn)\mathcal{O}(n \log n) by trying each position you are in and maintaining the group sums in a multiset. However, a factor of kk is too much. But without choosing kk, it seems incredibly hard, which suggests that perhaps the set of viable kk is small.

We observe that the variance of group skills is larger the more groups there are, which makes the ratio large. This suggests that larger group sizes might be ideal. Let’s consider having group sizes of nmnm, with xmax,xminx_{\text{max}}, x_{\text{min}}. What if we instead of group sizes of nn? Then consider the nn new groups that split from an original group. The skills satisfy

n new groupsxi=xoriginal\sum_{n \text{ new groups}} x_i = x_{\text{original}}

Which implies

minn new groupsxixoriginal/nmaxn new groupsxixoriginal/n\min_{n \text{ new groups}} x_i \leq x_{\text{original}}/n \\ \max_{n \text{ new groups}} x_i \geq x_{\text{original}}/n

Therefore, over the new groups,

xnew maxxnew minxmax/nxmin/n=xmaxxmin\frac{x_{\text{new max}}}{x_{\text{new min}}} \geq \frac{x_{\text{max}}/n}{x_{\text{min}}/n} = \frac{x_{\text{max}}}{x_{\text{min}}}

Which completes the proof. Thus, we only need to check k=pk = p for some prime, making the solution O(nlog2n)\mathcal{O}(n \log^2 n). \blacksquare

This is apparently a 2400-rated problem, which makes it my first red problem solve on CF! However, it took half an hour and I think 2100 better suits it…

Interval Queries on the Greatest GCD Pair

You are given a sequence of 2n1052 \leq n \leq 10^5 integers aia_i (1ai1051 \leq a_i \leq 10^5) and 1q1051 \leq q \leq 10^5 intervals in the sequence. The intervals are specified by [li,ri][l_i, r_i]. An interval consisting of kk integers has k(k1)/2k(k−1)/2 pairs of integers at different positions, which have their greatest common divisors. For each given interval, find the greatest one among such greatest common divisors.

—Source: 2024 ICPC Asia Yokohama Regional Contest I

Solution

Of course, ai105a_i \leq 10^5 should motivate looking at the problem from the perspective of GCD values rather than pairs of aia_i. If gg is the GCD of two numbers in a range, then there must be two numbers in the range divisible by gg. So it is useful to compute the indices which are a multiple of each 1g1051 \leq g \leq 10^5, which in total is bounded by o(n4/3)o(n^{4/3}) from d=o(n1/3)d = o(n^{1/3}) (the divisor count function).

Now we employ the classic way of handling offline queries by line sweeping. For each gg, we constantly keep track of the second last index jj which gajg \mid a_j in an array b[g]=j\texttt{b[g]} = j. Once we hit a rr in some query, then simply want to find the largest index gg of bb such b[g]l\texttt{b[g]} \geq l. This is a classic task which can be done by binary searching on a max segment tree in O(nlog2n)\mathcal{O}(n \log^2 n) or by walking on the segment tree in O(nlogn)\mathcal{O}(n \log n). Both pass. \blacksquare


© 2020-2025 • Last Updated 2025-04-21