Problem Solving Journal 2
Published Apr 21, 2025
#problem-journal #math #cs #algorithmsProblems: Binary Subsequence Value Sum · Counting Squarefree GCD Pairs · Duplicates · Forming Groups · Interval Queries on the Greatest GCD Pair
During late February and early March I feel like I found a breakthrough in OI skill, from often struggling with Div 2. Ds in-contest to consistently solving Es. However, life and college has been quite overwhemling since then. Now that summer’s here with more free time hopefully I can perform in-contest to Master soon :). Although still far, qualifying for ICPC seems just a bit more possible now!
Binary Subsequence Value Sum
For a binary string , the score of is defined as the maximum value of
over all . Here, , where denotes the number of s in . If , then . You are given a binary string of length and a positive integer . You will be asked queries. In each query, you are given an integer . You must flip . Find the sum of the scores over all non-empty subsequences of s after each modification query, modulo some prime .
—Source: Codeforces 2077C
Solution
This can be done in . First, let’s consider how to find the score of a binary string efficiently. We can expect to do this, considering all the other work we have to squeeze in time. The score is the maximum value of
One thing we can note is that the sum of these two factors is constant up to the choice of , it always sums to . Since these factors are both integers, the best we can do is then
This is clear by “AM-GM”, proveable by a simple smoothing argument. Let’s check that equality is achieveable: the first term starts at , then either increases by (), or decreases by () each time increases, and ends up at . Therefore, it attains all the integers from to and we establish
We are now tasked with finding the sum of the scores over all non-empty subsequences of a binary string . Notice that the score function depends only on the length of and the number of zeroes in . Moreover, it only depends on the quantity . Then we should evaluate
Where .
It’s easy to count that the number of subsequences with length and zeroes is , if has ones and zeroes. This looks like a potential application of Vandermonde’s, with a bit of manipulation first.
So we have a solution in . Note that, for some , only the value of matters. This sum looks tricky to dynamically recompute for each query, but we can notice it is a convolution. So this motivates an FFT-based approach:
Where the sums are over the positive integers. However, from the coefficients of the second sum, it is clear we only care about . Since , we can restrict the first sum to . FFT precomputes the answer for all in , and we’re done. One quick implementation note is that since the first sum can have negative powers, we can simply multiply a factor and query to make things easier.
I found this very tricky but satisfying to solve, with many possible ideas and approaches that never panned out.
Counting Squarefree GCD Pairs
An integer is square-biased if it is divisible by for some integer . You are given an array of integers a_i (). Count the number pairs of entries such that is not square biased.
Solution
This was a problem which, if our team solved it last year, would’ve made us #4 in the entire NA West Division! Unfortunately, we weren’t that good at the time. I never read the solution or retried it post-contest, but recently remembered to. On this new attempt I was able to AC in 20 minutes! :)
The idea builds off from CSES Counting Coprime Pairs. First, we count the complement: the number of pairs which are square biased. We can then subtract this from to obtain the answer.
Define . Then we wish to find
By the Principle of Inclusion-Exclusion. Further note,
So that, in addition to the Fundamental Theorem of Arithmetic,
Of course, since , we only need to check . Using a linear sieve we can easily compute . It remains to be able to compute efficiently. Then let be the number of such . We know .
How do we find ? We cannot sieve to to do so easily, but the key observation is that we don’t need to! Consider if has no prime factors less or equal to . Then, has at most two prime factors, counting multiplicity. So is square biased if and only if for some prime . Therefore, we thoroughly divide by each through brute force, and handle the remaining relevant square biased divisors through a check. This part is from the Prime Number Theorem , which is fast enough.
Duplicates
A sequence contains duplicates if there is an element that appears more than once in the sequence. You are given an matrix (), where each entry is between and . You can modify zero or more entries in to arbitrary integers between and . Your task is to make modifications to entries of such each row contains duplicates, and each column contains duplicates. Compute the minimum number of entries that need to be modified to achieve this.
Solution
Since each row and column have entries, it does not contain duplicates if and only if it’s a permutation of . Call this characterization unique. If a row/column is unique, we can quite easily change it to not be just by changing some one element while not making another row/column unique ( has weird cases otherwise, which justifies ). The optimization part comes where changing one cell entry may simultaneously solve both a unique row and column. Then we have to strategically choose those, and it seems like a tricky process.
For these types of problems we can appeal to the classic row/column bipartition representation that has appeared in problems like 2021 Google Kickstart Round A - D. Checksum. We can have nodes for rows and for columns, with up to edges between these bipartitions corresponding to cells of their intersection. Here, the cells of interest motivate a graph where there is an edge between and if and only if both and is unique. We can also color unique row/column nodes red.
Under this construction, double-solving corresponds to “removing” both and . So having as much double-solves as possible is finding the maximal matching of this bipartite graph! For the remaining red nodes, we just need to single-solve. Hopcroft-Karp gives us a final , but Edmonds-Karp is fast enough too.
Forming Groups
There are students, numbered from to (), each student having skill . You are student , the captain of the students. Students to are standing in a line from left to right in order. You can choose to stand in between any two students, to the left of student , or to the right of student . Then you can choose a number ( and must be a divisor of ), and the students will form groups based on their residue classes modulo . The skill of a group is the sum of skills of its members. What’s the minimum ratio you can achieve by choosing and your position, where is the maximum and minimum skills of the groups?
Solution
Of course, given some we can find this mininum ratio easily in by trying each position you are in and maintaining the group sums in a multiset. However, a factor of is too much. But without choosing , it seems incredibly hard, which suggests that perhaps the set of viable is small.
We observe that the variance of group skills is larger the more groups there are, which makes the ratio large. This suggests that larger group sizes might be ideal. Let’s consider having group sizes of , with . What if we instead of group sizes of ? Then consider the new groups that split from an original group. The skills satisfy
Which implies
Therefore, over the new groups,
Which completes the proof. Thus, we only need to check for some prime, making the solution .
This is apparently a 2400-rated problem, which makes it my first red problem solve on CF! However, it took half an hour and I think 2100 better suits it…
Interval Queries on the Greatest GCD Pair
You are given a sequence of integers () and intervals in the sequence. The intervals are specified by . An interval consisting of integers has pairs of integers at different positions, which have their greatest common divisors. For each given interval, find the greatest one among such greatest common divisors.
Solution
Of course, should motivate looking at the problem from the perspective of GCD values rather than pairs of . If is the GCD of two numbers in a range, then there must be two numbers in the range divisible by . So it is useful to compute the indices which are a multiple of each , which in total is bounded by from (the divisor count function).
Now we employ the classic way of handling offline queries by line sweeping. For each , we constantly keep track of the second last index which in an array . Once we hit a in some query, then simply want to find the largest index of such . This is a classic task which can be done by binary searching on a max segment tree in or by walking on the segment tree in . Both pass.