Computing a Number Theoretic Sum

Published Jan 4, 2025

#math #cs #algorithms

While I was attempting this problem, I misread it after writing it down on paper and instead solved

S=xynxygcd(x,y) S = \sum_{xy \leq n} xy \gcd(x, y)

Here the sum is over positive integers (x,y)(x, y) satisfying xynxy \leq n, rather than x,ynx, y \leq n. In my opinion it turned out more interesting than the original one, so I wanted to share.

Compute S=xynxygcd(x,y)S = \sum_{xy \leq n} xy \gcd(x, y) in sublinear time. Here I present O(n2/3)\mathcal{O}(n^{2/3}) but it can be much faster.

—Source: Accidental misread

Solution

Setting up notation first, let MM be set of multiplicative functions, and * be the Dirichlet product, then (M,)(M, *) is an abelian group. Let ϵ\epsilon, 11, II, τ\tau, σ\sigma, μ\mu, ϕ\phi be the *-identity (11-indicator), constant 11, identity, divisor count, divisor sum, Mobius, and totient functions respectively. Note that they are all multiplicative.

Write k=xyk = xy so that we can iterate over kk as follows:

S=knkdkgcd(d,k/d)S = \sum_{k \leq n} k \sum_{d \mid k} \gcd(d, k/d)
Key Lemma
d2kτ(k/d2)ϕ(d)=dkgcd(d,k/d)\sum_{d^2 \mid k} \tau(k/d^2) \phi(d) = \sum_{d \mid k} \gcd(d, k/d)

Note here that dd does not represent the same iterator, but I kept the variable name anyways.

Proof of Lemma. A cool fact is that any sum of a product of multiplicative functions with the form of the left side is multiplicative. We can define here

f(d2)={0d is not a squareϕ(d)otherwisef(d^2) = \begin{cases} 0 & \text{$d$ is not a square} \\ \phi(d) & \text{otherwise} \end{cases}

Note that ff is multiplicative, and the left side is actually fτf * \tau, making it multiplicative.

We can verify the right side is multiplicative too, with just the textbook approach. One can notice that if gcd(p,q)=1\gcd(p, q) = 1, dpq    d=rsd \mid pq \implies d = rs, where rpr \mid p and sqs \mid q. Also, gcd(rs,pq/(rs))=gcd(r,p/r)gcd(s,q/s)\gcd(rs, pq/(rs)) = \gcd(r, p/r) \gcd(s, q/s). Therefore,

dpqgcd(d,pq/d)=rpgcd(r,p/r)sqgcd(s,q/s)\sum_{d \mid pq} \gcd(d, pq/d) = \sum_{r \mid p} \gcd(r, p/r) \sum_{s \mid q} \gcd(s, q/s)

It remains to simply check both sides coincide at prime powers:

ia/2τ(pa2i)ϕ(pi)=τ(pa)+1ia/2(a2i+1)pi1(p1)\sum_{i \leq \lfloor a/2\rfloor} \tau(p^{a - 2i}) \phi(p^i) = \tau(p^a) + \sum_{1 \leq i \leq \lfloor a/2\rfloor} (a - 2i + 1) p^{i - 1}(p - 1)
=ia/212pi+(a2a/2+1)pa/2=iapmin(i,ai)= \sum_{i \leq \lfloor a/2\rfloor - 1} 2p^i + (a - 2\lfloor a/2\rfloor + 1) p^{\lfloor a/2\rfloor} = \sum_{i \leq a} p^{\min(i, a - i)}

So we’re done. The sum telescopes in the middle. \square

Therefore

S=knkd2kτ(k/d2)ϕ(d)S = \sum_{k \leq n} k \sum_{d^2 \mid k} \tau(k/d^2) \phi(d)

The goal here was to turn the sum into one consisting of multiplicative functions, so we can try applying Dirichlet’s Hyperbola Method. We can quickly make the observation d2knd^2 \leq k \leq n, and it looks promising to try iterating over dnd \leq \sqrt{n} instead:

S=dnϕ(d)d2kkτ(k/d2)S = \sum_{d \leq \sqrt{n}} \phi(d) \sum_{d^2 \mid k} k \tau(k/d^2)

Now kk has a really simple characterization: it is just d2αd^2 \alpha for αn/d2\alpha \leq n/d^2. Let’s iterate over α\alpha,

S=dnϕ(d)d2αn/d2ατ(α)S = \sum_{d \leq \sqrt{n}} \phi(d) d^2 \sum_{\alpha \leq n/d^2} \alpha \tau(\alpha)

So we would be happy if we can compute prefix sums of IτI \tau quickly, luckily it is multiplicative and simple enough, so let’s try to write it as a Dirichlet product:

Iτ=II=Iμσ=ϕσI\tau = I * I = I * \mu * \sigma = \phi * \sigma

At this point we are almost at the end, as many methods to calculate prefix sums of ϕ\phi and σ\sigma are well known:

σ=I1\sigma = I * 1, which decomposes it into two functions whose prefix sums can be computed in O(1)\mathcal{O}(1). With the Hyperbola Method we can compute prefix sums of σ\sigma in n\sqrt{n}.

Prefix sums of ϕ\phi can be computed in O(n2/3)\mathcal{O}(n^{2/3}) by linear sieving the first n2/3n^{2/3} values. Then using a recursion with another Hyperbola Method on the rest, with ϕ=1μ\phi = 1 * \mu, while making use of the fact that there are Θ(n)\Theta(\sqrt{n}) unique values of n/i\lfloor n/i \rfloor. See here! This balances out to O(x1/3x/i)=O(x2/3)\mathcal{O} \left(\sum^{x^{1/3}} \sqrt{x/i} \right) = \mathcal{O}(x^{2/3}). In fact, through the recursion we get all the “important values”, the prefix sums up to ϕ(n/i)\phi(\lfloor n/i \rfloor), for free.

So now using the Hyperbola Method to consolidate IτI \tau, choose splitting point pq=spq = s:

sIτ=pϕ(i)s/iσ(j)+qσ(i)s/iϕ(j)qσ(i)pϕ(j)\sum^s I\tau = \sum^{p} \phi(i) \sum^{s/i} \sigma(j) + \sum^{q} \sigma(i) \sum^{s/i} \phi(j) -\sum^{q} \sigma(i) \sum^{p} \phi(j)

As the computation of ϕ\phi’s prefix sums dominate, we can simply pick p=q=sp = q = \sqrt{s} while also not worrying about quickly obtaining all the “important values” of σ\sigma, which only contributes O(nlogn)\mathcal{O}(\sqrt{n} \log n).

Now to resolve

S=dnϕ(d)d2αn/d2ατ(α)S = \sum_{d \leq \sqrt{n}} \phi(d) d^2 \sum_{\alpha \leq n/d^2} \alpha \tau(\alpha)

We can linear sieve ϕ\phi for the other part in the SS sum at no asymptotic cost. Thus the total cost of this is

dnO((n/d2)2/3)=O(n2/3)\sum_{d \leq \sqrt{n}} \mathcal{O}((n/d^2)^{2/3}) = \mathcal{O}(n^{2/3})

Although this is asymptotically optimal, in practice for values like n1010n \approx 10^{10} the constant factors come into play. Some experimentation shows that it’s better to decrease the precomputation of the linear sieves and rely more on the Hyperbola Method-induced recursion. \blacksquare

Further Improvements

I’m sure that many improvements can be made to this. I haven’t explored too much about it, but here’s some immediate ideas.

  • A quick skim of search results tell me prefix sums of ϕ\phi and σ\sigma can be calculated even quicker using more advanced methods that I can’t understand.
  • I have seen this remarkable blog, showing how to compute the “important values” of the Dirichlet product of two functions whose “important values” we already know in O(n polylog n)\mathcal{O}(\sqrt{n} \text{ polylog } n). Combined with the above the whole algorithm can also be O(n polylog n)\mathcal{O}(\sqrt{n} \text{ polylog } n).
  • A non-improvement: Although ϕI2\phi I^2, the other part of the multiplicand, is also multiplicative, the Hyperbola Method is limited by a n\sqrt{n} time. So trying to use it on the overall sum SS with a fast ϕI2\phi I^2 prefix sum gains nothing as long as Iτ=ϕσI\tau = \phi * \sigma is used.
  • If anyone wants to correct me or share any improvements please do. With that being said I should implement blog commenting.


© 2020-2025 • Last Updated 2025-01-04