Torsten Hoefler et al. • April 2023 • Communications of the ACM •

Disentangling Hype from Practicality: On Realistically Achieving Quantum Advantage

What is a realistic quantum advantage for quantum computers? That is the central question the authors of Disentangling Hype from Practicality: On Realistically Achieving Quantum Advantage ask themselves. What the researchers from Microsoft find ought to dash the hopes of many a pundit:

[A] wide range of often-cited applications is unlikely to result in a practical quantum advantage[.]

Barring massive breakthroughs across the quantum stack, it means: a general quantum advantage is not bloody likely. At least not anytime soon.

The I/O Bottleneck

To initialize and measure the qubits on a quantum computer, it is safe to assume that at least one operation is required. After all, if no gates are used for initialization, the qubits will be in random states. So, how long does it take to read or write the qubits with a single gate each? Just divide the number of logical qubits by the gate time.

Why only the logical qubits? Because the ancilla qubits needed to perform quantum error correction do not encode data as such. Their initialization is determined by the quantum error correction algorithm.

The authors compute a maximum I/O bandwidth of 1 Gbit/s for a 10,000-qubit quantum computer with 10 μs gate times. Even with such a marvellous machine, we are still several orders of magnitude away from modern CPUs and GPUs, which range from 100 Gbit/s (e.g. Intel i9 and Apple M2) to 10,000 (e.g. NVIDIA A100). Quantum computers of the NISQ and early fault-tolerant eras will therefore be limited by this I/O bottleneck. What the article fails to mention is that 10,000 logical qubits are well beyond the reach of the next decade.

A gate speed of 10 μs is pretty common for ion traps, but solid-state technologies, such as superconductors and semiconductors (e.g. quantum dots) are much faster. I have summarized average gate times for various technologies in the table below; the gate time for one- and two-qubit gates generally differ. The three bandwidth figures listed are for 100, 1,000, and 10,000 logical qubits, respectively.

Modality	Gate time	Bandwidth (100)	Bandwidth (1k)	Bandwidth (10k)
Superconductors	10 ns	10 Gbit/s	100 Gbit/s	1,000 Gbit/s
Photonics	<100 ns	1 Gbit/s	10 Gbit/s	100 Gbit/s
Quantum dots	100 ns	1 Gbit/s	10 Gbit/s	100 Gbit/s
Trapped ions	10 μs	10 Mbit/s	100 Mbit/s	1 Gbit/s
Neutral atoms	10 μs	10 Mbit/s	100 Mbit/s	1 Gbit/s
NV centres	10 μs	10 Mbit/s	100 Mbit/s	1 Gbit/s

Only a quantum computer with 10,000 superconducting qubits can match the I/O bandwidth of a modern server. Unfortunately, we do not yet have a straightforward path towards 10,000 logical qubits. Quantum memory or QRAM is also not going to be a feasible solution to the I/O bottleneck either, as Connor Hahn argued in Practicality of Quantum Random Access Memory.

Necessary Conditions for Speed-ups

It is clear that any algorithm that requires lots of data to be loaded onto a quantum computer or read out from one will be constrained by the I/O bottleneck. Quantum machine learning, solving large linear systems of equations (e.g. HHL algorithm), and database search (e.g. Grover’s algorithm) clearly fall in that category.

So, what kind of problems can be solved on quantum computers that offer any practical quantum advantage, or a speed-up over solving such problems on classical computers? Well, asymptotic speed-ups are nice in theory, but no one is going to wait around forever. We can instead look at the maximum number of useful operations that can be completed within a runtime of, say, \(\tau = 10^6\,\mathrm{s}\) or almost 11 days and 14 hours. While that particular number is arbitrary, it highlights the focus on long-running computations, where small speed-ups can have a huge impact. Such computations still complete within a reasonable time frame, so that quantum advantage is more than an exercise in exceptional patience.

If it takes \(M\) operations per function call (a.k.a. oracle) with \(N\) invocations each, then a classical computation runs for a duration of \(M N^k t_{\mathrm{c}}\) with \(t_{\mathrm{c}}\) the time each invocation takes. Here, \(k\) is a parameter that models the asymptotic behaviour of the algorithm. On a quantum computer, this runs in \(T_{\mathrm{q}} = M N t_{\mathrm{q}}\) with \(t_{\mathrm{q}}\) the time for each evaluation. So, for a quadratic speed-up, \(k=2\).

As of August 2023, the official ACM edition contains typos in the exponent of the quantum runtime and in the denominator of the (k-1)th root that are corrected in the arXiv version.

For a practical quantum advantage, as per the authors’ definition, a quantum computer must finish in \(T_{\mathrm{q}} \leq \tau\), which implies that \(N\leq \frac{\tau}{Mt_{\mathrm{q}}}\). What is more, a quantum computer must finish before its classical competitor, so \(T_{\mathrm{q}} \leq T_{\mathrm{c}}\). With that in hand, we see that

\[\sqrt[k-1]{\frac{t_{\mathrm{q}}}{t_{\mathrm{c}}}} \leq N \leq \frac{\tau}{Mt_{\mathrm{q}}},\]

because \(M>0\).

We can therefore query the oracle \(M\leq \tau\cdot \sqrt[k-1]{\frac{t_{\mathrm{c}}}{t_{\mathrm{q}}^{k}}}\) times. The inequality highlights that both the speed-up \(k\) and compute speed are important to achieve quantum advantage. The bigger the speed-up and the faster the operations on the quantum hardware, the better.

Can we use the peak I/O bandwidth figures, or rather their inverses, for \(t_{\mathrm{c}}\) and \(t_{\mathrm{q}}\)? Unfortunately not. These are based on the assumption that a single operation suffices. A more interesting case is multiplication, which is considerably more complicated.

The authors rely on a particular implementation of \(n\)-bit multiplication by Gidney and Fowler. What they find is that a quantum computer with 10,000 logical qubits can only handle 10,500 half-precision floating-point operations per second at a 10 μs gate time. Compared to a GPU or ASIC and with only a quadratic speed-up, they find \(M<1\). The researchers therefore conclude that a quadratic speed-up (\(k=2\)) is simply not sufficient for a practical quantum advantage.

The Insight

Hoefler & Co. come to the same conclusion as Babbush et al. (2021) from Google who claimed a quadratic speed-up would not be sufficient for quantum advantage. Their analysis was based on the computation of only 100 Toffoli gates, which are far too few to be of any practical use. This paper improves upon that analysis by looking at a more realistic problem: multiplication.

The key insight is that, in the authors’ words, “quantum computers will be practical for ‘big compute’ problems on small data, not big data problems.” That may come as a surprise to the many companies that look into optimization and machine learning as key use cases for quantum computers, for which the speed-ups are often not even known except for isolated use cases. Where we do know such speed-ups, they tend to be quadratic only. Even though the HHL algorithm for solving sparse linear systems of equations has an exponential speed-up, it bumps into the I/O bottleneck and the fact that it may require circuits with 𝒪(10²⁵) gates. That is well outside the ability of current quantum computers in terms of the coherence needed to support that.

The simulation of quantum systems in physics and chemistry is therefore the most promising avenue of investigation for the foreseeable future, which also happens to be the original proposal for quantum computers by Richard Feynman in the early 1980s:

[N]ature isn’t classical, dammit, and if you want to make a simulation of nature, you’d better make it quantum mechanical, and by golly it’s a wonderful problem, because it doesn’t look so easy.

More than forty years later, it still is not so easy. Although tremendous progress has already been made.