Project proposal funded under Italian ISCA-C competitive grants
P.I. Iacopo Colonnelli, PhD candidate, University of Torino
Extended Abstract. Nowadays, quantum supremacy only concerns a restricted set of problems, for which quantum processors represent a game-changing technology. Nevertheless, typical scientific applications running on classical supercomputers are composed of very diverse steps. Some of them, like combinatorial optimizations, can benefit enormously from quantum annealers. Others, as generic matrix multiplications or in-situ data visualization, are better managed by more traditional hardware like GPUs or multi/many-core CPUs. Moreover, current Noisy Intermediate-Scale Quantum (NISQ) computers are insufficient (in terms of both amount of qubits and error rate) to implement conventional quantum algorithms. Still, they can already work in conjunction with classical hardware to solve specific tasks significantly faster than a purely non-quantum setting.
Therefore, an accelerator-like programming paradigm is a reasonable approach to exploit quantum processors, also because an ever-larger portion of the HPC community is getting used to such a paradigm to take advantage of GPUs and FPGAs in heterogeneous computing environments. This technique allows executing a program’s main flow on general-purpose hardware (the “host”), offloading subroutines (the “kernels”) to a special-purpose architecture (the “device”) whenever advantageous.
Implementing an efficient accelerator paradigm for quantum computing is not trivial. First of all, offloading a kernel to a quantum device typically translates into sending data to a remote API through the network, with data transfer overheads in the order of seconds. Therefore, it is necessary to group multiple experiments in batched executions whenever possible to keep performance high. Moreover, the limited availability of quantum hardware forces researchers to rely on quantum simulators for development and testing, which typically run on GPUs. Seamlessly switching from simulators to real quantum processors is mandatory for portability, reproducibility, and maintainability.
Several existing quantum computing libraries adhere to these principles, exposing a multi-backend accelerator programming model to easily switch between simulation and real quantum environments and providing batched execution APIs to improve performances. Some solutions, like Rigetti pyquil or IBM Qiskit, focus on proprietary quantum technologies, while others embrace a wider set of backends, e.g., Azure Quantum Development Kit, Google Cirq, and Amazon Braket. In all cases, classical and quantum portions of a program are interleaved in the same host code.
In this project, we aim at investigating how to build hybrid computational workflows that involve QC and HPC in a picture where QC acts as accelerators of HPC traditional platforms, but the two souls can remain as loosely coupled as possible. A viable method is relying on modern workflow systems, such as the StreamFlow WMS. They aim to build complex pipelines by way of the composition of fully segregated modules that can be independently deployed onto different, possibly heterogeneous platforms. Modules are then assembled according to true data dependency relations, which are kept separated by deployment indications.
This approach has a number of advantages against traditional accelerator implementations:1) Modules does not need to be compiled together, allowing different modules to live in very different software stacks;2) Data movement can be semi-automatically marshaled/unmarshaled to match the different encoding of data in different systems;3) Moving the classical execution near to the quantum hardware can heavily reduce the data transfer overheads;4) The quantum backend can be configured outside the host code as a workflow parameter, allowing to offload different quantum computations to the most suitable hardware (e.g., preferring quantum annealers to general-purpose hardware for QA-based optimization steps)