Lazy Qubit Reordering for Accelerating Parallel State-Vector-based Quantum Circuit Simulation
AI Breakdown
Get a structured breakdown of this paper — what it's about, the core idea, and key takeaways for the field.
Abstract
This article proposes two quantum operation scheduling methods for accelerating parallel state-vector-based quantum circuit simulation using multiple graphics processing units (GPUs). The proposed methods reduce all-to-all communication caused by qubit reordering, which can dominate the overhead of parallel simulation. Our out-of-order approach eliminates redundant reorderings by introducing intentional delays in reordering communications such that multiple reorderings can be aggregated into a single reordering. The delays are carefully introduced based on the principles of time-space tiling, or a cache optimization technique for classical computers, which we use to arrange the execution order of quantum operations. Moreover, we develop these methods tailored for two primary procedures in variational quantum eigensolver simulation: quantum state update (QSU) and expectation value computation (EVC). Our QSU simulation takes an advantage of the hierarchical interconnection of GPU systems to avoid slow inter-node communication. On the other hand, our EVC simulation reduces the number of reorderings by diagonalization of Pauli strings. Experimental validation on 32-GPU executions demonstrates acceleration in QSU and EVC—up to 54× and 1,657×, respectively—compared to an inorder-based method. We believe that our out-of-order approach is useful for accelerating large-scale quantum circuit simulations, including QSU and/or EVC that operate qubits in a regular manner.