Versatile Cross-platform Compilation Toolchain for Schrödinger-style Quantum Circuit Simulation
AI Breakdown
Get a structured breakdown of this paper — what it's about, the core idea, and key takeaways for the field.
Abstract
While existing quantum hardware resources have limited availability and reliability, there is a growing demand for exploring and verifying quantum algorithms. Efficient classical simulators for high-performance quantum simulation are critical to meeting this demand. However, due to the vastly varied characteristics of classical hardware, implementing hardware-specific optimizations for different hardware platforms is challenging. To address such needs, we propose CAST (Cross-platform Adaptive Schrödinger-style Simulation Toolchain), a novel compilation toolchain with cross-platform (CPU and Nvidia GPU) optimization and high-performance backend supports. CAST exploits a novel sparsity-aware gate fusion algorithm that automatically selects the best fusion strategy and backend configuration for targeted hardware platforms. CAST also aims to offer versatile and high-performance backend for different hardware platforms. To this end, CAST provides an LLVM IR-based vectorization optimization for various CPU architectures and instruction sets, and a PTX-based code generator for Nvidia GPU support. We benchmark CAST against IBM Qiskit, Google QSimCirq, Nvidia cuQuantum backend, and other high-performance simulators. On various 32-qubit CPU-based benchmarks, CAST achieves up to 8.03x speedup than Qiskit. On various 30-qubit GPU-based benchmarks, CAST achieves up to 39.3x speedup than Nvidia cuQuantum backend.