Quantum Bandit With Amplitude Amplification Exploration in an Adversarial Environment
AI Breakdown
Get a structured breakdown of this paper — what it's about, the core idea, and key takeaways for the field.
Abstract
The rapid proliferation of learning systems in an arbitrarily changing environment mandates the need to manage tensions between exploration and exploitation. This work proposes a quantum-inspired bandit learning approach for the learning-and-adapting-based offloading problem where a client observes and learns the costs of each task offloaded to the candidate resource providers, e.g., fog nodes. In this approach, a new action update strategy and novel probabilistic action selection are adopted, provoked by the amplitude amplification and collapse postulate in quantum computation theory. We devise a locally linear mapping between a quantum-mechanical phase in a quantum domain, e.g., Grover-type search algorithm, and a distilled probability-magnitude in a value-based decision-making domain, e.g., adversarial multi-armed bandit algorithm. The proposed algorithm is generalized, via the devised mapping, for better learning weight adjustments on favorable/unfavorable actions, and its effectiveness is verified via simulation.