On the Interpretability of Neural Network Decoders
AI Breakdown
Get a structured breakdown of this paper — what it's about, the core idea, and key takeaways for the field.
Abstract
Neural‐network (NN) based decoders are becoming increasingly popular in the field of quantum error correction (QEC), including for decoding of state‐of‐the‐art quantum computation experiments. In this work, established interpretability methods are used from the field of machine learning, to introduce a toolbox to achieve an understanding of the underlying decoding logic of NN decoders, which have been trained but otherwise typically operate as black‐box models. To illustrate the capabilities of the employed interpretability method, based on the Shapley value approximation, an examplary case study of a NN decoder is provided that is trained for flag‐qubit based fault‐tolerant (FT) QEC with the Steane code. The interpretation of particular decoding decisions of the NN is analysed, by doing so it is revealed how the NN learns to capture fundamental structures in the information gained from syndrome and flag qubit measurements, in order to come to a FT correction decision. Further, it is showed that the understanding of how the NN obtains a decoding decision can be used on the one hand to identify flawed processing of error syndrome information by the NN, resulting in decreased decoding performance, as well as for well‐informed improvements of the NN architecture. The diagnostic capabilities of the interpretability method presented here can help ensure successful application of machine learning for decoding of QEC protocols.