Exploring Quantum Natural Language Processing: A Breakthrough

In the realm of Quantum Natural Language Processing (QNLP), a significant milestone has been achieved with the advent of practical applications on quantum computers.

Sentences as Complex Networks. Words form intricate networks rather than mere "bags of words." About a decade ago, one of the authors, Bob Coecke, alongside Mehrnoosh Sadrzadeh and Steve Clark, began illustrating these networks, leading to a visual representation of how word meanings amalgamate to convey the overall meaning of a sentence. This contrasts the conventional view of sentences as unstructured collections of individual word meanings. These findings gained recognition, even featuring prominently in New Scientist.

Visualizing Meaning Networks. To better understand the functionality of these networks, consider a simplified example where boxes symbolize word meanings, and lines represent channels for meaning transmission. In this context, the subject "Alice" and the object "Bob" connect to the verb "hates," collectively defining the sentence's meaning. The foundational work for this concept can be traced back to the 1950s, primarily through the contributions of Noam Chomsky and Jean Lambek, who aimed to unify grammatical structures across languages within a singular mathematical framework. The meaning-flow network of a sentence is built upon a compositional mathematical model of semantics.

Language's Quantum Nature. A fascinating aspect of this graphical linguistic framework is its lineage from prior work that integrated quantum theory into a network-centric language. This foundational research culminated in a comprehensive textbook authored by Coecke and Aleks Kissinger.

Bridging Words and Quantum States. A pivotal connection was drawn between word meanings and quantum states, as well as grammatical constructs and quantum measurements. This relationship raises the question: Can quantum computers process natural language? The initial proposal by Will Zeng and Coecke in 2016 laid the groundwork for a novel NLP paradigm in quantum computing, yet faced significant hurdles, notably the lack of capable quantum computers to execute the proposed NLP tasks and the ambitious goal of encoding word meanings using quantum random access memory (QRAM).

Our Recent Initiatives. The concept of leveraging quantum computers for natural language processing is not only intriguing but also a logical progression. Recently, Intel has backed initial attempts to explore ideas from the Zeng-Coecke paper on their quantum simulator. The inaugural conference on Quantum Natural Language Processing took place in Oxford in December 2019, showcasing a simulation of our experiment. Since then, efforts have shifted towards utilizing existing NISQ devices, particularly IBM’s quantum systems. However, the networks illustrated earlier cannot be directly interpreted by IBM's hardware, necessitating a transformation into a "quantum circuit" format.

Implementing QNLP on Quantum Devices. In this revised form, we illustrate that QNLP can indeed be executed on NISQ devices, with promising scalability as these technologies advance. The ZX-calculus language, developed by Coecke and CQC's Ross Duncan, was employed to depict quantum circuits, aligning seamlessly with the network language of quantum theory that supports QNLP.

Advancing Without QRAM. Our approach circumvents the need for QRAM. While a comprehensive explanation is beyond this discussion, we utilize quantum machine learning to build a framework where quantum states and processes derive meanings directly from text rather than explicitly encoding word meanings. In this context, quantum circuits replace classical neural networks to identify patterns within data. Interestingly, while neural architectures dominate classical NLP, most methods overlook grammatical structures. Our QNLP methodology naturally integrates both grammar and meaning.

Encoding Meaning in Quantum States. Once words and phrases are represented as quantum states, we can generate quantum states that encapsulate the meanings of grammatical sentences on quantum hardware. This allows us to pose questions to the quantum computer, which responds based on the vocabulary and grammar it has assimilated.

Experiment Design and Execution. Our experimental framework is predicated on the scalability of our designs. As the number of qubits increases, the dimensionality of the meaning space expands significantly, whereas the circuit size dictated by grammar remains manageable. Notably, QNLP on NISQ devices offers a fresh opportunity to assess the scalability of quantum machine learning algorithms while experimenting with various quantum meaning spaces.

Technical Framework Details. The workflow of our experiment involves creating grammar categories to generate grammatical diagrams that represent the information flow of meanings within sentences. These diagrams are then instantiated as quantum circuits, where the meanings of words are encoded in quantum states. Each state is prepared from a classical reference state, and the combination of words in a sentence corresponds to the composition of circuits that prepare a state encoding that sentence's meaning.

Evaluating Quantum Circuits. To compute meanings, the constructed quantum circuits need evaluation, which can be performed on classical computers using advanced techniques for handling large matrices or on quantum devices directly. The parameterized quantum circuit for a sentence like "Alice hates Bob" exemplifies this process.

Training and Parameter Learning. To ensure effective execution on near-term NISQ devices, we selected a limited vocabulary and generated all possible grammatical sentences from these words. Each corresponding parameterized circuit was created, interpreting the grammatical diagrams to maintain a one-dimensional semantic space, indicating the truth-value of sentences.

Post-Training Evaluation. After training, new sentences are used to assess how well the truth labels of previously unseen sentences can be inferred. These new sentences share the same vocabulary but differ grammatically and semantically from the training set.

Questioning the Quantum Computer. With our learning framework established, we can ask the quantum computer grammatical questions derived from the training vocabulary. Interestingly, questions can be constructed from learned sentence compositions, enabling the evaluation of more complex circuits.

Compiling and Optimizing Quantum Circuits. A critical component of our experiment is the effective compilation and optimization of circuits for execution on quantum devices. We employed CQC's quantum software development platform, t|ket>, to adapt the circuits for device-native operations while accommodating connectivity constraints.

Access to Experimental Resources. The experiments discussed are available in the following repository: https://github.com/oxford-quantum-group/discopy/blob/ab2b356bd3cad1dfb55ca6606d6c4b4181fe590c/notebooks/qnlp-experiment.ipynb

Future Directions. There are several avenues for advancing this experiment. Variations in hardware, such as using ion traps or optical systems instead of superconducting qubits, can be easily implemented via the hardware-agnostic t|ket>. Additionally, we aim to explore different computational models, such as measurement-based quantum computing (MBQC), and expand beyond single sentences to process larger texts. Future articles will provide updates on our continued experiments and potential tasks like language generation and summarization. As quantum hardware capabilities expand, we anticipate scaling up the complexity of tasks and meaning spaces, fulfilling our overarching objective.

Notes and References ¹ The term "bag" is commonly used to describe the neglect of grammatical structure in many NLP applications. ² New Scientist article on quantum links and language understanding. ³ Foundational work integrating quantum theory and linguistics. ? References to seminal papers and developments in quantum computing.