Skip to content

ArunSehrawat/Quantum_Transformers_for_Natural_Language_Processing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

Quantum Transformers for Natural Language Processing

Would the transformer architecture — the backbone of large language models (LLMs) like ChatGPT — still work if we replaced its neural networks with quantum unitary operations?

The answer is yes!

This repository contains two Quantum Transformer (QT) models in separate Jupyter notebooks that show how it works. For a detailed explanation, please refer to the code block descriptions in the attached Jupyter Notebooks or see my blog post on this work. While early results on the tiny Shakespeare dataset are modest, the structure is promising.

1. Interferometric Transformer (IT) Model

Replaces classical linear layers with interferometric networks—phase shifters + beamsplitters (Fourier transforms).

2. Rotational Transformer (RT) Model

Replaces classical linear layers with qubit‐rotation networks—only single‐qubit Ry rotations.

QT Model